Published in International Journal of Advanced Research in Computer Science Engineering and Information Technology
ISSN: 2321-3337 Impact Factor:1.521 Volume:3 Issue:3 Year: 11 November,2014 Pages:375-385
The World Wide Web (Web) has been providing an important and indispensable platform for receiving information and disseminating information as well as interacting with society on the Internet. With its astronomical growth over the past decade, the Web becomes huge, diverse and dynamic. The application of data mining techniques to the web is called Web Mining. Web Mining aims to discover interesting patterns in the structure, the contents and the usage of web sites. An indispensable tool for the webmaster, it has, nevertheless, a long road ahead in which visualization plays an important role. Currently, Web mining techniques has emerged as an important research area to help Web users find the information needed. This paper is an effort in analyzing the views and methodologies stated by various authors on various processes in mining the web. Web Usage Mining is the application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Usage data captures the identity or origin of Web users along with their browsing behavior at a Web site. Learning is the ability to improve a machine’s behaviour and inference/interpretation based on training data/experience. We have a large number of criteria for categorizing learning algorithms; in practice selection of a suitable algorithm is a very complex process. Somewhere, there is an abstract notion of a learning algorithm (a neural network, a genetic algorithm, a classifier system) which is made very specific, depending on the situation in which it should work. Therefore we will investigate the applicability of a number of learning algorithms by tuning certain aspects of the algorithm
Data preprocessing,Data Cleaning,Session Identification
[1] Tak-Lam Wong and Wai Lam, “Learning to Adapt Web Information Extraction Knowledge and Discovering New Attributes via a Bayesian Approach”, IEEE Transactions On Knowledge And Data Engineering, vol. 22, no. 4, pp: 523-536, 2010. [2] Yatsko V., Shilov S. and Vishniakov T., “A Semi-automatic Text Summarization System”, In proceedings of the 10th International Сonference on Speech and Computer, Patras, pp. 283-288, 2005. [3] LaddaSuanmali, NaomieSalim and Mohammed Salem Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction", vol. 20, no. 2, pp. 105-115, November 2009. [4] KaustubhPatil and PavelBrazdil, "SUMGRAPH: Text Summarization Using Centrality In The Pathfinder Network", International Journal on Computer Science and Information Systems, vol.2, no.1, pp. 18-32, 2007. [5] RachitArora and BalaramanRavindran, “Latent Dirichlet Allocation Based Multi-Document Summarization”, In Proceedings of the second workshop on Analytics for noisy unstructured text data, pp:91-97, 2008. [6] KhosrowKaikhah, “Automatic Text Summarization with NeuralNetworks”, Second IEEE International conference on intelligent systems, pp: 40-45, 2004. [7] H. Edmundson,“New methods in automatic extracting”, Journal of the Association for Computing Machinery, Vol: 16, No. 2, pp: 264-285, 1969. [8] Inderjeet Mani, “Recent Developments in Text Summarization", In Proceedings of the tenth international conference on Information and knowledge management, ACM Press, pp: 529 - 531, 2001 [9] ShiyanOu, Christopher S.G. Khoo and Dion H. Goh, "Design and development of a concept-based multidocument summarization system for research abstracts", Journal of Information Science, vol. 34 , no. 3, pp. 308-326 , June 2008.