importance of web usage mining in machine learnings

J.SHARMILA,A.SUBRAMANI

Published in International Journal of Advanced Research in Computer Science Engineering and Information Technology

ISSN: 2321-3337          Impact Factor:1.521         Volume:3         Issue:3         Year: 11 November,2014         Pages:375-385

International Journal of Advanced Research in Computer Science Engineering and Information Technology

Abstract

The World Wide Web (Web) has been providing an important and indispensable platform for receiving information and disseminating information as well as interacting with society on the Internet. With its astronomical growth over the past decade, the Web becomes huge, diverse and dynamic. The application of data mining techniques to the web is called Web Mining. Web Mining aims to discover interesting patterns in the structure, the contents and the usage of web sites. An indispensable tool for the webmaster, it has, nevertheless, a long road ahead in which visualization plays an important role. Currently, Web mining techniques has emerged as an important research area to help Web users find the information needed. This paper is an effort in analyzing the views and methodologies stated by various authors on various processes in mining the web. Web Usage Mining is the application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Usage data captures the identity or origin of Web users along with their browsing behavior at a Web site. Learning is the ability to improve a machine’s behaviour and inference/interpretation based on training data/experience. We have a large number of criteria for categorizing learning algorithms; in practice selection of a suitable algorithm is a very complex process. Somewhere, there is an abstract notion of a learning algorithm (a neural network, a genetic algorithm, a classifier system) which is made very specific, depending on the situation in which it should work. Therefore we will investigate the applicability of a number of learning algorithms by tuning certain aspects of the algorithm

Kewords

Data preprocessing,Data Cleaning,Session Identification

Reference

[1] Tak-Lam Wong and Wai Lam, “Learning to Adapt Web Information Extraction Knowledge and Discovering New Attributes via a Bayesian Approach”, IEEE Transactions On Knowledge And Data Engineering, vol. 22, no. 4, pp: 523-536, 2010. [2] Yatsko V., Shilov S. and Vishniakov T., “A Semi-automatic Text Summarization System”, In proceedings of the 10th International Сonference on Speech and Computer, Patras, pp. 283-288, 2005. [3] LaddaSuanmali, NaomieSalim and Mohammed Salem Binwahlan, "Automatic Text Summarization Using Feature Based Fuzzy Extraction", vol. 20, no. 2, pp. 105-115, November 2009. [4] KaustubhPatil and PavelBrazdil, "SUMGRAPH: Text Summarization Using Centrality In The Pathfinder Network", International Journal on Computer Science and Information Systems, vol.2, no.1, pp. 18-32, 2007. [5] RachitArora and BalaramanRavindran, “Latent Dirichlet Allocation Based Multi-Document Summarization”, In Proceedings of the second workshop on Analytics for noisy unstructured text data, pp:91-97, 2008. [6] KhosrowKaikhah, “Automatic Text Summarization with NeuralNetworks”, Second IEEE International conference on intelligent systems, pp: 40-45, 2004. [7] H. Edmundson,“New methods in automatic extracting”, Journal of the Association for Computing Machinery, Vol: 16, No. 2, pp: 264-285, 1969. [8] Inderjeet Mani, “Recent Developments in Text Summarization", In Proceedings of the tenth international conference on Information and knowledge management, ACM Press, pp: 529 - 531, 2001 [9] ShiyanOu, Christopher S.G. Khoo and Dion H. Goh, "Design and development of a concept-based multidocument summarization system for research abstracts", Journal of Information Science, vol. 34 , no. 3, pp. 308-326 , June 2008.