skip to main content
10.1145/2331801.2331804acmconferencesArticle/Chapter ViewAbstractPublication PagesiiwebConference Proceedingsconference-collections
research-article

CAMEUD: clustering approach for mining evolving usage data

Published:20 May 2012Publication History

ABSTRACT

The growing number of traces left behind user transactions on the Internet (e.g. customer purchases, user navigations, etc.) has increased the importance of Web usage data analysis. A notable challenge of this analysis is the fact that the way in which a website is visited can evolve over time. As a result, the usage models must be continuously updated in order to reflect the current behaviour of the visitors. In this article, we introduce CAMEUD, a clustering approach to mine and detect changes in evolving usage data. The proposed approach is totally independent from the clustering algorithm applied in the classification problem and is able to detect and determine the nature of changes undergone by the usage groups (appearance, disappearance, fusion and split) at subsequent time intervals. Experiments on synthetic and real usage data sets evaluate the efficiency of CAMEUD.

Skip Supplemental Material Section

Supplemental Material

References

  1. C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In VLDB'2003: Proceedings of the 29th international conference on Very large data bases, pages 81--92, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Aldenderfer and R. Blashfield. Cluster Analysis. Sage Publications, Beverly Hills, California, 1984.Google ScholarGoogle Scholar
  3. G. Celeux, E. Diday, G. Govaert, Y. Lechevallier, and H. Ralambondrainy. Classification automatique des données. Dunod, Paris, 1989.Google ScholarGoogle Scholar
  4. B. Csernel, F. Clerot, and G. Hebrail. Streamsamp: Datastream clustering over tilted windows through sampling. In ECML PKDD 2006 Workshop on Knowledge Discovery from Data Streams, 2006.Google ScholarGoogle Scholar
  5. A. Da Silva, Y. Lechevallier, F. Rossi, and F. de A. T. de Carvalho. Clustering dynamic web usage data. In Innovative Applications in Data Mining, volume 169 of Studies in Computational Intelligence, pages 71--82. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  6. O. Elemento. Apport de l'analyse en composantes principales pour l'initialisation et la validation de cartes topologiques de kohonen. In SFC'99, Nancy, France, 1999.Google ScholarGoogle Scholar
  7. D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener. A large-scale study of the evolution of web pages. In In Proceedings of the 12th International World Wide Web Conference, pages 669--678. ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2:193--218, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  9. E. J. Johnson, W. W. Moe, P. S. Fader, S. Bellman, and G. L. Lohse. On the depth and dynamics of online search behavior. Manage. Sci., 50(3):299--308, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Khalilian and N. Mustapha. Data stream clustering: Challenges and issues. In The 2010 IAENG International Conference on Data Mining and Applications, Hong Kong, March 2010.Google ScholarGoogle Scholar
  11. T. Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, third edition, 1995. Last edition published in 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281--297. University of California Press, 1967.Google ScholarGoogle Scholar
  13. A. R. Mahdiraji. Clustering data stream: A survey of algorithms. Int. J. Know.-Based Intell. Eng. Syst., 13(2):39--44, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Murtagh. Interpreting the kohonen self-organizing feature map using contiguity-constrained clustering. Pattern Recogn. Lett., 16:399--408, April 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. O'Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani. Streaming-data algorithms for high-quality clustering. In Proceedings of IEEE International Conference on Data Engineering, pages 685--694, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Spiliopoulou, I. Ntoutsi, Y. Theodoridis, and R. Schult. Monic: modeling and monitoring cluster transitions. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, pages 706--711. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations, 1(2):12--23, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. J. van Rijsbergen. Information Retrieval. Butterworths, London, second edition, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. H. Wu, M. K. Ng, A. M. Yip, and T. F. Chan. A clustering model for mining evolving web user patterns in data stream environment. In IDEAL'04, pages 565--571, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. L. Zhang, M. W. Edu, T. Zhang, T. Zhang, R. Ramakrishnan, R. Ramakrishnan, and M. Livny. Birch: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1:141--182, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. CAMEUD: clustering approach for mining evolving usage data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the Web
      May 2012
      47 pages
      ISBN:9781450312394
      DOI:10.1145/2331801

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 May 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader