research-article

CAMEUD: clustering approach for mining evolving usage data

Authors:
Alzennyr Da Silva

EDF R&D, Clamart, France

EDF R&D, Clamart, France
View Profile

,
Yves Lechevallier

AxIS project, INRIA, France

AxIS project, INRIA, France
View Profile

,
Francisco A. T. de Carvalho

Cln, UFPE, Brazil

Cln, UFPE, Brazil
View Profile

IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the WebMay 2012Article No.: 3Pages 1–6https://doi.org/10.1145/2331801.2331804

Published:20 May 2012Publication History

IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the Web

Pages 1–6

ABSTRACT

The growing number of traces left behind user transactions on the Internet (e.g. customer purchases, user navigations, etc.) has increased the importance of Web usage data analysis. A notable challenge of this analysis is the fact that the way in which a website is visited can evolve over time. As a result, the usage models must be continuously updated in order to reflect the current behaviour of the visitors. In this article, we introduce CAMEUD, a clustering approach to mine and detect changes in evolving usage data. The proposed approach is totally independent from the clustering algorithm applied in the classification problem and is able to detect and determine the nature of changes undergone by the usage groups (appearance, disappearance, fusion and split) at subsequent time intervals. Experiments on synthetic and real usage data sets evaluate the efficiency of CAMEUD.

Supplemental Material

Available for Download

zip

a3-silva.zip (37.4 KB)

Supplemental file.

References

C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In VLDB'2003: Proceedings of the 29th international conference on Very large data bases, pages 81--92, 2003. Google ScholarDigital Library
M. Aldenderfer and R. Blashfield. Cluster Analysis. Sage Publications, Beverly Hills, California, 1984.Google Scholar
G. Celeux, E. Diday, G. Govaert, Y. Lechevallier, and H. Ralambondrainy. Classification automatique des données. Dunod, Paris, 1989.Google Scholar
B. Csernel, F. Clerot, and G. Hebrail. Streamsamp: Datastream clustering over tilted windows through sampling. In ECML PKDD 2006 Workshop on Knowledge Discovery from Data Streams, 2006.Google Scholar
A. Da Silva, Y. Lechevallier, F. Rossi, and F. de A. T. de Carvalho. Clustering dynamic web usage data. In Innovative Applications in Data Mining, volume 169 of Studies in Computational Intelligence, pages 71--82. Springer, 2009.Google ScholarCross Ref
O. Elemento. Apport de l'analyse en composantes principales pour l'initialisation et la validation de cartes topologiques de kohonen. In SFC'99, Nancy, France, 1999.Google Scholar
D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener. A large-scale study of the evolution of web pages. In In Proceedings of the 12th International World Wide Web Conference, pages 669--678. ACM Press, 2003. Google ScholarDigital Library
L. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2:193--218, 1985.Google ScholarCross Ref
E. J. Johnson, W. W. Moe, P. S. Fader, S. Bellman, and G. L. Lohse. On the depth and dynamics of online search behavior. Manage. Sci., 50(3):299--308, 2004. Google ScholarDigital Library
M. Khalilian and N. Mustapha. Data stream clustering: Challenges and issues. In The 2010 IAENG International Conference on Data Mining and Applications, Hong Kong, March 2010.Google Scholar
T. Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, third edition, 1995. Last edition published in 2001. Google ScholarDigital Library
J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281--297. University of California Press, 1967.Google Scholar
A. R. Mahdiraji. Clustering data stream: A survey of algorithms. Int. J. Know.-Based Intell. Eng. Syst., 13(2):39--44, 2009. Google ScholarDigital Library
F. Murtagh. Interpreting the kohonen self-organizing feature map using contiguity-constrained clustering. Pattern Recogn. Lett., 16:399--408, April 1995. Google ScholarDigital Library
L. O'Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani. Streaming-data algorithms for high-quality clustering. In Proceedings of IEEE International Conference on Data Engineering, pages 685--694, 2001. Google ScholarDigital Library
M. Spiliopoulou, I. Ntoutsi, Y. Theodoridis, and R. Schult. Monic: modeling and monitoring cluster transitions. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, pages 706--711. ACM, 2006. Google ScholarDigital Library
J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations, 1(2):12--23, 2000. Google ScholarDigital Library
C. J. van Rijsbergen. Information Retrieval. Butterworths, London, second edition, 1979. Google ScholarDigital Library
E. H. Wu, M. K. Ng, A. M. Yip, and T. F. Chan. A clustering model for mining evolving web user patterns in data stream environment. In IDEAL'04, pages 565--571, 2004.Google ScholarCross Ref
M. L. Zhang, M. W. Edu, T. Zhang, T. Zhang, R. Ramakrishnan, R. Ramakrishnan, and M. Livny. Birch: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1:141--182, 1997. Google ScholarDigital Library

Index Terms

CAMEUD: clustering approach for mining evolving usage data
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Efficient web usage mining process for sequential patterns
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services

The tremendous growth in volume of web usage data results in the boost of web mining research with focus on discovering potentially useful knowledge from web usage data.

This paper presents a new web usage mining process for finding sequential patterns ...
Read More
Evolution and Affinity-propagation Based Approach for Data Stream Clustering
ICFET '18: Proceedings of the 4th International Conference on Frontiers of Educational Technologies

In this paper, SED-Stream-AP is proposed as an extension SED-Stream which is an efficient evolution-based stream clustering technique. SED-Steam-AP is a stream clustering technique that integrates evolution and affinity propagation clustering. It adopts ...
Read More
Mining and monitoring evolving data
Handbook of massive data sets

Data mining algorithms have been the focus of much recent research. The initial spurt of research on data mining algorithms typically considered static datasets. In practice, the input data to a data mining process resides in a large data warehouse ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the Web
May 2012
47 pages
ISBN:9781450312394
DOI:10.1145/2331801
General Chairs:
Ullas Nambiar
EMC India COE, Bangalore
,
Zaiqing Nie
Microsoft Research Asia, Beijing
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 May 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
change detection
clustering
evolving data
web usage mining (WUM)
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 92
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

CAMEUD: clustering approach for mining evolving usage data

IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the Web

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Efficient web usage mining process for sequential patterns

Evolution and Affinity-propagation Based Approach for Data Stream Clustering

Mining and monitoring evolving data