Skip to main content
Log in

Discovering user profiles for Web personalized recommendation

  • Knowledge and Data Processing
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

With the growing popularity of the World Wide Web, large volume of user access data has been gathered automatically by Web servers and stored in Web logs. Discovering and understanding user behavior patterns from log files can provide Web personalized recommendation services. In this paper, a novel clustering method is presented for log files called Clustering large Weblog based on Key Path Model (CWKPM), which is based on user browsing key path model, to get user behavior profiles. Compared with the previous Boolean model, key path model considers the major features of users' accessing to the Web: ordinal, contiguous and duplicate. Moreover, for clustering, it has fewer dimensions. The analysis and experiments show that CWKPM is an efficient and effective approach for clustering large and high-dimension Web logs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Konstan J, Miller B, Maltz Det al. Group Lens: Applying collaborative filtering to usenet news.Communications of the ACM, 1997, 40(3): 78–87.

    Article  Google Scholar 

  2. Zaiane O R, Xin M, Han J. Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. 2002, http://citeseer.nj.nec.com/zaiane98discovering.html

  3. Buchner A G, Mulvenna M Det al. An Internet-enabled knowledge discovery process. May, 2002. http://citeseer.nj.nec.com/290505.html.

  4. Joshi K P, Joshi Aet al. Warehousing and mining Web logs. InACM CIKM'99 2nd Workshop on Web Information and Data Management (WIDM'99), Nov. 5–6, 1999, Kansas City, Missouri, USA, pp.63–68.

  5. Mobasher B, Jain N, Han E Het al. Web mining: Pattern discovery from World Wide Web transaction. April, 2002. http://citeseer.nj.nec.com/mobasher96Web.html

  6. Mobasher B, Cooley R. Automatic personalization based on Web usage mining.Communications of the ACM, 2000, 48(8): 142–151.

    Article  Google Scholar 

  7. Nasraoui O, Frigui H, Joshi A, Krishnapuram R. Mining Web access logs using relational competitive fuzzy clustering. May 2002. http://citeseer.nj.nec.com/nasraoui99mining.html.

  8. Shahabi C, Zarkesh A M, Adibi Jet al. Knowledge discovery from users Web-page navigation. InProc. Workshop on Research Issues in Data Engineering, Birmingham, England, 1997, pp.20–29.

  9. Yan T, Jacobsen M, Garcia-Molina Het al. From user access patterns to dynamic hypertext linking. 2002. http://www5conf.inria.fr/fich_html/papers/P8/Overview.html.

  10. Song Q B, Shen J Y. An efficient and multi-purpose algorithm for mining Web logs.Journal of Computer Research & Development, 2002, 38(3): 328–333. (in Chinese)

    Google Scholar 

  11. Han E H, Karypis G, Kumar V, Mobasher B. Hypergraph based clustering in high-dimensional data sets: A summary of results.IEEE Bulletin of the Technical Committee on Data Engineering, March 1998, 21(1): 15–22.

    Google Scholar 

  12. Shahabi C, Banaei F, Faruque J. Feature matrices: A model for efficient and anonymous mining of Web navigation.EC-Web2001, Sept. 2001, Germany, pp.68–82.

  13. Chen M Set al. Data mining for path traversal patterns in a Web environment. InProc. the 16th International Conference on Distributed Computing System, 1996, pp.385–392.

  14. Spiliopoulou M, Faulstich L. WUM: A tool for Web utilization analysis. InExtended Version of Proc. EDBT Workshop Web DB'98, LNCS 1590, Springer-Verlag, 1999, pp.184–203.

  15. Xiao Y Q, Dunham M H. Efficient mining of traversal patterns.Data and Knowledge Engineering, 2001, 39: 191–214.

    Article  MATH  Google Scholar 

  16. Song A B, Hu K F, Dong Y S. Research on Web log mining.Journal of Southeast University, 2002, 32(1): 15–18.

    Google Scholar 

  17. Colly R, Mobosher Bet al. Data preparation for mining World Wide Web browsing patterns.Knowledge and Information Systems, 1999, 1(1): 53–62.

    Google Scholar 

  18. Chen N, Chen A, Zhou L X. An effective clustering algorithm in large transaction databases.Journal of Software, 2001, 12(4): 475–484.

    Google Scholar 

  19. Jardine N, Sibson R. Mathematical Taxonomy. John Wiley & Sons, London and New York, 1971.

    MATH  Google Scholar 

  20. Perkowitz M, Etzioni O. Towards adaptive Web sites: Conceptual framework and study.Artificial Intelligence, 2000, 118: 245–275.

    Article  MATH  Google Scholar 

  21. McCreight E M. A Space-economical suffix tree construction algorithm.J. ACM, 1976, 23(1): 262–272.

    Article  MATH  MathSciNet  Google Scholar 

  22. Ukkonen E. On-line construction of suffix trees.Algorithmic, 1995, 14(3): 249–260.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ai-Bo Song.

Additional information

This work is supported by the Special Program “Network based Science Activity Environment” of the National Natural Science Foundation of China, and Jiangsu Provincial Key Laboratory of Network and Information Security under Grant No.BM2003201.

Ai-Bo Song received the B.S. and M.S. degrees from School of Information and Engineering, Shandong University of Science and Technology in 1993 and 1996, respectively. Currently he is a Ph.D. candidate in Department of Computer Science and Engineering at Southeast University. His current research interests include data mining, data warehousing and Petri nets.

Mao-Xian Zhao is a Ph.D. candidate in School of Transportation and Traffic at Northern Jiaotong University. His current research interests are optimization & its application and algorithm analysis.

Zuo-Peng Liang is a Ph.D. candidate in Department of Computer Science and Engineering at Southeast University. His current research interests include data mining and XML data management.

Yi-Sheng Dong received the B.S. degree from Department of Computer Science and Engineering, Southeast University in 1965. Since then, he has been with Southeast University. His main research interests are database and software technology.

Jun-Zhou Luo is a professor of Department of Computer Science and Engineering, Southeast University, the secretary-general Petri net Committee of China Computer Federation, an active member of Now York Academy of Science. His current research interests include Petri-nets-based protocol engineering, computer network, and concurrent engineering.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, AB., Zhao, MX., Liang, ZP. et al. Discovering user profiles for Web personalized recommendation. J. Comput. Sci. & Technol. 19, 320–328 (2004). https://doi.org/10.1007/BF02944902

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02944902

Keywords

Navigation