Skip to main content

Web User Grouping Based on Navigation Patterns Through Pair Wise Sequence Alignment and Breadth First Search

  • Conference paper
  • First Online:
ICDSMLA 2019

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 601))

  • 63 Accesses

Abstract

User grouping is one of the domains that have gained more interest among the web researchers so that users can be provided a customized and better environment. This paper attempts on finding user groups through their web page navigation pattern. The methodology primarily incorporates five different ways of interpreting the input navigation sequence and investigates if there is any influence in the obtained results. For each interpretation, pair-wise sequence alignment is carried out to determine the number of aligned pages. Subsequently, a similarity matrix is formulated based on the number of aligned pages and the unique number of pages between the users. Then, the matrix is thresholded based on the maximum value at each column as column thresholding. After that, breadth first search is performed to identify the user groups. MSNBC dataset, a publicly available data is used for assess the proposed methodology. Jaccard similarity co-efficient is computed to calculate the inter-group similarity. The values exhibits that the sorted input navigation sequence without redundancy provides effective clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cooley R, Mobasher B, Srivastava J (1997) Web mining: information and pattern discovery on the world wide web. In: Proceedings of Ninth IEEE International Conference on Tools with Artificial Intelligence, 1997. IEEE, pp 558–567

    Google Scholar 

  2. Baeza-Yates R, Boldi P (2010) Web structure mining. In: Advanced techniques in web intelligence-I. Springer, Berlin, Heidelberg, pp 113–142

    Google Scholar 

  3. Xu G, Zhang Y, Li L (2011) Web content mining. In: Web mining and social networking. Springer, Boston, MA, pp 71–87

    Google Scholar 

  4. Mobasher B (2005) Web usage mining. In: Encyclopedia of data warehousing and mining. IGI Global, pp 1216–1220

    Google Scholar 

  5. Srivastava J, Cooley R, Deshpande M, Tan PN (2000) Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD Explorations Newsl 1(2):12–23

    Article  Google Scholar 

  6. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  Google Scholar 

  7. Nanopoulos A, Manolopoulos Y (2001) Mining patterns from graph traversals. Data Knowl Eng 37(3):243–266

    Article  Google Scholar 

  8. Bondy JA, Murty USR (1976) Graph theory with applications, vol 290. Macmillan, London

    Book  Google Scholar 

  9. Fu Y, Sandhu K, Shih M (1999) Clustering of web users based on access patterns. KDD workshop on Web Mining, San Diego

    Google Scholar 

  10. Mojica JA, Rojas DA, Gomez J, Gonzalez F (2005) Page clustering using a distance based algorithm. In: Web congress, 2005. LA-WEB 2005. Third Latin American. IEEE, 7 pp

    Google Scholar 

  11. Poornalatha G, Raghavendra PS (2011) Web user session clustering using modified K-means algorithm. In: International conference on advances in computing and communications. Springer, Berlin, Heidelberg, pp 243–252

    Google Scholar 

  12. Maheswari BU, Sumathi P (2014) A new clustering and preprocessing for web log mining. In: 2014 world congress on computing and communication technologies (WCCCT). IEEE, pp 25–29

    Google Scholar 

  13. Chitraa V, Thanamani AS (2015) Clustering of navigation patterns using Bolzwano_Weierstrass theorem. Indian J Sci Technol 8(12):1

    Article  Google Scholar 

  14. Krishnaveni K, Rathipriya R (2016) MapReduce k-means based co-clustering approach for web page recommendation system. Int J Comput Intell Inform 6(2):161–171

    Google Scholar 

  15. GeethaRamani R, Revathy P, Lakshmi B (2018) Grouping of users based on user navigation behaviour using supervised association rule tree mining. Int J Reason-Based Intell Syst 10(3/4):307–315

    Google Scholar 

  16. Al-asadi TA, Obaid AJ (2016) Discovering similar user navigation behavior in Web log data. Int J Appl Eng Res 11(16):8797–8805

    Google Scholar 

  17. Hofgesang PI (2006) Relevance of time spent on web pages. In: Proceedings of KDD workshop on web mining and web usage analysis, in conjunction with the 12th ACM SIGKDD international conference on knowledge discovery and data mining

    Google Scholar 

  18. Xing D, Shen J (2004) Efficient data mining for web navigation patterns. Inf Softw Technol 46(1):55–63

    Article  Google Scholar 

  19. Krol D, Scigajlo M, Trawinski B (2008) Investigation of internet system user behaviour using cluster analysis. In: 2008 international conference on machine learning and cybernetics, vol 6. IEEE, pp. 3408–3412

    Google Scholar 

  20. Shi P (2009) An efficient approach for clustering web access patterns from web logs. Int J Adv Sci Technol 5(1):354–362

    Google Scholar 

  21. Azimpour-Kivi M, Azmi R (2011) A webpage similarity measure for web sessions clustering using sequence alignment. In: 2011 international symposium on artificial intelligence and signal processing (AISP). IEEE, pp 20–24

    Google Scholar 

  22. Hay B, Wets G, Vanhoof K (2004) Mining navigation patterns using a sequence alignment method. Knowl Inf Syst 6(2):150–163

    Article  Google Scholar 

  23. Khasawneh N, Chan CC (2007) Multidimensional sessions comparison method using dynamic programming. In: 4th international conference on innovations in information technology, 2007. IIT’07. IEEE, pp 581–585

    Google Scholar 

  24. Li C, Lu Y (2007) Similarity measurement of web sessions based on sequence alignment. Wuhan Univ J Nat Sci 12(5):814–818

    Article  Google Scholar 

  25. Mishra R, Kumar P, Bhasker B (2015) A web recommendation system considering sequential information. Decis Support Syst 75:1–10

    Article  Google Scholar 

  26. Geetharamani R, Revathy P, Jacob SG (2015) Prediction of users webpage access behaviour using association rule mining. Sadhana 40(8):2353–2365

    Article  Google Scholar 

  27. University of California, Machine Learning Repository. https://archive.ics.uci.edu/ml/…/MSNBC.com+Anonymous+Web+Data

  28. Cormen TH (2009) Introduction to algorithms, Chapter 23. MIT Press

    Google Scholar 

  29. Levandowsky M, Winter D (1971) Distance between sets. Nature 234(5323):34

    Article  Google Scholar 

  30. Hartigan JA, Wong MA (1979) Algorithm AS 136: A k-means clustering algorithm. J R Stat Society Ser C (Appl Stat) 28(1):100–108

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Revathy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Geetharamani, R., Revathy, P. (2020). Web User Grouping Based on Navigation Patterns Through Pair Wise Sequence Alignment and Breadth First Search. In: Kumar, A., Paprzycki, M., Gunjan, V. (eds) ICDSMLA 2019. Lecture Notes in Electrical Engineering, vol 601. Springer, Singapore. https://doi.org/10.1007/978-981-15-1420-3_180

Download citation

Publish with us

Policies and ethics