skip to main content
10.1145/2505515.2505535acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Overlapping community detection using seed set expansion

Published:27 October 2013Publication History

ABSTRACT

Community detection is an important task in network analysis. A community (also referred to as a cluster) is a set of cohesive vertices that have more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. One of the most successful techniques for finding overlapping communities is based on local optimization and expansion of a community metric around a seed set of vertices. In this paper, we propose an efficient overlapping community detection algorithm using a seed set expansion approach. In particular, we develop new seeding strategies for a personalized PageRank scheme that optimizes the conductance community score. The key idea of our algorithm is to find good seeds, and then expand these seed sets using the personalized PageRank clustering procedure. Experimental results show that this seed set expansion approach outperforms other state-of-the-art overlapping community detection methods. We also show that our new seeding strategies are better than previous strategies, and are thus effective in finding good overlapping clusters in a graph.

References

  1. Stanford Network Analysis Project. http://snap.stanford.edu/.Google ScholarGoogle Scholar
  2. B. Abrahao, S. Soundarajan, J. Hopcroft, and R. Kleinberg. On the separability of structural classes of communities. In KDD, pages 624--632, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann. Link communities reveal multiscale complexity in networks. Nature, 466:761--764, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Andersen and K. J. Lang. Communities from seed sets. In WWW, pages 223--232, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Bonchi, P. Esfandiar, D. F. Gleich, C. Greif, and L. V. Lakshmanan. Fast matrix computations for pairwise and columnwise commute times and Katz scores. Internet Mathematics, 8(1--2):73--112, 2012.Google ScholarGoogle Scholar
  7. R. Burt. Structural Holes: The Social Structure of Competition. Harvard University Press, 1995.Google ScholarGoogle Scholar
  8. M. Coscia, G. Rossetti, F. Giannotti, and D. Pedreschi. Demon: a local-first discovery method for overlapping communities. In KDD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. PAMI, 29(11):1944--1957, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. U. Gargi, W. Lu, V. Mirrokni, and S. Yoon. Large-scale community detection on YouTube for topic discovery and exploration. In ICWSM, 2011.Google ScholarGoogle Scholar
  11. D. F. Gleich and C. Seshadhri. Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In KDD, pages 597--605, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. JPDC, 48:96--129, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Khandekar, G. Kortsarz, and V. Mirrokni. Advantage of overlapping clusters for minimizing conductance. In LATIN, pages 494--505, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Lai, X. Wu, H. Lu, and C. Nardini. Learning overlapping communities in complex networks via non-negative matrix factorization. Int. J. Mod Phys C, 22(10):1173--1190, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally. JMLR, 13:2339--2365, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Mishra, R. Schreiber, I. Stanton, and R. E. Tarjan. Clustering social networks. In WAW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Mislove, H. S. Koppula, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Growth of the Flickr social network. In The First Workshop on Online Social Networks, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Palla, I. Derényi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435:814--818, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  21. J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. Automatic multimedia cross-modal correlation discovery. In KDD, pages 653--658, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. S. Rees and K. B. Gallagher. Overlapping community detection by collective friendship group inference. In ASONAM, pages 375--379, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Shen, X. Cheng, K. Cai, and M.-B. Hu. Detect overlapping and hierarchical community structure in networks. Phys. A, 388(8):1706--1712, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  24. H. H. Song, B. Savas, T. W. Cho, V. Dave, I. Dhillon, Y. Zhang, and L. Qiu. Clustered embedding of massive social networks. In SIGMETRICS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. J. Whang, X. Sui, and I. S. Dhillon. Scalable and memory-efficient clustering of large-scale social networks. In ICDM, pages 705--714, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Xie, S. Kelley, and B. K. Szymanski. Overlapping community detection in networks: the state of the art and comparative study. ACM Computing Surveys, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Yang and J. Leskovec. Overlapping community detection at scale: a nonnegative matrix factorization approach. In WSDM, pages 587--596, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Zhang, R.-S. Wang, and X.-S. Zhang. Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A, 374(1):483 -- 490, 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Overlapping community detection using seed set expansion

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
      October 2013
      2612 pages
      ISBN:9781450322638
      DOI:10.1145/2505515

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader