ABSTRACT
The processes by which communities come together, attract new members, and develop over time is a central research issue in the social sciences - political movements, professional organizations, and religious denominations all provide fundamental examples of such communities. In the digital domain, on-line groups are becoming increasingly prominent due to the growth of community and social networking sites such as MySpace and LiveJournal. However, the challenge of collecting and analyzing large-scale time-resolved data on social groups and communities has left most basic questions about the evolution of such groups largely unresolved: what are the structural features that influence whether individuals will join communities, which communities will grow rapidly, and how do the overlaps among pairs of communities change over time.Here we address these questions using two large sources of data: friendship links and community membership on LiveJournal, and co-authorship and conference publications in DBLP. Both of these datasets provide explicit user-defined communities, where conferences serve as proxies for communities in DBLP. We study how the evolution of these communities relates to properties such as the structure of the underlying social networks. We find that the propensity of individuals to join communities, and of communities to grow rapidly, depends in subtle ways on the underlying network structure. For example, the tendency of an individual to join a community is influenced not just by the number of friends he or she has within the community, but also crucially by how those friends are connected to one another. We use decision-tree techniques to identify the most significant structural determinants of these properties. We also develop a novel methodology for measuring movement of individuals between communities, and show how such movements are closely aligned with changes in the topics of interest within the communities.
- Lada A. Adamic, Orkut Buyukkokten, and Eytan Adar, A Social Network Caught in the Web First Monday, 8(6), 2003.Google Scholar
- R. Baeza-Yates, B. Ribeiro. Modern Information Retrieval. Addison Wesley, 1998. Google ScholarDigital Library
- S.Boorman, P. Levitt. The genetics of altruism. Acad. Pr. 1980.Google Scholar
- C. Borgs, J. Chayes, M. Mahdian and A. Saberi. Exploring the community structure of newsgroups Proc. 10th ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining, 2004. Google ScholarDigital Library
- Leo Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996. Google ScholarCross Ref
- R. Burt. Structural Holes: The Social Structure of Competition. Harv. U. Press, 1992. Harvard, 1992.Google Scholar
- D. Centola, M. Macy, V. Eguiluz. Cascade Dynamics of Multiplex Propagation. Physica A, to appear.Google Scholar
- J. Coleman Social Capital in the Creation of Human Capital, American Journal of Sociology, 94(Supplement): 1988.Google Scholar
- J. Coleman. Foundations of Social Theory. Harvard, 1990.Google Scholar
- S. Deerwester, S. Dumais, T. Landauer, G. Furnas, R. Harshman. Indexing by latent semantic analysis. JASIS 41(1990). J. Amer. Soc. for Information Science, 41(6), 1990.Google ScholarCross Ref
- S. Dill, R. Kumar, K. McCurley, S. Rajagopalan, D. Sivakumar, A. Tomkins. Self-similarity in the Web. 27th International Conference on Very Large Data Bases, 2001. Google ScholarDigital Library
- P. S. Dodds, D. J. Watts. Universal behavior in a generalized model of contagion. Phys. Rev. Lett., 92:218701, 2004.Google ScholarCross Ref
- P. Domingos, M. Richardson. Mining the Network Value of Customers. Proc. 7th Intl. Conf. Knowledge Discovery and Data Mining, 2001. Google ScholarDigital Library
- Gary Flake, Steve Lawrence, C. Lee Giles, Frans Coetzee. Self Organization and Identification of Web Communities. IEEE Computer, 35:3, March 2002. Google ScholarDigital Library
- G. W. Flake, R. E. Tarjan, and K. Tsioutsiouliklis. Graph Clustering and Minimum Cut Trees. Internet Math. 1(2004).Google Scholar
- M. Girvan, M. E. J. Newman. Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(2002).Google Scholar
- M. Granovetter. The strength of weak ties. American Journal of Sociology, 78(6): 1360--1380, 1973.Google ScholarCross Ref
- P. Hoff, A. Raftery, M. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(2002).Google Scholar
- P. Holme, M. Newman. Nonequilibrium phase transition in the coevolution of networks and opinions. arXiv physics/0603023, March 2006.Google Scholar
- J. Hopcroft, O. Khan, B. Kulis, B. Selman. Natural communities in large linked networks. Proc. 9th Intl. Conf. on Knowledge Discovery and Data Mining, 2003. Google ScholarDigital Library
- D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003. Google ScholarDigital Library
- J. Kleinberg. Bursty and Hierarchical Structure in Streams. Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002. Google ScholarDigital Library
- G. Kossinets, D. Watts. Empirical analysis of an evolving social network. Science, 311:88--90, 2006.Google ScholarCross Ref
- R. Kumar, J. Novak, P. Raghavan, A. Tomkins. Structure and evolution of blogspace. Comm. ACM 47(2004). Google ScholarDigital Library
- J. Leskovec, L. Adamic, B. Huberman. The Dynamics of Viral Marketing. Proc. 7th ACM Conf. on Electronic Commerce, 2006. Google ScholarDigital Library
- D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, A. Tomkins. Geographic routing in social networks. Proc. Natl. Acad. Sci. USA, 102 (Aug 2005).Google ScholarCross Ref
- M. Macy, personal communication, June 2006.Google Scholar
- M. E. J. Newman. Detecting community structure in networks. Eur. Phys. J. B 38, 321--330 (2004).Google ScholarCross Ref
- J.R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81--106, 1986. Google ScholarCross Ref
- M. Richardson, P. Domingos. Mining Knowledge-Sharing Sites for Viral Marketing. Proc. 8th Intl. Conf. on Knowledge Discovery and Data Mining. 2002. Google ScholarDigital Library
- E. Rogers. Diffusion of innovations Free Press, 1995.Google Scholar
- P. Sarkar, A. Moore. Dynamic Social Network Analysis using Latent Space Models. SIGKDD Explorations: Special Edition on Link Mining, 2005. Google ScholarDigital Library
- D. Strang, S. Soule. Diffusion in Organizations and Social Movements: From Hybrid Corn to Poison Pills. Annual Review of Sociology 24(1998).Google Scholar
- T. Valente. Network Models of the Diffusion of Innovations. Hampton Press, 1995.Google Scholar
- F. Viegas and M. Smith. Newsgroup Crowds and AuthorLines. Visualizing the Activity of Individuals in Conversational Cyberspaces. Hawaii Intl. Conf. Sys. Sci. 2004. Google ScholarDigital Library
- X. Wang, A. McCallum. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends, Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2006. Google ScholarDigital Library
- S. Wasserman and K. Faust. Social Network Analysis. Cambridge University Press, 1994.Google ScholarCross Ref
Index Terms
- Group formation in large social networks: membership, growth, and evolution
Recommendations
Magnet community identification on social networks
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningSocial communities connect people of similar interests together and play essential roles in social network applications. Examples of such communities include people who like the same objects on Facebook, follow common subjects on Twitter, or join ...
Statistical Properties of Community Dynamics in Large Social Networks
The authors' focus is on the general statistical features of the time evolution of communities (also called as modules, clusters or cohesive groups) in large social networks. These structural sub-units can correspond to highly connected circles of ...
Use of Local Group Information to Identify Communities in Networks
TKDD Special Issue (SIGKDD'13)The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a “community” truly is, and ...
Comments