Skip to main content
Log in

DANCer: dynamic attributed networks with community structure generation

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Most networks, such as those generated from social media, tend to evolve gradually with frequent changes in the activity and the interactions of their participants. Furthermore, the communities inside the network can grow, shrink, merge, or split, and the entities can move from one community to another. The aim of community detection methods is precisely to detect the evolution of these communities. However, evaluating these algorithms requires tests on real or artificial networks with verifiable ground truth. Dynamic networks generators have been recently proposed for this task, but most of them consider only the structure of the network, disregarding the characteristics of the nodes. In this paper, we propose a new generator for dynamic attributed networks with community structure that follow the properties of real-world networks. The evolution of the network is performed using two kinds of operations: Micro-operations are applied on the edges and vertices, while macro-operations on the communities. Moreover, the properties of real-world networks such as preferential attachment or homophily are preserved during the evolution of the network, as confirmed by our experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31

Similar content being viewed by others

Notes

  1. http://perso.univ-st-etienne.fr/largeron/DANC_Generator/.

References

  1. Akoglu L, Faloutsos C (2009) RTG: a recursive realistic graph generator using random typing. Data Min Knowl Discov 19(2):194–209

    Article  MathSciNet  Google Scholar 

  2. Akoglu L et al (2008) RTM: laws and a recursive generator for weighted time-evolving graphs. In: Eighth IEEE international conference on data mining, 2008 (ICDM’08). IEEE, pp 701–706

  3. Albert R, Barabási A-L (2002) Statistical mechanics of complex networks. Rev Mod Phys 74(1):47–97

    Article  MathSciNet  MATH  Google Scholar 

  4. Amaral LAN et al (2000) Classes of small-world networks. Proc Natl Acad Sci 97(21):11149–11152

    Article  Google Scholar 

  5. Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  MATH  Google Scholar 

  6. Benson AR et al (2014) Learning multifractal structure in large networks. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1326–1335

  7. Chung F, Lu L (2002) The average distances in random graphs with given expected degrees. Proc Natl Acad Sci 99(25):15879–15882

    Article  MathSciNet  MATH  Google Scholar 

  8. Dang TA (2012) Analysis of communities in social networks. Ph.D. thesis, Université Paris 13

  9. Easley D, Kleinberg J (2010) Networks, crowds and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  10. Erdős P, Rényi A (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci 5:17–61

    MathSciNet  MATH  Google Scholar 

  11. Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  12. Gong NZ et al (2012) Evolution of social-attribute networks: measurements, modeling, and implications using Google+. In: ACM conference on internet measurement conference (IMC). ACM, pp 131–144

  13. Görke R et al (2012) An efficient generator for clustered dynamic random networks. Springer, Berlin

    Book  MATH  Google Scholar 

  14. Görke R, Staudt C (2009) A generator for dynamic clustered random graphs. Tech. rep., ITI Wagner, Department of Informatics, Universität Karlsruhe. Informatik, Uni Karlsruhe, TR 2009-7

  15. Granell C et al (2015) A benchmark model to assess community structure in evolving networks. CoRR arXiv:1501.05808

  16. Holland PW, Leinhardt S (1971) Transitivity in structural models of small groups. Comp Group Stud 2:107–124

    Article  Google Scholar 

  17. Kim M, Leskovec J (2012) Multiplicative attribute graph model of real-world networks. Internet Math 8(1–2):113–160

    Article  MathSciNet  MATH  Google Scholar 

  18. Lancichinetti A, Fortunato S (2009) Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys Rev E 80(1):016118

    Article  Google Scholar 

  19. Lancichinetti A et al (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110

    Article  Google Scholar 

  20. Largeron C et al (2015) Generating attributed networks with communities. PLoS ONE 10(4):e0122777

    Article  Google Scholar 

  21. Lazarsfeld PF, Merton RK (1954) Friendship as a social process: a substantive and methodological analysis. Freedom Control Mod Soc 18(1):18–66

    Google Scholar 

  22. Leskovec J et al (2008) Microscopic evolution of social networks. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 462–470

  23. Leskovec J et al (2005a) Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In: Knowledge discovery in databases: PKDD 2005. Springer, Berlin, pp 133–145

  24. Leskovec J et al (2010) Kronecker graphs: an approach to modeling networks. J Mach Learn Res 11:985–1042

    MathSciNet  MATH  Google Scholar 

  25. Leskovec J et al (2005b) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, pp 177–187

  26. McPherson M et al (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27(1):415–444

    Article  Google Scholar 

  27. Milgram S (1967) The small-world problem. Psychol Today 2:60–67

    Google Scholar 

  28. Newman ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104

    Article  MathSciNet  Google Scholar 

  29. Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

    Article  Google Scholar 

  30. Palla G et al (2010) Multifractal network generator. Proc Natl Acad Sci 107(17):7640–7645

    Article  Google Scholar 

  31. Pfeiffer JJ III et al (2014) Attributed graph models: modeling network structure with correlated attributes. In: Proceedings of the 23rd international conference on World Wide Web. ACM, pp 831–842

  32. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442

    Article  Google Scholar 

  33. Wong LH et al (2006) A spatial model for social networks. Phys A Stat Mech Its Appl 360(1):99–120

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. Largeron.

Appendices

Appendix 1: Additional functions

See Table 4.

Table 4 Additional functions used in the algorithms

Appendix 2: User manual

The software DANCer as well as a detailed user manual is available at http://perso.univ-st-etienne.fr/largeron/DANC_Generator/. The user interface of DANCer generator is formed by three views as shown in Fig. 9.

1.1 Graph parameters

The parameters are on the left panel. They correspond to the parameters of algorithms given in Table 2. They are detailed below.

1.1.1 Communities

  • K : Number of communities in the first graph;

  • n : Number of vertices in the first graph;

  • Nb. Rep. : Number of representatives in each community. The higher is the value, the slower is the computation;

  • Theta : Percentage of vertices assigned to a random community. The higher is this value, the less likely the community will be homogeneous w.r.t. the attributes.

1.1.2 Attributes

  • Nb. Attr. : Number of real attributes associated with the vertices. Each attribute is distributed according to centered normal distribution with mean equals to 0;

  • Dev. i : Standard deviation of the ith attribute.

1.1.3 Edges

  • Edges Within : Maximum number of within community edges added to a newly inserted vertex;

  • Edges Between : Maximum number of between community edges added to a newly inserted vertex

  • MTE : Minimum number of edges in the resulting graph (up to a graph where communities are cliques).

1.1.4 Micro-dynamic

  • Proba Micro : The probability to perform a micro-update operation;

  • Add Vertex : The ratio of vertices created at each timestamp. When set to 1, the number of vertices is doubled at each timestamp;

  • Remove Vertex : The ratio of vertices removed at each timestamp;

  • Update Attr. : The ratio of vertices having their attribute values updated;

  • Add Btw. Edges : The ratio of edges inserted connecting two vertices in different communities;

  • Remove Btw. Edges : The ratio of edges removed connecting two vertices in different communities;

  • Add Wth. Edges : The ratio of edges inserted connecting two vertices in the same communities;

  • Remove Wth. Edges : The ratio of edges removed connecting two vertices in the same communities;

1.1.5 Macro-dynamic

  • Timestamps : The number of timestamp (i.e., the number of single graphs generated to form the dynamic network);

  • Proba Merge : The probability to perform a merge operation at a single timestamp (i.e., merging two communities into a single one);

  • Proba Split : The probability to perform a split operation at a single timestamp (i.e., split one community into two)

  • Proba Migrate : The probability to perform a migrate operation at a single timestamp (i.e., migrate vertices from a community to either a new or an existing community).

1.1.6 Network reproduction

  • Seed parameter : A seed is used for the random number generator. It allows to reproduce exactly the same network.

1.2 Graph visualization and manipulation

The central part of the user interface as shown in Fig. 9 allows to display the generated network and the changes in its communities at each time step. Each graph in the sequence can be viewed separately in the Graph tab. The sequence of graphs can also be visualized through the timestamp scrollbar at the right side of the panel.

For each graph plotted, in the Graph View tab, we can set different options (see Fig. 32) allowing, for example, to hide or display the edges and vertices through the Graph View section at the right side panel. The graph can then be displayed with different layout options (kamada-kawai, fruchtman-reynolds or self-organizing map) where the sizes of the plotted vertices are chosen according to their degree, age or community membership. Moreover, we can select or filter the displayed vertices according to their different events, as described in the micro-dynamic operations, from the Select Vertices panel.

Fig. 32
figure 32

Graph options panel

In the plotted graph, vertices of the same color are member of the same community. The user can then interactively select or manipulate a vertex (respectively a group of vertices) using the cursor. The informations for each node (id, degree, attributes) are displayed when a vertex is pointed.

The community dynamics (see Figs. 24, 25, 26) are available through the Community Dynamics tabs, in the central part of the user interface. It displays the size and the evolution of the different communities in the sequence of graphs according to the macro-dynamic operations (split, merge and migrate).

1.3 Graph measures

Several measures, listed in Table 3, such as modularity or homophily are computed on each graph of the dynamic network to describe its properties, notably P1, P2, P3, P4 and P5 detailed in Sect. 2. The changes in these different measures on the sequence of graphs is presented at the bottom of the interface as Fig. 9 shows.

1.3.1 Attribute measures

  • Observed homophily : Ratio of edges connecting similar vertices w.r.t. their attribute values;

  • Expected homophily : Ratio of pair of similar vertices among all possible pairs of vertices;

    The difference between the expected and observed homophily allows to measure if similar vertices according to the attributes tend to be more connected than dissimilar vertices (cf. P5);

  • Within inertia : Measure of the dispersion of the attribute values inside the communities (cf. P4). A low within inertia indicates that the communities are highly homogeneous with regard to the attribute values;

1.3.2 Structural measures

  • Modularity : gives the partition modularity measure as defined by [28] (cf. P3);

  • Average clustering coefficient : is given as an indication of the transitivity of connections in the network [32];

  • Random clustering coefficient : gives the clustering coefficient in a Erdös–Renyi random graph having the same number of vertices and edges;

    The network average clustering coefficient is a measure of the clustering tendency of the network (cf. P3). This observed value can be compared with the expected value computed on a random graph having the same vertex set: An observed value higher than the expected value confirms the community structure;

  • Average degree : the average number of neighbors of the vertices (cf. P1);

  • Average shortest path length : the average minimum number of hops required to reach two arbitrary vertices (cf. P2). It is not computed when the graph is formed by several disconnected components (i.e., \(E^\mathrm{max}_{btw}{}=0\));

  • Diameter : length of the longest shortest path between any pair of vertices (cf. P2);

  • Nb. edges between : number of edges connecting two vertices belonging to different communities;

  • Nb. edges within : number of edges connecting two vertices belonging to the same community (cf. P3);

  • Nb. edges : total number of edges in the graph, i.e., \(\mathcal {E}\).

1.3.3 Degree distribution

The bottom of the user interface includes also a panel displaying the distribution of vertex degrees on each graph of the sequence as shown in Fig. 33.

Fig. 33
figure 33

Degree distribution panel

Table 5 Predefined benchmark profiles

1.4 Output files

The generated dynamic network can be saved as a collection of files, one for each time step, under the out directory located in the same working directory as the generator. For each graph of the sequence, the file with the extension “.graph” indicates the composition of the graph (vertices and edges), and the “parameters” file enumerates all the parameters used by the generator.

  • Parameters : The parameters are output in a separated file. Each line starts by the parameter name and its value.

  • Vertices : In the graph file, the vertices section starts with the line # Vertices. Each consecutive line describes a vertex. A line consists of an integer corresponding to the vertex id, the list of its attribute values separated by “; ” and an integer corresponding to the vertex community id.

  • Edges : This section starts with the line # Edges. Each consecutive line corresponds to an edge. A line is composed of two vertex ids separated by a “; ”.

  • Measures : the measures are saved in a separated file. Each line gives the measure name and its consecutive values at each time step.

Appendix 3: Benchmark profiles

Table 5 presents a first network (Configuration 1 obtained with parameters given in Table 6), having a good community structure according to the relationships and the attributes and then three other networks in which the link-based structure or the attribute-based structure or the both are weaken. Table 7 presents modifications related to the dynamicity of the first network. The parameter setting is given for each network as well as its characteristics (modularity and within inertia).

Table 6 Default parameters
Table 7 Benchmark profiles derived from configuration 1 by changing dynamicity

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Largeron, C., Mougel, P.N., Benyahia, O. et al. DANCer: dynamic attributed networks with community structure generation. Knowl Inf Syst 53, 109–151 (2017). https://doi.org/10.1007/s10115-017-1028-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1028-2

Keywords

Navigation