A complex network model for a society with socioeconomic classes

: People’s attitudes and behaviors are partially shaped by the socioeconomic class to which they belong. In this work, a model of scale-free graph is proposed to represent the daily personal contacts in a society with three social classes. In the model, the probability of having a connection between two individuals depends on their social classes and on their physical distance. Numerical simulations are performed by considering sociodemographic data from France, Peru, and Zimbabwe. For the complex networks built for these three countries, average values of node degree, shortest-path length, clustering coe ﬃ cient, closeness centrality, betweenness centrality, and eigenvector centrality are computed. These numerical results are discussed by taking into account the propagation of information about COVID-19.


Introduction
A social class can be simplistically defined as a group of individuals with similar socioeconomic status [1][2][3][4].People's social class affects their habits, opportunities, relationships, traditions, values.In addition, the features of the socioeconomic stratification of a society is relevant for governments implementing policies related to education, labor market, public health, public safety [1][2][3][4].These features also influence, for instance, the propagation of a contagious disease like COVID-19 [5][6][7][8].Usually, from an economic perspective, societies are stratified into lower class, middle class, and upper class [1][2][3][4].In this manuscript, these three classes are taken into consideration in a model of scale-free network proposed for representing daily personal contacts.
People primarily interact with family, friends, and neighbors, which usually belong to the same social class; however, face-to-face encounters among individuals belonging to different classes do occur, for instance, in the workplace, in a subway, in a shopping mall, in a park.Interactions among distinct social classes can also occur in virtual environments [9].Social contacts have been theoretically modeled by complex networks [10][11][12][13][14][15].Two classic examples are the scale-free graphs representing exchanged e-mails [16] and human sexual contacts [17].Recent applications deal with scheduling problem [18] and rumor propagation [19].Usually, theoretical studies on social connectivity are based on the three main models of complex networks found in the literature.These well-known models were conceived by Erdös and Rényi, Watts and Strogatz, Barabási and Albert [10][11][12][13][14][15].Unfortunately, these models are not suitable for representing social interactions: the Erdös-Rényi network leads to Poissonian degree distribution and low average clustering coefficient; the Watts-Strogatz network leads to Poissonian degree distribution; the Barabási-Albert network leads to low average clustering coefficient [11,12,15].An appropriate network model should present scale-free degree distribution and high average clustering coefficient [11,12,15].The model proposed here presents these features.Notice that a suitable model of social connectivity could be employed in studies on homophily.This sociological concept states that similarities among people facilitate the formation of social bonds [20].Homophily has been investigated by analyzing, for instance, data from mobile phones in Singapore [21], the ethnoracial residential segregation in Detroit [22], friendship patterns in American high schools [23,24], the sociocultural dimension in Dutch urban areas [25].In these analyzes, however, basic statistical measures commonly used to identify the network structure were not computed.For the model proposed here, these measures are computed for three countries and compared.There are also studies on homophily that incorporate game theory [26] and degree heterogeneity [27]; however, their theoretical predictions were not tested in real-world scenarios.
The aim of this work is to introduce a complex network model to represent the social connectivity of a community with socioeconomic classes.This model is inspired by the coupling pattern originally developed for studying the neurophysiological phenomenon called spreading depression [28] and also used in investigations on the spread of contagious diseases in a host population [29][30][31].
This manuscript is organized as follows.In Section 2, a new model of complex network is proposed.In Section 3, the topological structures of the networks built with sociodemographic data from France, Peru, and Zimbabwe are characterized by computing average values of node degree, shortest-path length, clustering coefficient, closeness centrality, betweenness centrality, and eigenvector centrality.In Section 4, the numerical results obtained in these computer simulations are discussed from a public health perspective, by taking into account the COVID-19 pandemic.

The model of complex network
Let a square lattice be composed of η × η cells, in which each cell corresponds to an individual.Thus, there are N = η 2 individuals in this society.In order to avoid edge effects, the top and bottom edges are connected and the left and right edges are also connected.Therefore, all individuals living in this lattice are equivalent from a geographical standpoint; that is, their spatial coordinates can be neglected.Consider that the index α = 1, 2, ..., N labels an individual belonging to the social class x ∈ {l, m, u}, in which l denotes lower class, m middle class, and u upper class.Undirected connections between individuals are created by a random process, in which the α-th individual is connected to k α others placed within the square matrix of size 2r + 1 centered in such an individual (self-connections and multiple connections are not allowed).Here, for the α-th individual, a number σ α is randomly picked from the standard uniform distribution.Then, the value of k α is obtained from ρ(k α ) = σ α , in which ρ(θ) is a power law given by ρ(θ) ∝ θ −δ with δ = 2.5 (because the degree distribution for most social networks has 2 ≤ δ ≤ 3 [11,12,15]).The value of δ remains fixed and it is equal for the three countries.Also, the minimum and maximum degrees of the degree distribution must be conveniently chosen in order to adjust the average degree of the model to the average degree found in the real-world populations.
In the proposed model, the probability of linking two individuals depends on the distance between them and on their social classes as follows.The distance between the individuals is taken into account in the term q i α , which is the probability of creating a connection between the α-th individual and any individual at the i-th layer of the square matrix of size 2r + 1 centered in this α-th individual.Here, q i α is obtained from: with i = 1, 2, ..., r and r i=1 q i α = 1.The i-th layer is formed by individuals with Moore radius equal to i [32].For instance, for r = 2, the square matrix centered in the α-th individual is 5 × 5. Therefore, there are 8 individuals in the layer i = 1 and 16 individuals in the layer i = 2 (8 + 16 plus the central α-th individual is equal to 5 × 5 = 25 individuals).For r = 2, Eq.(2.1) gives q 1 α = 2/3 and q 2 α = 1/3; thus, the chance of connecting the α-th individual to any of the 8 individuals forming the layer i = 1 is 2/3 and to any of the 16 individuals forming the layer i = 2 is 1/3.Table 1 illustrates an individual with six neighbors in a lattice with r = 2. Table 1.A block 5 × 5 of a lattice with r = 2 showing the neighborhood of a single cell.In this example, the central cell (white) has four neighbors in the layer i = 1 (light gray) and two neighbors in the layer i = 2 (dark gray).Recall that m denotes middle class and l denotes low class.The empty cells are occupied by individuals that are not neighbors of the central cell; hence, their social classes are omitted.In this model, the probability of two cells become connected (neighbors) is given by Eq. (2.3), which depends on their distance according to Eq. (2.1) and on their social classes according to Eq. (2.2).
Let n y α be the number of neighbors of the α-th individual belonging to the social class y ∈ {l, m, u}; thus, n l α is the number of lower-class neighbors, n m α the number of middle-class neighbors, and n u α the number of upper-class neighbors.Evidently, n l α + n m α + n u α = k α .In the creation of the complex network, the social classes of the individuals are taken into account in the term s xy α defined as: in which w xy is a weighting factor which depends on the country where these people live.The higher the value of w xy , the higher the probability of two individuals of the classes x and y being socially connected.Recall that the α-th individual belongs to the social class x.Obviously, y={l,m,u} s xy α = 1.
For instance, assume that the α-th individual belongs to the middle class (that is, x = m) and n l α = 2, n m α = 4, and n u α = 0 (that is, this individual has two lower-class neighbors and four middle-class neighbors), as in the example shown in Table 1.Also, assume that in the region where they live, w ml = 4, w mm = 8, and w mu = 1.For this α-th individual, then s ml α = 1/5, s mm α = 4/5, and s mu α = 0.In the proposed model, the probability Q j α of the α-th individual of the class x being connected to an individual of the class y in the layer i is given by: Notice that the number of social classes considered in the model can be easily changed.This network model with s xy α = 1 (that is, a single social class) and by taking k α as a constant (instead of taking k α from a power law ρ(θ) as done here) was already employed in other works [28][29][30][31].
In short, the model parameters are: η (the lattice size), ρ(θ) (the power law used to determine the degree k α of the α-th individual), r (the Moore radius of the area where the connections can be made), W (the matrix 3×3 formed by the weights w xy , with {x, y} ∈ {l, m, u}), and the percentage of individuals in each social class.

Numerical results
Here, the topological structure of each graph is characterized by computing P(k), k , k l , k m , k u , , c , C c , C b , and C e .These symbols and the corresponding measures are defined below.
The degree distribution P(k) expresses how the fraction of individuals with degree k varies with k.The average degree of the whole population k is given by [10][11][12]15]: in which k min and k max are the minimum and maximum degrees found in the network, respectively.
Here, the average degree of the lower-class individuals k l is also calculated by considering only the links in which at least one endpoint is a lower-class individual.Likewise, the average degrees of middle-class individuals k m and of upper-class individuals k u are computed.
For the α-th individual, the clustering coefficient c α is defined as [10][11][12]15]: in which e α is the number of connections among its k α neighbors.
Centrality measures are usually employed to quantify the relevance of the nodes composing the network.The closeness centrality C c (α) of the individual α is defined as [15,33]: The betweenness centrality C b (α) of the individual α is defined as [15,33]: in which g βγ is the number of shortest paths between the individuals β and γ and g βγ (α) is the number of shortest paths between the individuals β and γ passing through the individual α.The eigenvector centrality of the individual α is determined from [12,33]: in which λ is the greatest eigenvalue of adjacency matrix A formed by the elements a αβ , so that a αβ = 1 if the individuals α and β are connected, and a αβ = 0 otherwise.For the whole network, average values of the measures defined by Eqs.
and C e = N α=1 C e (α)/N.Undirected graphs were built by using Eq. ( 2.3) and sociodemographic data from France, Peru, and Zimbabwe.Table 2 shows the actual percentages of individuals in each social class in these three countries [34][35][36].These percentages determine the numbers of individuals of each class in the graph.In the simulations, η = 100 (thus, N = 10000), r = 10, and the matrix W is written in terms of a single parameter ω as: w ll w lm w lu w ml w mm w mu w ul w um w uu Thus, w ll = w mm = w uu = ω, w lm = w ml = w mu = w um = ω/2, and w lu = w ul = 1.Assume that the value of ω decreases with the Human Development Index (HDI) and increases with the Gini coefficient.Since HDI France > HDI Peru > HDI Zimbabwe [37] and Gini France < Gini Peru < Gini Zimbabwe [37], then ω France < ω Peru < ω Zimbabwe .The values chosen for the constant ω are ω = 4 for France, ω = 10 for Peru, and ω = 40 for Zimbabwe.Thus, the weights w xy are assumed to be more uniform for France and more heterogeneous for Zimbabwe, which is consistent with the HDI and the Gini coefficient for these countries.Notice that the weights for connections between individuals of the same class are privileged, in agreement with results found in studies on homophily [20][21][22][23][24][25].Alternatively, the matrix W could be written in terms of two or more parameters, in order to represent different connectivity patterns.
For the α-th individual (for α = 1, 2, ..., N), k α is obtained from ρ(θ) = Aθ −2.5 for k min ≤ θ ≤ k max , in which A = 1/( k max θ=k min θ −2.5 ) is a normalization constant.Suppose that k min = 11 and k max = 39 for France; k min = 11 and k max = 36 for Peru; and k min = 7 and k max = 31 for Zimbabwe.Let k be the average number of daily contacts per individual typical of each country found in the literature.For France, k = 17 [38]; for Peru, k = 16 [39]; and for Zimbabwe, k = 11 [40].It is assumed that the complex network created according to Eq. ( 2.3) is suitable to represent the social contacts in each country if k k; that is, if the average degree of the computer-generated network is close to the average degree found in the real world.
Table 3 presents the values of k , k l , k m , k u , , c , C c , C b , and C e .Table 4 exhibits the classes of the 100 individuals with the highest values of k , c , C c , C b , and C e for the three countries.Observe that these numbers are close to the values of k found in the literature [38][39][40] and mentioned above.By considering the average degrees of the social classes given by k l , k m , and k u , the middle class is more connected than the other two classes in France, and the lower class is more connected than the other two classes in Peru and Zimbabwe.
Table 3 Hence, the value of c does not distinguish Peru from Zimbabwe and the value of C b does not distinguish France from Peru.These inequalities suggest that the neighbors of an individual are more connected in France and there are more individuals controlling the flow of information in Zimbabwe.Surprisingly, the values of C e were found to be identical for the three countries.
The values of k l , k m , and k u shown in Table 3 and the results presented in   4 for each topological measure are different from the sociodemographic data shown in Table 2.This computer experiment was repeated three times for each country.The standard deviations associated with the values shown in Table 3 were about 1%-3% and about 0% − 10% in Table 4.For better readability of the results, the deviations were omitted in these tables.
Figure 1 shows the double-logarithmic plot (log base 10) of the degree distribution P(k) (black dots) and the fitted function P 0 (k) = B 0 k −δ 0 (red line) for the three countries for k min ≤ k ≤ 2 k.Table 5 presents the values of B 0 , δ 0 , and the mean square error determined from the least square fitting method [41].Notice that, for the three countries, the degree distributions follow a power law with δ 0 ≈ 2.5, as expected.By considering the whole range of k, a better fitting is obtained with the function P 1 (k) = B 1 k −δ 1 10 δ 2 k 10 (blue line), as shown in Table 6 and Figure 1.Observe that δ 1 ≈ 2.5 for the three countries.The exponential tails exhibited in Figure 1 were already found in other realworld networks [16,42].They appear when the highest connected nodes have degrees lower than those predicted by a pure power law.In our model, this exponential cutoff is affected by η and r.   5 and 6 present the values of B 0 , δ 0 , B 1 , δ 1 , and δ 2 for France, Peru, and Zimbabwe.

Discussion and conclusion
In this work, scale-free graphs were numerically generated and analyzed.These graphs represent the daily personal contacts occurring in a society with three social classes.Socioeconomic data from France, Peru, and Zimbabwe related to the social stratification and the income distribution in these countries were taken into account.For each country, the power-law exponent of the degree distribution and the average degree present realistic values.Here, it is assumed that more contacts mean more information being changed.This assumption concerns the volume of the disseminated information and not its quality.By taking into account this supposition, the results shown in Tables 3 and 4 can help to understand, for instance, the propagation of information on COVID-19 in the considered countries.
Information affects the perception of reality and the decision-making process.In fact, information can become a matter of life and death.Hence, in every country, authorities have been fighting fake news and misinformation on COVID-19.For instance, in France, a website was launched to provide reliable information about the use of drugs during the COVID-19 outbreak [43].In Peru, creating and spreading fake news about COVID-19 could be punished with a prison sentence [44].In Zimbabwe, a social networking service was used to disseminate trustworthy COVID-19 information [45].The results obtained here via computer simulations can help these three countries to realize how the interpersonal communication is influenced by the social stratification.
The COVID-19 pandemic highlighted economic inequality, since individuals belonging to the lower class had higher risk of loosing their jobs and their lives [5][6][7][8].Unfortunately, their fear of unemployment hampered the adherence to movement restriction measures; their low income made it difficult to improve personal hygiene habits.
During the pandemic, there was an overload of technical information, which might sound seemingly contradictory sometimes.Hence, the scientific findings on COVID-19 should have been summarized and rephrased to facilitate its understanding.In addition, public health interventions (such as implementing preventive protocols and conducting vaccination campaigns) should have been planned by taking into account the topological characteristics of the underlying structure of the social contacts.Personal experiences, unverified information, true news, and fake news related to COVID-19 are spread through the same network.This work suggests that middle class in France and lower class in Peru and Zimbabwe primarily affect the volume of information changed in these countries.
In brief, the model of complex network proposed here can stress the influence of each social class in the propagation of information in every country, which can guide the development of strategies for disseminating scientifically accurate information.

Figure 1 .
Figure 1.Log-log plots (log base 10) of the degree distribution P(k) of the computergenerated graph (black dots), the fitted function B 0 k −δ 0 (red line) for k min ≤ k ≤ 2 k, and the fitted function B 1 k −δ 1 10 δ 2 k 10 (blue line) for k min ≤ k ≤ k max .Tables5 and 6present the values of B 0 , δ 0 , B 1 , δ 1 , and δ 2 for France, Peru, and Zimbabwe.

Table 3 .
The socioeconomic composition of the groups of the 100 individuals with the highest values of k , c , C c , C b , and C e for France, Peru, and Zimbabwe obtained in three numerical simulations.

Table 3
shows that k France k Peru > k Zimbabwe .
also shows that Zimbabwe .Since C c increases by decreasing , this table consistently shows that C c France C c Peru > C c Zimbabwe .These relations suggest that information travels faster in France and Peru than in Zimbabwe.Also, c France > c Peru c Zimbabwe and C b France

Table 4 .
Average degree of the whole population k , average degree of the lower class k l , average degree of the middle class k m , average degree of the upper class k u , average shortest-path length , average clustering coefficient c , average closeness centrality C c , average betweenness centrality C b , and average eigenvector centrality C e for France, Peru, and Zimbabwe obtained in three numerical simulations.information is mainly controlled by the middle class in France and by the lower class in Peru and Zimbabwe; however, middle class has a greater influence in Peru than in Zimbabwe.Despite the predominance of the middle class in France and of the lower class in Peru and Zimbabwe, the proportions shown in Table

Table 5 .
Values of B 0 and δ 0 for the fitted function P 0 (k) = B 0 k −δ 0 for k min ≤ k ≤ 2 k obtained from the graphs built for France, Peru, and Zimbabwe.

Table 6 .
Values of B 1 , δ 1 , and δ 2 for the fitted function P 1 (k) = B 1 k −δ 1 10 δ 2 k 10 for k min ≤ k ≤ k max obtained from the graphs built for France, Peru, and Zimbabwe.