Modeling complex network epidemics using nodes' roles

In this work we investigate the usefulness of node roles of complex systems in dynamical processes. More specifically, we use the characteristic equations of an SIS epidemic spreading in order to simulate an actual epidemic spreading. The equations that are used, encapsulate the degree correlations in comparison to the role distribution and the results are compared with an actual epidemic release simulation.


Introduction
The study of complex systems is acquiring more and more interest in the past decade. A field that has been researched thoroughly, are the scale free networks. The key to the degree correlations on a complex network hase been the pioneering effort of Barabasi and Albert [1], [2], with their work on the "preferential attachment" rule. As a direct metric of the degree, the correlations that emerge from it, have vastly impacted the way of examining complex systems in general.
More recently, the scientific community has vigorously tried to analyze the repeated occurrences of local subgraphs in complex networks, named motifs [3]. These patterns are formed from clusters of nodes that have identical connectivity patterns. the motif correlations usually emerge by using as reference the Erdos-Renyi random network [4], [5]. This approach has been followed in many fields, but mainly in the field of biology [6].
The first attempts to characterize the nodes of a complex network with respect to their roles have gathered a lot of attention [7], [8] and [9]. This paper shall be used in order to demonstrate the attempt to analyze networks in the roles of the nodes that reside in them, and the merits of this analysis. This will be achieved by using the role distribution described in the work [10], in order to extract correlations between these roles and use them to model a dynamical process. The choice of dynamical process for this test, is an SIS epidemic spreading model, which shall be simulated in two different networks.

The SIS epidemic spreading model
The SIS epidemic spreading model, [11], [12] and [13], describes the spreading of an epidemic in a network by factoring in that in any time step there are two states in which a node can be. It can be either Susceptible or Infected. The transposition of the nodes between these two states is controlled by the characteristic general equations of the model: where: I t k are the infected nodes at time step t with degree k. S t k are the susceptible nodes at time step t with degree k. λ is the infection rate of the epidemic. µ is the recovery rate of the epidemic. Θ t k is the quantity that changes for the two distinct cases of degree analysis. These equations contain a quantity, Θ t k , which differs in the cases of no correlations between the nodes exist, or the network has been analyzed with respect to the degrees of the nodes and their interconnectivity.

Uncorrelated case
In this case, the network has not been analyzed and the only information that we have is the degree of each node. The quantity of Θ t k in this case gives the probability of the disease to infect a node with degree k and it can be factored in the general equations as:

Degree correlated case
In this scenario, it is assumed that a further analysis has be done to the network, and the correlations of the nodes are imprinted in the joint degree distribution, P (k, k ), of the network, which shows the probability of choosing arbitrarily an edge with endpoints k and k respectively (equivalent to P (→ k )). In this case, the quantity of Θ t k becomes:

Role analysis
Instead of looking at the network in terms of the degree of its nodes, we focus on the role of each node and the type of connection it might have with the other nodes of similar or different role. As explained in the work [10], after performing a role analysis in the network, we get a set of vectors which contain the number of connections of a role to all the others, denotes as CV (r, r ). Also we get the number of nodes that each role occupies as N r . Summing all values of each vector, we get the number of connections of the role to all the others, ergo its degree k r .

General equations
The same reasoning that was applied in the degree cases above, will produce similar results for the general equations of the epidemic spreading, but now from the point of view of roles. So: I t r are the infected nodes at time step t which have a role r. S t r are the susceptible nodes at time step t which have a role r. Θ t r is the probability of the disease to infect a node with role r at time step t.

Role correlations
The conditioned role probability, P (r |r) describes the probability of an edge that starts from a node with role r to end to a node with role r . This probability can be analyzed as the ratio of the probability to pick arbitrarily an edge from a node of role r to a node of role r , P (r → r ), over the probability of an edge to start from a node with role r, P (r →). So: and P (r →) = Nrkr E which means that the above equation becomes: It is interesting to note that the role correlations are shown to be independent of the number of nodes that occupy each role.
Since the above quantity is calculated with the connectivity vectors, Θ t r becomes: 3. Networks to experiment 3.1. Model created network The first network that is examined, is generated by the model described in [14], and shows a modified Barabasi-Albert model that generates small-world networks with scale-free properties. The network is generated with preferential attachment and with the appearance of local links on new nodes. The specific network, is generated with 2000 nodes.

Real network
One of the most interesting datasets that is distributed in the field of complex systems, is the network that describes the topology of the Western States Power Grid of the United States, compiled by Duncan Watts and Steven Strogatz [15]. That network has 4941 nodes and it is undirected.

Results and interpretation
The process allows us to demonstrate the behavior of an SIS epidemic spreading in the above networks, by factoring in certain information. In the graphs that we demonstrate, the curves represent the following: • The case where the number of infected nodes is calculated by factoring in only the degree of the nodes of each network. • The case where the number of infected nodes is calculated by factoring in the probability distribution of a connection between two degree values. • The case where the number of infected nodes is calculated by factoring in the connectivity vectors of the roles that are assumed by the nodes of the networks. • The case where the number of infected nodes is measured by releasing an infection within the networks and monitoring the actual spreading.
The last case presents the realistic scenario of an epidemic spreading in the network. Figures  1b and 1a show the measurements of the above cases in comparison to each other.
It is seen from Figure 1 that the actual spreading in both networks is stabilized in a lower value of infected nodes than all the other simulation cases. It is also seen that in both networks, the uncorrelated case and the case with degree correlations have very similar curves and progress almost identically, whereas in the case of role correlations, the curve differentiates and progresses significantly closer to the actual spreading, even though the value that it is stabilized is closer to the two first cases. This proves that when factoring role correlations in a dynamical procedure, the results give a more accurate approach of the inner dynamics of certain networks that cannot be explored differently.