Epidemic extinction in networks: Insights from the 12,110 smallest graphs

We investigate the expected time to extinction in the susceptible-infectious-susceptible (SIS) model of disease spreading. Rather than using stochastic simulations, or asymptotic calculations in network models, we solve the extinction time exactly for all connected graphs with three to eight vertices. This approach enables us to discover descriptive relations that would be impossible with stochastic simulations. It also helps us discovering graphs and configurations of S and I with anomalous behaviors with respect to disease spreading. We find that for large transmission rates the extinction time is independent of the configurations, just dependent on the graph. In this limit, the number of vertices and edges determine the extinction time very accurately (deviations primarily coming from the fluctuations in degrees). We find that the rankings of configurations with respect to extinction times at low and high transmission rates are correlated at low prevalences and negatively correlated for high prevalences. The most important structural factor determining this ranking is the degrees of the infectious vertices.


I. INTRODUCTION
The susceptible-infectious-susceptible (SIS) model is the canonical model of infectious diseases that leave people resusceptible to the disease upon recovery. As other compartmental models of infectious diseases [1,12], it consists of two main components. First, a local description of how the disease spreads between pairs of people, and dies. A susceptible individual in contact with an infectious individual becomes infectious with a rate β; infectious persons become re-susceptible with a rate ν. Second, every epidemic model also describes how people come in contact with each other. Traditionally, one have assumed a fully-connected, or wellmixed, scenario-that anyone can meet anyone else with the same chance at all times. Lately, it has become popular to assume the population is connected into a network and everyone connected by an edge have equal probability of meeting one another, while pairs with no edge will never meet.
Research on the SIS model typically focuses on one of three questions. First, in a finite population how long time does it take for the outbreak to die out [8,9,26,28,32]? Second, in an infinite population, there will be a threshold value of β (given ν) below which the outbreak inevitably dies out and above which it can live forever. This line of research investigates how the network structure-the probability distribution of degree (the number of neighbors), the number of triangles, etc.-affects the threshold [29]. In almost all cases (Ref. [14] being an exception) authors have explored the large-size limit by stochastic simulations or approximative calculations. Other questions include the ranking of important vertices with respect to the outbreak [30] and the chains of events that are most likely to lead to extinction [13].
In this work, we will investigate a mix of the questions above. Namely how the network structure and the position of infectious vertices affect extinction in small connected graphs.
To our knowledge, this is the first study to investigate the time to extinction from different configurations (of who is susceptible and who is infectious). Scanning small graphs, however, has occasionally been used in network science [15,18,22]. Rather than addressing these questions with stochastic simulations, we calculate the exact expression for the expected time to extinction as a function of β (we set ν = 1 without loss of generality). This approach is computationally expensive, so we restrict ourselves to connected graphs of eight vertices or less. On the other hand, we are not restricted by graph models but can go through every distinct (non-isomorphic) such graph; 12,110 in total.
Our non-stochastic computational approach makes it possible to discover exact relations among small graphs, such as: what the smallest graph is such that the ranking of configurations' extinction times is independent of β. We can also discover scaling relations, whose validity on one hand is only verified for small graphs, on the other hand cover the largeβ regime that is inaccessible for stochastic simulations. In general, the study of small graphs could be seen as a complement to the (more common) large-scale studies. Of course, the large-scale limit is a limiting case just like small networks are. Once can argue that some types of networks are so small that the lack of self-averaging of the large networks, makes this approach just wrong. Animal trade networks [2], for example, could be represented as a graph where the nodes are farms (technically speaking metapopulations). These are often small by design, to restrict outbreaks.
In the rest of the paper, we will describe our approach, in parallel present one example and introduce the general theory. Then we will go through the numerical findings, first in the limit of large β and finally study how the ranking of configurations of susceptible and infectious nodes depend on β.

A. The SIS model
Assume a graph G = (V, E) with N vertices labeled from 0 to N − 1. Let φ i be a binary state variable (φ i ∈ {0, 1}). We will interpret φ i = 0 as vertex i being susceptible while φ i = 1 means that i is infectious. In the common formulation of the SIS model [7], the probability of a susceptible vertex i being infected by an infectious neighbor j is β per time unit, independent of when j was infectious. Likewise, the recovery of j is time independent, leading to an exponential distribution of the duration of infections. Without loss of generality, we can take the recovery rate to unity. This lack of memory (i.e. Markov property) means that we can encode the current situation of the outbreak into a number Giving s ∈ [0, 2 N − 1]. Just like reading the string of S and I as 0 and 1 and interpreting it as a binary number. We will refer to s as a configuration of vertex states. Next we will proceed to set up the equations for the expected time to extinction from a certain configuration. The derivation closely follows the derivation of the master equations (or Kolmogorov equations) giving the probability of the system being in a certain configuration [7,19]. We thus effectively treat the SIS dynamics as a random walk in the space of configurations s, where s = 0 is an absorbing configuration [23]. Now let I(s) be all configurations reachable from s by an infection event and S(s) the set of configurations reachable from s by a recovery. Let ω s = |S(s)| be the number of infectious vertices, a.k.a. the prevalence. Let m st be the number of edges between an infectious vertex in configuration s and the vertex that is susceptible in s and infectious in t (in our encoding of configurations, this vertex is log 2 (t − s)). Because of the exponential distribution of the durations in the susceptible and infectious states, the rates of events are additive. The total event rate z s (β) is where m st is the number of infection events that would turn s into t. This gives the expected duration of configuration s as 1/z s (β). The probability that the next configuration becomes t via a infection event is β m st /z s (β), while the probability of the next configuration t reachable through a recovery event is 1/z s (β), see Fig. 1.

B. Expected time to extinction
Consider a graph G. Let x s denote the expected time to extinction from configuration s. We can write down selfconsistency equations for x by noting it is the expected life time of the configuration s, T s = 1/z s (β), plus the expected extinction times of the configurations reachable from s times their transition probabilities. Symbolically: By the elementary laws of probability and the probabilities given in the previous section, this equation becomes From the above equation we can write the equation in the matrix form , and U(β) is a polynomial matrix [10] (since some of the its elements depend on β parameter) defined by: where we use the property that s − t = 2 i , i ∈ V, if and only if the only difference between s and t is that vertex with number i is infectious in s and susceptible in t. Extending Eq. (4) for all configurations s generates a linear system of equations with as many equations as unknowns. We can thus solve it (we use Gaussian elimination in favor of more elaborate methods [24]) to get the expectation value of the extinction times from any initial configuration s.
2. Above the diagonal, the elements are integers times β.
3. At each row, except the last (corresponding to the allinfectious configuration) there are two β-dependent elements. The constant coefficients of these terms sum to zero.
4. The diagonal is such that rows sum to zero, except the rows representing states that can reach s = 0 by one recovery event, then the row sum is −1.
Moreover, we note that the number of automorphic equivalence classes n defines the rank of the reduced matrix.

C. Algebraic calculations
Solving Eq. (5) is computationally complex. The major bottleneck is the polynomial algebra (to be precise-calculating the greatest common divisor needed to reduce the fractions of polynomials to their canonical form). The code was implemented in C with the FLINT library [11] for polynomial algebra. To group automorphically equivalent configurations, it also relies on the subgraph-isomorphism algorithm VF2 [5] as implemented in the igraph C library [6]. Finding subgraph isomorphisms-although a classical, computationally hard problem-is in practice relatively quick and this enables us to discover and exploit all symmetries rather than a priori focusing on symmetrical graphs (cf. Ref. [19]).

D. Small distinct graphs
We systematically evaluate small distinct (non-isomorphic) connected graphs of sizes up to 8 vertices: 3 ≤ N ≤ 8. There are two such graphs with N = 3, six with N = 4, 20 with N = 5, 112 with N = 6, 853 with N = 7 and 11,117 with N = 8, in total, 12, 110 graphs for 3 ≤ N ≤ 8 vertices. To generate these, we use the program Geng [25]. They can also be downloaded and viewed at http://www.graphclasses. org/smallgraphs.html.

E. Kendall's τ
We compare several types of correlations (e.g. between structural measures and times to extinction) in this work. To do that, we will use Kendall's τ, a rank-type correlation coefficient. It is defined as the fraction of pairs connected by a line with a positive slope, minus the fraction of pairs connected by a negative slope [20]. If its value is +1, there is a perfect correlation between the ranks of all data points; if the value is −1, there is a perfect anti-correlation; τ = 0 represents no correlation. We use this coefficient rather than other popular ones for three reasons. First, the output data is typically not Gaussian, so the premises for Pearson's correlation coefficient is violated. Second, to reduce the disk space usage we do not store the explicit expressions of x, but rather the order of them in the large and small β limits (the actual values are not needed to calculate τ, only the rank). Third, the number of data points is small enough to use Kendall's τ rather than the faster, but less principled, Spearman rank correlation coefficient.

A. An example
We start the discussion of our results by examining the example of Section II B and Fig. 1. Many properties of the solution, Eqs. (10), hold also for other N.
First, in the small limit of β, the solutions are the harmonic numbers of ω s . This follows immediately from the dynamics defined above-all events are recovery events, the time to the next event is 1/ω t , where ω t decreases by one every event, leading to the harmonic number s t=1 1/ω t . Second, for large β, the extinction time approaches the asymptote uβ N −1 . For all graphs we study, u is constant with respect to s but dependent on graph structure G. Below, we study u for all our graphs.

B. Solving our example with Cramer's rule
In this section, we introduce Cramer's rule as a way to solve the extinction times. This is a computationally inefficient method, but the way to get some analytic insights into the asymptotic behavior of uβ N −1 (as we will in subsequent sections). Let us derive this exponent for Eqs. (10). To do this, we apply Cramer's rule to the polynomial matrix Y(β) denoted further as Y (we will drop the β argument for most of the derivation below). Cramer's rule states that the s'th element of vector x from Eq. (5) is where Y s is a matrix obtained from Y by replacing the s'th column by the vector −1 (i.e. all elements being minus one). Let us consider x 1,4 for our example above (x 1 and x 4 are identical since 1 and 4 are automorphic). We will use the row and column indices of Y in this section.
In order to calculate the polynomial degree of determinant of matrix Y s we first make a subfactor expansion along the first column of matrix Y. This gives the expressions: where M st is the determinant of the matrix Y without s'th row and t'th column (i.e. the st-minor of Y). We find that giving (via Eq. (12)): which is in agreement with the numerical results from Eq. (10).

C. Asymptotic scaling: exact relations
We will prove that the leading term of x s is uβ c for an integer c. For all our 12,110 graphs and all 2,963,056 configurations, we have c = N − 1. We believe this holds in general, but we have to leave a proof of that for the future.
From Eq. (11), we see that our assertion will be true if we can show that the leading term of det Y s is independent of s. In the Appendix, we show that the determinants of the ns-minors of Y (cf. Eq. (14b)) have leading terms of polynomial degree n − 1 and the same prefactor, independent of s. Such a large polynomial degree is impossible to attain for st-minors with s < n, since they have rows not containing any β, some of the n − 1 factors of the Leibniz expansion of the determinant must have polynomial degree zero with respect to β. Thus the leading behavior of det Y s comes from M ns and is unique. Since det Y is trivially independent of s, the leading behavior of x s is also s-independent. If c = N − 1, we can conclude that det Y = n − N + 1 For our example graph (and some other simple graphs of N ≤ 4 we check) it holds that for all states t. If this is true in general, then, curiously, det Y is determined by the minors corresponding to the configurations with lowest prevalence and det Y s the one with the highest prevalence. This is reminiscent of current-flow networks where the determinant of the st-minor of the adjacency matrix is proportional to the potential drop between s and t [4].

D. Numerical results for the large β asymptotics
As mentioned above, the β → ∞ behavior is the same for all configurations s, namely x s = uβ N −1 + O(β). In this section, we investigate how the sizes of the graphs control the prefactor u.
As we can see in Fig. 2(a), for a given N, u is a power-law of the number of edges M (keeping in mind that u depends on For the graphs we study, the coefficients α and ln u 0 have a close to linear dependence of N ( Fig. 2(b) and (c)): where the number in parentheses represents the standard errors in the last digit. Of course, the error estimates are based on the data we have and subjected to small-size effects. In other words it is conceivable that α could be taken as N − 1, giving a large-β approximationx of the extinction time x: with constants a ≈ 126 and b ≈ 0.0268. Note also that there is a weak but consistent bend (negative second derivative) of ln u 0 as a function of N (i.e. a and b seem to be slowly varying functions of N).
As seen in Fig. 2, u(G) is not completely determined by Eq. (17)-there is also some spread of the points for a given N and M. To understand what causes two graphs of the same N and M to differ, we try several structural predictors: the clustering coefficient (a.k.a. transitivity-the fraction of triangles among all connected subsets of three vertices), the degree assortativity (the Pearson correlation of degrees at either side of an edge), the average distance (d(i, j)-the fewest number of edges of any path between i and j), and the standard deviation of the degree. See Refs. [3,27] for detailed descriptions of our measures. For all pairs of N and M, we calculate the correlation between the u and these measures, then we average these values over all graphs. The results, shown in Table I shows that all correlations, except the one with degree assortativity, are negative and the strongest correlation is with the standard deviation of degree σ k . This means epidemics in graphs with more homogeneous degree sequences tend to last longer in the large β limit. The relationship between u and σ k is shown explicitly in Fig. 4. Indeed, for every combination of N and M the u vs. σ k curves are almost always decaying. We highlight the curves with largest range in ln u, and note that these occur for close to maximally dense graphs.

E. Pervasiveness of β-dependent rankings of configurations
Already from Fig. 1, we know that the ranking of configurations with the same number of infectious vertices can depend while the right configuration has the numerator: (with the differences highlighted by bold face). Needless to say (since they also share the small-β asymptotics) plotting them in the same graph does not show any visible difference. This is an example of a result that would be almost impossible to detect by stochastic simulations. As N increases, there are more opportunities for symmetry breaking configurations with the same ω s (below, where it is clear from the context, we write simply ω). One scenario is that for even larger N, the only graphs with β-independent rankings are the fully connected graphs (because for them, all configurations of the same ω are automorphically equivalent and this hence simplifies the system of equations from Section II B). We note that fully connected graphs are the most common interaction structures studied in the literature, and perhaps an unfortunately atypical case.

F. Correlation of asymptotic behaviors
In this section, we continue the investigation of the βdependence of the rankings of expected extinction times. After calculating x, we rank the configurations in the limits of large and small β. Let r L (s, G) be the normalized rank of configuration s among all configurations of the same prevalence I in the large β limit; and r S (s, G) the corresponding quantity as β → 0. Then we use Kendall's τ coefficient to measure the correlation between r L and r S .
In practice, we first put all x s on the minimal common denominator and compare the numerators (which are polynomials with integer coefficients). To rank polynomials in the small β limit, one first compare the constant coefficient (which is the same for all configurations of the same prevalence), then we use coefficients of increasing polynomial degree as tiebreakers. To rank polynomials as β → ∞, one goes through the coefficients in the opposite direction-one polynomial is considered larger than the other if the highest order coefficient where they differ is larger. As the full equations of the solution for x s (β) take much disk space, we only save the rankings.
With the ranking of the solutions at hand, we proceed to calculate τ (using the NumPy library of Python). In Fig. 5(a), we see τ as a function of the prevalence ω averaged over all connected graphs of given number of vertices N = 3, . . . , 8. τ is strictly decreasing from +1 to −1. The decrease (in particular for larger N) is faster in the beginning and end than in the middle. Since the curve gets consistently less steep with N for intermediate prevalence values, it seems possible that the curve would flatten out with a growing N. In other words, for configurations with few infectious vertices, the ranking is rather independent of β; while for high-prevalence configurations, all curves of expected time to extinction will cross as β increases. In Fig. 5(b), we take a closer look at the N = 8 graphs and split the average into different curves depending on the number of edges. τ has an intermediate maximum for M = 12 while the sparsest graph (M = 7) has smaller τ than the densest (M = 27). For graphs close to the maximum number of edges, the region of slower decrease for small ω is almost gone.

G. Structural determinants of the asymptotic behavior
From the analysis of Fig. 5, we know that the number of vertices and edges affect the ranking of configurations. Now we will look at more detailed explanations based on graph structure-what determines the ranking for graphs of the same N and M? Fig. 6 presents a case study for N = 8 and M = 14 (when exactly half of the vertex pairs are connected by an edge).
To measure the correlation, we once again use Kendall's τ. We pick four structural measures to characterize a configuration. Then we correlate each one with either r S or r L . We present these measures briefly below. For a thorough account, we refer to Refs. [3,27].
1. Average degree-number of neighbors-of the infectious vertices. Degree is the simplest notion of centrality, but also local (in the sense that a vertex' degree is only dependent on its neighborhood).
2. Average eigenvector centrality over the infectious vertices. The eigenvector centrality is given by the eigenvector corresponding to the leading eigenvalue of the adjacency matrix. It is perhaps the most straightforward generalization of degree to account for the idea that being close to central vertices makes a vertex central.
3. Average vitality of the infectious vertices. Vitality is the general class of measures based on measuring the response of some graph descriptor on the deletion of a vertex [21]. Following Ref. [15], we define the vitality (technically component-size vitality) of vertex i to be v(i) = [S(G) − 1]/S(G \ {i}). Where S(G) denotes the number of vertices of the largest connected component of graph G. This measure will be very close to one for larger graphs and would thus be unsuitable if one were to scale this study up. It is, on the other hand, interesting as it is directly measuring the contribution the presence of a vertex makes in a worst case, β → ∞, scenario.
4. Average distance d(i, j) between infectious vertex pairs i j. This is the only measure we use that does not involve averaging a centrality measure.
In addition to these we also try the standard measures betweenness and closeness centrality, but these do not contribute much to the understanding. Partly because they are very correlated with vitality and eigenvector centrality, respectively, for these small graphs. Partly because their rationales involves flows along shortest paths between random pairs-a type of process not present in disease spreading.
In Fig. 6, we plot results of the correlation coefficient between the the above measures of position and the ranks of the configurations for all connected graphs with N = 8 and M = 14. Fig. 6(a) shows the case of large β. For configurations with low prevalence, we see that large degree is strongly correlated with the extinction-time ranking. This is natural, since for low β and ω secondary effects are negligible-the number of ways to increase ω counts more than other factors, i.e. the degree. Eigenvector centrality behaves almost like degree. Although not identical, these two quantities are strongly correlated for the small graphs we study, so it is natural the values are close. Somewhat interestingly, which one of these that gives the largest τ varies with β in an irregular way. For sparser graphs (than the one in Fig. 6), that are more prone to disintegrating upon vertex-deletion, vitality shows a correlation on par with degree and eigenvector centrality. Finally, the average distance shows an increasing correlation with ω. That the infectious vertices are far away means that the surface to the susceptible vertices are larger and thus that the next event is more likely to be an infection event.
The structural correlations for the small-β case ( Fig. 6(b)) is no surprise in the light of Fig. 5 and Fig. 6(a). Since there is a correlation between r S and r L at small ω and a corresponding anti-correlation at large ω, we expect the correlations with structural measures to be similar between the small and large β cases for small ω and different for large ω. This is also rather accurately describing what happens. In this case, the degree and eigenvector centrality are strongly correlated with r S for all ω, while the distance becomes strongly anti-correlated for large ω.

IV. DISCUSSION
We have studied the extinction of SIS epidemics on small graphs. We did so by calculating the exact expressions for the expected time to extinction for all connected graphs between three and eight vertices.
We find that, for a given graph, the limit behavior as β → ∞ is independent of the configuration of susceptible and infectious, while for β → 0 it is (trivially) just dependent on the number of infectious vertices. Of course both these limits are not of an immediate practical interest, but to understand the in-between reality we need to understand the extremes.
The large-β asymptote u of the extinction time depends on the size and structure of the graph-for large β we find u = u 0 M α β N −1 where both u 0 and α are linear functions of N for the graphs we investigate. Our final formula for the large-β behavior of x is a(bβM) N −1 where a ≈ 126 and b ≈ 0.0268. This super-exponential N-dependence is in line with earlier observations [8,9,14,26,28,32]. Simply speaking, even though there is a finite chance of extinction of SIS epidemics in finite graphs, for β only a little more than one, this probability is so small it can be ignored for all practical purposes for all but the smallest graphs. For graphs of the same number of vertices and edges, the strongest determinant of the asymptotic behavior of the time to extinction is the variability (measured by the standard deviation) of the degree. Finally, we find that outbreaks tend to last longer in graphs of heterogeneous degree distributions.
Furthermore, given an interaction graph, we investigated when the ranking of configurations with respect to extinction time changes. For configurations with few infectious vertices, the rankings are typically the same for large and small β; for configurations with many infectious vertices, there is an anti-correlation between the extinction times in the large and small β limits. The main structural predictors for the rank of configurations with the same number of infectious vertices (for the same graph) are degree and eigenvector centrality, while the correlations with vitality and inter-vertex distance are weaker.
The main contribution of this paper is to give a view of the relation between graph structure and epidemic behavior from another perspective than the usual. Rather than studying the N → ∞ limit by stochastic simulations, we study exact expectation values of small graphs. This enables us to discover hypotheses that could be tested in standard stochastic simulations. It also makes it possible to discover the smallest graphs and configurations with some specific properties. For example, the graph of Fig. 1(a)-where the configurations 3 and 6 have longer extinction times than configuration 5 in the interval 0 < β < 1/2 and vice versa for β > 1/2-is the smallest of a graph where configurations change the order of expected extinction time with β. The answer to the reversed questionwhat is the smallest graph where all same-prevalence configurations are ranked equally for large and small β-is the triangle E = {(0, 1), (1, 2), (2, 0)}. In fact, all graphs where at least two vertices are in different positions (not automorphically equivalent) do not have the equal-ranking property, but 20 of 23 such graphs we study where all vertices are equivalent do have it. On one hand, these observations do not generalize, and thus follow more of a mathematical mode of scientific exploration. On the other hand, they are the basis of hypotheses that could be tested by future theory. For example, for every graph where not all nodes are equivalent, will the ranking of configurations always depend on β?
We anticipate more computational epidemiology studies without random numbers in the future, and simulation studies testing the findings in this work holds for larger graphs. It would also be interesting to go beyond expected times and de-rive the probability distribution of extinction time. That would need a different computational approach. For a model of networks one could consider mapping the problem to that of mean first-passage times [16,17,23], or combinatorial stochastic processes [31].