Contact tracing in configuration models

Quarantining and contact tracing are popular ad hoc practices for mitigating epidemic outbreaks. However, few mathematical theories are currently available to asses the role of a network in the effectiveness of these practices. In this paper, we study how the final size of an epidemic is influenced by the procedure that combines contact tracing and quarantining on a network null model: the configuration model. Namely, we suppose that infected vertices may self-quarantine and trace their infector with a given success probability. A traced infector is, in turn, less likely to infect others. We show that the effectiveness of such tracing process strongly depends on the network structure. In contrast to previous findings, the tracing procedure is not necessarily more effective on networks with heterogeneous degrees. We also show that network clustering influences the effectiveness of the tracing process in a non-trivial way: depending on the infectiousness parameter, contact tracing on clustered networks may either be more, or less efficient than on network without clustering.


Introduction
Contact tracing is a frequently used method to control epidemic outbreaks. In this method, individuals who show symptoms of a disease, report themselves and identify their recent contacts which are then tested for the disease. If a contact tests positive, they is being isolated to prevent further spreading of the disease. In this way, an epidemic may be contained in its early stages.
The effect of contact tracing has mathematically been investigated by extending compartmental models, such as the SIR model, with an additional rule that infected individuals may be detected and removed with some rate that represents a tracing process [2,7,14], or by other differential equation approaches [11,10]. However, such compartmental models simplify the structure of contact networks by representing it with one numerical parameter. Complex networks on the other hand may have nontrivial structure, featuring heavy tailed degree distributions, clustering, and other phenomena. For example, the contact network of the HIV/AIDS epidemic in Cuba was found to be well-approximated by a power-law degree distribution [6], so that the proportion of vertices with k neighbors scale as k −τ . Such degree distributions feature a large variability of node degrees, with vertices of large degrees (also called hubs) being present along with large number of small degree nodes. We will refer to this phenomenon as degree-heterogeneity. Furthermore, power-law degree distributions where shown to cause important epidemiologic properties, such as vanishing epidemic thresholds [17,3], strong finite-size effects [18], and novel universality classes for critical exponents [8].
A recent simulation study suggested that contact tracing is more effective on networks with high degree-heterogeneity [13]. Intuitively, high-degree vertices infect more others than low-degree vertices, so that they are also more likely to be traced. Furthermore, quarantining high-degree vertices has a larger effect on the spreading of epidemics than quarantining low-degree vertices. Thus, on these types of networks, contact tracing is expected to be more effective than is predicted in the standard SIR-models due to degree-heterogeneity. In [1], this expectation was made more formal by showing that the tracing process becomes more effective when high-degree vertices are likely to install contact tracing apps.
While approaches in [13,1] rely on networks being locally tree-like, many real-world networks violate this condition and feature clustering: they contain a high density of triangles. Simulations suggest that network clustering has a strong positive impact on the effectiveness of the contact tracing process in homogeneous networks [12]. In general epidemics, clustering can either speed up, or slow down the spread of an epidemic process [19].
In this paper, we quantify the network effect on the effectiveness of contact tracing, by mapping it to a combination of bond-and site percolation models. We show that the extent to which contact tracing reduces the number of infections highly depends on the exact choice of tracing model. We show that when the tracing process is not immediate, but takes a nonzero amount of time, this drastically affects the outbreak size. We then investigate the effect of degree-heterogeneity and clustering on the effect of contact tracing on the final outbreak size using percolation models and find that clustering can either increase or decrease the effectiveness of tracing processes, depending on the infectiousness of the epidemic. This shows that the interplay between the underlying network structure and the exact choice of tracing process is delicate, and important to take into account.
We first describe the network model and define the tracing process in Section 2. Then we show the relation between the success probability of tracing and the characteristic time of the the tracing process. Section 3 analyzes the final outbreak size of our epidemic model with a generating function approach. We then study the effect of inducing clustering in the network in Section 4.

Network and tracing model
In this paper, we assume that the underlying network is given by the configuration model, a network model that can generate networks with any prescribed degree distribution (q k ) k≥1 [4]. In the configuration model, every vertex of degree k is equipped with k half-edges, which are paired uniformly at random. We assume that the disease spreads on this network as a bond percolation process: it removes each edge independently with probability 1 − π. While this is very simple variant of an epidemic process, the final size of a SIR epidemic with constant recovery duration can be identified as the size of the largest connected component after bond percolation [9]. In this setting, the effective basic reproductive number R 0 , or the average number of vertices infected by one infected vertex, is given by , where D denotes the degree of a uniformly chosen vertex [15].
We investigate the effect of a tracing process illustrated in Figure 1 on the final size of the epidemic. In this tracing model, every infected vertex 'reports' its infection independently with probability 1 − p s . After reporting, a vertex quarantines, so that it is unable to infect other vertices, as shown in Figure 1(b). Furthermore, a vertex that 'reports' itself as infectious lists its recent contacts and, with success probability p t , the infector of the reporting vertex is identified in this list. In this case, we say that the infector vertex was 'traced'.
After a vertex is 'traced', it quarantines, so that it is unable to infect other vertices. However, the traced vertex may already have infected other vertices before it was traced. We therefore model such secondary quarantining of the traced vertices by removing each edge incident to a traced vertex with probability 1 − δ. That is, the tracing process is modelled as an extra layer of bond percolation, see Figure 1(c).

Immediate or delayed tracing: the impact of δ
The probability that the connection to a vertex is removed when its parent is traced, 1 − δ, depends on the parameters p s and p t . Here we show how δ relates these parameters under two assumptions on the tracing process: immediate and delayed tracing, and discuss the impact of these assumptions on the effectiveness of the tracing process.

Immediate tracing
We first assume that the tracing process is immediate: once a vertex self-reports, it immediately traces its parent with probability p t . If successful, the traced vertex immediately quarantines and cannot infect other vertices anymore. We now show that this assumption leads to a degreedependent version of δ: δ k . Consider an outcome of the infection process as a tree composed of infected vertices. Tracing and self-reporting happens with the same probability, (1−p s )p t , for all infected vertices. Therefore, for a given infected vertex in the tree that infects k neighbors of which d neighbors trace it, the first of these d 'tracing' contact can be viewed as the first red ball drawn without replacement from an urn with d red balls and k − d black balls. The number of black balls drawn before the first red ball is on average (k − d)/(d + 1), which corresponds to the average number of infectious contacts of a vertex before it is first traced. Therefore, the average fraction of non-tracing contacts that occur before the vertex is traced equals 1/(d + 1).
The number of tracing vertices, d, is binomially distributed with parameters (k, (1 − p s )p t ), where k denotes the number of infectious contacts of the vertex. Using that E (X + 1) −1 = p −1 (1 + k) −1 (1 − (1 − p) k+1 ) when X is distributed as Bin(k, p), we obtain that the average fraction of contacts that appear before the first tracing occurs, δ k , equals so that δ k is decreasing in k (see Figure 2), and asymptotically, as k becomes large, we have: Thus, we see that δ k tends to zero when k becomes large, implying that for large values of k, only a vanishing fraction of contacts will not be traced.
Phase transition under immediate tracing. From (2) we obtain that the expected number of edges that remains for every vertex of degree k is asymptotically (1 − δ k )k ≈ 1/((1 − p s )p t ). As this quantity is independent of the vertex degree k, one might expect that the immediate tracing process removes the degree-heterogeneity. We will now show that the immediate tracing process is indeed very effective by calculating the critical value for the infectiousness parameter π, π c after which the epidemic outbreak becomes extensive. That is, when π < π c , the size of epidemic outbreaks are sub-linear in the total number of nodes, and when π > π c , this size is linear. When the outbreak size scales linearly with the total number of vertices, we call such outbreak extensive or giant.
In Appendix A, we show that there is a giant outbreak when  where the random variable D denotes the degree of a randomly chosen vertex in the network, and g D (x) its probability generating function, g D (x) = k q k x k . Thus, the critical value of the percolation parameter π c at which a giant outbreak is such that Figure 3 shows the value of π c for two choices of the degree distribution: a regular graph where every vertex has degree 4 (q 4 = 1), and a power-law degree distribution with exponent 2.65 and average degree 4 (q k = Ck −2.65 ). Interestingly, we see a qualitative difference between the tracing and no-tracing scenarios. Figure 3a shows that π c > 0 when p s , p t > 0 even for power-law distributions with degree exponent τ ∈ (2, 3). This means that under tracing, there is a regime for the infectiousness parameter π such that there are only small outbreaks. On the other hand, without tracing, π c = 0 for power-law distributions with degree exponent τ ∈ (2, 3) [17], showing that a giant outbreak always occurs regardless of the value of infectiousness π. Thus, this tracing process is very effective: it can reduce an extensive outbreak to have a sub-extensive size.
In the standard SIR model, a comparable qualitative change in the size of the outbreak corresponds to a bifurcation taking place when the basic reproduction number R 0 = 1. In the regular graph, Figure 3b, decreasing p s or increasing p t increases the critical value π c . Thus, when decreasing p s or increasing p t , there is a wider range of values of the infectiousness parameter π such that only small outbreaks occur, or alternatively, where the effective value of R 0 remains below one.

Tracing with delay
Even though the immediate tracing can result in a significant reduction of the giant outbreak, in practice, the tracing process may not be immediate. In what follows, we assume that there is a time-delay between the moment when a vertex self-reports and successfully traces its infector and the moment when the infector quarantines. We then again obtain an expression for the probability that the connection to a vertex is removed when its parent is traced, and obtain a degree-dependent version of the parameter δ: δ k .
Suppose that it takes time T for a vertex to self-report and trace its infector, and that all infections from a degree-k vertex occur as independent exponential time clocks of rate λ. In the time-window of length T (the incubation period) in which an infector is not traced yet, it can still infect others. Specifically, every remaining neighbor of the infector is infected independently in this time interval with probability 1 − e −λT .
If we denote the number of neighbors of a degree-k vertex that are infected during the incubation period by N q , and the number of vertices that were already infected before the incubation Figure 3: The critical percolation value π c from (4) as a function of p s and p t in networks with (a) a power-law degree distribution with exponent 2.65 and average degree 4, (b) a regular graph of degree 4.
Then, the average number of vertices that are infected before tracing occurs is and the average fraction of neighbors that are infected before tracing occurs is For large k, which is independent of k. This implies that we can use δ = 1 − e −λT as a proxy, instead of having a k-dependent δ.
We therefore use a k-independent value of δ throughout the rest of the paper, which assumes a tracing process that is not immediate.
Phase transition under delayed tracing In Appendix B we show that the critical value of π beyond which a giant outbreak occurs, satisfies Equation (6) implies that π c = 0 for power-law degree distributions with τ ∈ (2, 3), as then E D 2 , which appears on the left-hand side, diverges. Figure 4 shows the value of π c in regular graphs. We see that the value of π c is more sensitive to p s , the self-quarantining probability, than to p t , the tracing probability. Thus, increasing the effectiveness of the tracing procedure barely influences the value of the epidemic threshold, though it may still influence the final size of the epidemic.
The influence of the tracing process on the critical value beyond which an epidemic becomes extensive is substantially more pronounced when the tracing process is immediate. Under immediate tracing, there is a wider range of parameters where a giant outbreak becomes a sublinear outbreak (or where R 0 is pushed below one) than under delayed tracing.

Final outbreak size under contact tracing
We now investigate the size of the remaining outbreak after tracing using a generating function approach under fixed δ, as described in Section 2.2.1. In Appendix C, we show that in the largenetwork limit, the fraction of vertices in the giant outbreak S is given by where u is obtained by solving the implicit equation where g D * −1 (x) is the generating function for the excess degree distribution: . Figure 5a plots the size of the giant outbreak for networks with two different degree distributions, and shows that the analytical results of (7) match well with numerical simulations. By comparing the outbreak size with and without tracing, we can determine the effectiveness of contact tracing. That is, eff = S no tracing − S tracing , the outbreak size in an epidemic without contact tracing, minus the outbreak size in an epidemic with tracing. Here the outbreak size without tracing can be obtained by setting p s = 1. Figure 5b plots the effectiveness of contact tracing for two networks with the same average degree, but different degree distributions: a power-law degree distribution and a regular degree distribution. In both networks, the effectiveness of the tracing process depends on the infectiousness parameter π. In the regular network, the tracing process may shift the critical value of π c where the giant outbreak occurs, so that tracing completely removes a giant outbreak. In that regime, tracing is very effective. When a giant outbreak occurs in both the epidemic with tracing and in the epidemic without tracing, the effectiveness of contact tracing deceases in π. That is, the more infectious the disease, the less effective the tracing procedure. In the power-law network, a giant outbreak is always present in both the traced and the non-traced version of the epidemic. In this situation, there seems to be an 'optimal' value of the infectiousness parameter π where the tracing process is most effective.
We see that tracing is not necessarily more effective in heterogeneous power-law networks compared to the homogeneous regular graph, in contrast with previous studies [13,1]. This difference is caused by the immediate tracing assumption discussed in Section 2.1. Immediate tracing removes most of the degree-heterogeneity, and is therefore extremely effective on heterogeneous networks, which were studied in [13,1]. However, Figure 5b shows that contact tracing with delay is sometimes more effective on homogeneous networks than on heterogeneous networks. For larger values of π, tracing becomes more effective on the heterogeneous power-law network than on the homogeneous regular graph.

The effect of clustering on tracing
The configuration model is known to be locally tree-like: the fraction of triangles in the network vanishes asymptotically [5]. However, many real-world networks contain a non-trivial amount of triangles, which motivates studying the tracing process on a configuration model with enhanced clustering [16]. In this model, each vertex v has an edge-degree d (1) v and a triangle degree d (2) v , denoting the number of triangles that the vertex is part of. Then a random graph is formed by pairing edges uniformly at random and pairing triangles uniformly at random.
Let the degree-triangle distribution be denoted by q k,l , where k denotes the edge-degree, and l the triangle-degree. Let g(x, y) = k,l>0 q k,l x k y l be the generating function of the edge and triangle degrees. Furthermore, let g q (x, y) = 1 l k,l>0 tq k,l x k y l−1 , with s := k,l>0 kq k,l , l := k,l>0 lq k,l , be the generating functions of the number of edges and triangles that are reached by following a randomly chosen edge and a randomly chosen triangle respectively. In Appendix D, we show that the outbreak size after tracing equals where u and v are obtained by solving the system of implicit equations where w = p s + (1 − p s )(1 − p t ). Figures 6a and 6b show the epidemic size in networks with the same degree distribution but with a different amount of triangles. The analytic results for the final epidemic outbreak on networks with triangles obtained from (11) closely matches the results obtained by numerical simulations.
Furthermore, one may conclude from Figures 7a and 7b that the effectiveness of the contact tracing non-trivially depends on the amount of clustering. In the regular graph, Figure 7a shows that there is a range of the infectiousness parameter π where the tracing procedure is more effective on clustered networks than on tree-like networks, but there is also a range of parameters where the tracing procedure is more effective on the tree-like networks instead. On the heterogeneous power-law networks on the other hand, Figure 7b shows that the effectiveness of tracing is always higher in the tree-like network than in the clustered networks. Furthermore, the difference between the clustered and non-clustered networks is less pronounced in the power-law network.
Intuitively, introducing triangles has two effects: on the one hand they make it easier for an epidemic to spread, as they induce multiple paths for a person i to infect another person j, but on the other hand, they reduce the number of vertices that the epidemic can reach from a given vertex in k steps compared to a tree. The latter effect makes it easier for the tracing process to stop the epidemic in the presence of triangles. For power-law vertices, this is less pronounced, as in the presence of high-degree vertices, it is likely that the vertex has already infected many other neighbors before being traced. This may intuitively explain the difference between introducing triangles in power-law networks compared to homogeneous networks.
In general, Figure 7 shows that the effectiveness of contact tracing delicately depends on the interplay between the network degree distribution and its structure in terms of clustering.

Conclusion
In this paper, we have analytically studied a contact tracing process on networks with arbitrary degree distributions. In this process, infected vertices self-report and quarantine with some probability 1 − p s , and they trace their parent with probability p t . Using generating functions, we derive analytical expressions of the giant outbreak size after the tracing process.
We investigated the effect of the network structure on the tracing process and found that degree heterogeneity may either enhance or diminish the effectiveness of tracing depending on the exact parameter values. In our tracing model, we assume that there is a time-delay between the time that a person is infected and the time that its infector is traced. This assumption makes the network heterogeneity non-trivially affect the tracing effectiveness.
Likewise, enhancing clustering in the network has a non-trivial effect on the effectiveness of contact tracing. Depending on the infectiousness of the epidemic, clustering may either increase or decrease the effectiveness of contact tracing, in contrast with conclusions from simulations on homogeneous networks [12]. This underlines the importance of taking the network structure into account when investigating such tracing processes.
In this paper, we investigated bond percolation, which can be mapped to the final size of an SIR epidemic with constant recovery duration. In further research, it would be interesting to investigate the entire time evolution of the number of infected vertices in the SIR process as well, and investigate the effect of the network structure on this time evolution. This would enable to answer the question whether the network structure affects the speed at which tracing processes slow down the spread of an epidemic.
Furthermore, our results on power-law networks suggest that there is an optimal value of the infectiousness parameter π such that tracing is the most effective. It would be interesting to investigate the relation between this optimal value of π and the parameters of the tracing process, to enable the design of optimally efficient tracing processes.
Using that E (X + 1) −1 = p −1 (1 + k) −1 (1 − (1 − p) k+1 ) when X is distributed as Bin(k, p), we obtain Further, using that the probability generating function of a Bin(k, p) random variable is (1 − p + px) k , we obtain where g D * −1 (x) is the probability generating function of the size-biased degree distribution minus 1, so that A giant outbreak occurs when the expected number of offspring surpasses one, so when Thus, the critical value of the percolation parameter π c is such that

B Critical value under delayed tracing
We now derive the critical percolation value under delayed tracing with fixed δ. We use the same notation as in Appendix A. When R vertices report themselves, the probability that their infector is not traced is (1−p t ) R . There are N (π) t −R non-reporting vertices. When their infector is traced, on average a fraction of δ of them remain infected. Therefore, for any random variable Y , and N t is distributed as D * − 1, where D * is the size-biased degree-distribution, Finally, using that g D * −1 (x) = g D (x)/E [D] and that The critical value of π where a giant outbreak occurs, is when E[N f ] = 1, which yields equation (6).

C The giant outbreak size
In this section, we compute the giant outbreak size after tracing. A vertex does not trace its infector if it does not self-report, which happens with probability p s , or if it does self-report, but does not successfully trace its infector, which happens with probability (1 − p s )(1 − p t ). Thus, the probability that a vertex of degree k is traced by none of its offspring equals Let p k = (k + 1)q k+1 k≥1 kq k be the excess degree distribution, and p * k be the excess degree distribution after tracing. As a vertex loses all its offspring after self-reporting which happens with probability 1 − p s , p(0) is given by When a vertex is not traced, its degree remains the same. When a vertex is traced, an extra layer of percolation occurs with parameter δ. Thus, At least one offspring traces this node None of the offsprings trace this node p k , k > 0 where p δ k,j is the probability that a vertex of degree j has remaining degree k after percolation with bond occupancy δ. The generating function for p * k is then given by: where g D * −1 (x) denotes the generating function of p k . Furthermore, This is the generating function of the degree distribution of a tracing process on a network with excess degree distribution p k . However, before the tracing process takes place, an epidemic modeled by a bond percolation process with occupancy π takes place. Thus, to obtain the generating function G(x) of the degree distribution after the epidemic and the tracing process, we add the bond percolation process with bond occupancy probability π by substituting x → 1 − π + πx in (15): We then obtain the size of the giant outbreak S = p s − p s g D (1 − π + πu), where u is obtained by solving the implicit equation u = G(u) and g D (x) is the generating function of the degree distribution.

D Derivation of the giant outbreak size in clustered networks
Under bond percolation with probability π, a triangle from a given vertex can still be connected to its two triangle members, with probability π 2 (3 − 2π), it can connect to only one of its triangle members, with probability 2(1 − π) 2 π, or it can become disconnected from both other triangle members, with probability (1 − π) 2 . Thus, for a vertex of triangle-degree k, the number of neighbors that are reachable through these triangles after bond percolation, has generating function g D * −1 (z) = ((1 − π) 2 + 2(1 − π) 2 πz + π 2 (3 − 2π)z 2 ) k . Let u denote the probability that a randomly chosen half-edge is not connected to the giant component. Similarly, let v denote the probability that following a randomly chosen triangle does not lead to the largest component. Then, after bond percolation with probability π, Adding site percolation with probability p s results in Let w denote that a vertex of degree 1 is traced by none of its offspring, so that (1 − π) 2 type: k 5 Figure 8: After percolation with probability π, a triangle that is reached at the red vertex has become one of these types. Thus, when arriving at a percolated triangle at the red vertex, zero, one, or two other vertices may be reached. The labels below the types provide the probability that a percolated triangle equals this type.