Pandemic Spread in Communities via Random Graphs

Working in the multi-type Galton-Watson branching-process framework we analyse the spread of a pandemic via a general multi-type random contact graph. Our model consists of several communities, and takes, as input, parameters that outline the contacts between individuals in distinct communities. Given these parameters, we determine whether there will be an outbreak and if yes, we calculate the size of the giant connected component of the graph, thereby, determining the fraction of the population of each type that would be infected before it ends. We show that the pandemic spread has a natural evolution direction given by the Perron-Frobenius eigenvector of a matrix whose entries encode the average number of individuals of one type expected to be infected by an individual of another type. The corresponding eigenvalue is the basic reproduction number of the pandemic. We perform numerical simulations that compare homogeneous and heterogeneous spread graphs and quantify the difference between them. We elaborate on the difference between herd immunity and the end of the pandemic and the effect of countermeasures on the fraction of infected population.

to the basic reproduction number of the pandemic. More precisely, one defines a matrix M whose (i, j) entry for i, j = 1, ..., r encodes the average number of individuals of type j expected to be infected by an individual of type i. The matrix M determines the progression of the pandemics and the "direction" vector is its Perron-Frobenius eigenvector with the largest eigenvalue ρ(M ) corresponding to the basic reproduction number of the pandemic. ρ(M ) is the base of the exponential-growth of the disease, before a significant fraction of the population have contracted it. The components of the Perron-Frobenius eigenvector hold the information about the potential number of spreaders of each type, hence is intuitively related to the direction of the spread.
Given the input matrix P , one can determine when ρ(M ) 1 − ε, implying the spread ends naturally without an outbreak, in contrast to when ρ(M ) 1 + ε, in which case, a large fraction of the population would be infected, namely all the GCC. In the latter case, we can furthermore perform a calculation of the size of the GCC of the graph. Moreover, our framework allows us to calculate the fraction of the infected population for each community type separately.
Note that during the evolution of the pandemic the ratios between the numbers of unsaturated vertices (individuals that have not been infected and remain susceptible to contract the disease) from each of the types changes and thus the matrix M changes. This is because there are types that are more infectious and susceptible than others and they are likely to have a larger percentage of infected vertices. This implies that the matrix M changes with time. While this dependency is not important if we are interested only in the final size of the GCC at the end of the pandemic, it is important if we are interested in answering questions that are time dependent or before the end of the pandemic. We will discuss such questions in Section 5.

The graph model
Our model is best described via the formalism of graph theory. Let r ∈ N be the number of types of individuals in the population, and let n = (n 1 , . . . , n r ) be the vector indicating for each type i = 1, . . . , r, the number n i of individuals of that type. Note, that the numbers n i are the ones at the beginning of the pandemics. Let us remark here that r should be thought of as constant when compared to the total size of the population n := n 1 + . . . + n r . 1 For any two types i, j ∈ [r] (not necessarily distinct), we have a non-negative parameter λ i,j which captures (after an appropriate normalization) the susceptibility and infection between an individual of type i and an individual of type j, this is a term often called transmissibility (see for instance [29]). Let us assume, throughout, that these parameters are symmetric, i.e. λ i,j = λ j,i . Using these parameters we define the matrix P = (p i,j ) i,j∈ [r] as p i,j = λ i,j / √ n i n j which describes the probability that an individual of type i infects an individual of type j (given that the individual of type j is susceptible). 2 Once that matrix is set, let us formulate the distribution G( n, P ) over graphs G = (V, E) as follows: The vertex set V consists of r disjoint sets of vertices V 1 , . . . , V r , where |V i | = n i for all i ∈ [r]; as for the edges, for all v i ∈ V i , u j ∈ V j , each edge (v i , u j ) occurs in G, independently, with probability p i,j . 1 Our analysis nonetheless follows through even if r is a slowly growing function of the total population, say logarithmic. 2 When i = j, pi,j = λi,i/ni, this is the well-known normalization from the Erdős-Rényi [11] model G(n, p) where the threshold function was found to be 1/n (this model is often stated with parameter λ such that p = λ/n and the value of λ determines whether a giant component will appear or not). Thus, it fits logically for our model, as λi,i describes the strength of connections inside a given community of size ni, i.e. between vertices of the same type. On the other hand, when i = j, we want λi,j to describe the connection strength between two disjoint sides (each side contains the vertices of one of the types i or j). Those edges between different types, resemble the connections between vertices in the bipartite random graph described in [18], where the threshold function for a GCC in G(m, n; p) was found to be 1/ √ mn. Therefore, this setting of parameters generalizes both the Erdős-Rényi and the bipartite random graph models, and will turn out to be suitable for the multi-type graphs, as well.
A random graph sampled this way describes the progression of the pandemic as follows: if at a certain point in time a vertex v ∈ V was infected, then at the next step v infects all vertices adjacent to it in G. In other words, the random variable corresponding to an edge (v, u) occurring in G encapsulates the probability that these two individuals would be in close proximity when v is infected, as well as the probability that v would pass it along in those interactions (see Figure 1).
This perspective highlights the importance of a giant component in the graph G in the study of the size of an outbreak, as well as the number of infected individuals at each step of the spread. Intuitively, if an individual v ∈ V is infected, then every individual in the connected component of v would eventually be infected too. This motivates us to study the following question: Question 1. How does the distribution of connected components in G behave as a function of (λ i,j ) i,j∈ [r] ?
Let us remark here that this question is a variant of the well-known problem of determining the size of the giant component in the classical Erdős-Rényi model [3,17], and closely related variants that have been studied more recently [18,14,19].
Our method nevertheless-combining spectral considerations and the theory of multi-type Galton-Watson processes-is more natural in our view. As a biproduct of our approach, we uncover other parameters that may be of interest in the study of the spread of a pandemic. Our model is a general model of multi-type random graphs. We will describe a branching process over such a random graph in order to analyse some of its combinatorial properties, most importantly, for pandemic analysis, the size of the GCC depending on the initial parameters of the graph. The branching process matches the way a pandemic spreads through society in our model.
Throughout, we pose the basic assumptions of the SIR framework [22]. We have two classes of vertices in the graph called unsaturated and saturated. The unsaturated vertices are individuals susceptible to infection and the saturated vertices are infected individuals. An individual can only be infected once. At the beginning of the process all the vertices of the graph are unsaturated except for 1 (the so called, patient zero), which is infectious. At each step, infectious vertices infect according to their connections with other vertices in the graph, but they infect only other unsaturated vertices. At the next step the vertices that just infected their neighbours, become saturated (i.e. recovered/removed in SIR framework) and step out of the pandemic spread cycle in the sense that they cannot get infected (or infect others) again. Under these assumptions of the SIR framework, the GCC in the random graph corresponds to those individuals who got infected during the process.

The descendants matrix and its spectrum
Consider the matrix Λ ∈ R r×r whose i, j entries are equal to λ i,j and recall the matrix P whose i, j entry is measures the expected number of individuals of type j that a given individual of type i would infect.
It is well known (and easy to observe) that M is diagonalizable, and furthermore, its eigenvalues are real (we reproduce a proof of this fact in Section 2). The one parameter of utmost interest regarding this matrix is its highest eigenvalue, which will be denoted throughout by ρ(M ). Our result shows that this parameter is in fact exactly the same as the basic reproduction number (i.e. R 0 ). Namely, we prove: Theorem 1.1. With the set-up above, for all r, A ∈ N, ε > 0; there exists N ∈ N such that the following holds: Suppose n 1 , . . . , n r are such that n N , n i An j for all i, j ∈ [r], and let Λ, M be the matrices as above. Further suppose that Λ is connected, and for any i, j ∈ [r], we either have ε λ i,j 1 ε or λ i,j = 0. Then: , and all other connected components have size ≤ polylog(n).
In our view, the main case of interest is ρ(M ) > 1 + ε (otherwise the pandemic disappears fairly quickly with little impact). In this case, we give explicit, simple equations for computing α 1 , . . . , α r (see Equation (3) and Theorem 2.8). Furthermore, our proof demonstrates that the parameter ρ(M ) behaves like the basic reproduction number in a manner more general than just implying whether the disease is likely to spread out or not. In particular, our arguments show that the growth in the number of infections over time is an exponential function whose base is ρ(M ) (at least as long as the total number of infections has not reached a significant proportion of the population).

Comparison to related literature
A large number of approaches have been proposed to model pandemic spread, as well as other information transmission processes, using percolation in complex networks. Thus, much effort was devoted to characterizing sharp thresholds for many such models. In [21], Karrer, Newman and Zdeborova consider a related problem of percolation over sparse networks. They address the random subgraph model over a general host graph, wherein one starts with some host graph H and samples a random subgraph G of H by independently including each edge from H with probability p. Their work considers the problem of locating the critical probability p c , such that for p < p c the graph G is unlikely to contain large clusters (i.e. large connected components), and for p > p c the graph is likely to contain large connected components. The authors relate this critical probability to the non-backtracking random walk matrix on H, and in particular to the topmost eigenvalue of a matrix related to the adjacency matrix of H. The intuition behind this result is that the topmost eigenvalue represents a certain notion of average degree in the percolation graph, so that, in a sense, if the topmost eigenvalue is λ, then a vertex in G has λ neighbours on average, which results in an exponential growth with the number of steps of the process. The authors argue that their result holds for large girth, locally tree-like host graphs, and support their results by performing numerical simulations over several host graphs. Furthermore, a similar result was obtained by Bollobás et al. [5] for dense networks. In the latter, the threshold function obtained for p is the inverse of the leading eigenvalue of the simple adjacency matrix. Our work has several points of similarity and dissimilarity with [21,5], which we outline next.
1. Our work seeks formal mathematical proof of our results, and we are therefore constrained to work with a less general class of host graphs. Indeed, there are pathological examples of host graphs and to allow for a formal proof one has to make relatively strong assumptions on the structure of the host graph. In our case, we are working with blow-ups of graphs on k-vertices with k being constant, meaning that we start with any host graph H over k vertices and replace each vertex in it with a new cloud of vertices. To simplify terminology, we refer to the original vertices of H as types, and denote the cloud of vertices in our graph that replaces type i in H by V i . Our graph only allows edges between a vertex in V i and a vertex in V j if the original host graph H had an edge between types i and j.
2. Our model is more general in the sense that it allows different survival probabilities for different types of edges (occupation probabilities in the language of [21]). To be more precise, for any two types i, j ∈ [k], the probability of an edge (v i , v j ) where v i ∈ V i and v j ∈ V j (namely, in the language of [31], the occupation probabilities of edges coming from different types may be different). These probabilities are the input to our problem (related to the parameters λ i,j and the initial population sizes), and we wish to determine when such graphs are likely to contain large clusters, and when they are unlikely.
3. Our work also identifies that the important parameter in our problem is the topmost eigenvalue, but of a different matrix. The matrix we consider here is M , a k by k associated with the prototype of our host graph, whose entries are appropriate normalizations of the susceptibility and rate of interaction between different types. As in [21], this parameter also carries with it a certain notion of averagedegreeness, however it is more subtle in our case as not all of the neighbours contribute equally to further percolation of the process. Still, we prove that the magnitude of the topmost eigenvalue of the matrix M determines the existence of large clusters in our model.
Pandemic spread models of networks with general power-law degree distributions were studied by Newman in [29] and later by Miller in [28]. 3 The transmissibility between each two vertices u, v is drawn according to an appropriate random variable T u,v . All those random variables are i.i.d. and taken from a predefined distribution, thus a new parameter T is introduced which takes a weighted average of these variables, according to the predefined distribution. In this model, vertices can be sometimes infectious for multiple rounds, or rather, be non-infectious (albeit being carriers of the virus) for multiple rounds. The paper [29] develops several methods in order to analyze questions concerning parameters such as: the relation between the degree of a vertex and its probability to avoid infection; sizes of outbreaks; and finally thresholds for the occurrence of an outbreak. Those are obtained using the generating functions method [31], described also in [34, Section IV.C] and initially in [38].
The general multi-type framework that we analyse in this paper was described in [30] as an undirected network whose vertices are partitioned into types that interact with each other. A study of such model using the method of generating functions with an application to pandemic spread has been carried out in [2] where, under some assumptions, a criterion for the threshold to having GCC was proposed. The criterion differs from ours and since the proof methods used in [2] are non-rigorous it is not clear to us when their result holds. In the current work we study a model where the degree distributions of the vertices are binomial distributions and prove rigorously the results about the GCC. The advantage of having a rigorous proof is two-fold: first, it explains exactly under which conditions on the parameters the result has to hold; second, in order to make the proof go through we identify several parameters that then may actually become the center of interest, and are completely absent in [2]. In this case, as our technique mostly relies on spectral graph theory, we identify the types matrix M and the importance of its topmost eigenvalue ρ(M ), which for pandemics has the interpretation as the basic reproduction number, but more generally may be regarded as the rate of increase of the components as a function of "time" (where each step of the aforementioned Galton-Watson branching process advances the "time" by exactly one step). The associated eigenvector also plays a crucial role in our proof, and turns out to encode the number of "active" vertices at various stages of the process (a term we later refer to as "unsaturated vertices"). In contrast, in [37] a topmost eigenvalue of a matrix R is indeed noted to be related to the basic reproduction number of the disease. However, no explicit statement is given to relate the eigenvalue to the emergence of the giant component or to its growth rate through a branching process on the graph, nor a connection is drawn between the direction of the spread and the appropriate leading eigenvector, which was one of the main goals of this paper.
More recent analysis of pandemic spread in networks include [36,16] and [33,32], where the basic graph is complete, and the heterogeneity is received by drawing connections according to some specific power-law degree distribution (e.g. appropriate Gamma distributions). Using these models, herd immunity factors are calculated for families of distributions, in particular for COVID-19. Moreover, a numerical simulation of multi-type heterogeneous society network where described in [7], which resembles our model in the sense that it considers partitioning society according to different characteristics, but takes a rather experimental approach, to tackle specifically COVID-19 data. In summary, those described models often assume the homogeneity of the society (individual vertices are indistinguishable) and they obtain heterogeneity only by applying degree distributions that describe the disease parameters. Others, allow multiple communities, but include only simulations of specific data, and not general analytical results.

Related applications
The questions addressed here are applicable beyond the spread of pandemics. Taking a large set of vertices and analyzing their interaction by first partitioning them into types and then assuming random connections, takes place in many distinct areas where the setup consists of a large complex network. These include networks in the physical world such as in life sciences, in the virtual world of the internet, and in the society (for a survey see e.g. [24]). One natural example is search algorithms. The web is a graph whose vertices are all the known web-pages and where a directed edge connects one vertex to another if the first includes a link to the other. Trying to figure out the best pages for a given query amounts to analyzing that graph.
The assumption that web-pages have types, and that connections between types can be assumed to be quite random, have been promoted both theoretically (for a review see e.g. [25]), as well as in practice, where search algorithms that run in practice are rumored to employ such tactics.
Organization: The paper is organized as follows. In Section 2 we introduce some preliminaries and probabilistic tools and outline the multi-type Galton-Watson branching process framework. In Section 3 we prove various properties of the multi-type Galton-Watson process including a quantitative bound on the extinction probabilities, as well as the Galton-Watson simulation of the connected components in random graphs. In Section 4 we prove Theorem 1.1. In Section 5 we perform numerical simulations to demonstrate the analytical statements. Section 6 is devoted to a discussion and outlook.

Preliminaries
Notations. Throughout the paper we use asymptotic big-O notation. We write X = O(Y ) or Y = Ω(X) to say that there exists an absolute constant C such that X C · Y . Sometimes, this constant will depend on other parameters, say on ε, in which case we write X = O ε (Y ) or Y = Ω ε (X). We let [r] = {1, . . . , r} and N = {0, 1, 2, . . .}. We use the standard Euclidean inner product x, y = i x i y i and p -norms Definition 2.1. We say n ∈ N r is A-balanced if for any i, j ∈ [r] it holds that n i An j .
ε . Definition 2.3. We say a symmetric Λ = (λ i,j ) i,j∈ [r] is connected if the graph whose vertices are [r] and there is an edge between i and j if λ i,j > 0, is connected.

Spectral properties of the matrix M
The following lemma asserts that our matrix M is diagonalizable, and furthermore the eigenvalues and eigenvectors satisfy several useful properties.
Lemma 2.4. For all r ∈ N, ε > 0 there exists δ > 0 such that the following holds. Suppose Λ is ε-separated and connected. Then there exists a basis x 1 , . . . , x r of R r consisting of eigenvectors of M . Furthermore, if θ 1 θ 2 . . . θ r are the eigenvalues of M , then |θ i | θ 1 − δ for all i = 1.
Since it is symmetric, there is a basis y 1 , . . . , y r consisting of eigenvectors of it with real eigenvalue θ 1 , . . . , θ r . Thus, for each i ∈ [r], For the furthermore statement, we use a quantitative version of the Perron-Frobenius theorem. More precisely, we use [13,Inequality (3)], and for that we first bound the quantity m(A) defined therein.
Normalize y 1 so that y 1 2 = 1; we show that y 1 (i) δ for all i, for some δ > 0 depending only on r and ε. Indeed, first taking i that maximizes y 1 (i), we have that y 1 (i) r − 1 2 . For the rest of the types, we first upper bound ρ(Λ): We now lower bound y i (j) for all j. Take j minimizing y 1 (j); as Λ is connected, there is a path of length r from j to i, so We can now lower bound m(Λ) from [13, Inequality (3)] as: As Λ is connected, there are i ∈ M , j ∈ M such that λ i,j > 0 so that λ i,j ε, and then we get δ for all i = 1, and we're done. Let x 1 , . . . , x r be the basis from Lemma 2.4. Since this is a basis, any vector z may be written as a linear combination of these vectors. However, since this is not an orthonormal basis, finding these coefficients and proving estimates on them may be a bit tricky. In the lemma below, we use the fact that this basis is a multiplication of an orthonormal basis with a diagonal matrix, to prove several estimates on such coefficients that will show up in our proofs. Lemma 2.5. For all A, r ∈ N and ε > 0 there are δ > 0 and S > 0 such that the following holds. Let z ∈ [0, ∞) r , and write s = j α j x j . Then: 2. for all j, α j x j 2 S z 2 .
Proof. As y 1 , . . . , y r is orthonormal, we have D Thus, This proves the first item. For the second item, fix j and note that Therefore, which is at most

The mutli-type Galton-Watson process
A key component in our analysis is the multi-type Galton-Watson process. In this section we introduce this process and recall several of its basic properties. In Section 3 we state and prove additional properties of it that will be important in the proof of our main results. We will specialize our exposition to our case of interest, in which the number of offsprings is distributed according to a binomial distribution. We refer the reader to [15] for a more systematic treatment of this process.

Development of the Galton-Watson Process
The parameters defining the Galton-Watson process are identical to the parameters defining our disease: namely a vector n = (n 1 , . . . , n r ) of integers, and a matrix P := (p i,j ) i,j∈[r] positive entries. The process starts from an initial configuration x ∈ N r , specifying for each type i ∈ [r] the number of individuals of type i, that is, x(i). Individuals are classified as either "unsaturated", if we have not explored their offsprings yet, and otherwise are classified as "saturated". Initially, all individuals are classified as unsaturated. There are two equivalent ways to develop this process, both of which will be useful for us: 1. Sequential Galton-Watson process. At each step, as long as there is an unsaturated individual, the process picks up some unsaturated individual, w, and explores its offsprings. Namely, suppose w is of type i, then for each j ∈ [r], the process generates independent Binomial samples N j ∼ Binomial(n j , p i,j ). The process adds N j unsaturated individuals of type j for all j ∈ [r], and changes the classification of w to "saturated". The process halts when all individuals are classified as saturated.
2. Parallel Galton-Watson process. Here, at each point in time instead of picking a single unsaturated individual from the list, we look at them all together, and generate their descendants simultaneously.
We denote by pop(GW( n, P, x)) the random variable measuring the total population in the process once it terminates, when starting with initial population x ∈ N r (defining it as ∞ if the process does't terminate).

The extinction probabilities
One aspect of the Galton-Watson process we will be concerned with are the extinction probabilities with a given initial configuration, as well as the distribution of the number of individuals in case the process terminated. In this case as well, it makes sense to consider the matrix M = (p i,j n j ) i,j∈ [r] . We record below two well-known facts, showing that in this case the parameter that determines whether the Galton-Watson process will terminate is ρ(M ). Proof. Let x be the initial configuration of the Galton-Watson process, and define We take the parallel view of the development of the Galton-Watson process. Let X t be the number of individuals explored at time t. Note that the expectation of X t is x t 1 x t 2 , so by Markov's inequality and using the sum of geometric series this is at most ρ(M ) x 2 (1−ρ(M ))k . Sending k to infinity finishes the proof.
Remark 2.7. The above argument can actually give us strong bounds on the total size of the population when the Galton-Watson process terminates. Indeed, in Section 4.1 we give an adaptation of this argument to handle the sub-critical case in the graph process.
Next, it is natural to ask what can be said about the extinction probabilities in the super critical case, ρ(M ) 1 + ε. For this, we appeal to a result from [10], stating that these extinction probabilities satisfy a simple system of equations. Towards this end, consider for each type i the probability generating function of the distribution of descendants of an individual of type i: In our case of interest, as the number of descendants of i from different types are independent random variables, we may simplify the about equation and simply write where Gen n,p (z) is the generating function of Binomial(n, p). With these notations, we define f ( z) = (f 1 ( z), . . . , f r ( z)). In this language, [10, Theorem 7.4] reads: Theorem 2.8. Suppose ρ(M ) 1 + ε, and let α i be the extinction probability of GW( n, P, e i ). Then the vector α = (α 1 , . . . , α r ) is a fixed point of f , i.e. f ( α) = α, and α i < 1 for all i. Moreover, for any other non-trivial fixed point β, Remark 2.9. For or most purposes, it is often useful to approximate the moment generating function of a binomial random variable as Gen n,

Probabilistic tools
We will need the notion of stochastic domination as well as the following simple fact in our proof.
Fact 2.11. Let X, Y be two real-valued random variables. Then X stochastically dominates Y if and only if there is a coupling ( X, Y ) of X and Y , such that X Y always.

Properties of multi-type Galton-Watson processes
In this section, we prove several more properties of the multi-type Galton-Watson process.

A quantitative bound on the extinction probabilities
Suppose x ∈ N r is the initial configuration of a Galton-Watson process. A quantity that we'll often be interested in is M x, which measures the "expected configuration" of the population after a single step of development in parallel. That is, (M x) i is the expected number of individuals of type i after a single step of development in parallel. Thus, studying norms of such expressions, or more generally of expressions such as M k x, is important in understanding the Galton-Watson process.
Claim 3.1. For all A, r ∈ N and ε > 0, there exist k ∈ N and δ > 0, such that if n is A-balanced, Λ is connected and ε-separated and ρ(M ) 1 + ε, then for any vector z with non-negative entries it holds that: Proof. We begin with the first item. Write z = r j=1 α j x j , and note that for all k we have M k z = r j=1 θ k j α j x j , where θ j is the eigenvalue corresponding to x j . Thus, where δ 1 , S > 0 are from Lemma 2.5, and δ 2 is from Lemma 2.4. As ρ(M ) 1 + ε, there is k depending only on δ 1 , δ 2 , r, S (and therefore only on For the second item, we may find large enough k such that ρ( which gives the first inequality of the second item. The second inequality follows immediately from definition of ρ(M ).
Next, we show that in our set-up of the Galton-Watson process, in the super-critical case ρ(M ) 1 + ε the probability that the process never goes extinct is bounded away from 0. Proof. Recall the functions f 1 ( z), . . . , f r ( z) and f = (f 1 ( z), . . . , f r ( z)) from Section 2.2, and let α i be the extinction probability of GW( n, Λ, e i ).
By Theorem 2.8, α 1 , . . . , α r satisfies α i = f i ( α) and α i < 1 for all i. Our main goal is to show that under the conditions of the lemma, there is i such that z i 1 − δ, where δ = δ(A, r, ε) > 0; once we show that, it will be easy to deduce this is in fact the case for all i. The proof is very similar to the proof of [10,Theorem 7.4], except that we replace a part of the argument therein with Claim 3.1 above.
Specifically, let δ > 0 to be chosen later, and assume towards contradiction that α i 1 − δ for all i. Let z = 1 − α. For large enough n = n 1 + . . . + n r , we may write . Therefore, applying this twice, we , and combining the last two equations we get giving a contradiction if δ < 1 C 2 r .
We conclude that there is i such that α i 1 − δ for δ = 1 2C 2 r , and we next use the connectedness of Λ to argue that α j 1 − δ for some δ (A, r, ε) > 0. Indeed, we note that for each j ∈ [r], as there is a path from j to i of length r, and as Λ is ε-separated and n is A-balanced we get that the probability that GW( n, Λ, e j ) produces an individual of type i after steps is at least ξ(A, ε, r) > 0. Therefore, the probability that this process survives is at least ξ(A, ε, r) times the probability that GW( n, Λ, e i ) survives, which is at least δ. Thus, the claim is proved for δ = ξ(A, ε, r)δ.

The Galton-Watson simulation of connected components in random graphs
The following lemma will be useful for us to analyze the size of connected components in graphs by translating such questions to their analogous counterparts in the Galton-Watson setting. Proof. The proof proceeds by a coupling arguments.
The upper bound. Consider a coupling between the exploration process of the connected component of v i and GW( n, P, e i ). At each step, we will have queues Q graph , Q GW of unsaturated vertices in the graph and in the Galton-Watson process, initially containing only v i . At each step, we take a vertex w ∈ Q graph , suppose its type is i , and an individualw ∈ Q GW of the same type i . Letting T t = ( T t (1), . . . , T t (r)) be the number of vertices explored in the graph of each type so far, we sample N j ∼ Binomial(n j − T t (j), p i ,j ) and N j (GW) ∼ Binomial(n j , p i ,j ) in a correlated manner so that N j N j (GW). We add N j vertices of type j to Q graph and N j (GW) vertices of type j to Q GW .
Observe that at each step of the exploration process, for each type j ∈ [r] the number of individuals of type j in Q GW is at least the number of individuals of type j in Q graph . Also, at each step of the process the number of vertices in the graph we have explored is at most the number of individuals explored in the Galton-Watson process.
The lower bound. For the lower bound, we use the same coupling, with one difference. If the total population explored in the process so far is at most k, we take w ∈ Q graph ,w ∈ Q GW of the same type i , and sample N j ∼ Binomial(n j − T t (j), p i ,j ) and N j (GW) ∼ Binomial(n j − T (j), p i ,j ) in a correlated manner so that N j (GW) N j ; this is possible as T t (j) k T (j).
As Lemma 3.3 suggests, to understand the behaviour of connected components in random graphs we may need to understand two different Galton-Watson processes, i.e. GW( n − T , P, e i )) and GW( n, P, e i )). Intuitively, these process could be thought as identical as long as max T i = o(n) (at least in the sense of extinction probabilities and rough statistics about the distribution of the population). For us, it will be enough to show that if the first of these process is super-critical, then so is the second.
Note that √ n i − n 0.999 √ n i − √ n 0.999 and similarly for n j , so we get Note that for each i, n rA n i n, and λ i,j 1/ε, so the last sum is at most r 3 A4 nε 2 , so |ρ(M ) − ρ(Q)| 2 For i ∈ [r] and t ∈ N, we denote by U (t; T ; e i ) the (random) set of unsaturated vertices at time t when we begin the exploration process of T from configuration e i . We also denote by S(t; T ; e i ) the (random) set of saturated vertices at time t. We will often drop the process T from the notation if it is clear from context.
What can be said about the distribution of |U (t; e i )|? In the standard (i.e., non multi-type) Galton-Watson process, this question has a rather accurate answer, and this random variable is distributed roughly as a Poisson random variable (see [3] for example). Here, intuitively the situation is similar, but as we are not aware of any literature addressing this question, and instead establish cruder properties of this random variable (that are enough for our application).
To gain some intuition, it is instructive to check that the expectations of these numbers grow exponentially with t. Indeed, by the second item in Claim 3.1 for large enough t we have M t e i 2 δρ(M ) t , and as M t e i 2 that the expected growth is at least exponential. A similar argument, using M t e i 1 √ r M t e i 2 , shows that the growth is at most exponential. Unfortunately, expectation considerations do not seem to be enough to make our arguments go through, and we need to establish stronger properties of |U (t; e i )|. For example, intuition suggests that not only should the expectation of |U (t; e i )| be large, but that will be achieved in a very skewed way: either |U (t; e i )| will be 0 with some probability, or else it will be exponentially large in t. The following lemma proves that something along the lines indeed occurs. 4 Proof. Let T be large enough to be determined later. For each t = 1, . . . , T , we denote by X t the indicator random variable of that is 1 if and only if |U (t; e i )| H. Let α i be the survival probability of the Galton-Watson process starting at e i . Note that Pr [ GW survives up to step t + 1 | GW survives up to step t].
Observe that for each t, Pr [ GW goes extinct in step t + 1 | GW survives up to step t] Pr [ X t = 1 | GW survives up to step t]c H , for some c = c(A, r, ε) > 0; this is because for each individual, the probability that they will have 0 descendants is at least c, for an appropriately chosen c. Set q t = Pr [ X t = 1 | GW survives up to step t]; then moving to the complements events in (5), this bound yields Therefore, there is t such that q t c −H log(1/δ) T . We thus take T = 2c −H log(1/δ) and get that q t 1 2 , and the proof is concluded by noting that Next, we prove that the total number of individuals that we explored in constantly many steps obeys a strong tail bound. Proof. The proof is by induction on t.
Base case. When t = 1, we have where all of the binomial random variables are independent. As p i,j O A,r,ε (1/n), the expectation of r j=1 Binomial(n j , p i,j ) is O A,r,ε (1/n), so the claim follows from Chernoff's bound.
Inductive step. Suppose the claim is correct for t and let ξ = ξ(A, r, ε) > 0 be a small constant to be determined later. By the inductive hypothesis, Pr [|U (t; e i ) ∪ S(t; e i )| ξk] e −δk . Condition on L = |U (t; e i ) ∪ S(t; e i )| and assume that L ξk; we have where each B j is a Binomial random variable of the type Binomial(n i , p i ,i ) for some i , i , and they are independent. The expectation of ξ (this is how we choose ξ). Therefore, by Chernoff's bound Pr In this section, we prove Theorem 1.1.

The super-critical case
In this section, we begin the proof of the second item in Theorem 1.1.
High level structure of the argument. Our proof has three steps.
1. No middle ground. We first show that for k − := log 2 (n), k + := n 0.99 , with probability 1 − o(1) the graph G ∼ G( n, P ) does not have any connected components whose size is between k − and k + . This statement is the bulk of our proof.

2.
Sprinkling. Second, we show that with probability 1 − o(1), all components whose size exceeds k + will be connected, thus in conjunction with the previous statement, we conclude that with high probability G consists of components of sizes k − , and a single component that exceeds it.
3. Estimating the size of the component. Finally, by appealing to the Galton-Watson process again, we show that the probability of an vertex v i ∈ V i to be in the single component exceeding k − is α i ± o(1), where α i is the probability the Galton-Watson process with P survives if it starts at initial configuration e i . The proof is then concluded by a simple application of Chebyshev's inequality.
We remark that this proof outline is analogous to the proof in the simpler case of the standard Erdős-Rényi model (see for example [3]). The proofs of the first and third steps however require more work.

No middle ground: proof overview
Next, we elaborate on the main step in the argument outlined above, in which we show that with probability 1 − o(1), G does not contain components of size between k − and k + .
Suppose we wish to explore the connected component of a given vertex v i ∈ V i for some type i ∈ [r]. Towards this end, it is useful to consider a coupling between what's happening in the graph G, and in a corresponding execution of the Galton-Watson process with P on initial configuration v i .
The only difference between the two procedures is that in the graph case there is no replacement, hence after we have explored T vertices spread according to the vector T = (T 1 , . . . , T r ) ∈ N r , the neighbours of type distribution of a new unsaturated vertex is slightly different. Namely, for each j ∈ [r], the number of neighbours of type j for a vertex v i of type i is distributed as Binomial(n j − T j , p i,j ), whereas in Galton-Watson process, the number of offsprings of type j for an element of type i would still be ∼ Binomial(n j , p i,j ). However, as these random variables are anyway "close" to each other (at least as long as T is not too large), this difference should be thought of as minor. For clarity, we ignore this issue in this overview.
The main issue in working with the original Galton-Watson process, is that it is not "immediately clear" whether the process is super critical or not in a very local point of view. For example, it could be the case that (a) for each j ∈ [r − 1], an individual of type j can only have descendants of type j + 1, and the average number of them could be small -say 1 2 , and (b) for j = r, an individual of type j is likely to have many descendants of type j -say at least 2 in average. This process is connected and super-critical, however if we start with an individual of type j = 1, it will be hard for us to identify it locally. Namely, we would need to make at least r steps in the process in order to have a chance of having an offspring of type r, in which case it is "clear" that the process has a chance to never go extinct.
To circumvent this issue, we consider a modified Galton-Watson process, call it GW modified , in which a step corresponds to a batch of constantly many parallel steps in the original Galton-Watson process. We will also explore the connected component of v in a process that is analogous to the modified Galton-Watson process. Using this idea, we are able to show that: 1. in order for the connected component to exceed k − in size, except for negligible probability, the modified Galton-Watson process must have gone on for at least Ω(k − ) steps.

For
if graph exploration process survives at least T steps, then except for negligible probability it has at least Ω(T ) unsaturated vertices at step T .
3. If the graph exploration process at least U unsaturated vertices at some point, then except for probability e −Ω(U ) , the connected component of v i will exceed k + in size. Intuitively, this is true since each one of the unsaturated vertices can be thought of as initiating a Galton-Watson process, which has positive probability of not going extinct. Furthermore, the events of "not going extinct" are not too negatively correlated.
Below is a formal statement of a lemma that easily implies the "no middle ground" step.
We define a modified Galton-Watson process T modified as follows. The process starts at some vertex v ∈ V , say of type i, and maintains as before lists LU(t) and LS(t) of unsaturated and saturated vertices at step t. Upon exploring a vertex w ∈ LU(t), if its type is j, we run the original Galton-Watson process on j for t j steps (in the "parallel view"), and then update LS(t) and LU(t) accordingly.

The sandwiching random variables
We will want to estimate the number of explored vertices as well as the number of unsaturated vertices at various times in the process. However, as the random variables counting the number of newly explored vertices at each step are dependent, we are not able to use strong concentration bounds such as Chernoff's inequality. To overcome this difficulty, we will use auxiliary random variable to lower bound the number of newly found unsaturated vertices, and upper bound the total number of explored vertices, at each point in the process.
For each i, let R i be the random variable counting the number of unsaturated vertices found in a single step of the modified Galton-Watson T modified process in a single step from configuration e i . Let C > 0 be a large enough constant to be determined later, and set R = min(R 1 , . . . , R r , C) (where R 1 ,. . . R r are independent). It is clear that R is stochastically dominated by each R i , and we next show that it obeys a tail bound and its the expectation of R is strictly larger than 1. Proof. First, consider the random variableR = min(R 1 , . . . , R r ). Then Next, write To upper bound the last sum, we use Claim 3.6 to bound Pr R h e −βh for some β = β(t) = β(A, r, ε) > 0. Therefore, and taking C large enough, this expression is at most 1.
We also construct a random variable W that will serve as an upper bound for the population. Let T 2 = GW( n, P ), and consider T 2 modified to be the modified Galton-Watson process of T 2 . Namely, this process maintains lists of saturated and unsaturated vertices, and at each step picks an unsaturated vertexsay w whose type is j -and then performs t j parallel steps of T 2 on w. Finally, the lists of saturated and unsaturated individuals are updated appropriately.
LetR i be the random variable measuring the total population explored in a single step of T 2 modified from configuration e i , and define W = max(R 1 , . . . ,R r ). Proof. Immediate from Claim 3.6 and the union bound.

The main argument: the proof of Lemma 4.1
Fix v ∈ V . We explore the connected component of v using a process G as follows. Throughout the process, we maintain lists LU(t) and LS(t) of unsaturated and saturated vertices at step t respectively. At each step, we take an arbitrary w ∈ LU(t) -say it's type is i, and then explore the neighbourhood of radius t i around w (analogously to the modified Galton-Watson process). We add w and any other new saturated vertex found in this step to LS(t), and add all new unsaturated vertices to LU(t).
Let D be a random variable denoting the total number of steps in the process G. Let U ( ) be the number of unsaturated vertices found in the -th step of G; note that the number of unsaturated vertices in step L is then L =1 U ( ) − L. We also denote by S( ) the number of saturated vertices introduced in the -th step of the modified process.
The coupling. We now describe a coupling procedure that will help us analyze the above random variables. Suppose that at the -th step of the process. If we explored more than k + vertices we halt. Otherwise, we take an unsaturated vertex w -say of type j, and perform a step of G.
1. Sample a random variable R( ) which is an independent copy of R and W ( ) independent copy of W 2. Sample U ( ), S( ) conditioned on U ( ) R( ) and S( ) W ( ) (we justify below why this is possible).
To see that the last step is possible, letŨ ( ) be the total number of unsaturated vertices in G( n − k + , P ) when we do a step from a vertex of type j. Note that U ( ) stochastically dominatesŨ ( ), which in return stochastically dominates R( ). Since the stochastic domination relation is transitive, by Fact 2.11 we can do the sampling of U ( ) as above. The argument for S( ) is analogous.
This process halts if the total population exceeds k + , or we ran out of unsaturated vertices. To make the analysis simpler, it will be helpful for us to imagine we still sample copies of R, namely R( )'s even after the above process terminates, so we in fact have the random variables R( ) for all ∈ N. We also remark that they are independent.
Analysis. Let ζ be a parameter to be chosen later. To bound the probability in question, we first use the union bound to say We first show that it is very unlikely that the process terminates early and explores many vertices. We fix ζ from Claim 4.4 henceforth.
Proof. We first argue that Pr ζk − =1 R( ) 5ζk − e −δ k − for some δ = δ (A, r, ε) > 0. Indeed, the random variables R( ) are independent, bounded between 0 and C = C(A, r, ε) > 0 and have expectation at least 10 by Claim 4.2, so this just follows from Chernoff's bound. Therefore, we can write The rest of the proof is devoted to bounding the probability on the right hand side. Note that whenever the event on the right hand side holds, we have that the number of unsaturated vertices explored by time ζk − is ζk − =1 R( ) − ζk − , and in particular at least 4ζk − , so we may bound this probability by Consider the process at time ζk − , and condition on the set of vertices in LU(ζk − ) and the set of vertices LS(ζk − ), and let s 1 , s 2 be sizes of these sets respectively. The contribution of the case s 1 + s 2 > k + to the probability in question is 0, so we assume henceforth that s 1 + s 2 k + . Fix an ordering w 1 , . . . , w ζk − on some ζk − vertices from LU(ζk − ). For each 1 ζk − , let E be the probability that starting the exploration process from w we find at least k + vertices (without using vertices from LU(ζk − ) ∪ LS(ζk − )). Let q = Pr Ē < Ē .
We bound each q from above. Conditioning on < Ē and further on the vertices explored in the connected components of w 1 , . . . , w −1 . Let T be the type statistics of these vertices and of the vertices in LU(ζk − ) ∪ LS(ζk − ). We note that the event E then is the event that the vertex w has a connected component of size at most k + in G( n − T , P ). Therefore, by Lemma 3.3, letting j be the type of w we have q 1 − Pr pop(GW( n − T − k + 1, P, e j )) k + 1 − Pr GW( n − T − k + 1, P, e j )) survives .
Plugging Claims 4.4, 4.5 into (6) Proof. Let δ = 1/(4A). Denote k = δk log n . By the union bound, where in the last inequality we used e −x 1 − x. Note that n j k n k e k log n e δk , and n j n/A, k n/2A, so we get that the above probability is upper bounded by e −δk e −k/2A e −k/4A .
Define the matrix P = 1 λ i,j >0   Proof. Let X be the set of potential edges, i.e. e = ( For each e ∈ X, let Z e be the indicator random variable that e ∈ G, and let Z e be the indicator random variable that e ∈ G . Then the statistical distance between G and G ∪ G is at most the statistical distance between the ensemble of random variables (Z e ) e∈E and (Z e ∨ Z e ) e∈E , and each one of these ensembles consists of independent random variables. We compute the KL-divergence between the two ensembles. Due to independence, this KL-divergence is equal to where p i,j = p i,j + (1 − p i,j ) 1 n 1.95 . A standard computation using log(z) z−1 ln 2 shows that Proof. Denote the probability in question by q.
We first argue that except for probability e − √ n , each connected component of size k + contains least n 0.98 vertices of each type. More specifically, let E be the event whose complement is By Claim 4.6 and the union bound, the probability ofĒ is at most n · e −δn 0.98 e − √ n , where the last inequality holds for large enough n. We now argue that if E holds, then any connected component of size at least n 0.99 contains at least n 0.98 vertices of each type. First, note that there is a type i such that |C ∩ V i | n 0.99 /r. Since E holds, for any j such that λ i,j > 0 we have that |C ∩ V j | δn 0.99 log n , and as Λ is connected we get that for all j it holds that |C ∩ V j | δ r n 0.99 log r n n 0.98 . Sample G ∼ G( n, P ) and G ∼ ( n, P ). Then number of connected components in G ∪ G whose size is at least k + is at least 2 G ∈ E .
By Claim 4.8, the statistical distance between G and G ∪ G is o(1). By the argument above, the probability ofĒ is o(1). Finally, for the last probability, condition on G and let C 1 , C 2 , . . . , C be all connected components of G of size at least k + . Then By Claim 4.7, for any distinct 1 , , the probability there is no edge between C and C is at most e −n 0.01 , so by the union bound the last probability is at most 2 e −n 0.01 n 2 e −n 0.01 = o(1). Combining everything, we get that q = o(1).

The size of the giant component
Let E be the event that G ∼ G( n, P ) contains only components of size at most k − , and at most a single connected component whose size exceeds k + . Using Lemma 4.1 we get that the probability that G ∼ G( n, P ) contains a connected component whose size is between k − and k + is at most n · e −δk − = o(1) for large enough n N . Secondly, by Lemma 4.9 the probability G contains more than one connected component of size k + is at most o(1). Therefore, For the rest of the proof, we sample G ∼ G( n, P ) conditioned on E. Let W be the random variable which is the connected component of size at least k + . For each v, let Z v be the indicator random variable of v ∈ W , i.e. that |CC(v)| k − .
Let α i , be the survival probability of GW( n, P, e i ). Proof. We have where we used the fact that Pr [E] 1 − o(1). By Lemma 3.3, By a coupling argument, the last probability is at most Pr [pop(GW( n, P, e i )) k − ] + o (1), which is at Proof. As before, where we used the fact that Pr  Let i, j ∈ [r] be types (not necessarily distinct), and let v ∈ V i , u ∈ V j be two distinct vertices.
Proof. By definition, To analyze the last probability, condition on the connected component of v, call it S. By symmetry, for each u ∈ V j the probability that u ∈ S is the same, and as |S| k − the probability that u ∈ S is at most k − /n j . Otherwise, u ∈ S, and then the second probability is where T is the type statistics of S. Using the natural coupling between G( n, P ) and G( n − T , P ), this n j , and we are done.
We are now ready to show that for each i ∈ [r], with probability To compute the variance, we write where we used Claim 4.12 in the last inequality. Thus, var(|W ∩ V i |) = 4Ak 2 − n, so we get that where v ∈ V i is some vertex, and by Claims 4.10, (4.11 Applying the union bound on Lemma 4.13 over all i ∈ [r] gives that |W ∩ V i | = (α i + o(1))n i for all i ∈ [r] except for probability o(1). Denote this event by E 2 .
The event in question of the second item in Theorem 1.1 is simply E ∩ E 2 , and as each one of these events has probability 1−o(1), it follows that the probability of E ∩E 2 is also 1−o(1), and we are done.

Numerical simulations
In this section we perform several numerical simulations to highlight some of the features of the pandemic spread in our proposed model. In Section 5.1 we perform numerical simulations and calculate the threshold for having a GCC and its size in two-type models and show the agreement with Theorem 1.1 (and the size of the component given in Equation (3) and Theorem 2.8).
Next, we would like to consider questions regarding the pandemic while it still develops (and not only about the end result of it). As we proved, the basic reproduction number at the beginning of the disease spread is the Perron-Frobenius eigenvalue ρ(M ) of the initial matrix M and it determines whether or not the outbreak occurs. As the spread progresses, a non-negligible fraction of the population gets infected and the ratios between the numbers of unsaturated vertices (vertices that are susceptible) from each of the types get changed. This follows simply from the fact there are types that are more infectious and susceptible than others and they tend to have a larger percentage of infected vertices. Thus, while the probabilities matrix P is constant throughout the spread, the matrix M changes and at each time step is modified. This change in time is not important if we are interested only in the final size of the GCC at the end of the pandemic. However, if we are interested in answering questions before the end of the pandemic, such as the GCC at the herd immunity point, we have to recalculate it at each time step.
This calculation gives M t = P · diag n i (from each of the types) in the graph at time t. We use the SIR framework where saturated vertices are removed. At each point in time t, we calculate the matrix M t and get a new Perron-Frobenius eigenvalue and a corresponding eigenvector. By updating the matrix we can follow the development of the disease in time. Without any countermeasures or external interventions, this time-updating eigenvalue is naturally reduced until the pandemic reaches the entire GCC, and dies out on its own. When trying to curb an outbreak in a given population, the time dependent eigenvalue can be very important, e.g. to check the possible effectiveness of countermeasures.
In Section 5.2 we use our analytical results to calculate the GCC size at the end of the pandemic and compare it to its size at the herd immunity point where the basic reproduction number drops below one. In Section 5.3 we calculate the GCC size at the end of the pandemic and compare it to its size at the herd immunity point when counter measures are being taken. We will see that countermeasures can lower the difference between the number of infected at the herd immunity point and the end of the disease. In Section 5.4 we follow the development of the pandemic in time and show the time dependence of the Perron-Frobenius eigenvalue and the corresponding eigenvector.

Comparison of GCC: analytical calculation versus numerical simulation
In this subsection we compare the calculation of the threshold to having GCC and its size as follows from Theorem 1.1 and a numerical simulation. We consider a two-type model and sample graphs with n 1 = n 2 = 10, 000 vertices. We generate a random two-type graphs with different configurations for λ 1,1 , λ 2,2 and λ 1,2 = λ 2,1 . The Perron-Frobenius eigenvalue ρ(M ) of the matrix M reads: In Figure 2 we compare the GCC size and the phase transition threshold of the simulations and the analytical calculation. It is clear that, the the random graphs sampled results match the analytical results both for the threshold value of ρ(M ) which is 1 and the size of the GCC.

Herd immunity versus end of the pandemic
In this subsection we consider the difference between homogeneous and heterogeneous infection graphs and the effect on the fraction of infected population at the end of the disease. We show the difference between fraction of the population infected at the herd immunity point where the effective reproduction number drops below one and the fraction infected at the end of the disease. This is the after-burn effect analysed in [32]. Consider two types of populations. Let the initial basic reproduction number be ρ(M ) = 2.6 which is in the estimated range for the COVID-19, R 0 ∼ 2 − 3. We fix λ 1,2 = 0.6 and vary the ratio λ 2,2 λ 1,1 and the fraction of type one of the total population n 1 n 1 +n 2 . In Figure 3 we compare the fraction of the population infected at the end of the disease and at the herd immunity point where the effective reproduction number ρ(M ) drops below one (this is the standard definition, see e.g. [6,8]). The uppermost graphs (brown, red and orange) show the fraction of infected at the end of disease, and the lower graphs (pink, purple and green) show the fraction of infected when herd immunity is reached, i.e. ρ(M ) = 1. We also see the difference between the two fractions of infected population in the two-dimensional plot (b).
The difference between these two fractions is significant [32], is of much importance and is often ignored in the discussions about reaching herd immunity. We also see in the graphs the difference between the homogeneous and heterogeneous cases. Real world pandemic spread follows a heterogeneous network and we see that the fraction of infected population can be significantly lower compared to the often quoted number of the homogeneous spread.
(a) (b) Figure 3: A comparison between the fraction of the population infected at the end of the disease and at the herd immunity point where the effective reproduction number ρ(M ) drops below one. The uppermost graphs (brown, red and orange) show the fraction of infected at the end of disease, and the lower graphs (pink, purple and green) show the fraction of infected when herd immunity is reached ρ(M ) = 1. We see a significant difference, the after-burn effect [32]. We also see the difference between the homogeneous and heterogeneous cases, where the fraction of infected population is lower in the latter. The plot are for two types, where the initial basic reproduction number is ρ(M ) = 2.6 and λ 1,2 = 0.6 (we move λ 1,1 on the range of [0.7, 3] with steps of 0.1 and interpolate λ 2,2 ). In (a) and the sizes of the communities are equal, i.e. n 1 = n 2 . (b) The same effect for the same parameters only changing community sizes, i.e. n 1 n 1 +n 2 = 10%, i.e. the vertices of type 1 make 10% of all vertices.

The countermeasure effect
We consider the effect of a partial lockdown (which stops infections between distinct types) on the infected fraction of population at the end of the pandemic. The lockdown starts once reaching the point where 10% of the population is infected. We take as an example the case of two types with initial basic reproduction number ρ(M ) = 2.6. We use λ 1,2 = 0.6 and vary the ratio λ 2,2 λ 1,1 and the fraction of type one of the total population n 1 n 1 +n 2 . The results are plotted in Figure 4. The uppermost graphs (brown, red and orange) show the fraction of infected population at the end of disease when no countermeasure are taken. The lower graphs (pink, purple and green) show the fraction of the population that is infected if we impose the lockdown.

The direction of the disease spread
One of our observations is that despite the complex structure of the pandemic spread one can identify a propagation direction. It is given by the Perron-Frobenius eigenvector of the matrix M , where the corresponding eigenvalue is the reproduction number of the disease whose threshold value at 1 separates between an outbreak or no outbreak of the pandemic. We refer to the Perron-Frobenius eigenvector which is time dependent (i.e. of the matrices M t defined above) and in order to calculate it one needs to update the matrix M , to get M t , as the pandemic evolves. Intuitively, the Perron-Frobenius eigenvector points at any time step in the direction where there is the highest potential to infect. This takes into account the size of the remaining uninfected population of the different types at that time step and their probability to infect. In fact, it supports the result of Theorem 1 of [9].
In Figure 5 we plot the Perron-Frobenius eigenvector for two distinct scenarios: homogeneous and In plots (b) and (c) we see the heterogeneous case. In this example we take 80% from type one and 10% from each of types 2 and 3. ρ(M ) ≈ 2.7 and we used   Types 2 and 3 are relatively highly infectious, and more susceptible to be infected than type 1. This means that they are quickly being infected and after they infect others they are removed (become saturated vertices). However, Type 1 is much less infectious and susceptible to be infected and remains longer with more unsaturated vertices, this means that only at a later stage of the outbreak, when vertices of types 2 and 3 become saturated, the pandemic catches more vertices of type 1 much quicker. We see in the plot that types 2 and 3 decrease quickly in their potential to infect compared to type 1 (their amount of unsaturated vertices is decreased), while type 1's unsaturated vertices increase relative to types 2 and 3 as the pandemic evolves.

Discussion
The contact graph of a real world pandemic is naturally heterogeneous and complex. It is clearly desirable to be able to work with a general graph and this is what we have done in this work. We employed the multitype Galton-Waston branching process and analysed the question of whether there would be an outbreak and what will be the fraction of infected population at the end of the disease, in the case of an outbreak. We defined an (r × r)-dimensional matrix M where r is the number of types, and whose entries encode the probability that an individual of one type would infect an individual of his type or an individual from another type (and this for each pair of types). M encodes the whole information about the evolution of the pandemic. In particular, its Perron-Frobenius eigenvalue is the basic reproduction number that determines whether there will be an outbreak. The corresponding eigenvalue points in the the direction of the spread and at each time step takes into account the remaining individuals of each type that can infect as well as their probability to infect.
Our framework allows for a general simulation on real world data once collected. It would be of importance to follow this direction. In our numerical simulations we presented several examples that highlight certain properties of the spread such as the difference between homogeneous and heterogeneous infection networks and the difference between herd immunity and the final end of the disease spread. We have also observed a property of our model regarding the eigenvector corresponding to the Perron-Frobenius eigenvalue. We have shown that, at least numerically, this eigenvector describes the direction in which the pandemic will spread (i.e. to which types first and quicker and to which types later on).
Further developments of our model may include applying it to more specific cases with explicit types, as done in [4]. One may want to use our model in order to incorporate connections between caregivers (may be viewed as super-spreaders of one type) and patients (that may be themselves partitioned into different communities). Through this, one can study the effects of more subtle interventions and countermeasures, or plan effective assignment of caregivers, in order to prevent further infections and stop the outbreak. Also, it is to be considered that other frameworks, rather than SIR, are more applicable in some cases, as described in [20], e.g. by allowing non immediate recovery-time (and thus prolonging infectiousness stage), but rather letting it follow a certain distribution.
Moreover, our analysis is applicable for a general information spread and as such outlines a structure that can be used not just for a pandemic spread. A generalization of our work that is worth pursuing in this context is to a non-symmetric probability matrix P that naturally arises, for instance, in search engines. Another important direction to follow is to have general degree distributions, rather than just binomial, e.g. the Gamma distribution that has proven valuable is describing pandemics such as COVID-19.