Fixed speed competition on the configuration model with infinite variance degrees: unequal speeds

We study competition of two spreading colors starting from single sources on the configuration model with i.i.d. degrees following a power-law distribution with exponent tau in (2,3). In this model two colors spread with a fixed but not necessarily equal speed on the unweighted random graph. We show that if the speeds are not equal, then the faster color paints almost all vertices, while the slower color can paint only a random subpolynomial fraction of the vertices. We investigate the case when the speeds are equal and typical distances in a follow-up paper.


Introduction and results
1.1. The model and the main result. Let us consider the configuration model CM n (d) on n vertices, where the degrees D v , v ∈ {1, 2, . . . , n} := [n] are i.i.d. with a power-law tail distribution. That is, given the number of vertices n, to each vertex we assign a random number of half-edges drawn independently from a distribution F and the half-edges are then paired randomly to form edges. In case the total number of half-edges L n := v∈[n] D v is not even, then we drop one half-edge from D n (see below for more details). We assume that with τ ∈ (2, 3), and all edges have weight 1. We assume P(D ≥ 2) = 1 guaranteeing that the graph has almost surely a unique connected component of size n(1 − o(1)) see e.g. [27,Theorem 10.1] or [34,35]. We further denote the mass function of −1 plus the size-biased version of D by f * j := (j + 1)P(D = j + 1)

E[D]
, j ≥ 0. (1.2) We write F * (x) for the distribution function F * (x) = x j=0 f * j . Pick two vertices R 0 (red source) and B 0 (blue source) uniformly at random in [n], and consider these as two sources of spreading infections. Each infection spreads deterministically on the graph: for color blue it takes λ time units to pass through an edge, while color red needs 1 unit of time for that. Without loss of generality we can assume that λ > 1. Each vertex is painted the color of the infection that reaches it first, keeps its color forever, and starts coloring the outgoing edges at the speed of its color. When the two colors reach a vertex at the same time, the vertex gets color red or blue with an arbitrary adapted rule, i.e. a rule that is not depending on the future. One examples of such a rule is when it is painted red or blue with probability 1/2 each, independently of everything else. Another natural adapted rule is that a vertex, when the two colours arrive at it at the same time, is painted red or blue with probability proportional to the number of previously red and blue-colored neighbors of the vertex. Let R t := R t (n) and B t := B t (n) denote the number of red and blue vertices occupied up to time t, respectively. We denote by B ∞ := B ∞ (n) the number of vertices eventually occupied by blue. We emphasise that the randomness in this model is only coming from the structure or topology of the graph and the uniform choice of the source vertices for the two colors; once these are settled, the dynamics is completely deterministic.
Roughly speaking, the first main result of this paper, Theorem 1.2 below, tells us that in the quenched setting, i.e., for almost all realizations of the graph CM n (d) and for almost all initial vertices R 0 , B 0 , the faster color always wins, that is, it gets n − o(n) many vertices. Furthermore, the number of vertices the slower color paints is a subpolynomial of n. More precisely, blue paints whp exp{(log n) 2/(λ+1) H(n, Y r , Y b )} many vertices, i.e., a stretched exponential in log n with exponent 2/(λ + 1) < 1, and where the coefficient H(n, Y r , Y b ) is a random function that depends on n, λ, τ , and two random variables Y r and Y b , that can intuitively be interpreted as some measure of 'how good' the neighbourhoods of the source vertices are: the faster the local neighbourhoods grow, the larger these variables are. Moreover, H(n, Y r , Y b ) does not converge: it has an oscillatory part that exhibits 'log log-periodicity'.
The other main result, Theorem 1.4, shows that the degree of the maximal-degree vertex that blue ever occupies obeys asymptotic behaviour similar to blue's total number, with a strictly smaller coefficient in the exponent, and the same log log-periodicity. This phenomenon is due to integer part issues coming from the fact that the edge weights are concentrated on a lattice. We emphasise again that these results are quenched.
To be able to state the main theorem precisely, let us define the following random variables: k denote the number of individuals in the kth generation of two independent copies of a Galton-Watson process described as follows: the size of the first generation has distribution F satisfying (1.1), and all the further generations have offspring distribution F * from (1.2). Then, for a fixed but small ρ > 0 let us define Y (n) r := (τ − 2) t(n ρ ) log(Z (r) t(n ρ ) ), Y (n) b := (τ − 2) t(n ρ )/λ log(Z (b) t(n ρ )/λ ), (1.3) where t(n ρ ) = inf k {Z (r) k ≥ n ρ }. Let us further introduce Y r := lim k→∞ (τ − 2) k log(Z (r) k ), Y b := lim k→∞ (τ − 2) k log(Z (b) k ). (1.4) We will see below in Section 2 that these quantities are well-defined and that (Y (n) r , (1.4) as n → ∞. With these notation in mind, we have the following theorem: Theorem 1.2. Fix λ > 1. Then, lim n→∞ R ∞ /n = 1 whp. Further, there exists a bounded and strictly positive random function C n (Y (n) r , Y (n) b ) such that as n → ∞ . (1.5) We identify C n (Y (n) r , Y (n) b ) in (7.17) as a deterministic, oscillating (non-convergent) function of τ, λ, n, Y (n) r , Y (n) b . C n (Y (n) r , Y (n) b ) has the uniform (non-tight) bounds (τ − 2) (τ − 1) 2 Remark 1.3. We also give the accompanying tight bounds on C n (Y (n) r , Y (n) b ), see (7.18). Let us denote D (b,n) max (∞) := max i∈B∞ D i (1.6) the degree of the maximal degree vertex eventually occupied by blue. As a side result of the proof of Theorem 1.2, we get the following theorem: There exists a bounded and strictly positive random function C max n (Y (n) r , Y (n) b ) defined below in (6.10), such that as n → ∞ .
(1.8) Remark 1.5. We emphasise that these results are valid for any adapted rule of decision when the two colors jump at the same time to a vertex. In case λ is irrational, clearly, this rule will never be used. If λ is rational, then the normalisation random variables C n (Y (n) r , Y (n) b ), C max n (Y (n) r , Y (n) b ) depend on the rule -they are slightly different if the rule is so that these vertices are always painted red, from the case when there is a positive chance that these vertices are painted blue, but the upper and lower bounds on C n (Y (n) r , Y (n) b ), C max n (Y (n) r , Y (n) b ) remain the same. On the other hand, when λ = 1, this rule will play an important role in the outcome. Remark 1.6 (More than two colors). If there are a finite number of colors with edge passage-times 1 = λ 1 < λ 2 ≤ · · · ≤ λ k , then the statements of Theorems 1.2 and 1.4 stay valid for each λ i , 2 ≤ i ≤ k, with limit variables (Y λi i /Y 1 ) 1/(λi+1) on the right hand side of (1.5), where Y i are i.i.d. copies of Y . The reason for this is that with high probability each slower color only meets the fastest color and never meets the other slow ones. That is, the clusters of slower colors are separated from each other by the cluster of the fastest color.
1.2. Related work and discussion. First we give a (non-complete) overview of the literature on competition on different graph models. Then we mention some more applied results.
In a seminal paper [24] Häggström and Pemantle introduced competition on the grid Z d . The model is called the two-type Richardson's model, and it describes the dynamics of two (red and blue) infections with single source vertices v 0 , v 1 ∈ Z d that compete to conquer the grid Z d . In this continuous-time model, a vertex of Z d gets a given color with rate proportional to the number of infected neighbours of that color; then, once a vertex is infected, it keeps its color forever. Note that the evolution of a single color without the presence of the other color has independent exponential passage times across edges, and a vertex gets infected at the time that equals the minimal length path from the source to the vertex. Hence, a single color process is often called first passage percolation in the literature. Multiple colours then lead to the name competing first passage percolation.
For two colors, we have two possible evolution scenarios: in the first, one of the growing clusters completely blocks the growth of the other color -by surrounding it -and then it infects all the remaining healthy vertices. In the second scenario the two clusters continue to grow unboundedly forever: this is called coexistence. The important question is: does coexistence occur with positive probability? Häggström and Pemantle [24] proved that this is the case for the Z 2 grid with i.i.d. exponential passage times. Later this result has been extended by Garet and Marchand [21] for Z d , d ≥ 2 for a vast class of passage time distributions under mild hypothesis. For further literature on the Richardson model see [14,15,22,25,26].
Recently, a noticeable scientific interest arose in understanding the structure of large but finite networks and the behaviour of spreading processes on these networks. Typically, results on these topics are called first passage percolation, see e.g. [6,7,8]. It is then natural to ask what happens when one considers competition of multiple spreading processes on these networks. When studying competitive spreading, one might also gain a more detailed understanding of the structure of these graphs.
The idea of competitive spreading on finite random graph sequences raises several questions. First and foremost, due to the finite size of the graphs the main questions about these models must be rephrased, since infinite growth can never happen. Thus, the definition of coexistence had to be modified in this setting. Consider two competing colors on a sequence of random graphs: is there an asymptotic coexistence of the two colors? That is, is it possible that both colors paint a positive proportion of vertices with positive probability, as the size of the graph tend to infinity? If this is not the case, can we determine the number of eventually occupied vertices for both colors in terms of the size of the graphs? What happens if we modify the passage dynamics so that the two infections have different rates of growth λ 1 and λ 2 ? Here we give a (non-complete) overview of the existing literature on these topics for different random graph models.
Antunovic, Dekel, Mossel and Peres [2] give a detailed analysis of competition on random regular graphs (degree at least 3) on n vertices with i.i.d. exponential edge weights. They analyse the number of eventually occupied vertices by both colors as a function of the speeds λ 1 , λ 2 and of the initial number of infected vertices, that might even grow with n. They show that asymptotically almost surely the color with higher rate occupies n − o(n) vertices and the slower color paints approximately n β vertices for some deterministic function β(λ 1 , λ 2 ). Their result include asymptotic coexistence for equal speeds λ 1 = λ 2 for infections starting from single sources.
Next, van der Hofstad and Deijfen [16] investigates competition with exponential spreading times on the configuration model with i.i.d. degrees coming from a power-law distribution with exponent τ ∈ (2, 3). They prove that even if the speeds are not equal, the 'winner' color is random, i.e. the color with slower rate can still take most of the graph. Moreover, the winning color paints all but a finite number of vertices. The randomness of the 'winner' color comes from the fact that the underlying Markov branching process explodes in finite time, and the slower color has a positive chance to explode earlier than the faster color.
A slightly different, discrete time competition model is analysed by Antunović, Mossel and Rácz in [3]. There, the underlying random graph is the growing linear preferential attachment model, and vertices pick their color upon entering the network randomly from the colors of the vertices they attach to. The probability of picking a color is a (possibly linear) function of the number of neighbors with the given color, called the coloring function. The authors analyse coexistence of colors in terms of the properties of the coloring function. Note that in this case the graph has power law τ = 3. The proofs are based on comparison to Pólya urns.
Finally, this paper considers competition on the configuration model with i.i.d. power-law degrees with exponent τ ∈ (2, 3), but with deterministic unit edge-weights. Theorem 1.2 shows that the fact that the edge weights have a support separated from zero entirely changes the picture observed in [16]: when the speeds are unequal, the faster color always paints n − o(n) vertices, and the slower color can paint only subpolynomial many vertices.
If the speeds are equal, then the phenomena is richer: as a side result of the analysis of the λ = 1 case, we obtain precise distributional limits of the second order terms in typical distances in the graph. Further, we conjecture that there is still no coexistence with high probability, and the loser type can paint a polynomial many vertices with a random exponent that is less than 1. However, this random exponent sensitively depends on the initial local neighbourhoods of the source-vertices and shows different behaviour if the corresponding random variables are within a very specific constant factor of each other or if they are not. Due to the length of the analysis of this case and to put more highlight on the rich phenomena that comes with it, we decided to put the equal-speed case in a subsequent paper soon to be published.
From the more applied perspective, competition on networks is present in many aspects of our life. To start with an example, in marketing, companies compete for customers who are connected via their acquaintance network, and they provide word-of-mouth recommendations and opinions about the services of the different companies, see [19,20]. For economic studies on the importance of word-of-mouth, see e.g. [4,11,18]. Recently, 'word-of-mouth' recommendations happen also on large scale on different social online media such as Facebook, and Twitter. For a survey on how online feedback mechanisms differ from original word-ofmouth recommendations and what challenges they pose, see [17]. The paper [32] analyses recommendation-based viral marketing on social media, where they use viral marketing also to identify communities of online networks. For recent economic studies of the importance of word-of-mouth recommendations, see e.g. [12,31].
In epidemiology, viruses and bacterial infections spread through society. In this setting, competition can happen among different strains of a pathogen, see e.g. [33] for a study under which conditions coexistence can occur and references therein. In the physics community, [1,30,36] study the effect of the underlying network on co-existence of competing viruses.
The epidemiological analogies have been further exploited by [37], where they study a variation of susceptible-infectious-susceptible epidemic spread, where two epidemics are immune to each other, and the authors show that one of them completely takes over (similarly as in [16]). Then, [5] studies how partial immunity can cause coexistence in the previous model.
Discussion and open problems. The analysis of competition on the configuration model is far from complete. One can for instance ask about different spreading dynamics (edge lengths) and different power-law exponents. Further, one can ask what happens if the colors have entirely different passage time distributions (e.g. one is explosive and the other is not), or what happens if one of the colours have a main advantage by starting from one or many initial vertices of very high degree. These can correspond to e.g. competition advantage of different product on the network or to different marketing strategies. Here we list some conjectures for uniformly picked single vertex sources of infections on CM n (d) with i.i.d. power law degrees of distribution D with exponent τ . We further assume that the time to passage times can be represented as i.i.d. random variables on edges, from distribution I r , I b for red and blue, respectively. 1. τ ∈ (2, 3): A. If the spreading dynamics are so that the underlying branching processes defined by D, I r and D, I b are both explosive, then we conjecture that there is never coexistence and either of the two colors can win. This is one of our ongoing research projects.
B. If the underlying branching process for one color has explosive spreading while the other one has not, than we suspect that the explosive one always wins.
C. If both underlying branching processes are non-explosive, and further assume I r d = λI b , then we guess that there is no coexistence if λ = 1 (the fastest color wins). We suspect that the number of vertices the 'loser' color paints depends sensitively on the weight distribution. The outcome in the λ = 1 case might sensitively depend on the weight distribution. 2. τ > 3: D. We suspect that if the transmission times I r , I b both have continuous distribution, and the branching process approximations of them have different Malthusian parameters, then there is no coexistence, and the number of vertices painted by the slower color is n β for some β ∈ (0, 1). When the Malthusian parameters agree, we suspect that there is asymptotic co-existence. 3. τ = 3: E. In this case P(D > x) = L(x)/x 2 , with L(x) a slowly varying function at infinity. We suspect that L(x) and the transmission distributions I r , I b jointly determine into which category among A, C, D above the spreading of the colours belongs to: if the x log x criterion holds for the underlying age-dependent branching process with D, I r and D, I b then we expect that the model will show similar phenomena as in case D. If the underlying branching processes are explosive, then similar phenomena is expected as in case A, and if it is none of these two, then as in case C. Further, if the two colours have significantly different dynamics, i.e. one is explosive and the other one is not, then we conjecture that case B applies.

1.3.
Overview of the proof and structure of the paper. The heuristic idea of the proof is as follows: we can start growing the two clusters simultaneously. The growth has six phases, each corresponding to a section below, described as follows: (i) Branching process phase. At first, whp, the two colored clusters do not meet and the growth of both clusters is characterised by the growth rate of the branching process (BP) to which they can be coupled. This we call the branching process phase. The length of this phase is of order log log n/| log(τ − 2)| + O(1). Then, the faster color (red) reaches the area where the coupling fails to remain valid: R t reaches size n for some > 0. (ii) Mountain climbing phase.
At this point, we start making use of the structure of high-degree vertices in the graph: due to high connectivity, the subgraph formed by high-degree vertices can be represented as a 'mountain' where the height function is linear in the log log-degree. Level sets of this mountain represent vertices with degree of the same order of magnitude, with the maximal degree in the graph at the top of the mountain. We partition this mountain into layers -that is, constant length intervals on a log log-scale -and we show that every vertex in a given layer has at least one neighbour in one layer higher. As a result, we show the existence of a path for red through these layers of vertices of higher and higher degree such that the path reaches some vertex with degree larger than n (τ −2)/(τ −1) at the end. This we call the mountain climbing phase. The climbing phase lasts only finitely many steps, but the constants turns out to be important, so we perform a rather careful analysis. We denote the total time of the branching process phase and the climbing phase for red by T r . (iii) Crossing the peak of the mountain.
We handle how the color red goes through the peak of the 'degree-mountain' very carefully. Vertices of degree much larger than √ n form a subgraph that is a complete graph, hence it takes only one step to paint all the very high degree vertices, but the degree of vertices to which the faster color arrives at the end of this single step is delicately depending on the initial random growth rates of the branching processes and their integer and fractional part issues. (iv) Red avalanche from the peak.
After crossing the mountain, red starts sloping down to layers of vertices of smaller and smaller degree. Since it is still true that each vertex in a layer is connected to at least one vertex in one layer higher, this means that in each additional step, red paints all the vertices in one layer lower. We call this the avalanche-phase of red. (One can imagine this as red being a very careless climber who -after crossing the peak of a mountain -steps in the snow with a bucket of red paint and starts a huge painted avalanche.) (v) At the collision time.
Now we turn our attention to the blue climber who does essentially the same as red except that it is slower: after getting out of its local neighbourhood corresponding to the branching process, blue starts its mountain climbing phase as well. Since it is slower, whp it will only reach some low layer of the degree-mountain when red starts its avalanche. With this picture in hand, we can identify the maximal degree vertex eventually painted blue -this is the vertex in the highest layer blue can still reach. The idea of the proof is to determine the value such that during the total time T r + , blue has climbed up to the same layer as the red avalanche has sloped down to. Since red occupies every vertex in a layer it reaches, it will necessarily bump into blue, who whp reaches only some vertices in that layer. This determines the time when red starts successfully blocking blue. (vi) Competing with the avalanche.
After the meeting time T r + , blue cannot go higher up on the mountain since red already occupies every vertex having degree higher than the maximal degree of blue. Note that at this time most of the graph is still not reached by any color: we need to estimate the number of vertices that blue can still reach before the red avalanche closes up around the blue cluster. This is done in two steps: heuristically, every vertex that is close enough to a blue half-edge occupied at or before T r + has a high chance to become blue later. Hence, first we calculate the size of the 'optional cluster of blue', i.e. we calculate the size of the k-neighborhood of blue half-edges via path counting methods. The size of the optional cluster is convergent if k → ∞: due to the presence of the red avalanche, the degrees in the blue paths get more and more restricted and finally the red avalanche reaches constant order vertices and then the procedure stops. It can still happen that some vertices in the optional cluster of blue are occupied by red simply because they are 'accidentally' also close to some red vertex. Thus, in the second step we estimate the size of the intersection between the optional cluster of blue and the red cluster. The two steps together provide a matching upper and lower bound for the number of vertices that blue occupies after the intersection. This phase has a non-negligible impact on the order of magnitude of vertices painted blue since the constant C n in the exponent of (1.5) is influenced by this last phase.
Notation. We write [n] for the set of integers {1, 2, . . . , n}. We denote by the same name and add a superscript (r), (b) to random variables, sets or other quantities belonging to the red and blue processes, respectively. We write E(CM n (d)) for the set of edges. For any set of vertices S ⊂ [n], we write N (S) for the set of their neighbors, i.e., N (S) = {y ∈ [n] : ∃x ∈ S, (x, y) ∈ E(CM n (d))}. (1.9) For any event A, P n (A) := P(A|D 1 , D 2 , . . . , D n ). As usual, we write i.i.d. for independent and identically distributed, lhs and rhs for left-hand side and right-hand side. We write x , x for the lower and upper integer part of x ∈ R, and {x} for the fractional part of x ∈ R. Slightly misusing the notation, we use curly brackets around set elements, events and exponents as well. We say that a sequence of events E n occurs with high probability (whp) when lim n→∞ P(E n ) = 1. In this paper, constants are typically denoted by c in lower and C

The branching process phase
First we describe the exploration process of the local neighbourhood of a given vertex in order to relate it to a branching process.
The configuration model CM n (d) (introduced in [9], for more see [10,27]) on n vertices with i.i.d. degree distribution D can be briefly described as follows: for each vertex i ∈ [n] we assign an i.i.d. random variable D i ∼ D, and attach D i half-edges to that vertex. If the total degree L n = n i=1 D i is odd, then we add an extra half-edge to the vertex n. Then we number the half-edges in an arbitrary way from 1 to L n , and start pairing them uniformly at random, i.e. we pick an arbitrary unpaired half-edge and pair it to a uniformly chosen other unpaired half-edge to form an edge. Once paired, we remove them from the set of unpaired half-edges and continue the procedure until all half-edges are paired. We call the resulting multi-graph CM n (d). Since the choice of the half-edge to be paired is arbitrary, we can start from any set of vertices, and explore their cluster simultaneously with the construction of the graph. We call this procedure the exploration process, which is a version of a Breadth First Search Algorithm on the random graph CM n (d). We describe the exploration process in more detail for the case when the initial set is a single uniformly chosen vertex v ∈ [n], and relate it to a corresponding branching process as follows.
In each step of the exploration process, each vertex belongs to exactly one of three sets: it can be active (A), explored (E) or unexplored (U). Initially E 0 = ∅ and all vertices except v are in U 0 . We start setting the status of the initial vertex v to active: A 0 = {v}, and we write A i for the set of active vertices after the ith step of the exploration. In each step we pick a vertex v i+1 from A i (we do this first-in-first-out way, i.e., we keep track of when a vertex enters the set A) and do three things: remove v i+1 from A i ; add it to the explored vertices E i ; and put all its unexplored neighbors in the active set of vertices, i.e., where N (v i+1 ) denotes the neighbors of v i+1 in CM n (d). The explored vertices form the sequence Let B i stand for the forward-degree of the vertex v i in the exploration process, so that We aim to determine the distribution of B i . For this we note that in the construction of the random graph CM n (d), an arbitrary half-edge is chosen and paired to a uniformly chosen unpaired half-edge. Hence, we can do the construction of the graph together with the exploration process. Further, the probability of picking a half-edge which is belonging to a vertex with degree j + 1 is proportional to (j + 1)f j+1 , and as long as the size of the neighbourhood is small the probability that a vertex is connected to some vertex explored earlier vanishes. Hence, we get the size-biased distribution (1.2) as a natural candidate for the forward degrees of the vertices v i in the exploration process. More precisely, we have the following result: In our case, we have two source vertices red and blue with different spreading speed, thus, we need a slight modification of this proposition. Namely, we need that a similar coupling remains valid for two exploration processes from two uniformly chosen vertices up to the time when the red (first) color reaches size n ρ . Let us temporarily denote the number of vertices occupied by blue (the other) color by this time by h(n, ρ). This coupling is similar to [6,Proposition 4.8], but we state it for the reader's convenience: i n ρ i=2 for the red cluster. After this, connect the n ρ -th chosen vertex to the blue source vertex B 0 with an imaginary edge. Then drop all the other active vertices from A T (n ρ ) and re-start the exploration process with only vertex B 0 being active. Since it takes time λ to cover an edge for blue, up to time T (n ρ ) blue reaches all the vertices which have graph distance at most T (n ρ )/λ from the source vertex B 0 . Thus, continue the exploration process from the blue source up to finishing generation T (n ρ )/λ . Since λ > 1, the total number of vertices found by this second phase has smaller order than n ρ , so that the coupling still remains valid. Moreover, since each of the clusters have only at most n ρ many vertices, with high probability they do not meet each other. Further, when λ = 1, the proof is the same, the first cluster to reach n ρ vertices takes the role of red, and the other one takes the role of blue.
An immediate consequence of Proposition 2.1 and Lemma 2.2 is that locally we can consider the growth of R t and B t as independent branching processes (Z k ) k>0 with offspring distribution F * for the second and further generations, and with offspring distribution given by F for the first generation.
Let us now investigate the growth of these branching processes. Since τ ∈ (2, 3), the offspring distribution of this branching process has infinite mean for every individual in the second and larger generations. To understand the behavior of this BP, we first look at what happens in a BP where all the degrees are distributed as F * , including the first generation.
The following theorem by Davies [13] describes the growth rate of such a branching process: Theorem 2.3 (Branching process with infinite mean [13]). Let Z k denote the k-th generation of a branching process with offspring distribution given by the distribution function F * . Suppose there exists an x 0 > 0 and a function x → γ(x) on R + that satisfies the following conditions: Let us assume that for some τ ∈ (2, 3), the tail of the offspring distribution satisfies that, for Then (τ − 2) k log( Z k ∨ 1) converges almost surely to a random variable Y . Further, the variable Y has exponential tails.
To be able to apply this theorem to our setting, we need to show that the distribution function F * satisfies the condition (2.1). This is clearly the case since using the elementary re-arrangement of weights combined with the bounds in (1.1) and elementary estimates immediately yields that there exist constants 0 < c * Since P(D ≥ 2) = P(B ≥ 1), these BP-s cannot die out, i.e., we can write log Z k instead of log( Z k ∨ 1) and apply Davies' theorem to obtain the a.s. convergence of Recall that the degree of the first vertex in the exploration process is distributed as F not F * , hence we denote by Z k the corresponding BP and call it the delayed branching process. The next lemma identifies the distribution of the limit of the properly scaled delayed branching process. We also identify the limit random variable Y in terms of Y .
Then Y satisfies the distributional identity where Y (i) are i.i.d. copies of the limiting random variable of the original non-delayed BP. Further, Remark 2.5. An elementary calculation using (2.3) shows that Y also has exponential tails with a parameter that is (τ − 1) times the parameter of Y .
Proof of Lemma 2.4. Since the number of offsprings in the first generation is distributed as D, by the branching property the subtrees starting from the first generation up to level k are distributed as Z k−1 and are independent of each other. Thus, for every k ≥ 1, We can bound the right hand side from both sides: (2.6) Clearly (τ − 2) k log D P → 0, and by monotonicity we can exchange log and max and use Theorem 2.3 for the convergence of (τ − 2) k−1 log( Z Exchanging the limit with the maximum finishes the proof. The second statement of the lemma can be proved analogously.

Mountain-climbing phase
In this section we describe the mountain-climbing phase. From now on we will concentrate on the growth of the red (the faster) cluster, but the very same methods will later be used for blue as well. Thus, in this section we neglect the superscript (r), and temporarily every quantity is belonging to the red cluster. We denote the set of red vertices at time t by R t and its size by R t . Since Proposition 2.1 only guarantees the coupling as long as the total number of explored vertices by red is at most n for some > 0, let us first set some Note that by Lemma 2.2, and the fact that the total size of earlier generations are whp negligible compared to the last generation, t(n ρ ) = T (n ) whp. Recall Definition 1.1, i.e., Note that t(n ) and thus Y (n) r is depending on n. Then, an easy calculation yields that, where Note that 1 − a n is there to make the expression on the rhs of t(n ) equal to its upper integer part. Due to this effect, the last generation has a bit more vertices than n , so let us introduce the notation for the random exponent of the overshoot We get this expression by rearranging (3.1) and using the value t(n ρ ) from (3.2). The property < (τ − 2) 2 guaranties that the coupling is still valid, i.e. we can also couple the degrees of vertices in the t(n )th generation of the branching process to i.i.d. size biased degrees.
After time t(n ), we stop the coupling and focus on the graph: we start decomposing the graph to the following nested sets of vertices, that we call layers: where u i is defined recursively by for a large enough constant C > 0. We will see below that e.g. C = 8/c 1 is sufficient, where c 1 is from (1.1). It is not hard to see that . . . First we need to show that Z t(n ) has a nonempty intersection with the initial layer Γ 0 , and then we will build a path through the layers. The following lemma is a general lemma about the maximum of i.i.d. power-law random variables. It guarantees that R t(n ) ∩ Γ 0 = ∅, and will also be repeatedly used to determine the maximum degree in a set of vertices: and for K > 0, where c 1 arises from (1.1).
Note that the distribution F * satisfy the condition of the lemma with α = τ − 2, see (2.2). So, we can apply this lemma (specially (3.8)) in the following setting: the i.i.d. variables X i are the forward degrees (B i ) i=1,...,Z t(n ) ∼ F * in the last generation of the branching process, thus m := Z t(n ) = n and α = τ − 2. Note that the bound we get when applying (3.8) states that whp there is at least one vertex with degree at least u 0 (defined in (3.6)). Hence, we get that Γ 0 ∩ R t(n ) = ∅ whp.
We will repeatedly use concentration of binomial random variables of the following form Lemma 3.2 (Concentration of binomial random variable). Let X be a binomial random variable with parameters n, p n . Then Proof. Follows from standard estimates, see e.g. [27,Theorem 2.19] or [23] In what follows, we will build a path from Γ 0 ∩ R t(n ) to the highest-degree vertices through successive layers Γ i . The following lemma guarantees the existence of such a path. Recall that N (S) stands for the neighbors of the set S in CM n (d).
Furthermore, the previous statement can be applied repeatedly to build a path from Γ 0 to Γ i as long as Proof. Let us denote the total number of half-edges in Γ i by S i . Then, since the degrees are i.i.d., we have |Γ i+1 | ∼ Bin(n, 1 − F (u i+1 )), and each vertex w ∈ Γ i+1 has degree at least u i+1 . Thus by Lemma 3.2, Recall that L n denotes the total number of half-edges in the graph. Then, the probability that there is a vertex v ∈ Γ i not connected to Γ i+1 can be bounded from above by where we recall that P n (·) := P(·|D 1 , . . . , D n ). We have used that L n < 2E[D]n whp by the Law of Large Numbers, |Γ i | < n, and the estimate S i+1 in (3.11). The factor 1/2 in the exponent u i /2 comes from the worst-case scenario estimate when we connect all the first u i /2 half-edges back to v. Similar calculations (with indices of u i and u i+1 exchanged) are worked out in more detail in [27, Volume II., Chapter 5]. Then, using the defining recursion (3.6), it is easy to see that that is, the error term in (3.12) is bounded by The assertion of the lemma follows if ε i small: the first term is small when picking C large enough. For the second term we need n (1), which exactly translates to the condition u i = o(n 1/(τ −1) ) and to (3.10) using (3.7). Note that as long as (3.10) is satisfied, even . This means that we can apply the lemma consecutively for the layers (Γ i )'s and build a path ) whp, as long as i satisfies (3.10). This finishes the proof of the second statement of the lemma.
With Lemma 3.3 in hand we can determine how long it takes to climb up through the layers Γ i to the highest-degree vertices. Lemma 3.1 with X i = D i ∼ F , α = τ − 1 shows that the maximal degree in CM n (d) is of order n 1/(τ −1) . We write i * for the last index when Γ i is whp nonempty, i.e., i * := inf{i : (3.13) An easy calculation using (3.7) shows that (3.14) Note that i * satisfies (3.10), thus all the error terms up to this point stay small. Using the value of the overshoot exponent in (3.4) and then the value a n in (3.3), plus the fact that From (3.7) one can easily calculate that (3.16) We will repeatedly need the total time to reach the top, so let us introduce the notation which only depends on via the approximating Y (n) r , and b n is exactly the fractional part of the expression on the rhs of T r . Since also Y (n) r → Y r irrespective of the choice of , this establishes that the choice of is not relevant in the proof.

Crossing the peak of the mountain
Next we investigate what happens when the path through the layers reaches the highest degree vertices. We have just seen that the exponent of n in Recall that the maximum degree in the graph has exponent 1 τ −1 whp, i.e. Γ i * +1 = ∅ whp, meaning the path can not jump 'up' one more step. On the other hand, we can make use of the following lemma from [27, Volume II., Chapter 5]: for some function h(n), then conditioned on the degree sequence with L n ≤ 2E[D]n, the probability that the two sets are not directly connected can be bounded from above by Proof. When pairing the half-edges coming out from A, the probability that the i-th one paired is not directly connected to a half-edge in B is (1 − S B /(L n − 2i − 1)). Thus, The product only goes until S A /2 − 1, since in the worst case scenario the first S A /2 half-edges are all paired back to another half-edge in A, thus the last S A /2 half-edges are not used anymore. In both cases, we can pair at least S A /2 many half-edges.
Let us introduce and and the following layer: The next lemma helps us describe how the process goes through the highest-degree vertices: All the vertices in Γ 1 are occupied by red at time T r + 1, i.e., Proof. By Lemma 3.3, there is a blue path up to Γ i * , and hence, blue is occupying some vertices in layer Γ i * at time T r . Hence, R Tr ∩ Γ i * = ∅, and we have at least one vertex v i * in R Tr for which the degree is at least u i * , see (3.16). We claim that this vertex is whp connected to every vertex in Γ 1 . To see this, let us set A := {v i * } and B := {w}, that is, any single vertex in Γ 1 with degree at least (C log n)n/u i * . Then apply Lemma 4.1 with this setting to see that v i * is whp connected to w. Further, note that S A S B /n = C log n by the definition of u 1 . Hence, using the error bound in Lemma 4.1 and a union bound, (4.5) Clearly | Γ 1 | < n: picking a large enough C, we see that the error probability tends to zero. Calculating C log n · n/u i * yields the formula for u 1 .
It is important to note that vertices with degree larger than u 1 do whp exist in CM n (d) by Lemma 3.1. Moreover, i * is the first index when we can apply Lemma 4.1, since for all smaller values i < i * , there are whp no vertices with degree at least n/u i by Lemma 3.1.
This completes the crossing the peak of the mountain phase.

red avalanche from the peak and the blue climber
Using the value u 1 in (4.2), let us again recursively define and also the increasing sequence of sets i.e., now Γ 1 ⊂ Γ 2 ⊂ . . . holds. Since (5.1) is the very same as the recursion in (3.6) with indices exchanged, we can apply Lemma 3.3 to Γ ≥1 , now yielding that for any ε > 0, for This means that in the 'sloping down' phase, whp red occupies all vertices in Γ at time T r + . Solving the recursion (5.1) yields that where α and β were defined in (4.1). Note that the exponent of C log n stays bounded even when → ∞. Hence this procedure can be continued even to reach lower degree vertices, for every fixed ε > 0 up until In what follows, we determine the point where red and blue meet. More precisely, we calculate the value such that during the time T r + , the maximum degree vertex in the cluster of blue is of the same order as u . Since at time T r + , red occupies whp almost every vertex with degree at least u , the growing cluster of blue bumps into the occupied vertices and cannot spread to higher-degree vertices anymore.
The following proposition about the maximal degree of blue is our main building block for the proof of Theorem 1.4: of the maximal degree vertex in the blue cluster at time t. Then, at time T r +t and for any real as long as t is so that the quantity on the rhs is less than Before the proof we need some important definitions that will be used also outside the proof. Similarly as in (3.5), let us define: Note that Γ (b) i grows exactly as Γ i while Γ (b) i grows faster: there is always an extra (C log n) 2 factor causing an initial 'gap' of order (log n) 2 between u Further, let us say that a quantity We will see below in (6.5) that blue cannot make more jumps than O λ−1 λ(λ+1) in its climbing phase. In order to show Proposition 5.1, we need a lower and an upper bound on the maximal degree in each step. The next lemma handles the upper bound, but first some definitions.
We say that a sequence of vertices and half-edges (π 0 , s 0 , t 1 , π 1 , s 1 , t 2 , . . . , t k , π k ) forms a path in CM n (d), if for all 0 < i ≤ k, the half edges s i−1 , t i form an edge between π i−1 , π i . Let us denote the vertices in a path starting from a half-edge in Z (b) t(n )/λ by π 0 , π 1 , . . . . We say that a path is good if deg(π i ) ≤ u (b) i holds for every i. Otherwise we call it bad. We decompose the set of bad paths in terms of where they turn bad, i.e. we say that a bad path is belonging to BadP k if it turns bad at the kth step: BadP k :={(π 0 , s 0 , t 1 , π 1 , s 1 . . . , t k , π k ) is a path, The following lemma tells us that the probability of having a bad path is tending to zero: . Then for any k 0 ≤ O (x), the following bound on the probability of having any bad paths holds: Proof. The proof uses path counting methods that we describe in the appendix. Hence we put the proof there.
Proof of Proposition 5.1. Since the method for the lower bound is very much the same as for red, plus we will need a more detailed analysis of this process below in Lemma 5.4, we just sketch the proof (read further to the proof of Lemma 5.4 for more details). First, Lemma 2.1 ensures that we can couple both the blue and the red cluster to their BP approximation until time t(n ) given in (3.3). Since it takes λ > 1 unit of time to cover an edge for blue, the number of generations covered by the branching process approximation Z (b) of blue is t(n )/λ . The size of the last generation in the blue BP is thus We start applying the method in the Mountain climbing phase for blue from this point on.
With the same technique as we used to show that R t(n ) ∩Γ 0 = ∅ using Lemma 3.1, we define u i -s such that at time λ( t(n )/λ + i), blue occupies at least 1 vertex in Γ (b) i . by Lemma 3.3. Note that from Γ (b) i to Γ (b) i+1 , the exponent of 1/(τ − 2) on the right hand side of (5.8) is increased by on. Further, there is an extra +1 in the exponent for the initial maximization of the degrees in u (b) 0 similarly as in (3.6).
The total number of layers Γ (b) i jumped by blue at time T r + t is then (T r + t)/λ − t(n )/λ , that, combined with (5.8), yields formula (5.4).
We still need to check that the term arising from C log n in the definition of u (b) i 's can be put in a (1 + o P (1)) factor in the exponent. For this, write Z (b) t(n )/λ := m, then )/(3−τ ) and the last layer before time T r + t is reached after climbing i = (T r + t)/λ − t(n )/λ many Γ (b) i layers, so by (3.2) and (3.17) we calculate Thus, if t ≤ O (1), when taking the logarithm, then the term corresponding to (C log n) Hence, these terms vanish when taking out (τ − 2) − Tr+t/λ in the statement of the lemma. We will see below in (6.5) that in fact the procedure stops at t = O ( λ−1 λ(λ+1) ) since after that red will block the growth of blue entirely.
For the upper bound, according to Lemma 5.2, whp {BadP k = ∅ ∀k ≤ k 0 }, and on this event the maximal degree of blue at time λ t(n ρ )/λ + λi is at most u (b) i . Since the exponent of C log n in u (b) i is exactly (−1) times the exponent of C log n in u (b) i , these terms can also be put in the (1 + o P (1)) factor by the same argument as for the lower bound.
We will later need more information than the maximal degree of blue, namely, we also need an upper bound on how many vertices blue occupies in each layer. For this, first, we will show that the probability that blue goes above u (b) i at time λ t(n )/λ + λi is small, then we estimate the number of vertices blue paints in each layer based on this bound. We carry these out in a claim and a lemma.
Let us denote the total number of half-edges attached to vertices with degree larger than y n by E ≥yn . Then, we have the following tail bound for E ≥yn : Claim 5.3. For a sequence y = y n , and a large enough constant C < ∞, and for some constant 0 < c < ∞, (5.10) Proof. Since the degrees are i.i.d. in CM n (d), we write Now, exchanging sums, The variables (X (n) k ) k≥1 form a multinomial random variable, each marginal is a binomial, and hence large deviation type concentration bounds can be used. Lemma 3.2 combined with a union bound yields Now, by (1.1), . Note that 2 1−τ < 1, hence summing up terms in k on the right hand side of (5.11), we get that for an c ≤ C 1 /12, the error term is bounded by Since the event {∀k ≥ 1 : combining this fact with the previous error estimate finishes the proof.
Let us denote the set and number of blue vertices in the ith layer Γ (b) i right at the time when blue reaches it by Hence, for x ≤ (λ − 1)/λ(λ + 1), for some constant K 2 , whp (1)). (5.14) Proof. First, Lemma 5.2 guarantees that {BadP k = ∅ ∀k ≤ k 0 } holds whp, and on this event u (b) i serves as an upper bound on the maximal degree of blue at time λ t(n ρ )/λ + λi. So, we can give a recursive upper bound on the number of vertices reached by blue in a given layer Γ (b) i by using u (b) i as a lower and u (b) i as an upper bound on the degrees. Let us condition on the number of blue vertices A i in layer Γ (b) i . Then, we have at most A i half-edges in Γ (b) i ∩ B λ t(n )/λ +λi , with degree at most u (b) i , hence we get the stochastic domination . (5.16) Thus, with the error probability in the previous display, whp

The bound in (5.13) is nothing but
Solving the recursions for u (b) i and u (b) i in (5.5) we get that Initially A 0 ≤ 2C log n whp. This can be seen as follows: by the coupling of the exploration process to the branching process in Section 2, the last generation has size Z (b) t(n )/λ , and the degrees are i.i.d. of distribution D . Hence, the number of vertices in this last generation that have degree at least u (b) 0 has distribution A 0 ∼ Bin(Z (b) t(n ) , P(D > u (b) 0 )). Note that by the choice of u (b) 0 , E[A 0 ] ≤ CC 1 log n, and the Lemma 3.2 implies that A 0 < 2CC 1 log n holds with probability at least exp{CC 1 log n/8}, which is small when C is large enough.
Using (5.19) and evaluating (5.18) finishes the proof of (5.13). We dropped some negative terms in the exponent in (5.13). If we set i ≤ O (x) = x log log n/| log(τ − 2)| + O P (1), then by picking a large enough C, the error terms are o(n −Ai ) in (5.16). Thus we can also iterate the argument up to time O (x) to see that at time λ t(n )/λ + λO (x), in Γ (b) O (x) , the number of vertices blue occupies is bounded by the right hand side of (5.13).
6. At the collision time -the maximal degree of blue 6.1. The maximum degree of blue. In this section we analyse how red and blue collide and prove Theorem 1.4, i.e., we determine the degree of the maximum degree vertex that blue ever occupies. There are two different processes running at time T r + : the red process is in its avalanche phase and occupies every vertex that has degree higher than u , while the slower blue process is still in its mountain-climbing phase and keeps increasing its maximal Note that the left-hand side is approximately equal to the maximum degree D (b,n) max (T r + ) of blue, while the right-hand side is the approximate value of u . Thus t c is the (non-integer valued) time left till the intersection of these two functions after time T r .
We will soon see that neglecting the integer part does have an influence on the highest degree vertex blue can occupy. To get a more precise picture, we should compare which color is first and second to jump after T r + t c , since if it is blue, it can still increase its exponent. So, let us introduce the time of the last jump of red and blue before time T r + t c : In words, red jumps at times T r + r * − , T r + r * − + 1, . . . while blue jumps at times b * − , b * − + λ, . . . and t c satisfies T r + r * − ≤ t c < T r + r * − + 1 and b * − ≤ t c < b * − + λ. We need to determine who jumps first after time T r + t c , (that is, r * − + 1 < b * − + λ or the other way round), so let us also introduce the remaining times till the next jump after the intersection for both colors: J r and J b stands for the additional time needed for red and blue till their next jump after time T r + t c .
Remark 6.1. Note that given the values Y (n) b , T r and α, with each additional jump, red decreases the exponent of 1/(τ − 2) by 1 and blue increases its exponent by 1. (Here we again neglect the terms including C log n.) Thus, for red, when plotting the exponents of 1/(τ − 2) of log u / log n one gets a line of slope −1, starting from time T r + 1 from the value α. The exponent of 1/(τ − 2) in log u (b) i / log n in the cluster of blue is a line of slope 1/λ, since it increases by one with every additional λ time units, see (5.4). These lines can be seen in Fig. 2 and Fig. 3.
Intuitively, the final exponent of the maximal degree of blue depends on two things: which color jumps first after the intersection time T r + t c and how large the difference d(t c ) is between the exponents of 1/(τ − 2) in the log(degree)/ log n of red and blue before time T r + t c . Since with each jump the exponent of 1/(τ − 2) of the jumping color is changed by one, it is crucial whether this difference is less than or larger than 1. Let us temporarily postpone the calculations and believe that this difference is We will later analyse this difference in detail around equation (6.8). Since d(t c ) is the sum of two fractional parts, it is at most 2. Recall also that J r , J b stands for the time till the next jump of red and blue after time T r + t c , respectively (see (6.3)). With these notations in mind, there are five cases (compare them to Fig. 2 and Fig. 3).
(B1) J b < J r and d(t c ) < 1. Blue jumps first after the intersection and occupies some vertices up to Γ r * − , i.e. blue can increase the exponent by a factor (τ − 2) −d(tc) . (Vertices with higher degree than that are already red). See Fig 2(a). (B2) J b < J r and d(t c ) > 1. Blue jumps first after the intersection and occupies some vertices one layer higher, namely the total exponent of 1/(τ − 2) in (5.4) reached by blue is Tr+tc λ + 1. However, since 1 < λ, the next jump after this must be a red jump, hence red occupies every vertex with higher degree than this value. See Fig. 2(b).
(R1) J r < J b and d(t c ) < 1. Red jumps first after the intersection, and occupies every not-yet blue vertex down to Γ r * − +1 , which means that blue cannot increase its exponent anymore. Thus the exponent of 1/(τ − 2) in (5.4) of the maximal degree reached by blue is Tr+tc λ . See Fig. 3(c). (R2) J r < J b < J r + 1 and d(t c ) > 1. Red can make only one jump after the intersection and occupies every vertex in Γ r * − +1 , while blue jumps after this and can reach some vertices with degree up to Γ r * − +1 with its next jump. Thus the maximal degree of blue in this case is determined by Γ r * − +1 , see Fig. 3(a). (R3) J r + 1 < J b and d(t c ) > 1. Red can make at least two consecutive jumps after the intersection and occupies every not-yet occupied vertex in Γ r * − +2 , which means that blue can not increase its exponent. The exponent of 1/(τ − 2) in (5.4) of the maximal degree reached by blue is again Tr+tc λ , see Fig. 3

(b).
Note that above we only handle the cases when J b = J r : this can be ensured by restricting λ to be irrational. If λ = p/q, p, q ∈ N is rational with p and q co-primes, then every vertex that is qt away from the blue source and pt away from the red source for arbitrary t ∈ N might be occupied at the same (i.e, at time pt). In this case, the color of such a vertex is chosen with probability 1/2 independently of everything else. For the meeting time of the red avalanche and blue climber, a rational λ implies cases when J b = J r or J b = J r + 1, i.e. the two processes jump at the same time after t c . Here we list what happens in these cases, to be able to merge them in the cases above. We assume here that the adapted rule is so that there is a positive probability that a vertex becomes blue upon co-occupation. (BR1) J b = J r and d(t c ) < 1. Since there are lots of vertices just slightly smaller than u r * − , blue whp occupies some vertices up to that point, i.e. blue can increase the exponent by a factor (τ − 2) −d(tc) again. This case can be merged into Case B1. (BR2) J b = J r and d(t c ) > 1. In this case, blue can occupy some of the vertices up to one Γ (b) i higher. This case can be merged into Case B2. (BR3) J b = J r + 1 and d(t c ) > 1. In this case, red jumps first and occupies all the vertices down to Γ r * − +1 , and then the two processes jump together, so blue can occupy some vertices right below that. This case can be merged into Case R2.
Remark 6.2. If the adapted rule is so that the probability that a vertex is going to be red with probability one upon co-occupation, then Case BR1 merges into Case R1, case BR2 merges into Case B2, and Case BR3 merges into Case R3. We see that the adapted rule only influences the place where the strict and non-strict inequality signs appear inside the indicators in f (d(t c ), J r , J b ) in (6.9) below. Hence, the main result still holds true with a slightly different f (d(t c ), J r , J b ). For other adapted rules, the function f can be determined similarly.
Now we formalize these heuristics by finishing the proof of Theorem 1.4. An elementary calculation is to solve (6.1) yielding On the other hand, since the last jump of blue before time T r + t c is at time λ[(T r + t c )/λ], blue could do (T r + t c )/λ − {(T r + t c )/λ} many up-jumps, hence right before the intersection, blue occupies some vertices that satisfy 7) where we have used (6.5) for (T r + t c )/λ combined with (5.4) at time (T r + t c )/λ .
Note that the formulas (6.6) and (6.7) only differ in the exponents of 1/(τ − 2), and this difference is exactly d(t c ), introduces in defined in (6.4). More precisely,  In these pictures, red jumps first after the intersection, thus it can occupy more vertices: the exponent it can reach depends on how large the distance is between red and blue at their last jump before the intersection. The first two pictures show the two cases where the distance before the jump is more than one, and red can jump only once or at least twice after the intersection, respectively. The third pictures shows the case when distance before the jump is smaller than one. The colored regions illustrate the maximal degree blue can reach. (In Fig 3(b) and 3(c) blue cannot increase its maximal degree anymore.) Recall from (6.3) that the remaining time to the next jump for red and blue after the intersection at time T r + t c is denoted by J r and J b , respectively.
Since (6.7) is the exponent of the maximal degree vertex that blue occupies before the intersection, to determine the maximal degree of blue, we need to investigate whether blue can jump once more before the red avalanche reaches lower degrees than (6.7). If yes, then blue can gain an additional factor to the rhs of (6.7).
Obviously, if d(t c ) < 1, then even though blue jumps first, it cannot increase its exponent by a whole factor (τ − 2) −1 , since vertices with degree larger than (D (b,n) max (T r + t c )) (τ −2) −1 are already all red. It is not hard to see that blue in this case will occupy some vertices 'right below' u [tc] (that is, say, higher than u [tc] /(C log n)), hence blue in this case can increase its exponent by (τ − 2) d(tc) .
This case illustrates that the additional factor that we need to add to the rhs of (6.7) depends on two things: (1) which color jumps first (and possibly second) after the intersection and (2) whether d(t c ) > 1 or not. There are five cases, described above (after formula (6.4)). As a result, the gain in the exponent for blue can be summarized by multiplying (6.7) by the following function containing indicators for these five cases (the order is Case B1, R1, B2, R2, R3 here, and the cases where λ rational are also included):  .7) with the additional factor f (d(t c ), J r , J b ), so that we can introduce the 'oscillation-filtering' random variable which is oscillating with n and is random, but is depending on the same randomness as Y (n) r , Y (n) b , i.e., they are defined on the same probability space. At this point we have shown that ). max (∞). In this section we investigate how many maximum degree vertices are reached by blue. We show that in some cases (namely, Cases B1, R2) the number of these vertices is so large that it corresponds to an additional factor for the total number of half-edges in maximum degree vertices of blue.
More precisely, let us denote the set of outgoing half-edges from these maximal degree vertices by M (b) n , and its size by M (b) n . Later we will determine how many vertices blue can occupy after this phase, and to be able to count that we need to know how many half-edges are in the highest layer of blue.
n , the number of outgoing half-edges from the set of maximal degree vertices, i.e. the sum of the forward degrees reached by blue for which (1.7) holds, we have is a bounded random variable given below in formula (6.17).
Proof. Recall that A i denotes the number of vertices blue occupies in layer Γ (b) i upon reaching it, see (5.12). In the cases where blue finishes its last jump at a certain layer Γ (b) i , that is, in Case R1 (Fig 3(c) and Case R3 (Fig 3(b)) and also in Case B2 (Fig 2(b)) the statement is a direct consequence of Lemma 5.4, since blue is stuck with its maximal degree at a given layer Γ (b) imax , and hence (1)). Taking logarithm we get log By (6.12), i max = O ( λ−1 λ(λ+1) ) in Lemma 5.4, so we can use the bound in (5.13) with x = (λ − 1)/λ(λ + 1). Hence, the last term in (6.13) disappears when we divide by (log n) 2/(λ+1) .
We are left with handling the cases where the last jump of blue is not a full layer, i.e., Cases B1 and R2. In these cases, after reaching layer Γ (b) imax , blue still jumps up, but not a full layer: due to the presence of red the forward degrees are truncated at u r * − in Case B1 and at u r * − +1 in Case R2. First, we apply Lemma 5.4 to see that log A imax in the last 'full' layer Γ (b) imax is small. Let us recall the notation Then we introduce a new layer and we denote the number of half-edges in this set by E γ . By Lemma 5.4, whp blue is not reaching higher degrees than u (b) imax at time i max . Recall that there are A imax many blue vertices in layer Γ (b) imax . Hence, the total number of blue half-edges in this layer is at most A imax u (b) imax . Thus, the number of vertices in Γ to which blue is connected is dominated by . (6.14) whp. Thus, conditioned on A imax , the expected value of the Binomial variable in (6.14) is bounded above by Since red occupies every vertex with degree larger than (u (b) imax ) γ , the previous formula bounds the number of vertices with degree in the interval [(u (b) imax ) γ /C log n, (u (b) imax ) γ ). Thus, the total number of half-edges going out from maximal degree vertices can be bounded by Since i max = O ( λ−1 λ(λ+1) ), we can use (5.13) and the calculations in the proof of Lemma 5.4 to see that ) ) 2 is still small, i.e., it disappears when taking logarithm and dividing by (log n) 2/(λ+1) . Hence, the main contribution comes from (u . Hence, in Cases B1 and R2, blue can get more half-edges than of order D (b,n) max (∞). To get the total number of half-edges at the last up-jump, we need to modify the function f (d(t c ), J r , J b ). An elementary rearranging of the indicators of the cases and the constants shows that the extra factor needed for (6.7) to get M (b) n is Then the normalizing constant for M (b) n is given by Before moving on to the next section, let us introduce the time when the maximal degree is reached, which is nothing else but the time of the last possible up-jump of blue, i.e., where E stands for the event that blue has an additional up-jump after time t c , i.e. Case B1, B2 or R2 happens.

Path counting methods for blue
By time t b , only o(n) vertices are reached by red and blue together -most of the vertices are still not colored. Thus, it still remains to determine how many vertices blue can reach after time t b . We do this via giving matching upper and lower bounds on how many vertices blue occupies in this last phase.
For the upper bound, the idea is that we count the size of the local neighborhood of the half edges that are just occupied at time t b . Since the red avalanche continues to be in its avalanche phase and occupies all vertices of smaller and smaller degrees as time passes, the spreading of blue is more and more restricted, so this local neighborhood is quite small. We call this the optional cluster of blue. Since its size is random, we give a concentration result on its size, i.e., we give a concentrated upper bound on what blue can get.
For the lower bound, we estimate how much the red color might 'bite out' of this optional cluster. This can happen since even a constant degree vertex might by chance be close to both colors. We show that this intersection of the clusters is negligible compared to the size of the optional cluster.
We start describing the first step -the optional cluster of blue -in more detail. At time t b , the half-edges in the set M (b) n start their own exploration clusters, i.e., an exploration process from the half-edge to not-yet occupied vertices. At time t b + λj, we color every vertex v, whose distance is exactly j from some half-edge h in M (b) n , and the degrees of vertices on the path from h to v are less than what red occupies at that moment, blue. That is, the degree of the jth vertex on the path must be less than u t b −λi+λj−Tr . We do this via estimating the number of paths with degree restrictions from M (b) n and call this the optional cluster of blue, denote the set by O max and its size by O max . Corollary 7.2 below determines its asymptotic behavior.
On the other hand, not just the half-edges in M (b) n can gain extra blue vertices: from half-edges in A imax−z , z = 0, 1, 2 . . . the explorations start a bit earlier (at time t b − λz) towards small degree vertices. Let us denote the vertices reached via half-edges from layer we color a vertex v blue if its distance is exactly j from a half-edge h in A imax−z , and the degrees of vertices on the path from h to v are less than u imax−z+j and also what red occupies at that moment, i.e., the degree of the jth vertex on the path must be less than min{u This extra truncation is needed since we want to avoid double counting, that is, we do not want to count vertices explored from A imax−z towards A imax−z+1 , hence the additional restriction. We show that the total number of optional blue vertices in lower layers, z≥0 O −z with these additional explorations is at most the same order as O max in Lemma 7.3.
For the lower bound of what blue can occupy after time t b , note that not every vertex in O max will be occupied by blue: red can still bite out some parts of these vertices by simply randomly being close to some parts of the blue cluster. We estimate the number of vertices in the intersection of O max and red, and then subtracting the gained estimate from the lower bound on O max gives a lower bound on what blue occupies from the graph after t b , see Lemma 7.4. Now we turn to the calculations.
We introduce the expected truncated degree of a vertex that is distance j away from the set Then, by (1.1), Let us also define Then again by (1.1), Let us call a path of length k from M (b) n with vertices (π j ) j≤k good if π j ≤ u t b +λj−Tr , and good-directed if u t b +λj+1−Tr ≤ π j ≤ u t b +λj−Tr . n , respectively. Then there exist positive constants 0 < c 2 ≤ C 2 < ∞ such that and while for the variance of the latter: n L n + e k,n , (7.4), and the error term e k,n is The proof of this lemma uses path counting methods and is similar to that of [29, Lemma 5.1]. Similar techniques can also be found in [27,Section 10.4.2]. Since our case is slightly different than the cases handled there, we work out the details in Appendix A. Now we state the immediate corollary of Lemma 7.1. Recall the definition of t b from (6.18).
Corollary 7.2 (Chebyshev's inequality for blue vertices). Take c 3 ≤ 2−ε λ+1 | log(τ − 2)| −1 and any k ≤ c 3 log log n. Then, conditioned on the number of blue half-edges M (b) n at time t b , the number of vertices optionally occupied by blue up to time t b + λk satisfies that, conditionally on M (b) n , Proof. Let us write O non-d max (k) for paths that are good but not good-directed. We show that they have a negligible contribution, while O d max (k) is well-concentrated. In this proof below, all expectations and probabilities are conditional wrt. M (b) n . Let us write Now we can apply Chebyshev's inequality on the first term while Markov's inequality on the second term (both conditioned on M (b) n ), using Lemma 7.1: The term containing γ 2 1 /ν 4 1 L n is coming from the Taylor expansion of the exponential factor in the formula for e k,n . We only have to verify that the rhs of the previous display is tending to 0. For this we need γ 1 /(ν 2 1 M (b) n ) → 0 and also γ 2 1 /(ν 4 1 L n ) → 0. For the first term, note that M (b) n ≥ D (b,n) max (∞), since it counts the number of half-edges with maximal degree , since it is not hard to see that at time t b + λ, the degree above which red occupies everything (i.e., u t b +λ−Tr ) is already less than D (b,n) max (∞), otherwise blue could have still increased its maximal degree at t b + λ by an extra jump. (Technically, this was the definition of t b . Alternatively, compare the exact values of D (b,n) max (∞) in (1.7) and (6.10), and compare it to that of u t b +λ−Tr , which can be derived from (6.6) by adding the appropriate number of (τ − 2) factors in the exponent corresponding to the five different cases. This calculation is left to the reader.) Similarly, the second term, γ 2 1 /(ν 4 1 L n ) = u 2 t b +λ−Tr /L n is less than of order D (b,n) max (∞) 2 /n and hence is small as long as Note that this is the case by Theorem 1.4 since λ > 1. Finally, we show that the last term in (7.7) is also small. Since λ > 1, [t b + λ(j + 1) − T r ] ≥ [t b + λj − T r ] + 1, and ν i ∈ (c 1 , C 1 ) × u 3−τ [t b +λi−Tr] hence the last term is less than 6 times where we have used the recursion u +1 = C log n u 2−τ in (5.1). Again, by the same recursion, for some large enough constant C , the sum on the rhs is at most which is small as long as log u [t b +λk−Tr] is of larger order than log((C log n) 3−τ ). Note that this holds for an appropriate choice of k, since using (6.6) and the recursion for u again, log Note that if we now pick k = o(log log n), then the exponent (log n) 2/(1+λ) stays unchanged and the expression is much larger order than log log n.
Recall that A i , A i stands for the set and number of blue vertices in layer Γ (b) i at the time when blue reaches the layer -at time λ[t(n ρ )/λ] + λi. Also recall that i max stands for the index of the last Γ (b) i layer ever reached by blue, see (6.12). Further, O −z (k) is the number of vertices explored via a path of length k starting from a half-edge in A imax−z that are not explored via a half-edge from A imax−z+1 . Next we show that z≥0 O −z (k) is at most the same order of magnitude as O max (k): Lemma 7.3. With the notation introduced before, Proof. Let us denote the number of half-edges in A imax−z that are not connected directly to Γ (b) imax−z+1 by H −z . From Lemma 5.4 we have a bound on the number of vertices A i in layer Γ (b) i , and Lemma 5.2 says that the maximal degree in A i is at most u (b) i whp. First, let us describe the following construction of the blue cluster spreading through the layers Γ (b) i . After an extra time unit λ, A i+1 half-edges out of the at most A i u half-edges of blue are connected to half-edges in Γ (b) i+1 , while the other half-edges are not. In the construction of CM n (d) in Section 2, each half-edge is paired to a uniformly chosen other half-edge. The uniform distribution restricted to a set is still uniform on that set, thus we can think of this procedure by picking A i+1 many of the half-edges out of the at most A i u i half-edges uniformly at random and connecting them to uniformly chosen half-edges in Γ (b) i+1 . The rest of the half-edges in Γ (b) i are connected to lower degree vertices, i.e., we can simply pair these half-edges to lower degree vertices than u (b) i+1 , and apply the path counting method similar as for M (b) n in Lemma 7.1, with the restriction that the degree of the j-th vertex on such a path must be less than the degree in Γ (b) imax−z+j if j ≤ z and less than the degree where the red avalanche is at the current time when j > z, respectively. The restriction for j ≤ z is needed to avoid double counting. Clearly, Then the degree truncation for this process at λj time unit later is at u (b) imax−z+j if j ≤ z and u t b +λ(j−z)−Tr if j > z. A simple modification of Lemma 7.1 gives the number of vertices found from these halfedges. Moreover, to show that vertices reached from A imax−z , for z ≥ 1 are of less order than that reached via M (b) n , we can use Markov's inequality: Similarly as in (7.12), where the exponent 3 − τ comes from a similar calculation than that in (7.2). We claim that the maximum of this quantity is at z = 0.
, we need to show that the sum of the first two terms in (7.8) are less than log M (b) n . By the recursive definition of u i in (5.5), log u imax −z = (τ − 2) z log u imax (1 + o(1)). We can also use the fact from Lemma 5.4 that (1)) whp (7.9) and the second term in (7. (7.10) We see that the sum of the right hand sides of (7.9) and (7.10) is exactly log( u imax ). Thus, returning to (7.8), The right hand side is indeed maximal for z = 0, for which we have Compare this quantity to log O max (k) in Corollary 7.2. Since n , this finishes the proof of Lemma 7.3, since and the log log n factor becomes a negligible additive term when taking logarithm.
Having analysed the size of the optional cluster of blue, we are ready to finish the upper bound of Theorem 1.2 by combining the previous results.
Proof of the upper bound in Theorem 1.2. First, fix k = k(n) → ∞ so that k(n) = o(log log n). Then, Lemma 7.3 implies that the logarithm of the total number of vertices that blue paints in the last phase is at most log O max (k)(1 + o P (1)). Corollary 7.2 says that the order of magnitude of log(O max (k)) = log n is the number of blue half-edges in the highest layer that blue can reach. Further, Lemma 6.3 determines the order of magnitude of log M (b) n , which is and hence converges in distribution to (Y λ b /Y r ) 1/(λ+1) when divided by the second two factors.
We rewrite t b − T r = t c + λ 1 E − Tr+tc λ in the exponent using (6.18), and then use formula (6.5) to see that t c = O ( λ−1 λ(λ+1) ). Hence the main order term in (τ − 2) tc is (log n) (λ−1)/(λ(λ+1)) . This implies that the two smaller order terms k log(C 1 C log n) and β log(C log n)(τ − 2) tc are o(log n(τ − 2) tc ) and can be put in a (1 + o P (1)) factor of the main term. Using the exact value of t c in (6.5) we obtain then The proof of the lemma will follow from the following claim: For the right hand side to be minimal we set K n := ((τ − 2)n/|S|) 1/(τ −1) , which is o(n 1/(τ −1) ) as long as |S| → ∞ with n. With this choice of K n , Since the exponents sum up to 1, the rhs is always o(n) if |S| = o(n).
Proof of Lemma 7.4. Note that we can construct the configuration model by pairing the half-edges in an arbitrarily chosen order. This enables the joint construction of the graph and the spread of the red and blue cluster. Hence, we can assume that if a vertex is not yet colored, its half-edges are still free, and we do not have to take into account the effect that whole paths can be blocked away from one color by the other color by painting one or a few vertices only.
Fix the length of the blue exploration path k. For a set S of vertices, we denote by H(S) the total number of half-edges that point out of the set S. As a lower bound, we can use the adapted rule that whenever red and blue arrives at a vertex at the same time, it is going to be red deterministically. We can further assume that if this is the case, i.e., there are simultaneous jumps of red and blue, then we always pair the red half-edges first, i.e., when pairing the blue half-edges at time t b + λi, we consider R t b +λi as already determined.
Let us consider a path π ending in O(k) given by the sequence of half-edges and vertices (π 0 , s 0 , t 1 , π 1 , s 1 , . . . , t k , π k ), that is, s i is the half-edge pointing out of vertex π i that we pair to t i+1 , a half-edge belonging to vertex π i+1 . We call this path thinned at step i if the half-edge s i−1 is paired to the half-edge t i where π i is already red, i.e. π i ∈ R t b +λi . We call a path thinned if it is thinned at some i ≤ k.
Clearly, each time we pair a blue half-edge at time t b + λi, it is with probability H(R t b +λi )/L n (1 + o(1)) paired to a red half-edge. Let us denote σ k : Hence, the probability that a particular path ending in O(k) to be thinned can be bounded using a union bound P (π 0 , s 0 , t 1 , π 1 , s 1 , . . . , t k , as long as k = k(n) is so that the quantity on the rhs is less then 1. Hence, for any function δ n,k so that δ n,k p th,k < 1, the proportion of vertices in O(k) that are thinned -denoted by O th (k) -by Markov's inequality is at most δ n,k p th,k ≤ 1 δ n,k .
(7.20) Now, note that we are done with the lower bound if we can pick a k = k(n) → ∞ and an δ n,k so that δ n,k → ∞ and δ n,k p th,k < 1.
For this, let us temporarily believe that k := k(n) = log log log n has the property that R t b +λ k = o P (n). Then, let us write R t b +λ k := O(n/ω n, k ) where ω n, k → ∞ with n → ∞. Set k := k(n) = min{log log log n, (ω n, k ) Clearly, k ≤ k holds, hence, by monotonicity we have R t b +λk ≤ R t b +λ k . Applying Claim 5.3 on each term in the sum, On the event {L n ∈ (1/2E[D]n, 2E[D]n)}, using (7.21), This allows us to pick δ n,k := (ω n, k ) τ −2 4(τ −1) , and then δ n,k p th,k → 0 as well as δ n,k → ∞ holds with n → ∞. As a result, the rhs of (7.20) tends to zero, showing that whp, only a negligible fraction of the vertices in O(k) will be thinned.
We are left showing that with k = k(n) = log log log n, we have |R t b +λk | = o P (n). One way to see this is to use [28, Theorem 1.2] about typical distances: typical distances in the graph are 2 log log n/| log(τ − 2)| with bounded fluctuations around this value, while t b + λ k < (1 + ε)2λ/(λ + 1) log log n/| log(τ − 2)|. Hence, the number of vertices at most t b + λ k away from the uniformly chosen red source vertex must be o(n).
To keep the paper self-contained, we provide another proof of this fact here. For this, note that t b + λ k, t b defined in (6.18) is at most (1+ε)λ λ+1 2 log log n | log(τ −2)| for some ε > 0 whp. To estimate the expected size of the red cluster, we write Now, using that D(R 0 , v)/2 has the distribution t b λ=1 , we can continue the bound as In the last line, we used (6.18) with λ = 1 and put bounded terms there in the (1−2ελ) factor on the rhs inside the probability sign. Further, note that the random variables Y (n) r Y (n) b have asymptotically exponential tails. Hence, the probability is tending to zero as n → ∞. This ensures that most vertices are further away from the source of the red infection than t b + λk and hence R t b +λ k = o(n). This finishes the proof of the lemma.
Proof of the lower bound in Theorem 1.2. First, note that the time T r for red to reach the top of the mountain was a lower bound, i.e., we have shown the existence of a path that reaches the top in time T r whp in Lemma 3.3. Clearly, if red reaches the top earlier, then there is less time for blue to increase its degree, hence, it will occupy fewer vertices. Fortunately, an adaptation of Lemma 5.2 for red instead of blue shows that this cannot happen. That is, one can define the sequence u (r) i by the recursion u (r) 0 := (n C log n) 1/(τ −2) , u (r) i := (u (r) 0 C log n) 1/(τ −2) , and then exchange every superscript (b) to (r) in the definition of BadP k (see right before Lemma 5.2). Applying Lemma 5.2 yields that with high probability, red cannot jump a layer ahead, and hence the time to reach the top remains as defined in T r .
Next, everything from this point on was a concentrated estimate, hence, we only need to check what happens in the last phase, how many vertices blue can actually get from its optional cluster.
Using Lemma 7.4, we see that the log-size of the blue cluster at time t b + λk is whp log O(k) = log O max (k)(1+o P (1)). Note also that by Corollary 7.2, log O max (k) is concentrated and is equal to log M (b) n + k j=1 log ν j + o P (1) by Lemma 6.3. Hence, it only remains to give a lower bound on k j=1 log ν j . For this, note that the lower bound on ν j is the same as the upper bound, with a factor C 1 replaced by c 1 . This factor becomes an additive term when taking the logarithm, and hence contributing only inside the o P (1) factor. Hence, The right hand side converges to (Y λ b /Y r ) 1/(λ+1) . Combining this with the upper bound completes the proof of the Theorem 1.2.

Acknowledgement
The work of EB, RvdH and JK was supported in part by the Netherlands Organisation for Scientific Research (NWO) through VICI grant 639.033.806.
The work of JK was supported in part by NWO through the STAR cluster, and the work of RvdH was supported in part by NWO through Gravitation grant 024.002.003. JK thanks the Probability Group at The University of British Columbia for their hospitality while completing the project.
Appendix A. Path counting methods for restricted paths Proof of Lemma 5.2. Here we follow the notation of [27, Section 10.4.2] as much as we can. We will use union bound and Markov's inequality to bound the probability of the existence of bad paths: (A.1) First we give an upper bound on the expected number of bad paths conditioned on the degree sequence, so let us fix the degrees first and write d v for the degree of the vertex v. A (directed) path of length k from vertex a to some vertex π k can be described as {(π 0 , s 0 ), (π 1 , t 1 , s 1 ), . . . , (π k−1 , t k−1 , s k−1 ), (π k , t k )} , where π i ∈ [n] is the i-th mid-vertex along the path, s i ∈ [d πi ] denotes the label of the outgoing and t i ∈ [d πi ] the label of the incoming half-edge of π i . Recall that we call a path good if deg(π i ) ≤ u (b) i for all 0 ≤ i ≤ k, and BadP k is a subset of bad paths with k . Since the number of half-edges out of vertex π i is d πi , there are many possible paths via the vertices (π i ) k i=0 . Thus, the expected number of paths through fixed vertices π 0 , . . . , π k equals the probability that a given path in (A.2) is present in CM n (d) multiplied by the combinatorial factor of picking the possible half-edges for the paths, i.e., where L * n is the number of free half-edges when the procedure starts. Thus, the expected number of all self-avoiding bad paths in BadP k equals where * means that we sum over distinct vertices. Allowing non-distinct vertices, we get the upper bound where the factor e k 2 L * n is a bound on the term k i=1 L * n /(L * n − 2i + 1) above. Since the path counting starts at time t(n ) = O (1), and typical distances are O (2) in the graph, we have L * n = L n (1 + o(1)) whp (see the proof of Lemma 7.4 for more details). By the Law of Large Numbers, L n /n → E[D], hence the i-th factor on the right hand side is close to while the last factor in (A.5) is close to by the tail behavior (1.1) of the distribution function of D, for some constant C 3 .
Since we need to set k 0 = O(log log n), the error term exp{ k 2 L * n } = 1 + o(1) in (A.5) stays close to 1. Since π 0 is a vertex that belongs to the last generation of the branching process approximation phase, we get an upper bound on the total number of bad paths by contracting all the vertices that belong to the last generation of the blue branching process. Note that by the coupling to the BP, the degrees in this generation are i.i.d. from distribution F * in (1.2), hence for some large constants C , C 2 > 0 whp by the definition of u (b) 0 in (5.5). Further note that, with m := Z [t(n )/λ] and the definition of u (b) 0 again, by Lemma 3.1, Then with this error probability we can write The recursion for u (b) i in (5.5) gives , and then in (A.8), after elementary calculation, the powers of (τ −2) −1 cancel in the exponent of u (b) 0 and C log n, and the formula simplifies to This estimate and (A.7) together implies that the union bound in (A.1) leads to This completes the proof of the upper bound in (7.3).
Similarly as in the previous lemma, a (directed) path of length k from vertex a = π 0 to b = π k can be described as {(π 0 , s 0 ), (π 1 , t 1 , s 1 ), . . . , (π k−1 , t k−1 , s k−1 ), (π k , t k )} . (A.9) We call a path now good if π i ∈ Λ i and good-directed ) for the number of self-avoiding good paths and good-directed paths going from vertex a to b, respectively, and L * n for the total number of half-edges present in CM n (d) at time t b . Similarly as in (A.4), the expected value of all self-avoiding good paths equals where * means that we sum over distinct vertices. Now clearly we have the upper bound Note that L n /L * n → 1 by the argument in Lemma 7.4. The Law of Large Numbers ensures the convergence on the right hand side, so where ν i is from (7.1). By contracting all the vertices belonging to the set M (b) n , we have d a = M (b) n and by letting b be the contraction of all the vertices with degree less than K for some arbitrary constant K ≥ 2, we have d b /L * n ≤ 1 is of constant order again. Note that the total number of explored vertices on paths of length k is bounded from below and from above by Noting that ν i grows super-exponentially, (1)). This finishes the proof of the upper bound in (7.3).
We can get a lower bound on (A.10) if in the sum over distinct vertices, we leave out the i highest degree vertices (A.14) Note that since we leave out only finitely many vertices, the ith sum within the product still converges to ν i . Again contracting all the vertices belonging to the set M (b) n , we have d a = M (b) n and by letting b be the contraction of all the vertices with degree less than K for some arbitrary constant 2 ≤ K, we have d b /L n is of constant order again. Combining with the lower bound in (A.13) finishes the proof of the lower bound in (7.3).
The proof of the bounds (7.4) for good-directed paths are analogous, but now one has to use the restricted sets Λ d j := Λ j \ Λ j+1 = v ∈ [n] : u t b +λ(j+1)−Tr < d v ≤ u t b +λj−Tr .
Next we prove the variance formula for O d max (k) following more or less the lines of [27, Section 10.4.2 and 9.4]. Note that the major difference between the proof of [27, Proposition 9.17] and our case is that here we have the extra restriction π i , ρ i ∈ Λ d i , and the Λ d i sets are disjoint.
First write N d k (a, b) as the sum of indicators that a given good-directed path is present, and write |π ∩ ρ| for the number of edges the two paths share. Then we have the variance formula π,ρ |π∩ρ|= [P(π, ρ ⊆ CM n (d)) − P(π ⊆ CM n (d))P(ρ ⊆ CM n (d))] .
Consider first the inner sum for = 0, i.e. when the two path have disjoint edge-sets. Since at the time of pairing the ith half-edge, there are L * n − 2i + 1 free half-edges to pick from, the probability that both π, ρ are present is exactly 2k i=1 (L * n − 2i + 1) −1 . On the other hand, the square of the probability that a path present is P(π ⊆ CM n (d)) 2 = k i=1 (L * n − 2i + 1) −2 .
For = k, that is, the two paths are identical, P(π ⊆ CM n (d)) − P(π ⊆ CM n (d)) 2 ≤ P(π ⊆ CM n (d)), hence the inner sum can be bounded by the inequality π,ρ |π∩ρ|=k P(π ⊆ CM n (d)) ≤ E[N d k (a, b)], (A. 16) explaining the first term on the right hand side of (7.5). Now we are left with handling the cases 1 ≤ ≤ k − 1. Note that in these cases we have to evaluate over all possible overlaps between the paths π, ρ. For this, note that the restriction that π i , ρ i ∈ Λ d i and Λ d i are disjoint sets implies that for each i there are only two cases: either π i = ρ i or π i = ρ i , but in both cases they are disjoint from all π j , ρ j , j = i. We will merge these cases into shapes. Let us call an excursion of length s a connected component of π \ ρ, that is, a consecutive sequence of edges where the two paths are not the same. Formally, for some i, (π i , π i+1 ) = (ρ i , ρ i+1 ), . . . , (π i+s−1 , π i+s ) = (ρ i+s−1 , ρ i+s ), is an excursion if it is started and ended by the common edges (π i−1 , π i ) = (ρ i−1 , ρ i ) and (π i+s , π i+s+1 ) = (ρ i+s , ρ i+s+1 ) unless i − 1 = −1 or i + s + 1 = k + 1, in which cases there is no edge before/after the excursion, respectively. Due to the property that Λ d i are disjoint, note that there are exactly the same number of edges on the π part of an excursion as on the ρ part of the excursion.
Let us denote by m the number of excursions, and again, we denote by := |π ∩ ρ| = the total number of shared edges. For a fix m, there can be m − 1, m or m + 1 many segments of π ∩ ρ, depending on whether none of, only one of, or both a, b are part of an excursion. Let us thus introduce the indicators δ a = 1, δ b = 1 if vertex a, b are parts of an excursion.
We can now define the class of shapes called Shape m, corresponding to pairs of paths for which |π ∩ ρ| = and π \ ρ consists of m excursions. That is, ρ has m edge-disjoint excursions from π, and between two consecutive excursions there is at least one edge in π ∩ ρ. Note that the number of excursions m is thus at most + 1. Also note that each shape in Shape m, can be uniquely characterised by a sequence of numbers of the form   Figure 4. Paths of length 8 belonging to Shape 2,3 : m = 2 indicates that there are two excursions, = 3 means that the two paths share 3 edges in total. On the first picture, the excursion do not start at the ends of the path, hence δ a = δ b = 0, on the second picture, δ a = 1, δ b = 0, while on the third picture both excursions start at the ends, hence δ a = δ b = 1. Note that in all cases, the number of degree three vertices is 2m − δ a − δ b , and the shared edges form m + 1 − δ a − δ b many connected components.
Note that if |π ∩ ρ| = is fixed, then there are exactly 2k − different edges in π ∪ ρ, so that with fixed vertices and fixed half-edges, If we now fix only the vertices, but not the half-edges, then we have to multiply this with a combinatorial factor similar to that in (A.3) counting the number of possible variations of halfedges for fixed vertices (π i , ρ i ) 1≤i≤k . Recall again that δ a = 1{a ∈ first excursion of π \ ρ} and δ b = 1{b ∈ last excursion of π \ ρ}. Let us write d σ (v) for the number of half-edges of v used in the union of paths π, ρ of shape σ, and in text we write degree σ for this degree. At the end of every excursion we have degree σ -3 vertices, while on the excursions and inside segments of π ∩ρ we have degree σ -2 vertices. Thus the combinatorial factor to pick half-edges, once fixing the vertices along the path (but not the half-edges) is at most L n (A.21) By ta similar argument then that in the proof of Claim 5.3, the sums in the previous display are converging to γ is − γ is+1 and ν it − ν it+1 , ν iu − ν iu+1 , respectively. Thus, we get that the rhs of (A.21) is at most as long as u f (i) = o(n 1/(τ −1) ). Since the maximal degree in the graph is of this order, this is always the case. That is, it in not worth merging vertices on excursions. We can continue analysing formula (A.22). Now we identify the indices i s , i t , i u , using the restrictions π i , ρ i ∈ Λ d i . The crucial observation is the following: follow the indices π 1 , π 2 , . . . , π k−1 and ρ 1 , ρ 2 , . . . , ρ k−1 along the two paths. If for some i, the vertices π i = ρ i are degree σ -2 vertices on an excursion, then the corresponding ν 2 i appears in the product in (A.22). If π i = ρ i is a degree σ -3 vertex, then we have a factor γ i replacing ν 2 i in the product. If π i = ρ i is a degree σ -2 vertex in π ∩ ρ, then we only have a factor ν i in the product (instead of ν 2 i ) in (A.22). Thus, dividing (A.22) by 2k−2 i=1 ν 2 i yields that for each degree σ -3 vertex we have a factor γ i /ν 2 i and for each coinciding degree σ -2 vertex we have a 1/ν i ≤ 1/ν k−1 in the product. Elementary calculation shows that . Now let us set d a := M (b) n and in d b we collect all the vertices that are less then ν k : these contain vertices with constant degree (say all the degrees smaller than K for F (K) = 1/2). This implies that d b ≥ L n /2 whp. Combining the contribution for = 0 in (A.15), = k in (A. 16), and then m = 1 in (A.25) and finally m ≥ 2 in (A.28) yields (7.5).