Weak Convergence of Non-neutral Genealogies to Kingman’s Coalescent

Interacting particle populations undergoing repeated mutation and ﬁtness-based selection steps model genetic evolution, and describe a broad class of sequential Monte Carlo methods. The genealogical tree embedded into the system is important in both applications. Under neutrality, when ﬁtnesses of particles and their parents are independent, rescaled genealogies are known to converge to King-man’s coalescent. Recent work established convergence under non-neutrality, but only for ﬁnite-dimensional distributions. We prove weak converge of non-neutral genealogies on the space of c`adl`ag paths under standard assumptions, enabling analysis of the whole genealogical tree. The proof relies on a conditional coupling in a random environment.


Introduction
The n-coalescent (Kingman, 1982a,b,c) is a homogeneous continuous-time Markov process on the space P n of partitions of {1, . . ., n} =: [n].The non-zero entries of its infinitesimal generator Q are q ξ,η = −|ξ|(|ξ| − 1)/2 ξ = η 1 ξ ≺ η for every ξ, η ∈ P n , where |ξ| denotes the number of blocks in ξ, and ξ ≺ η means that η is obtained from ξ by merging exactly two blocks.It is the limiting genealogical process, as the population size goes to infinity, for samples of n individuals from a wide range of population models.The original work of Kingman (1982c) provides sufficient conditions for convergence of genealogies from Cannings models to the n-coalescent, in the sense of finite-dimensional distributions.Cannings models are characterised by a fixed population size, exchangeable offspring counts, and i.i.d.generations.Möhle (1998) provides sufficient conditions for the wider class of models in which the population size may vary deterministically, the offspring distributions are independent (but not i.i.d.) across generations, and exchangeability is replaced by the weaker random assignment condition.Independence of family sizes in different generations is incompatible with hereditary fitness, and essentially implies neutral reproduction (Del Moral et al., 2009).For that class of models and under the same conditions, Möhle (1999) proves weak convergence of the genealogies as stochastic processes.Möhle (2000) gives a simpler condition which is necessary and sufficient for weak convergence of genealogies from Cannings models to the Kingman coalescent.
We consider a still wider class of models where exchangeability is relaxed to random assignment, and independence between generations is not required, so that our results apply to non-neutral models.This class was also treated in Brown et al. (2021), where convergence of finite-dimensional distributions was proved under a non-neutral analogue of the condition of Möhle (2000).Here we prove weak convergence under the same condition.Our proof follows the structure of Möhle (1999, Theorem 3.1), but removing the assumption of independent family sizes between generations results in considerable technical complications because the pre-limiting, reverse-time genealogies are no longer Markov processes.Non-Markovianity rules out some standard weak convergence conditions, such as uniform convergence of semigroups (Ethier and Kurtz, 1986, Chapter 4, Theorem 2.5).We overcome these complications and prove weak convergence of the non-Markovian genealogical processes to the Markovian n-coalescent limit by controlling the modulus of continuity of pre-limiting processes.Our approach yields weak convergence without assuming Markovianity (Ethier and Kurtz, 1986, Chapter 3, Corollary 7.4 and Theorem 7.8).
The models studied are of interest not only in population genetics, but also in sequential Monte Carlo (SMC): a very broad class of algorithms used in computational statistics and related disciplines (see e.g.Chopin and Papaspiliopoulos, 2020, for an introduction).In this context, the model we study characterises the dynamics of a population of "particles" whose empirical measure approximates a sequence of measures of interest, such as the conditional distribution of the latent part of a hidden Markov model given some observations.Genealogies induced by the resampling step of SMC are critical to the performance of these algorithms, as has been known since SMC was first introduced to the statistics literature (Gordon et al., 1993).Genealogical trees embedded into SMC particle systems have been the subject of numerous studies (Del Moral et al., 2016;Del Moral and Miclo, 2001;Del Moral et al., 2009), see also Del Moral (2004, Chapter 3), but direct analysis of the marginal genealogies has only taken off recently (Brown et al., 2021;Koskela et al., 2020).Indeed, Brown et al. (2021, Section 4) verify that the conditions of our main result, Theorem 1 presented below, hold for several important classes of SMC algorithms.Hence, Theorem 1 allows us to strengthen several corollaries of Brown et al. (2021) to weak convergence of the genealogical processes, facilitating convergence statements for a larger class of test functions than those which depend only on the finite-dimensional distributions.Prominent examples of functions which cannot be computed from finite-dimensional distributions alone include the time to the most recent common ancestor (TMRCA) and the total branch length of the genealogy, both of which measure the memory cost of storing algorithm output.Jacob et al. (2015) showed that the TMRCA and total branch length of all N particles are both O(N log N), while Koskela et al. (2020) showed that those for a sample of n particles are Θ(N) and Θ(N log n), respectively, strongly suggesting that the results of Jacob et al. (2015) could be sharpened to Θ(N) and Θ(N log N), respectively.Because of the lack of a weak convergence result, the argument of Koskela et al. (2020, Corollary 2) relies on a cumbersome workaround involving a coupling of the genealogical process with a separate n-coalescent.Our Theorem 1 removes the need for such workarounds, and facilitates simpler a priori analysis of marginal SMC genealogies in other settings, such as design of variance estimation schemes (Olsson and Douc, 2019) and conditional SMC updates in particle MCMC (Andrieu et al., 2010).In both settings, designing a large enough particle system to guarantee that a large number of distinct ancestral lines remain after a given number of generations is crucial to practical performance.

Encoding genealogies
Consider an interacting particle system in which N particles stochastically reproduce in discrete, non-overlapping generations, such that each particle in generation t ∈ N has a single parent in the previous generation.For convenience, we label time in reverse throughout this article, with the terminal generation labelled 0, their parents being in generation 1, and so on.The index of the generation-t parent of individual j in generation t − 1 is denoted a , and the number of offspring that individual i in generation t has in generation t − 1 is ν ), be the reverse-time filtration generated by the vectors of offspring counts ν We study the genealogies of finitely many individuals under the asymptotic regime in which N → ∞.In particular, sample n ≤ N individuals from generation 0 uniformly without replacement, and trace back the corresponding lineages to obtain their genealogy which, following Kingman (1982a), is encoded by a P n -valued stochastic process (G if and only if terminal particles i and j share a common ancestor at time t (i.e.t generations back).
Under the assumption (A1) stated below, it is sufficient for our purposes to consider only offspring counts ν  is uniform over all assignments such that |{j : a t for all i.Known as the random assignment condition, (A1) is weaker than exchangeability of the particles within a generation.As we will see, it is still sufficient to yield an exchangeable coalescent process in the N → ∞ limit.
In order to obtain a well-defined limit for the genealogical process as N → ∞, we must scale time by a suitable function τ N (•).To define this time scale, we first define the conditional pair merger probability, where (n is the falling factorial.This is the probability, conditional on ν , that a randomly chosen pair of lineages in generation t merges exactly one generation earlier.The interpretation of c N (t) as a conditional merger probability is justified by assumption (A1), and the same is true of the interpretation of D N (t) in (3) as an upper bound on the probability of larger mergers.To achieve a limiting pair merger rate of 1, as in the n-coalescent, we rescale time by the left-inverse: (2) The function τ N maps continuous to discrete time, providing the link between the discrete-time generations and the continuous-time scaling limit.
Our definition of c N (t) differs from that which is usual in the population genetics literature, where the deterministic function defined as the expectation of (1) is identified as c N (t).Our definition yields a time scale which is random, depending on the realisation of ν (1:N ) t for each t.In the context of SMC, this is necessary to accommodate the heterogeneity of the system: the random time scale subsumes the time-inhomogeneity of the model, so that the rescaled ancestry converges to a time homogeneous process.Inhomogeneous time scales have been studied in the genetics context as models of variable population size, but even in these settings c N (t) is defined as an expectation so that the inhomogeneous time scale varies deterministically (Möhle, 2002).
We will also make extensive use of the following upper bound (Koskela et al., 2020) on the conditional probability of a multiple merger (three or more lineages merging, or two or more simultaneous mergers): This is used to control the rate of multiple mergers, which must be dominated by the pair-merger rate as N → ∞ if we are to recover an n-coalescent in the limit.Proposition 1 collects some basic properties of c N , D N and τ N .
Proof.The outermost bounds in (a) follow from the fact that the entries of ν (1:N ) t are non-negative and sum to N. The central inequality follows as outlined in Koskela et al. (2022, Supplement, p. 13-14).The case s ′ = 0 in (b) follows directly from the definition of τ N in (2) and part (a), while the case s ′ > 0 is obtained by applying the s ′ = 0 case to both sums in Finally, (c) follows from (a) and the definition of τ N in (2).
We recall the following lemma, proved in Brown (2021, Lemma 3.2).
Lemma 1. Fix t > 0, and recall that (F r ) is the backwards-in-time filtration generated by the offspring counts ν Let p ξη (t) denote the conditional transition probabilities of the genealogical process from ξ ∈ P n to η ∈ P n given ν , for each t ∈ N. The only non-zero transition probabilities p ξη (t) are those where η can be obtained from ξ by merging some blocks of ξ (i.e.some lineages coalescing).Ordering the blocks by their least element, denote by b i the number of blocks of ξ that merge to form block i in η, Then, assuming (A1), the conditional transition probability is given by (4) We will only need to work directly with the identity transition probabilities p ξξ (t).Following Brown et al. (2021, Lemma 3.6), but keeping the terms in N explicit, yields the following lower bound.
Proposition 2. Let ξ ∈ P n , N > 2. Then where for some K > 0 that does not depend on |ξ| or N.
Define the asymptotic notation 1 and D with the associated Skorokhod (J1) topology and its Borel σ-algebra.
Theorem 1.Let ν (1:N ) t denote the offspring numbers in an interacting particle system satisfying (A1) and such that, for any sufficiently large N and for all t ∈ [0, ∞), P[τ N (t) = ∞] = 0. Suppose that there exists a deterministic sequence (b N ) N ∈N such that lim N →∞ b N = 0 and, for all large enough N, almost surely, uniformly in t ≥ 1.Then the rescaled genealogical process (G (n,N ) τ N (t) ) t≥0 converges weakly in D to Kingman's n-coalescent as N → ∞.
Remark 1. Condition ( 5) is very natural: it requires the rate of (combinations of) mergers involving three or more lineages to be vanishingly small in comparison to that of binary mergers.The sequence (b N ) N ∈N controls the rate at which the ratio of rates of large and binary mergers vanishes, and can decay to zero arbitrarily slowly.The exact values of its entries are not special; it is just the sequence that plays an implicit role in the usual little-o notation.As mentioned in Section 1, a natural analogue of ( 5) is known to be necessary and sufficient for convergence to the n-coalescent in the neutral case (Möhle, 2000, Equation (16)).
Proof of Theorem 1.The structure of the proof follows Möhle (1999), albeit with considerable technical complication due to the dependence between generations (nonneutrality) in our model.To make it digestible, the proof is broken down into a number of results which are organised into sections; the relationships between these are shown in Figure 1.
Convergence of the finite-dimensional distributions was proved in Brown et al. (2021, Theorem 3.2).Strengthening this to weak convergence on the space of processes amounts to establishing relative compactness of the sequence {(G (n,N ) τ N (t) ) t≥0 } N ∈N .Since P n is finite and therefore complete and separable, and the sample paths of (G live in D, we can apply Ethier and Kurtz (1986, Chapter 3, Corollary 7.4) which states that a sequence of processes {(G is relatively compact if and only if the following two conditions hold: 1.For every ǫ > 0 and rational t ≥ 0, there exists a compact set Γ ǫ,t ⊆ P n such that lim inf 2. For every ǫ > 0, t > 0 there exists δ > 0 such that lim inf where ω is the modified modulus of continuity: with the infimum taken over all partitions of the form 0 = T Since P n is finite and hence compact, Condition 1 is satisfied automatically with Γ ǫ,t = P n .Intuitively, Condition 2 ensures that the jumps of the process are well-separated.
In our case, ρ(G τ N (v) ) = 1 if there is at least one jump between times u and v, and 0 otherwise.The supremum and maximum indicate whether there is a jump inside any of the intervals of the given partition; this can be zero only if all of the jumps up to time t occur exactly at the times T 0:K .The infimum over all allowed partitions can only equal zero if no two jumps occur less than δ (unscaled) time apart.
To prove Theorem 1, it remains to verify Condition 2. To do this, we use a coupling with a process for which Condition 2 is easier to check, and which will imply that it also holds for the genealogical process of interest.Define p t := max ξ∈Pn {1 − p ξξ (t)} = 1 − p ∆∆ (t), where ∆ = {{1}, . . ., {n}}.For a proof that the maximum is attained at ξ = ∆, see Lemma 3. Following Möhle (1999), we construct the process (Z t , S t ) t∈N 0 on N 0 × P n with initial state (Z 0 , S 0 ) = (0, ∆), and conditional transition probabilities The definition of p t ensures that the second case of ( 7) is non-negative, attaining the value zero when ξ = ∆.Unlike the corresponding process in Möhle (1999), the transition probabilities in (7) depend on offspring counts.Thus, (Z t , S t ) is only Markovian conditional on F ∞ , and can be thought of as a time-inhomogeneous Markov chain in a random environment.Marginally, (S t ) has the same distribution as the genealogical process of interest, while (Z t ) jumps whenever (S t ) does, but also has some extra jumps.The jump times of (Z t ) do not depend on the current state, making it much easier to analyse.Our construction also resembles that of Möhle (2002), where the process (S t ) is also time-inhomogeneous but still Markovian without conditioning on a random environment.
The coupling of jumps implies that the modulus of continuity of (Z t ) is at least as large as that of (S t ).Hence, we will show that (6) holds for (Z t ), and conclude that it also holds for the genealogical processes of interest.Denote by 0 = T (N ) 0 < T (N ) 1 < . . . the jump times of the rescaled process (Z τ N (t) ) t≥0 , and by i−1 the corresponding holding times.Suppose that for some fixed ̟ ) by construction, so ω((Z τ N (•)) , δ, t) = 0. We therefore have that for each m ∈ N and δ > 0, Thus, a sufficient condition for Condition 2 to hold is: for any ǫ > 0, t > 0, there exist m ∈ N, δ > 0 such that lim inf By Lemma 2, below, the limiting distributions of ̟ (N ) i are i.i.d.Exp(α n ), where α n := n(n − 1)/2, so lim sup for each i, and lim sup using the series expansion for the Erlang CDF (see for example Forbes et al., 2011, Chapter 15).Now lim inf which can be made ≥ 1 − ǫ by taking m sufficiently large and δ sufficiently small.Since this argument applies for any ǫ and t, (8) is satisfied.Hence, so is Condition 2, and the proof is complete.
The structure of the proof of weak convergence in Theorem 1 resembles that of the neutral case (Möhle, 1999, Theorem 3.1).The complications arising from non-neutrality have been subsumed into Lemma 2, stated below.As illustrated in Figure 1, its proof relies on a number of technical results, which we state and prove in Section 4, and particularly in Section 4.2.
The neutral techniques of Möhle (1999) have also been used to establish weak convergence of genealogical processes to more general coalescent models featuring simultaneous mergers of more than two lineages (Möhle and Sagitov, 2001;Möhle and Sagitov, 2003;Sagitov, 2003).Proofs of these results invariably contain two main parts: convergence of finite-dimensional distributions (established via convergence of generators in the Markovian setting), and control of the modulus of continuity.We expect that this strategy will allow similar results for convergence of non-neutral models to multiple merger coalescents: use an approach reminiscent of Koskela et al. (2020, Theorem 1) to establish convergence of finite-dimensional distributions and then adapt Theorem 1 to control the relevant modulus of continuity.
Throughout the remainder of the manuscript, we write x 1:k ≤ y 1:k for vectors x 1:k := (x 1 , . . ., x k ) and y 1:k if the inequality holds elementwise.
Proof.For any k ∈ N, there is a continuous bijection between the jump times T 1:k to ̟ 1:k is equivalent to convergence of the jump times to T 1:k , where T i := ̟ 1 + • • • + ̟ i .We will work with the jump times, following Möhle (1999, Lemma 3.2).The idea is to prove by induction that, for any k ∈ N and t 1:k > 0, lim Take the basis case k = 1, for which ) has no jumps up to time t: Lemma 7 shows that this probability converges to e −αnt as required.
For the induction step, assume that (9) holds for some k.We have k+1 > t k+1 .The first term on the right-hand side converges to P[T 1:k ≤ t 1:k ] by the induction hypothesis, and it remains to show that lim As shown in Möhle (1999, p. 459), while the probability on the left-hand side of (10) can be written That is, there are jumps at some times r 1:k , and identity transitions at all other times.A similar expression is derived in Möhle (1999), but here we have an additional expectation because the probabilities p r depend on the random offspring counts.Lemmata 8 and 9 show that this probability converges to the correct limit.This completes the induction.
Proof.Consider any ξ ∈ E consisting of k blocks (1 ≤ k ≤ n − 1), and any ξ ′ ∈ E consisting of k + 1 blocks.Setting η = ξ in (4), Similarly, Discarding the zero summands, Thus, p ξξ (t) is decreasing in the number of blocks of ξ, and is therefore minimised by taking ξ = ∆, which uniquely achieves the maximum n blocks.This choice in turn maximises 1 − p ξξ (t), as required.

Bounds on sum-products
In this section we derive tractable bounds on sums of products of conditional merger probabilities, which themselves appear as upper and lower bounding envelopes of the conditional transition probability p r in Propositions 2 and 3.These sums of products can be regarded as building blocks of the conditional transition probabilities of the genealogical process, and the bounds obtained here facilitate proving its convergence.The sum-product bounds will be applied multiple times in the lemmata of this section.
Lemma 4. Fix t > s > 0 and l ∈ N. Then (a) Proof.(a) This follows from the inequalities the first of which follows from a multinomial expansion of the middle term and the second from Proposition 1(b).
(b) We begin by multiplying the bound in Koskela et al. (2022, Supplement, equation (8)) by ½ {c N (s)≤t−s} , which is valid because the left-hand side is non-negative, and where the final inequality follows from the definition of τ N and Lemma 4(a), and tracking the event {c N (s) ≤ t − s} is necessary in case of a large negative value of the lower bound t − s − c N (τ N (s)) when l is even.A binomial expansion of the first term on the right-hand side, followed by using Proposition 1(a) results in For the upper bound we have using the definition of τ N .A binomial expansion and Proposition 1(a) yield For later uses of Lemma 4 when s = 0, we emphasize that c N (τ N (0)) = 0.
Lemma 5. Fix t > 0, l ∈ N.Then, for any constant B > 0, Proof.We start with the binomial expansion Since we are summing over all permutations of s 1:l , the inner sum depends on I only through I := |I|.We may therefore replace the sum over I ⊆ [l] with a sum over the size I of the subset and a binomial coefficient counting the number of terms in which the subset is of size I: where we have also separated out the I = l term.There is always at least one D N factor in the second term on the right-hand side, so using Proposition 1(a), Lemma 4(a), and the Binomial Theorem, we can write Substituting ( 13) into (12) concludes the proof.
Lemma 6. Fix t > 0, l ∈ N.Then, for any constant B > 0, Proof.A binomial expansion and manipulations as in ( 11)-( 12) gives where the inequality arises because some positive terms have been multiplied by −1.Then (13) concludes the proof, noting that an upper bound on negative terms results in an overall lower bound.
Since l distinct objects can always be ordered, Lemmata 5 and 6 can also be phrased in terms of summations over ordered rather than distinct variables.We will use whichever representation is more convenient on a case-by-case basis.

Main components of induction argument
This section contains the technical aspects of the proof of Lemma 2, which establishes the limiting distributions of holding times of the coupled process via an induction argument.It is split into four lemmata: the first (Lemma 7) is used in the basis step, and the others in the induction step, which is established by combining upper and lower bounds proved in Lemmata 8 and 9, respectively.Lemma 10 is a technical result which is common to both the upper and lower bounds, determining the limit as N → ∞ of a certain expectation that arises in both cases.
The following are all consequences of (5): for all t > s > 0, as N → ∞.Proofs are given in Brown et al. (2021) in Lemmata 3.4 (with small tweaks), 3.3 and 3.5 respectively.
Proof.We start by showing that lim ] ≤ e −αnt .Setting ξ = ∆ in Proposition 3, we have for each r and sufficiently large N, Since our interest is the N → ∞ limit, it is sufficient to have bounds that hold for large enough N.However, some of the manipulations to follow will also require pre-limiting bounds to be non-negative.For this reason we introduce indicator functions which guarantee non-negativity, but which will not affect the limit.The indicators introduced at this point are such that if their conditions do not hold then the bound becomes the trivial 1 − p r ≤ 1.
When N ≥ 3, a sufficient condition to ensure that the expression on the right-hand side of ( 17) is non-negative is that the event occurs, where the sequence 1 N is the same as that in (17).We will also need to control the sign of c N (r) − B ′ n D N (r), for which we define the event and we define N .Applying a multinomial expansion and then separating the positive and negative terms, This is further bounded by applying Lemma 6 and then both bounds of Lemma 4(b): Collecting some terms, The requirement τ N (t) ≥ l has been dropped in all but the first term, which constitutes adding some positive terms, giving an upper bound.Now, taking the expectation and limit, then applying ( 14)-( 16), and using Lemmata 12, 13 and 14 to show that lim Passing the limit and expectation inside the infinite sum is justified by dominated convergence and Fubini's theorem.It remains to show the corresponding lower bound: lim where B n > 0. Due to Proposition 1(a), a sufficient condition for this bound to be non-negative is and we define is also a valid lower bound since if E 3 N does not occur then this collapses to the trivial lower bound 1 − p t ≥ 0. We now apply a multinomial expansion to the product, and split into positive and negative terms: From here, the argument for the lower bound follows the same steps as that used to obtain the upper bound.The right-hand side of ( 25) is further bounded via Lemma 5 and both bounds in Lemma 4(b): Collecting terms and dropping indicators from some non-positive terms, Now, taking the expectation and limit, and applying ( 14)-( 16) to show that all but the first sum vanish, and Lemmata 12 and 13 to show that lim Again, passing the limit and expectation inside the infinite sum is justified by dominated convergence and Fubini.Combining the upper and lower bounds in ( 22) and ( 27) respectively concludes the proof.
Lemma 8 (Induction step upper bound).Assume (5) holds.Fix k ∈ N, i 0 := 0, i k := k.For any sequence of times Proof.We use the bound on (1 −p r ) from ( 17), which holds for sufficiently large N, and apply a multinomial expansion.Define events E 1 N and E 2 N as intersections of events of the form in ( 18) and ( 19), such that the following manipulations make sense: The penultimate line above is exactly the expansion we had in the basis step (20), except for the upper limit of the summation over l, and as such following the same arguments gives a bound analogous to that in ( 21): For the last line of (28), recalling that where the penultimate inequality uses Lemma 4(a).Substituting the preceding two displays into (28), we obtain To obtain a corresponding bound for p r , we use ( 23) and Lemma 5 (with ordered rather than distinct indices) to obtain The following looser but simpler bound will also be useful: Using Lemma 4(a), ( 31) also leads to the deterministic bound All the ingredients for obtaining the bound in the statement of Lemma 8 are now in place.First, by (29), To further bound the right-hand side, we apply (30) to the first term, (32) to the second, and (31) to the third, yielding Taking an expectation and letting N → ∞, the second, third, fourth, and fifth lines on the right-hand side vanish by ( 14)-( 16), leaving lim where passing the limit and expectation inside the infinite sum is justified by Lemma 16.
To see that the last line vanishes, recall that 0 = t 0 ≤ t 1 ≤ . . .≤ t k ≤ t, whereupon using Lemma 4(a) for the final inequality.Hence, by ( 15), lim By Lemmata 12, 13 and 14, lim = 1, so we can apply Lemma 10 to the remaining expectations in (33), yielding lim Lemma 9 (Induction step lower bound).Assume (5) holds.Fix k ∈ N, i 0 := 0, i k := k.For any sequence of times 0 Proof.Firstly, The second product on the right-hand side does not depend on r 1:k , and we can use the lower bound from (26): where E 3 N is defined as in and immediately beneath (24).We will also need an upper bound on this product, which is formed from ( 21) with a further deterministic bound: where the second inequality uses Proposition 1, parts (a) and (b).Now consider the remaining sum-product of p r i -factors on the right-hand side of (34).We use the same bound on p r as in (17): where the 1 N term does not depend on r.The right-hand side of ( 37) is non-negative on the event E 2 N , defined in and beneath (19).Hence Applying Lemma 6 with ordered indices, we obtain The above expression is already split into positive and negative terms; a lower bound on (34) can be formed by multiplying the positive terms by the lower bound (35) and the negative terms by the upper bound (36).Thus, Due to ( 14)-( 16), all but the first line on the right-hand side of the above have vanishing expectation, leaving lim Passing the limit and expectation inside the infinite sum is justified by Lemma 16.Lemmata 12 and 14 establish that lim N →∞ P[E 2 N ∩ E 3 N ] = 1, and Lemma 13 deals with the indicator for {τ N (t) ≥ l}.We can therefore apply Lemma 10 to conclude that lim as required.
Proof.As pointed out by Möhle (1999, p. 460), the sum-product on the left-hand side of the statement can be expanded as c N (r i ).
Define the events , where the upper bound on the right-hand sides is strictly positive since t j > t j−1 , and thus satisfies the conditions of Lemmata 12 and 15.Define the event [E 4 N (j) ∩ F 4 N (j) ∩ {c N (τ N (t j−1 )) ≤ t j − t j−1 }], We can now evaluate the limit: lim ,...,k}:i j ≥j k j=1 (t j − t j−1 ) i j −i j−1 (i j − i j−1 )! lim (t j − t j−1 ) i j −i j−1 (i j − i j−1 )! , where we used ( 14) and ( 15) to conclude that the sum in the expectation vanishes, and Lemmata 11, 12, and 15 to obtain that lim N →∞ P[E N ∩ E 4 N ] = 1.The upper and lower bounds coincide, so the proof is complete.

Indicators
Many of the preceding results make use of indicator functions in order to control the sign of certain terms.The probabilities of the corresponding events were claimed to converge to 1 as N → ∞, so that the indicators do not affect the limit.These claims are proved in here.Firstly, Lemma 11 was proved in Brown (2021), and shows that suffices to prove the limits separately for each factor in a product of indicators of two or more events.
Lemma 11. (Brown, 2021, Lemma 4.11)  The remainder of this section is split into four lemmata, each showing that the probabilities of certain events converge to 1 as N → ∞.The first three are variants of Koskela et al. (2022, Supplement, Lemma 4), with analogous proofs.For completeness, self-contained proofs of Lemmata 12 -14 can be found in Brown (2021, Lemmata 4.12-4.14).
(1:N ) t rather than the parental indices a

Figure 1 :
Figure 1: Graph showing dependencies between the lemmata used to prove weak convergence.Dotted arrows indicate dependence via a slight modification of the preceding lemma.
Throughout the remainder of the paper, we assume N is large enough that any 1 N terms are positive.The following upper bound can be found inKoskela et al. (2022, Supplement, Lemma 1 Case 1).