On the Number of Collisions in $\Lambda$-Coalescents

We examine the total number of collisions $C_n$ in the $\Lambda$-coalescent process which starts with $n$ particles. A linear growth and a stable limit law for $C_n$ are shown under the assumption of a power-like behaviour of the measure $\Lambda$ near $0$ with exponent $0<\alpha<1$.


Introduction
A system of particles undergoes a random Markovian evolution according to the rules of the Pitman-Sagitov Λ-coalescent [13,14] if the only possible type of interaction is a collision affecting two or more particles that merge together to form a single particle. When the total number of particles is b ≥ 2, a collision affecting some 2 ≤ j ≤ b particles occurs at the probability rate where Λ is a given finite measure on [0, 1]. Linear time change allows to rescale Λ by its total mass, making it a probability measure, which is always supposed below. Two important special cases are Kingman's coalescent [10] with Λ a unit mass at 0 (when only binary collisions are possible), and the Bolthausen-Sznitman coalescent [5] with Λ the Lebesgue measure on [0, 1]. See [1,2,8,12] for recent work on the Λ-coalescents and further references. A quantity of considerable interest is the number of collisions C n which occur as the system progresses from the initial state with n particles to the terminal state with a single particle. Representing the coalescent process by a genealogical tree, C n can be also understood as the number of non-leave nodes. Asymptotic properties of C n are sensitive functions of the behaviour of Λ near 0. In this paper we explore the class of measures which satisfy Λ ([0, x]) = Ax α + O(x α+ς ) as x ↓ 0, with 0 < α < 1 and ς > 0. (2) Under this assumption we show that C n ∼ (1 − α)n as n → ∞ (Lemma 6) and that the law of C n approaches a completely asymmetric stable distribution of index 2 − α (Theorem 9). The same question for the Bolthausen-Sznitman coalescent has been addressed recently in [6,7]. This can be viewed as a limiting case of (2) with α = 1. However, the technique of [6,7] is based on the particular form of Λ in that case hence cannot be applied to the general Λ satisfying (2) with α = 1.
If Λ is a beta(α, 2 − α) distribution with parameter 0 < α < 2, a time-reversal of the coalescent describes the genealogy of a continuous-state branching process [4]. This connection was exploited recently to study a small-time behaviour of Λ-coalescents [1,2] in the beta case.
We develop here a more robust and straightforward approach based on analysis of the decreasing Markov chain M n counting the number of particles. The number of collisions C n is the number of steps needed for M n to reach the absorbing state 1 from state n. In Kingman's case M n has unit decrements, but in general the decrements of M n are not stationary, which is a major source of difficulties preventing direct application of the classical renewal theorems for step distributions with infinite variance [9]. To override this obstacle we show that when (2) holds, in a certain range M n can be bounded from above and below by processes with stationary decrements. It allows to approximate C n , and it happens that these bounds can be made tough enough to derive the limit theorem. Our method may be of interest in a wider context of the pure death processes.
By Schweinsberg's result [16] a coalescent satisfying (2) comes down from the infinity, hence the number of particles existing at a fixed time is uniformly bounded whichever n. Therefore the asymptotics of the number of collisions that occur prior some fixed time is the same as that of C n .

Markov chain M n
Let M n be the Markov chain whose time ticks at the collision events and the state coincides with the number of remaining particles. Since no two collisions occur simultaneously, the number of collisions C n in the Λ-coalescent starting with n particles is the number of steps the Markov chain M n needs to proceed from the initial state n to the terminal state 1. Note that the number of particles decreases by j − 1 when a collision affects j particles, hence the probability of transition from b particles to b − j + 1 is where λ b is the total collision rate of b particles It is convenient to introduce the sequence of moments In view of λ b,2 = b 2 ν b−2 the rates λ b,2 (b = 2, 3, . . . ) uniquely determine the whole array λ b,j , as one can also conclude from the consistency relation which is equivalent to the integral representation of rates (1), see [13].
Simple computation shows that the rates can be derived from ν b 's as and, from λ b+1 − λ b = bν b−1 , we have Since the second difference of We shall denote J b a random variable with distribution so the first decrement of M n is distributed as J n − 1, and its mean value is Let g(n, b) denote the Green kernel equal to the probability that the Markov chain M n ever visits state b. We have g(n, n) = 1, and g(n, 1) = 1 since 1 is the absorbing state reached in at most n − 1 steps. Decomposition over the first jump shows that the Green kernel satisfies the recursion The moments of C n can be readily expressed in terms of the Green kernel.
Lemma 1. The first two moments of the number of collisions in the Λ-coalescent started with n particles are Proof. Formula (11) is obvious since the number of collisions is the number of sites b > 1 visited by M n . Still, it is instructive to derive (11) from the first-step decomposition where in the RHS J n has distribution (3) and is independent from C 1 , . . . , C n , with C 1 = 0. Taking expectations on both sides of (13) we obtain where h n ≡ 1. Replace now repeatedly E C n−1 , E C n−2 , etc. using this recursion. Collecting coefficients at h b we see that it is the sum over all decreasing paths from n to b of probabilities of the path, that is g(n, b) by definition. Using relations g(n, n) = 1 and E C 1 = 0 we arrive at (11).
To calculate the second moment the equation (13) is squared, so from which This has the same structure as (14) with h n = 1 + 2 n j=2 q n (j)E C n−j+1 . Expressing E C n−j+1 from (11) and using recursion (10) yield Following the same line as above we get (12).

Asymptotics of the moments
From now on we only consider measures Λ satisfying (2). Standard Tauberian arguments show that in this case Here and henceforth ς ′ = min{1, ς} .
This behaviour will imply that the transition probabilities q n (j) stabilise as n → ∞ for each fixed j.
The relevant asymptotics of λ n and λ n,j appeared in [3,Lemma 4] under a less restrictive assumption of regular variation, but we need to explicitly control the error term.
Proof. Introduce the truncated moment Integrating by parts we derive from (2) that for x → 0 Rewriting (1) in terms of G −2 and integrating by parts we obtain for ς < 2 − α, in which case the result follows from the familiar asymptotics of the gamma function Γ(n + β)/Γ(n) = n β + O(n β−1 ) (n → ∞). If ς > 1 the error term in this expansion constitutes the main part of error, yielding appearance of ς ′ instead of ς. The case ς ≥ 2 − α is treated in the same way.
Proof. The formula for λ n follows by the direct application of Lemma 2 with m = 2. Expression for λ n,j is a difference between two subsequent tail sums. The ratio of these quantities gives q n (j).
Thus J n converge in distribution. The convergence in mean is also true. Note that the mean of the limiting distribution of jumps Lemma 4. If (2) holds then the mean decrease of the number of particles after collision satisfies Proof. By assumption (2) relation (16) implies the existence of constants n 0 , c such that Approximating sums by integrals yields, as n → ∞, by definition of ς ′ . Substitution of this expression into (9) and applying Corollary 3 finishes the proof.
Example. It is possible to choose measure Λ so that the decrement probabilities for j < n are exactly the same as for the limiting distribution truncated at n, in which case the envisaged limit theorem for C n follows readily from [9]. To achieve one should take the measure which is a mixture of beta(α, 1) and a Dirac mass at 1. Adding δ 1 does not affect λ n,j for j < n, so the integration in (1) yields Summation (or direct integration of (4)) implies the desired expression for q n (j). That a positive mass at 1 does not affect the asymptotics of C n is seen e.g. by observing that the probability of total collision implied by this mass is of the order smaller than n −1 , namely q n (n) = O(n α−2 ). On the continuous time scale of the coalescent, the mass at 1 is responsible for the total coalescence time (independent of n), hence the insensibility of the asymptotics to Λ({1}) may be explained by the effect of coming down from the infinity, as mentioned in Introduction.
The example also demonstrates that taking minimum in the error term of Lemma 4 is necessary. Indeed, ς ′ = 1, however direct calculation using (18) shows that So the error term is O(n α−1 ), and not O(n −1 ).
The heuristics for this is that the jumps J b − 1 become almost identically distributed for large b and their mean is close to 1/(1 − α). If the distributions were indeed the same with mean 1/(1 − α) then the Lemma would follow from the renewal theorem. We postpone a rigorous proof to Section 4.
in probability.
Proof. The argument is based on formulas of Lemma 1. Indeed, immediately from Lemma 1 and Application of Chebyshev's inequality completes the proof.

Stochastic bounds on the jumps
In this section we construct stochastic bounds on the decrements J b − 1 of Markov chain M n in a range b = k, . . . , n, to control the asymptotic behaviour of C n . Specifically, we find random variables J + n↓k and J − n↓k to secure the distributional bounds uniformly in some range b = k, . . . , n. Here ≤ d denotes the stochastic order, meaning that two random variables X and Y satisfy X Our approach to establishing the limit theorem for the number of collisions is based on constructing random variables J + n↓k and J − n↓k which on the one hand comply with (19) and on the other hand yield the same limit distribution of the sum of their independent copies. These two requirements point in opposite directions, forcing an adequate choice of these random variables to be a compromise. We define the distributions which depend on parameters γ, β ∈ ]0, 1[ and θ ∈ ]β, 1[. The calibration of these constants will be done later. For n ≥ k > 0 define where λ n (m : k) = k j=m λ n,j .
Note that j q + n↓k (j) = j q − n↓k (j) = 1. Moreover, q ± n↓k (j) are nonnegative for large enough n and k. Indeed, the inequality q + n↓k (j) ≥ 0 is obvious. Lemma 2 implies that if n and k are large enough and k > n β then for some c 2 > c 1 > 0 uniformly in ℓ ∈ {k, . . . , n}. Hence (21) holds for the maximum over these ℓ, and it follows that q − n↓k (j) ≥ 0. Hence, quantities q ± n↓k (j) define some probability distributions on N, at least for large enough n and k. Let J + n↓k and J − n↓k be random variables with these distributions, so Lemma 7. Suppose that β, γ, θ and υ satisfy the inequalities Then the stochastic bounds (19) hold for n large enough and b, k in the range ⌊n υ ⌋ ≤ k ≤ b ≤ n.
Proof. By definition of the stochastic order, we need to show that for all m The first inequality (24) is clearly true for m ≥ n β + 1 because the left-hand side is zero and for m ≤ 2 because both sides are 1. Suppose 3 ≤ m ≤ n β , then the first inequality reads as Since b ≥ k ≥ ⌊n υ ⌋, taking n sufficiently large enables us to apply Lemma 2 and Corollary 3 to get asymptotic estimates valid for all b in the range k ≤ b ≤ n. From the definition of λ k (m : n β ), Lemma 2 and the inequality Hence we rewrite the inequality as The leading terms on both sides cancel, and simplifying this inequality we are reduced to checking (In the above lines we neglected certain lower order terms using inequalities like The desired inequality follows from the following two inequalities with sufficiently large constants c 1 , c 2 > 0 (twice the ratio of a constant implied by the corresponding O(·) and the constant in the right-hand side is enough) and c 3 ∈ ]0, 1[. The right-hand side of (26) considered as a function of m ∈ [3, ⌊n β ⌋] has a unique minimum n β−γ/(2−α) and has the value asymptotic to Since k ≥ ⌊n υ ⌋ the RHS grows to infinity once (23) holds.
In inequality (27) we neglect the first summand in the RHS and still have the function b 1−α n −γ ≥ n υς ′ −γ which grows to infinity with n once (23) holds. Thus the first inequality in (24) holds for all sufficiently large n.
The latter follows from a simpler inequality Since k > n β , application of (21) implies max ℓ∈{k,...,n} for some c 4 > 0. We suppose that n is large enough to satisfy 1 − n −γ ≥ 1/2. These observations, Lemma 2 and Corollary 3 allow us to rewrite inequality (28) as for some c 5 > 0. Simplification shows that this inequality holds provided for suitable constants c 6 , c 7 > 0. Further simplification gives Proceeding as above, the expression in brackets attains its minimum in m ∈ [3, n β ] at m ′′ ∼ c 8 n β−γ/(2−α) , c 8 > 0, with the minimum value asymptotic to where c 9 > 0. Since b ≥ ⌊n υ ⌋ the right-hand side of (29) grows to infinity as n → ∞ as long as (23) holds. This observation finishes the proof.
We want to keep control over the difference between distributions of J + n↓k , J − n↓k and J b , n ≥ b ≥ k. In particular, the following statement provides bounds for divergence of means.
Lemma 8. Suppose β < υ < 1. Then there exists c > 0 such that for n large enough and for k in range n ≥ k ≥ ⌊n υ ⌋ the following inequalities hold: Proof. We start with the following observation.
Taking m = n β + 1, for some β ∈ ]0, υ[ we see using Corollary 3 that as n, k → ∞ with n ≥ k ≥ ⌊n υ ⌋. Now the proof follows by a simple calculation. The mean of J − n↓k can be estimated using Lemma 4 and (30): Similarly, since υ > β formula (30) is applicable and implies together with Lemma 2 that so the claim follows.
Using a familiar device, Lemma 7 enables us to couple random variables J + n↓k , J b and J − n↓k in such a way that holds almost surely. From this Lemma 5 will follow by comparing M n with two random walks.
Proof of Lemma 5. For n ≥ b ≥ k ≥ 1 let g + (n↓k, b) and g − (n↓k, b) be the Green kernels of decreasing random walks started at n with decrements (J + n↓k − 1) and (J − n↓k − 1), correspondingly. Suppose that (31) holds for some n ≥ k and for all b in range n ≥ b ≥ k. Then we have g Take γ > 0 and υ > θ > β > 1/(2 − α). Combination of Lemmas 4 and 8 implies that Assume further that parameters β, γ, υ and θ are chosen to satisfy inequality (23). Then the coupling (31) exists for n ≥ b ≥ k ≥ ⌊n υ ⌋ by Lemma 7. Applying a standard result of renewal theory, as n, k → ∞ and n − b → ∞. Thus g(n, b) has the same limit.

The total number of collisions
We are in position now to present our main result on the convergence of the number of collisions C n in the Λ-coalescent on n particles.
The main idea of the proof is that the decrements J b of are almost identically distributed for large b, as Corollary 3 suggests. However the nonstationarity prevents the possibility of any direct analysis. To override this, we use the technique of stochastic bounds introduced in the previous section. First we introduce some auxiliary notations.
For 1 ≤ k ≤ n the coalescent started with n particles after some series of collisions will reach a state with less than k + 1 particles; let C n↓k denote the number of collisions and let B n,k ≤ k denote the number of particles as the coalescent enters such state. In particular, C n = C n↓1 . For J ± n↓k,m independent copies of J ± n↓k , introduce C + n↓k,ℓ := min c : the minimal number of decrements distributed as J + n↓k − 1 (respectively, J − n↓k − 1) needed to drop by at least ℓ. We skip the index ℓ when it is equal to n − k, so that C ± n↓k ≡ C ± n↓k,n−k . Under assumptions of Lemma 7 we can couple the corresponding Markov chains so that (31) holds almost surely for all large enough n once n ≥ b ≥ k ≥ ⌊n υ ⌋. Consequently, for such n the coupled Markov chains satisfy In other words, In order to find the limit distributions for C ± n↓⌊n υ ⌋ we need the following statement about the characteristic function φ n (u) := E e iu(Jn−1) of the first decrement of M n .
Proof. We write for shorthand u = s/m. For u = 0 the claim is obvious, so we suppose that u = 0. The characteristic function of J n − 1 can be written in terms of Λ as follows: using the integral representation of λ n,j . Denote the numerator of the fraction under the integral above by h n (u, x); then (7) we obtain because h 1 (u, x) = 0. Taking again differences of (1 − x) j − (1 − (1 − e iu )x) j with respect to j and calculating it directly for j = 0 we represent the integral in (35) as Exchanging the sums and utilising notation (5) for moments ν k of Λ we get (36) By (8) and Lemma 4 the second term above is since ς > 1 − α by hypothesis. Recalling notation u = s/m and inequality n ≥ m 1/υ with υ < 1 we see that for some δ 1 > 0. Thus it remains to estimate the last summand in (36). Integration by parts gives Substitution of this relation into (36) leads to Summation yields For m big enough and n ≥ m 1/υ with υ < 1 for some δ 2 > 0. Let θ ∈ [−π/2, π/2] be such that e iθ = 1−e is/m |1−e is/m | . Note that θ = −π sign(s)/2 + O(1/m) as m → ∞. For any β > 0 we have as m, k → ∞ with k ≥ m 1+δ3 for any δ 3 > 0. By assumption (2) we can write Λ[0, where |f (x)| ≤ cx α+ς for some c > 0 and all x ∈ [0, 1]. Thus, as m, k → ∞ with k ≥ m 1+δ3 , Take δ 3 = (1/υ − 1)/2 and denote n 0 = m 1+δ3 . Divide the last sum in (37) into two sums over k ≥ n 0 and k < n 0 . The first sum is estimated taking (17) into account as for some δ 4 > 0. The same argument applied to the sum over k = 0, . . . , n 0 − 1 shows that it constitutes a lower order term to the whole sum. Thus it remains to combine the results above to get the statement of Lemma.
Next we show that under certain assumptions the same asymptotic expansion is also valid for the characteristic functions of J ± Then there exists δ > 0 such that as n, k, m → ∞ in such a way that n ≥ k ≥ ⌊n υ ⌋ and m ≤ cn 1/(2−α) for some c > 0.
Let S + n↓k,h , respectively S − n↓k,h , be the sum of h independent copies of J + n↓k − 1, respectively J − n↓k − 1.
Hence the claim follows from inequalities (34) since F 2−α is continuous.
Proof of Theorem 9. Recall that B n,k is a number of particles in the Λ-coalescent started with n particles right after the number of particles drops below to k + 1. Then for any k ≤ n the total number of collisions is decomposable as where in the RHS (C b ) is an independent copy of (C b ). This can be iterated as B n,k 1 ↓k2 + C B n,k 2 ↓k3 + · · · + C (ℓ) B n,k ℓ−1 ↓k ℓ for any finite sequence k ℓ ≤ k ℓ−1 ≤ · · · ≤ k 1 ≤ n, with the convention that C b↓k = 0 for b ≤ k.