A population model for $\Lambda$-coalescents with neutral mutations

Bertoin and Le Gall (2003) introduced a certain probability measure valued Markov process that describes the evolution of a population, such that a sample from this population would exhibit a genealogy given by the so-called $\Lambda$-coalescent, or coalescent with multiple collisions, introduced independently by Pitman (1999) and Sagitov (1999). We show how this process can be extended to the case where lineages can experience mutations. Regenerative compositions enter naturally into this model, which is somewhat surprising, considering a negative result by M{\"o}hle (2007).


Introduction
A coalescent with multiple collisions, or Λ-coalescent, Π = (Π t ) t≥0 , is a Markov process on the space P(N), the partitions of N = {1, 2, . . . }, such that for all n, Π (n) , the restriction of Π to [n] = {1, . . . , n}, is a Markov process with the following transitions: If Π (n) t has b blocks, then any collection of k blocks, coalesce into one block at rate λ b,k for 2 ≤ k ≤ b ≤ n. Note that the rate only depends on the number of blocks, not their sizes. By considering Π (n) and Π (n+1) one realizes that λ b,k = λ b+1,k + λ b+1,k+1 . From this it follows, see [9], that for some finite measure Λ on [0, 1]. * Stockholm University, Department of Mathematics, division of Mathematical Statistics, 10691 Stockholm, Sweden, e-mail: andreas@math.su.se Λ-coalescents were introduced independently by Pitman [9] and Sagitov [10] as a generalization of the Kingman coalescent [4]. All these processes can arise as limiting processes when studying the genealogy for a finite sample of individuals from a haploid (one parent per child) population, see [11,Proposition 7].
Consider a large population with constant size N for all generations, which are furthermore non-overlapping. A sample of n individuals, labeled by [n], form a partition by grouping together those who have had a common ancestor by generation t backwards in time, i.e. those whose lineages have coalesced into a common lineage by that time. The Λ-coalescent, restricted to [n], is a possible limiting process when N → ∞, and time and the distribution of the number of children of each individual are scaled properly. Furthermore, it is possible to obtain a coalescent with simultaneous multiple collisions, but we will in this paper not consider such so-called Ξ-coalescents, see [8,11] for more details.
The Kingman coalescent is a Λ-coalescent with Λ = δ 0 , i.e. with the only type of transition being a merger of two blocks at a time. This process is the natural limiting process for many population models, roughly speaking those models where the number of children for each individual always is small compared to the total population size as the size tends to infinity. The probability of more than two lineages in the sample coalescing at the same time is then negligible in the limit. A Λ-coalescent with Λ = δ 0 corresponds to a population where occasionally single individuals have offspring constituting a positive fraction of the entire next generation as the population size tends to infinity. If several of the lineages in your sample belong to that fraction, they will coalesce into a single lineage at that moment.
The main result of this paper is a description of the dynamics of the whole population when all lineages experience neutral mutations, i.e. mutations that do not influence the chance of survival. Earlier studies have only described how introducing mutations influences the dynamics of the genealogy of a sample from the population.
We will proceed as follows. First, in Section 2, we will acquaint ourselves with useful representations of random partitions and coalescent processes. Here we will also find a description of a population model such that the genealogy of a sample from this population is given by the Λ-coalescent. When we introduce mutations in the population, an obvious way of partitioning a sample of individuals is by their common genotypes. In Section 3, a general recursion formula is given for the distribution of the family sizes in the sample. Section 4 might at first be seen as a detour into the theory of regenerative composition structures, i.e. a special kind of ordered partitions, especially since it is known that our type of partitions can never appear from these regenerative composition structures if one simply disregards the order in the composition. This theory, however, is used in the last Section 5, in which we present a model for the whole population, and not just a sample from it, when all lineages experience neutral mutations, such that the distribution of a sample from this population is in accordance with the result in Section 3.

Paintboxes and a population process
In this section we will see how one can use probability measures to construct random partitions of N. Let p be a probability measure with atoms of sizes b = (b i ) i∈N in non-increasing order. Let (R i ) i∈N be an i.i.d. sample from p and define the equivalence relation ∼ p on N by i ∼ p j if R i = R j . Then the equivalence classes of ∼ p form the sought partition of N.
A more common way of using such a sequence b, is to partition 1), and let i such that V i ∈ I 0 be in equivalence classes of their own. In general, one can use a random measure π, and carry out the construction pointwise, given π = p. This is equivalent to using atoms of random sizes β = (β i ) i∈N , and the construction is called a paintbox construction, see [4].
A random partition of N is called exchangeable if its restriction to [n] has a distribution that is invariant under permutations of the labels [n] for all n ∈ N. For example, if (Π t ) t≥0 is a Λ-coalescent, then Π t is exchangeable for all t. Kingman has shown, see e.g. [4], that any exchangeable random partition of N can be obtained from a paintbox construction, e.g. with β = (β i ) i∈N being the almost sure limit of (l i (n)/n) i∈N , where l i (n) is the size of the ith largest block of the partition restricted to [n]. Here, and elsewhere in this paper, we understand limits to be taken as n → ∞, unless otherwise indicated.
If one enumerates the blocks of Π s = {A s 1 , A s 2 , . . . }, and for t > s considers the blocks of Π t = {A t 1 , A t 2 , . . . }, then each A t i = ∪ j∈C s,t i A s j for some C s,t i . It is a property of coalescent processes that the partition Π s,t = {C s,t 1 , C s,t 2 , . . . } is exchangeable and distributed as Π t−s . Bertoin and Le Gall [1] showed that there exists a collection (π s,t ) s≤t of random probability measures on [0, 1] such that ∼ πs,t corresponds to Π s,t for all s < t. For fixed s and increasing t this collection describes the genealogy of the population further and further backwards in time. The random partition Π s,t should be interpreted as describing how the lineages present in the population at time s coalesce into lineages at time t, and it thus has a meaning even with negative arguments s < t, since this corresponds to future events in the population relative to time zero. We shall study, and later extend, the dynamics of the Markov process ρ = (ρ t ) t≥0 := (π −t,0 ) t≥0 , which describes the evolution of the population forwards in time. Heuristically, ρ t (dr) represents the descendants at time t of the fraction dr of the population at time zero.
If Λ(0) = 0, then the dynamics of ρ can be described by a measure ν(dx) := Λ(dx)/x 2 . Let {(τ i , X i , U i )} i∈N be a Poisson process on R × (0, 1] × (0, 1) with intensity measure dt ⊗ ν(dx) ⊗ du. Assume for the moment |ν| := ν((0, 1]) < ∞. We will use this Poisson point process to construct the process ρ. We let (τ i ) i∈N , which by the assumption is a homogeneous Poisson process with rate |ν|, be the jump times of ρ. At a time τ i , the conditional law of ρ τ i , given ρ τ i − , is where X i has distribution ν/|ν| and R i is a sample from ρ τ i − , picked by the inverse transformation method: Between jumps, ρ remains constant. The heuristic interpretation of these dynamics is that a person is chosen from the population just before the jump at time τ i , and she is identified with her family labeled R i . At the time of the jump, she begets offspring of proportion X i of the total population, and we say that the litter i, born at τ i , has size X i . The rest of the population must thus be scaled down by a factor 1 − X i and the atom corresponding to her family is increased with mass X i .
, then the sequence of processes (ρ n ) n∈N , where each ρ n is governed by the respective ν n , converges in distribution, in the sense of weak convergence of finite-dimensional marginals, to the process ρ corresponding to the collection (π s,t ) s≤t associated with the limiting Λ-coalescent. If Λ(0) = Λ(1) = 0 and (0,1) xν(dx) < ∞, then the convergence can be strengthened to almost sure convergence.

Mutations in the sample
In a population genetics setting, it is natural to introduce mutations along the lineages and ask how the individuals of your sample are partitioned into different families according to their genotype. We assume that mutations always give rise to new types of individuals never seen before in the population (the so-called infinite alleles model), and that when tracing the lineages backwards in time, there is a constant intensity µ per lineage for a mutation to occur, i.e. if we draw the family tree of the sample, the mutations constitute a homogeneous Poisson process with intensity µ along each branch, see Figure 1 for an example. A quantity of interest is q(a), a = (a 1 , a 2 , . . . ), the probability of observing a partition with a i families of size i. When we trace the genealogy of the lineages backwards in time, the probability of a mutation to occur first is µn/(µn + λ n ), and the probability of a collision of k lineages happening first is n k λ n,k /(µn + λ n ), for k = 2, . . . , n. By the Markov property of the Λ-coalescent, we can condition on the type of event that happens first, and obtain a recursion for q(a). Möhle [6] was the first to provide this recursion for Λ-(and even Ξ-) coalescents. Let e k be the kth unit vector in R ∞ and λ n := n k=2 λ n,k .
The parts of this formula should be interpreted as follows. If a mutation occurs first, the rest of the sample is described by a − e 1 . If a merger of k lineages occurs first, and it occurs in a family represented by j + k − 1 lineages, then after that merger, the sample will consist of n − k + 1 lineages and be described by a + e j − e j+k−1 . In particular, there will now be a j + 1 families of size j. The probability that the merger of k lineages affected a family of size j +k −1 is given by j(a j +1)/(n−k +1), since the merger could have resulted in any of the j lineages in any of the a j + 1 families with equal probability by the exchangeability. We refer to Möhle [6] and Dong et al. [2] for more detailed discussions. The latter paper extends coalescent processes, so that their blocks become frozen when they encounter a mutation, and then do not partake in the further evolution of the process. With frozen blocks enclosed by and , the path of this process, realized as in Figure 1, would be The partition into families is obtained when all blocks are frozen.

Regenerative composition structures
Most results in this section are from Gnedin and Pitman [3]. A partition of n ∈ N is an unordered collection of natural numbers {n 1 , . . . , n k } such that n 1 + · · · + n k = n. An ordered partition is called a composition, and we say that n 1 , . . . , n k are its parts. A composition structure C is a sequence (C n ) n∈N of random compositions of n such that if n balls are distributed into an ordered series of boxes according to C n , then C n−1 is obtained by discarding one of the balls picked uniformly at random, and deleting an empty box in case one is created. A composition structure is regenerative if for all n ≥ m ≥ 1, given that the first part is m, the remaining composition of n − m is distributed as C n−m .
We will see that one can obtain regenerative compositions with the appropriate sampling procedure. Let (V i ) i∈N be i.i.d. U (0, 1) and let (V in ) i∈ [n] be the ordered sample of (V i ) i∈[n] , meaning V 1n ≤ · · · ≤ V nn . Given a closed set S ⊆ [0, 1], we can construct a composition C n as follows. Partition [n] into blocks of consecutive integers by letting j and j + 1 be in different blocks if [V jn , V j+1,n ] ∩ S = ∅. Let the parts of C n be given by the sizes of the blocks in increasing order of their elements, see Figure 2. We will in general also allow a random closed set S ⊆ [0, 1], independent of (V i ) i∈N , where the construction is carried out given the realization S = S. We then say that C n is obtained by sampling from S.
This is the part of S strictly to the right of D(S, z) scaled back to [0, 1], see Figure 2. We say that a random closed set S ⊆ [0, 1] is multiplicatively regenerative if for each z ∈ [0, 1), given D(S, z) < 1, the set S (z) is independent of [0, D(S, z)] ∩ S, and has the same distribution as S. ? Figure 2: An illustration of sampling with V 1 , . . . , V 7 from S resulting in (n 1 , n 2 , n 3 , n 4 ) = (1, 1, 2, 3), and how to construct S (z) from S.
Let {(τ i , X i )} i∈N be a Poisson process on R + × (0, 1) with intensity measure dt ⊗ ν(dx) for a measure ν with (0,1) xν(dx) < ∞, and let µ ≥ 0 be a constant. The notation here is intensionally similar to the one in the previous sections of this paper, but we assume for the moment no relation to these. We call the process Z = (Z t ) t≥0 a multiplicative subordinator with characteristics (µ, ν) if for all t ≥ 0. The name is justified by the property that (1 − Z t ′ )/(1 − Z t ) has the same distribution as 1 − Z t ′ −t and is independent of (Z u ) 0≤u≤t for t ′ > t. We obtain an ordinary subordinator by the transformation Z t → − log(1 − Z t ).
Let R be the closed range of the multiplicative subordinator Z. Proposition 3 collects some results of [3].

Proposition 3
The closed range R of Z is multiplicatively regenerative, and conversely, all multiplicatively regenerative sets can be seen as the range of some multiplicative subordinator, whose characteristics are determined up to a positive constant. Sampling from R produces a regenerative composition structure C , and all regenerative composition structures can be obtained by sampling from a regenerative set.
Since we have these relations between regenerative composition structures, multiplicative subordinators, and multiplicatively regenerative sets, we also say that (µ, ν) are the characteristics of the regenerative composition structure C of the proposition. In particular, the probability of the first part having size m in C n , is q(n : m) = Φ(n : m)/Φ(n), where Φ(n : m) = µn1(m = 1) + n m and Φ(n) = n m=1 Φ(n : m). We see that the characteristics (µ, ν) and (cµ, cν), c > 0, produce the same regenerative composition structure. We will need more detailed results about the first part of a regenerative composition C n , n ≥ 2. It can have size one if either V 1n ∈ R, or V 1n / ∈ R and R ∩ [V 1n , V 2n ] = ∅. The expressions for the following probabilities are taken from the proof of Theorem 5.2 on p. 457 of [3].

Möhle [7, Theorem 3.1] showed
Proposition 4 The recursion (2) cannot be solved by a partition obtained by disregarding the order of the parts of a regenerative composition structure, unless Λ has all its mass in either 0 or 1.

Mutations in the population
For a population without mutations, ρ t → δ e in distribution as t → ∞, where e has distribution U [0, 1] and is called the primitive Eve [1, Proposition 1 and Definition 4], so that all of the population belongs to the primitive Eve's family. This is a sort of genetic drift where, by chance, some genotype eventually makes up the whole population. When mutations are possible, no such absorbing state exists since new mutations appear, and we can hope for the existence of a non-trivial stationary distribution of ρ.
We shall now investigate what happens with ρ, describing the evolution of the population forwards in time, when individual lineages mutate at constant rate µ. The heuristic interpretation will be that a constant mutation rate µ erodes all families at the same rate. The mutated lineages are unique and each one only takes up an infinitesimal fraction of the whole population until they possibly increase their size to a positive fraction of the population by a jump. They could also experience yet another mutation but that does not matter in this setting since we are not interested in the actual genotypes; all that matters is that they differ. In the case with finite intensity of births of new litters, the jump mechanism will be the same as in (1), but for t between two consecutive jump times, say σ and τ , we will have where λ is the Lebesgue measure on [0, 1].
To make this rigorous, we will proceed in several steps. We will first study the litters without any genealogical relationships. Since a family consists of litters, claiming that it erodes at a constant rate, implies that its litters also must do so at the same rate. We will describe the composition of a population consisting of eroding litters and "mutants", or singletons, with a probability measure on [0, ∞). The process describing the evolution of the population, still disregarding possible family ties between litters, will then be shown to converge to a stationary distribution. After that, we will impose a genealogy on the litters, meaning a partial order describing who is a descendant of whom. This will enable us to define ρ as we want. Finally, Theorem 2 validates our construction by stating that a sample from this population would have the same sampling distribution as from a Λcoalescent with mutations.
We will for the rest of this section assume that Λ(0) = Λ(1) = 0 and We let ν(dx) := Λ(dx)/x 2 , as in Section 2, and thus (0,1) xν(dx) < ∞. Let {(τ i , X i , U i )} i∈N be a Poisson process on R × (0, 1) × (0, 1) with intensity measure dt ⊗ ν(dx) ⊗ du, whose points denote the times of birth, and the sizes of the litters born in the population, and auxiliary random variables for each litter to be used later. The litters are indexed in decreasing order of size, and in case of ties in decreasing order of the auxiliary random variables. Thus i < j need not imply τ i < τ j . The dynamics in (1), when there are no mutations, imply that the fraction of the population at time t that belongs to a litter i, born at time τ i ≤ t, with original size X i , is given by since the size of the litter must be scaled down at the birth of each subsequent litter. If the size of each litter, and thus also the size of each family, is furthermore eroded with a constant rate µ, the size at time t of litter i becomes where ½(·) is the indicator function. We can describe the sizes and the ages of the litters at time t with the random probability measureρ t on R + , defined by its distribution function for s ≥ 0 and F (s :ρ t ) := 0 for s < 0. The atoms ofρ t now have sizes X i (t) and positions t − τ i , provided τ i ≤ t, corresponding to the current sizes and ages at time t of the litters. By the homogeneity of the Poisson process, ρ = (ρ t ) t∈R is a stationary process. Since the process depends on all litters born before time t, it does not describe the composition of the population into litters if we want the process to start at time 0 with no litters. In that case we must use a cut-off, so that there are no litters older than t at time t. This is described by the processρ ′ = (ρ ′ t ) t≥0 of random probability measures on R + , with distribution functions for s ≥ 0 and F (s :ρ ′ t ) := 0 for s < 0, so thatρ ′ 0 (ds) = µe −µs ds. This process is not stationary, but it converges in distribution toρ 0 .
Proof Defineρ ′′ t , t ≥ 0, by its distribution function for s ≥ 0 and F (s :ρ ′′ t ) := 0 for s < 0. By the homogeneity of the Poisson process, F (s : For s > t, we have t d →ρ 0 . Thusρ andρ ′ have the same limiting distribution, and we choose to work with the former process, since it is stationary.

Theorem 1
The composition of a sample fromρ 0 , according to litters of increasing age, is regenerative with characteristics (µ, ν).
Proof By construction, F (s :ρ 0 ) is a multiplicative subordinator with characteristics (µ, ν), and the order of the parts of the regenerative composition obtained by sampling from the closure of its range corresponds to increasing age of the litters.
In the light of Proposition 4, the theorem might be a bit surprising. What we really want is not the composition into litters, but the partition into families, so we must somehow collect different litters into families. This will destroy the regenerative property of the composition into litters.
We will now define how the litters are related to each other. We do this by sampling fromρ. Let R t be the closed range of F (s :ρ t ). The complement of R t in [0, 1] is a union of disjoint open intervals, ∪ i I i,t , with so that interval I i,t corresponds to litter i. Note that litters i with τ i > t, i.e. litters not yet born at time t, have I i,t = ∅. We also have R Figure 3.
We say that litter i originates from litter j if U i ∈ I j,τ i − , and in that case we write j ≺ ′ i, see Figure 3. There is for each i at most one j such This is the set of litters which are not descendants from any other litter, but descendants from singletons, thus their genotypes are unique at their times of births. We call these litters roots. We define ≺ by j ≺ i if there exist k 1 , . . . , k n such that j ≺ ′ k 1 ≺ ′ · · · ≺ ′ k n = i. The sequence k 1 , . . . , k n is then unique. Furthermore, we set j i if j ≺ i or i = j. There can be at most one root j for each i such that j i, and in that case we write α(i) = j, and say that j is the root of i. What is not immediately obvious is that each i ∈ N has a root (almost surely).
Lemma 2 Each i ∈ N has a root almost surely.
Proof Define recursively I n := {i : ∃j ∈ I n−1 , j ≺ ′ i}, for n ≥ 1, and let the height of a fixed litter i be defined by H i := n if i ∈ I n and H i = ∞ if ∄n ∈ N : i ∈ I n . We need to show that H i is finite almost surely.
The height H i is a function of {(τ k , X k , U k )} k:τ k <τ i . The event {i ≺ ′ l} is likewise measurable with respect to {(τ k , X k , U k )} k:τ i ≤τ k <τ l . Thus the events {H i = h} and {i ≺ ′ l} are independent for all i, l : τ i < τ l , and h ∈ N ∪ {∞}.
In the last equality we used (4) with n = 1. Thusp h = (1 − q(1 : 1) ′ ) h and therefore H i has a geometric distribution with parameter q(1 : 1) ′ and is finite almost surely. By our interpretation of the relation as a genealogical relation, we should let all litters with the same root have the same genotype. We define R i , the genotype of litter i, by R i := U α(i) . Now we can finally define Note that this is a stationary version, and ρ 0 ≡ λ. In the finite intensity case, ρ behaves as in (6) between jumps, just as we wanted, and at the time of a jump, the new litter chooses its genotype from the population at the moment before the jump, just as in (1). At a fixed time t, ρ t represents the population in the sense that a sample from the population will have a partition with distribution as given by ∼ ρt . An i.i.d. sample (r i ) i∈[n] from a realization of ρ t can be interpreted as the genotypes of individuals i = 1, . . . , n in a sample from the population a time t. The value of an r with distribution ρ t can either be one of the R 1 , . . . , or, with probability 1 − i X i (t), it is uniformly drawn from [0, 1].
The justification for the construction is given by Theorem 2.

Theorem 2
The partition of a sample from ρ 0 according to families has the same distribution as the partition according to genotypes of a sample from a Λ-coalescent with mutations, i.e. its distribution is given by the recursion (2).
Proof We assume the sample size n ≥ 2 and that the sample is created by first sampling from R 0 with the i.i.d. uniform random variables (V i ) i∈[n] , and then collecting the litters into families. Note that (V in ) i=j...n , when disregarding their order, are i.i.d. U (v, 1), given V jn > v. We will use the notation from Section 4. Consider the realization V 1n = v. Three possibilities exist.
1. v ∈ R 0 . This happens with probability q(n : 1) ′ . Then 1 is a singleton and is thus in a family of its own.
2. V 1n / ∈ R 0 and [V 1n , V 2n ] ∩ R 0 = ∅. This happens with probability q(n : 1) ′′ . Then 1 is in a litter of its own, say litter k, and what family litter k belongs to is determined by a uniform random variable U k on R 3. [V 1n , V mn ]∩R 0 = ∅ and either m = n, or 2 ≤ m < n and [V mn , V m+1,n ]∩ R 0 = ∅. Then m individuals belong to the same litter, say litter k. This happens with probability q(n : m). What family this litter belongs to is determined by a uniform random variable on R In case 1., we immediately find that the first part of our composition has size 1. The distribution of the rest of the sample is determined by (V in ) i=2...n and R (v) 0 . By the regenerative property, the distribution of the rest of the sample will be the same as sampled with (V i ) i∈[n−1] from R 0 .
In case 2., the lineages of the sample can be represented by U k and (V in ) i=2...n and their partition is obtained by sampling from R (v) 0 = R τ k − , which by the regenerative property yields the same result in distribution as sampling with (V i ) i∈[n] from R 0 .
In the third case, we know that the lineages represented by (V in ) i∈[m] have coalesced since they originate from a common litter, say litter k, but we do not know to which family they belong. This is determined by the realization of U k relative to R (v) 0 , which, if 2 ≤ m < n, together with the realization of (V in ) i=m+1,...,n determines the further coalescing of lineages. As in case 2., the distribution will be the same as if we sample with (V i ) i∈[n−m+1] from R 0 .
The argument is now similar to the one in Möhle [6] and Dong et al. [2], with the main difference that our case 2. above does not add any information about the final partition, whereas they only have either mutations/freezing (our case 1.) or collisions (our case 3.) happening at each stage. We thus get the recursion q(a) = q(n : 1) ′ q(a − e 1 ) + q(n : 1) ′′ q(a)  Figure 4: Illustration of sampling with (V i ) i∈ [7] from the regenerative set that corresponds toρ 0 . The arrows indicate how the litters are related to each other. Compare with Figures 2 and 3.
We illustrate the procedure of the proof with Figure 4. The procedure amounts to moving from left to right and note how the arrows hit R 0 or its complement. First, V 4 is alone in its interval, corresponding to case 2. in the proof. At this point we cannot say anything about the final partition since that litter may be related to the other individuals in our sample. Next, V 7 hits R 0 so that it is in a family of its own (case 1.). The next event is that both V 2 and V 5 fall in the same interval, corresponding to a merger of their lineages and case 3. in the proof. After that, we have a case 2. for the lineage of 2 and 5. Next, we find that the litter of individual 4 is a root. Then lineages 1, 3 and 6 coalesce. The penultimate event is that the litters of lineages 2 and 5, and 1, 3 and 6, are related to the same litter, and thus these lineages coalesce. The final event is finding that this litter also is a root. Thus the partition is {1, 2, 3, 5, 6}, {4}, {7} , just as the example of Figure 1. The order of the collisions and mutations is also the same as in that example.
Remark 2 Our construction of ρ requires ν to be a measure on (0, 1) with (0,1) xν(dx) < ∞. This excludes a large class of Λ-coalescents. The moment condition is necessary when we want to construct the multiplicative subordinator F (s :ρ 0 ) (whose properties we use repeatedly) from the point process {(τ i , X i )} i∈N . Nevertheless, it might be possible to obtain a convergence result analogous to the one of Proposition 1, but we have not been able to do so.