Trees within trees: Simple nested coalescents

We consider the compact space of pairs of nested partitions of $\mathbb N$, where by analogy with models used in molecular evolution, we call"gene partition"the finer partition and"species partition"the coarser one. We introduce the class of nondecreasing processes valued in nested partitions, assumed Markovian and with exchangeable semigroup. These processes are said simple when each partition only undergoes one coalescence event at a time (but possibly the same time). Simple nested exchangeable coalescent (SNEC) processes can be seen as the extension of $\Lambda$-coalescents to nested partitions. We characterize the law of SNEC processes as follows. In the absence of gene coalescences, species blocks undergo $\Lambda$-coalescent type events and in the absence of species coalescences, gene blocks lying in the same species block undergo i.i.d. $\Lambda$-coalescents. Simultaneous coalescence of the gene and species partitions are governed by an intensity measure $\nu_s$ on $(0,1]\times {\mathcal M}_1 ([0,1])$ providing the frequency of species merging and the law in which are drawn (independently) the frequencies of genes merging in each coalescing species block. As an application, we also study the conditions under which a SNEC process comes down from infinity.


Introduction
In the framework of population biology, one can see asexual organisms, but also DNA sequences or even species, as replicating particles. The genealogical ascendance of co-existing replicating particles can always be represented by a tree whose tips are labelled by the names of these particles [25,27,42].
Even if species are not strictly speaking replicating particles, ancestral relationships between species are also usually represented by a tree whose nodes are interpreted as speciation events, i.e., the emergence of two or more species from one single species. The inference of the so-called gene tree of contemporary DNA sequences from their comparison has a decade-long history. It is considered as a field in its own right, called molecular phylogenetics [19,32], which relies heavily on the theory of Markov processes. (This can be misleading, but the species tree, much more often than the gene tree, is called a phylogeny.) When one type of replicating particle is physically embedded in another type of particle, like a virus in its host, their common history can be depicted as a tree within a tree [14,29,34]: tree of dividing parasites inside the tree of dividing hosts, tree of paralogous genes (i.e. distinct DNA segments resulting from gene duplication and coding for similar functions) inside the gene family tree, gene tree inside the species tree. In many such cases, biologists are more interested in the coarser tree rather than in the finer tree. Typically, the finer tree is a gene tree and is inferred thanks to methods developed in molecular phylogenetics. One of the current methodological challenges in quantitative biology is to devise fast statistical algorithms able to also infer the coarser tree. When the genes are sampled from infecting pathogens of the same species (Influenza, HIV...), the coarser tree is the epidemic transmission process [21,45]. When the genes are sampled from (any kind of) different species, the coarser tree is the species tree [22,33,43]. It is often required to use several gene trees nested in the same species tree to infer the latter.
In terms of stochastic modeling, the standard strategy is to define the two nested trees in a hierarchical model referred to as the multispecies coalescent model [12,36] (see also [3,18] for recent surveys on general coalescent theory and applications to population genetics). First, the species tree is fixed or drawn from some classic probability distribution (e.g., pure-birth process stopped at some fixed time, viewed as present time). Second, each gene sequence is assigned to the contemporary species it is (supposed to be) sampled from. Recall that each contemporary species is in correspondence with a tip of the species tree. Third, conditional on the species tree, each gene lineage can then be traced backwards in time inside the species tree, starting from the tip species harboring it and traveling through its ancestral species successively. In addition, gene lineages are assumed to coalesce according to the censored Kingman coalescent [24], i.e., each pair of lineages lying in the same species independently coalesces at constant rate.
In the case when the species tree is also distributed as a Kingman coalescent, the former two-type coalescent process is a Markov process as time runs backward, that we call the nested Kingman coalescent (or 'Kingman-in-Kingman') [8,11,28]. Our goal here is to display a much richer class of Markov models for trees within trees, called simple nested exchangeable coalescent (SNEC) processes, where multiple species lineages can merge into one single species lineage, and where simultaneously, within those merging species, multiple gene lineages can merge into one single gene lineage. To make this more precise, we show in the next display some valid and invalid coalescence events from an initial state where six genes, labeled from 1 to 6, are grouped by pairs in three species lineages. We represent this situation in the next display by a pair of partitions π s π g , as in the left-hand side of the display. Event (A) is valid because the first two species merge and simultaneously, within these species, genes labeled 1, 2 and 3 coalesce. On the contrary, event (B) is not a valid transition because there are two distinct gene coalescences (1 with 2, and 3 with 4), which is proscribed, and event (C) is not valid because the gene coalescence (5 with 6) is outside the species coalescence. In brief, SNEC processes are the generalization of Λ-coalescents to processes valued, not in partitions of N, but in pairs of nested partitions of N. The class of Λ-coalescents [35,37], for which only one coalescence event can occur at a time, is a subclass of Markov, exchangeable processes with possibly non-binary nodes, called Ξ-coalescents, where several coalescence events can be simultaneous [4,38].
Non-binary nodes in species trees can be interpreted as unresolved nodes (a sequence of binary nodes following each other too closely in time for their order to be inferred correctly) or radiation events (periods of frequent speciations due to the opening of new ecological opportunities that can be exploited by different, new species). In gene trees, non-binary nodes are increasingly recognized as a conspicuous sign of natural selection both by biologists [30,44] and by mathematicians and physicists [1,10,13,16,31,41] ; it is also well understood that non-binary nodes could be consequences of bottlenecks as well as large variance in offspring distributions [17,40]. The class of SNEC processes includes all these features. They can distinguish unresolved nodes (sequence of stochastically close, binary coalescences) from radiations (multiple merger in the species tree). Under the interpretation of non-binary nodes as a result of natural selection, SNEC processes can model the appearance of alleles responsible for positive selection (multiple merger in the gene tree) or for divergent adaptation (multiple merger simultaneously in the gene tree and in the species tree).
From a mathematical point of view as well, SNEC processes open up the door to many possible new investigations. For example some of us are currently studying the speed of coming down from infinity of SNEC processes [8,28] as well as similar extensions [15] to fragmentation processes [4]. It will be interesting to investigate how the nested trees generated by SNEC processes can be cast in the frameworks of multilevel measure-valued processes [7,11] and flows of bridges [5,6] as well as of exchangeable combs [20,26]. It would also be natural to study the extension of Ξ-coalescents to nested partitions.
Organization of the article. In Section 2, we introduce some notation, and give examples of nested coalescent processes whose distributions are characterized by four parameters. Section 3 formally defines our object of study, the SNEC processes. We prove our main result in Section 4, and show in Section 5 how SNEC processes can be constructed from a collection of Poisson point processes.
Finally, Section 6 gives a necessary and sufficient condition under which SNEC processes come down from infinity.
2 Statement of results and notation

Statement of results and examples
An exchangeable partition is a random partition of N whose law is invariant by permutations of N (with finite support). A Λ-coalescent is a Markov process valued in the exchangeable partitions of N typically starting from the partition 0 ∞ of N into singletons, and such that only one coalescence event can occur at a time. The generator of a Λ-coalescent R = (R(t), t ≥ 0) is characterized by a The finite measure x 2 ν(dx) is usually denoted Λ(dx), hence the name Λ-coalescent.
We can now draw the parallel with the results obtained in this paper. We want to define a Markov process R = ((R s (t), R g (t)), t ≥ 0) valued in exchangeable bivariate, nested partitions of N, in the sense that the gene partition R g (t) is finer than the species partition R s (t) for all t a.s.
We now have to allow for coalescences in both the gene partition and the species partition. To this aim, we will consider a doubly indexed array of 0's and 1's Z = (X, The goal is to give a characterization and a Poissonian construction of R under the assumptions that the semigroup of R is exchangeable and that both R s and R g undergo only one coalescence at a time (but possibly the same time), as detailed in forthcoming Definition 2. Roughly speaking, and similarly as previously, X i will determine whether the i-th species block participates in the coalescence in the species partition R s , and Y ij whether the j-th gene block of the i-th species block participates in the coalescence in the gene partition R g .
Let us start with the Kingman-type coalescences. Let K s i,i ′ be the (Dirac) law of the array Z with only zero entries except X i = X i ′ = 1 and let K g i;j,j ′ be the (Dirac) law of the array Z with only zero entries Let us carry on with multiple gene mergers without simultaneous species coalescences. Let x ∈ (0, 1] and i ∈ N. Let P g i,x be the distribution of the array Z with only zero entries except at row i, where X i = 1 and the (Y ij , j ≥ 1) are i.i.d. Bernoulli(x) r.v.'s. Let us define Finally, let us consider multiple species mergers, with possible simultaneous gene mergers. Let x ∈ Our main result is that for any simple nested exchangeable coalescent (SNEC) process R, there are • two non-negative real numbers a s and a g ; • a σ-finite measure ν g on (0, 1]; • a σ-finite measure ν s on ( Note that coagulations of the Kingman type cannot occur simultaneously in the species partition and in the gene partition.
We now give a couple of examples of SNEC processes.
If ν s (dp, dµ) = ν ′ s (dp) δ δ 0 (dµ), species and genes never coalesce simultaneously and the nested coalescent is a multispecies coalescent (see Introduction), where the species tree is given by the Λ-coalescent with coagulation measure ν ′ s and Kingman coefficient a s , while the genes in the same species block undergo independent Λ-coalescents with coagulation measure ν g and Kingman coefficient a g . In particular, when ν ′ s and ν g are zero, the SNEC process is a nested Kingman coalescent (Kingman-in-Kingman).
Whenever ν s is not under the form ν s (dp, dµ) = ν ′ s (dp) δ δ 0 (dµ), species blocks and gene blocks can coalesce simultaneously. For example if ν s (dp, dµ) = ν ′ s (dp) δ δx (dµ) for x ∈ (0, 1], at each species coalescence event, a proportion x of gene blocks contained in the species blocks participating in the coalescence event, are simultaneously merged together. In particular, if x = 1, the gene tree coincides with the species tree on lineages situated after a species coalescence event. Recall that there are conditions (see (3.7)) for ν s to be a correct SNEC measure, which in this case translate to which is simply equivalent to Otherwise the simplest sort of measure ν s can be obtained by parameterizing its second component Γ(a) Γ(b) , we can consider ν s under the form ν s (dp, dµ) = ν ′ s (dp, da, db) δ µ a,b (dµ).
In this case, the condition (3.7a) reads ν ′ s (dp, da, db) p 2 < ∞, and (3.7b) becomes which can be rewritten Note that the idea to use a Beta distribution here is inspired by the Λ-coalescent setting [35], where Beta distributions appear as natural candidates for the parametrization of the measure Λ, as the coalescence rate of each k-tuple of blocks among a total number of b blocks is expressed in the form

Notation
For any n ∈N := N ∪ {+∞}, let P n be the set of partitions of [n] . A partition π is called simple if at most one of its non-empty blocks is not a singleton. We denote the set of simple partitions of [n] by P ′ n , that is, where π 1 , π 2 , . . . denote the blocks of π ordered by their least element and |π i | stands for the number of elements in the block π i . Recall that a partition π can be viewed as an equivalence relation, in the sense that i π ∼ j if and only if i and j belong to the same block of the partition π. If π g and π s belong to P n , we will say that the bivariate partition π = (π s , π g ) is nested (or equivalently that π g is finer than π s ) when Note that this is defines a natural partial order on P n , and we can write π g π s if (π g , π s ) is nested.
The set of nested partitions of [n] is denoted in the sequel by N n . We will sometimes use the notation The notation (π s , π g ) owes to our modeling inspiration (see Introduction) where gene lineages are enclosed into species lineages.
Given a nested partition we can use the coagulation operator Coag (more details in Chapter 3 in Bertoin [4]) to write the species partition in terms of the labels of the gene partition. Recall that if π ∈ P n andπ ∈ P m with m ≥ |π|, then define π ′ = Coag(π,π) as the partition of P n such that For every n ∈N, let π = (π s , π g ) be an element of N n and write m = |π g |. The unique partition π ∈ P m such that π s = Coag(π g ,π) is called the link partition of π. We sometimes say that π is linked byπ. To illustrate the previous definition, observe that the nested partition defined in Example 1 has We can next get a partition of P n 1 × P n 2 through the coagulation of two pairs of partitions. More precisely, if (π 1 ,π 1 ) ∈ P n 1 × P n ′ 1 and (π 2 ,π 2 ) ∈ P n 2 × P n ′ 2 with n ′ 1 ≥ |π 1 | and n ′ 2 ≥ |π 2 |, then (Coag(π 1 ,π 1 ), Coag(π 2 ,π 2 )) is well defined and it is an element of P n 1 ×P n 2 . If we denote π = (π 1 , π 2 ) andπ = (π 1 ,π 2 ) we will say that the pair (π,π) is admissible and denote the latter operation by Coag 2 (π,π). In the following we will sometimes call the partitionπ as the recipe partition.
In the sequel, we are interested in the coagulation of a nested partition, say π = (π s , π g ), with a pair of simple partitionsπ = (π s ,π g ). Nevertheless, we should observe that the resulting partition, Coag 2 (π,π) is not necessarily nested. For instance, if we coagulate the partition π of Example 1, Coag(π s ,π s ). In order to maintain the nested property while coagulating a nested partition we need to watch out the way the gene blocks do merge together and if they respect the species structure. To this end, for any n ∈N and π ∈ N n , we can define the set P(π) ⊂ (P ′ n ) 2 of simple recipe partitions permitting a consistent merger of species and genes, i.e.
Finally the natural partial order on partitions can be extended to bivariate partitions by defining (π 1,s , π 1,g ) (π 2,s , π 2,g ) ⇐⇒ π 1,s π 2,s and π 1,g π 2,g . This partial order allows us to see coalescent processes as nondecreasing processes in the space of nested partitions.

Simple nested exchangeable coalescents
In the aim to describe the joint dynamics of the species and gene partitions, we will now define a nondecreasing process with values in the nested partitions, called nested coalescent process. In this work we are only interested in simple nested coalescents in the sense that at any jump event, called coalescence event, all blocks undergoing a modification merge into one single block. Simple exchangeable coalescent processes were first introduced independently by Pitman [35] and Sagitov [37], and are usually called in the literature Λ-coalescents (see Introduction). Here we use the term simple as in [4], to denote the analog of a Λ-coalescent process in the case of (nested) bivariate partitions.
Note that for any partition π ∈ P ∞ and any injection σ : For bivariate partitions we define in the same way σ(π s , π g ) := (σ(π s ), σ(π g )). the restriction (Π(t) |n , t ≥ 0) would not be a Markov process, as the jump rates of Π(t) |n depend on the whole partition Π(t). Invariance under injections ensures us that processes can be consistently defined, i.e. that (Π(t) |n , t ≥ 0) will always be a Markov process. It will also be useful in forthcoming proofs to consider invariance under injections rather than only permutations.
Since we consider processes with values in the space P ∞ , let us endow it with the natural topology generated by the sets of the form {π ′ ∈ P ∞ , π ′ |n = π} for n ∈ N and π ∈ P n . It is readily checked that this topology is metrizable and makes P ∞ compact. Also, note that the product topology on P 2 ∞ , and that induced on N ∞ also makes them compact.
ii) The process (R(t), t ≥ 0) evolves with simple coalescence events, that is for any time t ≥ 0 such iii) The semigroup of the process (R(t), t ≥ 0) is exchangeable, in the sense that for any t, t ′ ≥ 0 and any injection σ : N → N, To start the analysis of SNEC processes we would like to make some observations related to Definition 2. First note that R is a N ∞ -valued process such that for every t, t ′ ≥ 0, the conditional distribution of R(t + t ′ ) given R(t) = π is the law of Coag 2 (π,π), whereπ ∈ P(π), hence the law ofπ depends on t ′ but also on π. Also, it will be clear from our main result (see We now turn to investigate the transitions of the restrictions of a SNEC to finite partitions, which relies on the following lemma. ii) For ̺, π ∈ N n , the rate from ̺ to π is zero if π can not be obtained from a simple coalescence event; iii) The Markov chain (R |n (t), t ≥ 0) is exchangeable, in the sense that for any t, t ′ ≥ 0, ̺, π ∈ N n and σ permutation of n, the rate from ̺ to π is equal to that from σ(̺) to σ(π).
Proof. Let R be a SNEC in N ∞ and let n ∈ N. Let us prove that R |n satisfies the claimed properties.
Let ̺ ∈ N n . Pick ̺ ⋆ ∈ N ∞ such that ̺ ⋆ |n = ̺, and which contains an infinite number of species blocks, each of which containing an infinite number of gene blocks, each of them being an infinite subset of , so for any t, t ′ ≥ 0, Since this is valid for any ̺ ′ such that ̺ ′ |n = ̺, this conditional distribution depends only on {R |n (t) = ̺}, which proves that R |n is a Markov process. Now the assumption that R has càdlàg paths ensures us that the process R |n stays some positive time in each visited state a.s. Therefore R |n is a continuoustime Markov chain. Now statements i)iii) are easily deduced from Definition 2.
Conversely, let R = (R(t), t ≥ 0) be a process with values in N ∞ such that for all n ∈ N, R |n is a Markov chain satisfying i) -iii) of the lemma. Then i) and ii) of Definition 2 follow immediately, and it remains to check that for any injection σ : N → N, the equality in distribution (3.2) holds.
Let σ : N → N be an injection and fix n ∈ N. Define N = max{σ(1), σ(2), . . . , σ(n)}, and consider Now notice that for any t ≥ 0 and any π ∈ N ∞ , σ(π) |n = σ(π |N ) |n , which enables us to write, for any t, t ′ ≥ 0, The passage to the second line in the last display is a consequence of iii) of the lemma, and we used the fact that restrictions are Markov chains, i.e.
Since n is arbitrary in we have shown (3.2), concluding the proof.
This key lemma enables us to give the following first properties of SNEC processes.

Proposition 4. Let R be a SNEC.
• If the process R starts from an exchangeable nested partition R(0), then for any t ≥ 0, R g (t) and R s (t) are exchangeable partitions.
• The process R is a Feller process, so in particular it satisfies the strong Markov property.
• Conditional on R(t), ifR(t) denotes the link partition of R(t) then for any t, t ′ ≥ 0, the dis- Another property is that the process (R s (t), t ≥ 0) is a simple exchangeable coalescent process, but we do not prove it at this point as it will be clear from Theorem 5.
Proof. The first point of the proposition is immediate considering iii) of Definition 2.
As for the second point, recall that N ∞ is endowed with the topology generated by the sets of the form {π ∈ N ∞ , π |n = π}, for n ∈ N, π ∈ N n . It is easy to see that this topology is metrized by We need to show that for any continuous (then bounded) function f : N ∞ → R, the function P t f : By definition the process is càdlàg so we have almost surely f (R(t)) → f (R(0)) so clearly by taking expectations P t f (π) → f (π) as t → 0. Now to show that P t f is continuous, consider n ∈ N and let { π 1 , . . . , π k } be an enumeration of N n . We pick π 1 , . . . , π k ∈ N ∞ such that π i |n = π i , and define For t > 0 and π, π ′ ∈ N ∞ , we have Now suppose π |n = π ′ |n . Since f n depends only on π |n and by Lemma 3 the process R |n has the same distribution under P π or P π ′ , we have the equality E π f n (R(t)) = E π ′ f n (R(t)), and plugging that into showing that P t f is continuous.
For the third point of the proposition, R g (t + t ′ ) is clearly of the form Coag(R g (t),π g ), whereπ = (π s ,π g ) is a random recipe partition whose distribution depends on R(t) and t ′ . Let us show that the conditional distribution ofπ given R(t) is invariant under the action of permutations preservingR(t).
Without loss of generality, we can work under the conditioning {R(t) = (̺, 0 ∞ )}, where ̺ is any partition and 0 ∞ is the partition into singletons, so that for all π, we have Coag(R g (t), π) = π. In particular, note that in this case we have R g (t + t ′ ) =π g , andR(t) = ̺. Let σ be a permutation such that σ(̺) = ̺. The problem then reduces to showing that which is now an immediate consequence of iii) in Definition 2.
Let us now investigate the transition rates of the Markov chains R |n appearing in Lemma 3, for every n ∈ N. In this direction fix n ∈ N, let ̺ ∈ N ∞ and π ∈ N n and denote the jump rate of R |n from ̺ |n to π by q n (̺, π) := lim where P ̺ (·) = P(· | R(0) = ̺). The index n is not necessary in the notation as it can be read in the partition π. However we keep it as it will ease reading. Remind that q n (̺, π) only depends on ̺ through ̺ |n . As is remarked in Lemma 3, q n (̺, π) equals zero if π is not obtained from ̺ |n by coagulating blocks according to a partition in P(̺ |n ), that is by merging some species blocks of ̺ s |n into one and some gene blocks of the new species into one. Also observe that the rates do not depend on the sizes of the gene blocks in the starting configuration so there is no loss of generality if we consider that ̺ g = 0 ∞ , the trivial partition made of singletons. Of course changing the starting partition ̺ has some effect on the arrival partition π. This is why we will need to write transition rates in another way, giving more emphasis on the dependence of the coagulation mechanism upon the starting partition.
Fix n ∈N and suppose that R |n starts from n singleton gene blocks allocated into b species. Since labels of genes do not affect the transition rates, we will keep the data of the number of genes in each species in a vector g = (g 1 , . . . , g b ). This vector suffices to describe the starting position. Indeed |g| = b gives the number of species and b i=1 g i = n gives the number of genes. Now the coagulation mechanism will be described by two terms. We will say that a gene block participates in the coalescence event if it merges with other gene blocks. We will say that a species Note that all such arrays (g, s, c) do not necessarily code for observable coalescence events, so we will define a restricted set of arrays of interest for our study. First, note that one needs to have i s i ≥ 2 in order to observe a species merger. If i s i = 1, then there is a gene coalescence if and only if i,j c ij ≥ 2. Also, we will restrict ourselves to the arrays (g, s, c) such that i,j c ij = 1, because a sole gene coalescing is not distinguishable from no gene coalescing.
Formally, we consider finite arrays (g, s, c) satisfying the assumptions and i,j We denote by C the set of arrays (g, s, c) satisfying (H1), (H2) and (H3).
We then denote the transition rate of R |n from a partition described by g (such that g i = n) to a new partition obtained by merging species and genes according to s and c by k (g, s, c).
Here again indices b and k are not necessary but permit to read easily the coalescence event at the species level (k = s i species merging among b = |g|). We insist on the fact that we consider only arrays (g, s, c) ∈ C when we study the rates q b,k (g, s, c), and that these quantities determine uniquely the law of a SNEC R, since they describe completely the rates associated to each finite-space continuous-time Markov chain R |n .
We introduce a notation that we will use in the next result for ease of writing. For µ a probability on [0, 1], consider any probability space where Z 1 , Z 2 , . . . are i.i.d. with distribution µ and denote the expectation E µ . Now take a vector (g i , i ∈ S) of integers, where S is a finite subset of N. We define This can be thought of as the probability that a random array (c ij , i ∈ S, 1 ≤ j ≤ g i ) does not satisfy (H3), where conditional on (Z i , i ∈ S) the variables (c ij ) are independent, and for all i, j, c ij = 1 with probability Z i . We can now state our main result.

8)
and such that for any array (g, s, c) ∈ C such that |g| = b, i s i = k and j c ij = l i ,

9)
where the functional U is defined in (3.6) and I = I(g, s, c), in the case k = 1, is the unique index in {1, 2, . . . , b} such that s I = 1.
Furthermore, this correspondence between laws of SNEC processes and quadruplets (a s , a g , ν s , ν g ) satisfying (3.7) and (3.8) is bijective.

Remark 6. We will show the surjective part of the theorem's last statement in Section 5, using an explicit Poissonian construction. For now we prove the existence and uniqueness of the characteristics
(a s , a g , ν s , ν g ).

Proof of Theorem 5
Consider a SNEC process R = ((R s (t), R g (t)), t ≥ 0) with values in N ∞ and recall its jump rates q n (̺, π) defined in (3.5). Also recall the alternative notation q b,k (g, s, c). Here, g is a vector of size b such that g i = n, s is a vector having the same size as g with coordinates in {0, 1} such that and such that the transition rate of the Markov chain R |n from ̺ |n to π ∈ N n is given by Furthermore, for any permutation σ : N → N, Note that we write µ ̺ (Π ∈ A) instead of µ ̺ (A) because we implicitly work on the canonical space N ∞ and we denote by Π the generic element of N ∞ .
Proof. Let n < m. We first note that since R |m and R |n = (R |m ) |n are Markov chains, the transition rates can be expressed, for any π ∈ N n \ {̺ |n }, This implies that if (A n ) n≥1 is a family of pairwise disjoint elements of A such that n A n ∈ A, then at most a finite number of the A n are non-empty (because since n A n is compact, there is a finite subcover), so countable additivity reduces to finite additivity. Therefore Carathéodory's extension theorem applies, hence the existence of a measure µ ̺ on N ∞ \ {̺} satisfying (4.11).
Considering µ ̺ as a measure on N ∞ such that µ ̺ ({̺}) = 0, we check easily (4.10) by noticing that Furthermore, for any n, π ∈ N n \ {̺ |n } and σ : N → N permutation, we have by the exchangeability property (3.2) of a SNEC, that which proves that (4.12) holds on A. Since the topology of N ∞ is generated by A, the proof is complete.
The latter lemma implies that there exists a family of exchangeable measures on N ∞ characterizing (i.e. acting as an analog of a Markov kernel for continuous-space pure-jump Markov chains) the SNEC process R. Furthermore, since we are dealing with a simple coalescent, it is clear from the characterization (4.11) that µ ̺ is simple in the sense that it is supported by all the possible bivariate partitions obtained from a simple coalescence from ̺. To put it simply, The measure µ ̺ can be translated as a measure on arrays of random variables in {0, 1}. Informally, we can associate to each species in ̺ a 1 entry if it participates in the coalescence and a 0 entry otherwise. Inside the species participating to the coalescence event, we can also associate a 1 entry to the genes participating in the coalescence event and a 0 entry otherwise. To tally with the definition of the SNEC we will need a certain partial exchangeability structure for this array. This picture can be formalized as follows. Let ((X i , (Y ij , j ∈ N)), i ∈ N) be an array of Bernoulli random variables and denote by Z i the i-th line vector (X i , (Y ij , j ∈ N)). We say that this array is hierarchically is invariant under any permutation over the j's.
We also naturally extend this definition to measures on the space ({0, 1} × {0, 1} N ) N . We say that such a measure µ is hierarchically exchangeable if it is invariant both under the permutations of the rows, and the permutations within a row.
Then the exchangeability property of µ ̺ (4.12) implies that ν is a hierarchically exchangeable measure on ({0, 1} × {0, 1} N ) N , and (4.10) implies that where 0 denotes the null array on ({0, 1}×{0, 1} N ) N . Also, note that the application (µ ̺ , ̺ ∈ N ∞ ) → ν is one-to-one. Indeed, we can conversely define for any Z and any nested partition ̺ ∈ N ∞ , the nested partition C(̺, Z) ∈ N ∞ obtained from ̺ by merging exactly the blocks that participate in the coalescence where • The i-th block of ̺ s participates iff X i = 1; • The j-th block in ̺ g of the i-th block of ̺ s participates iff X i = 1 and Y ij = 1.
With this definition, µ ̺ is obtained as the push-forward of ν by the map Z → C(̺, Z). Now recall the alternative notation q b,k (g, s, c) for the transition rate of R |n (where n = i g i ) from a nested partition with b species blocks and g 1 , . . . , g b gene blocks inside them, to a nested partition obtained by merging k species blocks according to the vector s and gene blocks inside those species according to the array c. For any array (g, s, c) ∈ C, note that (4.11) translates in terms of our push-forward ν in the following way: Indeed, the first line is quite straightforward and comes from our representation of coalescence events by those arrays (g, s, c) ∈ C (see Section 3) which basically means that blocks participating in a coalescence event are those associated with a 1. However in the case when c = 0, there is an additional probability to observe the coalescence of species blocks associated to s with no coalescence of gene blocks (the case when all the Y ij 's are 0 is included in the first term), which is when exactly one of the Y ij 's is equal to 1. This gives rise to the second line of (4.16).
We now have to establish a de Finetti representation of hierarchically exchangeable arrays to express the measure of an event of the form Note that we consider random measures in the following, but only on Borel spaces (S, S) (i.e. spaces isomorphic to a Borel subset of R endowed with the Borel σ-algebra), which will enable us to use de Finetti's theorem [23]. For this we write M 1 (S) for the space of probability measures on S, which is endowed with the σ-algebra generated by the maps µ → µ(B) for all B ∈ S. The spaces (S, S) that we consider will be for instance [0, 1] with its Borel sets or {0, 1} N equipped with the product σ-algebra, which are clearly Borel spaces.  , µ 0 , µ 1 )) such that for all n ≥ 1 (4.17) Proof. Let us first observe that if a sequence (X, (Y j , j ∈ N)) satisfies Hypothesis (A2), then, conditional on X = x ∈ {0, 1}, the sequence (Y j , j ∈ N) is exchangeable. We can thus apply de Finetti's theorem: conditional on X = x there is a unique probability measure µ x giving the distribution of the asymptotic frequency q of the variables (Y j , j ∈ N), and conditional on q they are i.i.d. Bernoulli with parameter q. This implies that, for any {0, 1}-valued finite sequence (x, y 1 , y 2 , . . . , y k ), Also observe that since X is binary, there exists p ∈ [0, 1] such that P( As a consequence of Hypothesis (A1), we can apply once again de Finetti's theorem: there exists a unique lawΛ on M 1 ({0, 1} N ) such that the law of ( Furthermore it has been seen thatμ can be expressed as in (4.18).
Now let F stand for the measurable mapping such that F (μ) = (p, µ 0 , µ 1 ) ∈ E ′ and let Λ be the push-forward ofΛ by the mapping F . We obtain that if A and (B i , i ∈ A) are finite subsets of N, This ends the proof.
This result is almost enough to express (4.16) but one has to be careful because the measure ν might not be finite. However, it is σ-finite because by (4.15), and those events have finite measure. The idea behind the following lemma is to make use of those events and hierarchical exchangeability to express ν as a limit of finite measures which, thanks to an application of Proposition 8, have a representation under the form (4.17).
Let us introduce some notation that will enable us to make this argument formal. For a fixed vector (g, s, c) ∈ C, such that |g| = b, let us examine the event and its measure ν(A). Let us define, for all n ≥ 1 the shifted random array Recall that the array Z encodes which species blocks and which gene blocks are participating in a coalescence. Therefore the event A ∩ {Z b = 0} indicates that there are merging species blocks outside of the first b blocks. In fact we will see that this implies that such merging blocks are infinitely many (a random proportion p of them), and within each of these blocks, a random proportion q of gene blocks are also participating in the coalescence event. The following technical lemma makes this statement rigorous.
Proof. We define some events that will be used to express ν(A ∩ {Z b = 0}).
Note that (g, s, c) satisfies (H1) and (H2), so we have , g 1 , . . . g b ). Now because ν satisfies (4.15), necessarily ν(A) < ∞, which implies that is a finite hierarchically exchangeable measure on ({0, 1} × {0, 1} N ) N . The de Finetti representation (Proposition 8) implies that on the event A, Z b is either 0, or has an infinite number of entries with value 1. In particular, has at least two entries at 1}, therefore, there is the equality where the union is increasing. Therefore, where we used the hierarchical exchangeability of ν to get the second equality. Now we know from (4.15) and because ν is exchangeable that the measure We can simplify this expression since ν is supported by the set {∀i ∈ N, This implies that Λ n -a.e. the measure µ 0 is δ 0 the Dirac measure at 0. Therefore we write Λ n for the push forward measure on E := (0, 1] × M([0, 1]) of Λ n by the application (p, µ 0 , µ 1 ) → (p, µ 1 ). We now have To be able to pass to the limit, let us check that the sequence of measures ( Λ n ) is increasing. Indeed, recall that Λ n is obtained from two applications of de Finetti's theorem to the exchangeable array Z n , so the asymptotic parameters p and µ appearing in (4.21) are a deterministic, measurable functional of Z n . Let us write this functional F (Z n ) = (p, µ), so now Λ n is simply the measure But p and µ are asymptotic quantities of the array Z n , which do not depend on the first row of Z n , so F (Z n+1 ) = F (Z n ) and we have where the passage from the second to the third line is simply because B n ⊂ B n+1 . Therefore there is a limiting measure ν s on E such that so we recover (4.20). To prove the uniqueness of this measure, consider any measure ν ′ s on E such that (4.20) holds. Then we have simplỹ where the first equality is by definition and the second because we assumed that (4.20) holds for ν ′ s . Taking limits on both sides yields ν s (dp, dµ) = ν ′ s (dp, dµ).
Let us now examine ν(A ∩ {Z b = 0}). Recall that the event A ∩ {Z b = 0} indicates that there are no other merging species blocks than those within the first b blocks. The next lemma shows that this implies that we are either in a Kingman-type coalescence (a pair of species blocks are merging, occurring at rate a s , or a pair of gene blocks within one species are merging, occurring at rate a g ), or in a multiple gene coalescence within a single species block (in which case a random proportion q of gene blocks are merging).
-Suppose first c 1,1 = c 1,2 = 1. This means that the first two gene blocks of the first species block coalesce while the first two species blocks coalesce. Then we note that for any n, i ≥ 2, disjoint. Therefore for any n ≥ 2, • In the case k = 1, suppose that s 1 = 1. On the event we have simply Z 1 = 0, and then the measure is an exchangeable measure on {0, 1} N such that for all n ∈ N, ν ′ n j=1 Y j ≥ 2 < ∞. Therefore (see for instance Bertoin [4]) there exist a unique constant a g ≥ 0 and ν g a unique measure on (0, 1] satisfying (3.8) such that ν ′ can be written for any vector (y 1 , y 2 , . . . , y n ) ∈ {0, 1} n \ {0} such that l := i y i ≥ 2.
Putting all the previous considerations together yields (4.22). Now it remains to put together Lemma 9 and Lemma 10. Recall that we restricted the rate function q to arrays in C, i.e. satisfying (H1) to (H3). The reason for assuming (H3) is that then we can always write q b,k (g, s, c) as in (4.16), that is Using the previous two lemmas to decompose the two lines on the events {Z b = 0} and {Z b = 0}, we obtain the formula (3.9), concluding the proof of Theorem 5.

Poissonian construction
The goal of the present section is to show how any simple nested exchangeable coalescent can be constructed from a Poisson point process. Consider two real coefficients a s , a g ≥ 0 and two measures: (3.7), and ν g on (0, 1], satisfying (3.8). Recall the measures K s , K g , P g x and P s x,µ introduced in Section 2, and the measure ν(dZ) defined on the space E of doubly indexed arrays of 0's and 1's Z = (X, Note that ν characterizes the distribution of the SNEC through the relation (4.16). The key idea of the construction is that ν necessarily satisfies (4.15), which is easily shown using exchangeability and conditions (3.7) and (3.8). First, note that ν(Z = 0) = 0 is trivial from our definitions, and that a straightforward union bound yields therefore we need only check that these two quantities are finite. Now by definition, we have Integrating the last two lines with respect to ν g and ν s and summing, we see that (3.7) and (3.8) imply that both ν(X 1 = X 2 = 1) and ν(X 1 = Y 1,1 = Y 1,2 = 1) are finite, proving (4.15).
Now to start the construction of our process, consider an initial partition π 0 ∈ N ∞ . Let M be a Poisson point process on (0, ∞) × E with intensity dt ⊗ ν(dZ). We will construct on the same probability space the processes R n = (R n (t), t ≥ 0), for n ∈ N thanks to M .
Recall that for any Z = (X, (Y i , i ≥ 1)) = (X i , Y ij , i, j ≥ 1) and any nested partition π ∈ N n , we denote by C(π, Z) the nested partition of N n obtained from π by merging exactly the blocks that participate in the coalescence where • The i-th block of π s participates iff X i = 1; • The j-th block in π g of the i-th block of π s participates iff X i = 1 and Y ij = 1.
Fix n ∈ N, and let M n denote the subset of M consisting of points (t, Z) such that n i=1 X i ≥ 2 or ∃i ≤ n, X i n j=1 Y ij ≥ 2. Because of (4.15), there are only a finite number of such points with t in a compact set of [0, +∞). Therefore one can label the atoms of the set M n := {(t k , Z (k) ), k ∈ N} in increasing order, i.e. such that 0 ≤ t 1 ≤ t 2 . . .
We set R n (t) = (π 0 ) |n for t ∈ [0, t 1 ). Then define recursively These processes are consistent in n as we show in the following result. Proposition 11. For every t ≥ 0, the sequence of random bivariate partitions (R n (t), n ∈ N) is consistent. If we denote by R(t) the unique partition of N ∞ such that R |n (t) = R n (t) for every n ∈ N, then the process R = (R(t), t ≥ 0) is a SNEC started from π 0 , with rates given as in Theorem 5.
The proof uses similar arguments as in the proof of consistency of exchangeable coalescents given in Proposition 4.5 of [4].
Recall that we defined M n as the subset of M consisting of points (t, Z) such that n i=1 X i ≥ 2 or ∃i ≤ n, X i n j=1 Y ij ≥ 2. Fix n ≥ 2 and write (t 1 , Z (1) ) for the first atom of M n on (0, ∞)× E. Plainly, R n−1 (t) = R n |n−1 (t) = (π 0 ) |n−1 for every t ∈ [0, t 1 ).

Consider first the case when
ij ≥ 2. Then (t 1 , Z (1) ) is also the first atom of M n−1 and by definition and using (5.24), R n−1 (t 1 ) = R n |n−1 (t).
ij ≤ 1. This implies that at time t 1 , there is no species (resp. genes) coalescence between the n − 1 first species (resp. genes) of R n (t 1 −). Therefore the coalescence event in R n at time t 1 leaves the first n−1 blocks of R n (t 1 −) s or R n (t 1 −) g unchanged, though there may be a coalescence involving the n-th block (in that case, necessarily a singleton {n}) and one of the n − 1 first blocks. So finally R n (t 1 ) |n−1 = R n (t 1 −) |n−1 = R n−1 (t 1 ).
In both cases we have R n (t 1 ) |n−1 = R n−1 (t 1 ), and by an obvious induction this is true for any further jump of the process R n , so that for all t ≥ 0, This shows the existence of R such that for all n, R |n = R n .
From this Poissonian construction R n is a Markov process, and by definition the arrays Z (i) |[n] 2 are hierarchically exchangeable, which implies that R n is an exchangeable process. Clearly by construction R n (t) is nested for all t, and the only jumps of the process R n are coalescence events. According to Lemma 3, the process R is a SNEC process. Because the arrays Z, where (t, Z) ∈ M , are the same arrays that appear in the proof of Theorem 5, it is clear that the jump rates of R n are those given in Theorem 5.
It is obvious from Proposition 11 that (R s (t), t ≥ 0) is a simple coalescent process, with Kingman coefficient a s and coagulation measure ν s satisfying (2.1) which is the push-forward of ν s (dp, dµ) by the application (p, µ) → p. Let us call this univariate coalescent the (marginal) species coalescent of the SNEC process R. Now, notice that under an initial condition with a unique species block (i.e., R s is constant to the coarsest partition 1 ∞ ), the process (R g (t), t ≥ 0) also behaves as a simple coalescent process, with Kingman coefficient a g and coagulation measure ν g defined by ν s (dp, dµ)p µ(B).
We call the simple coalescent thus defined the (marginal) gene coalescent of the SNEC process R.
Equivalently, in terms of Λ-coalescents, the marginal species coalescent is a Λ s -coalescent with Λ s defined by These two marginal processes allow us to express properties of the initial bivariate SNEC process.
Consider an initial state ̺ 0 ∈ N ∞ with infinitely many species blocks, each containing infinitely many gene blocks. In a way analogous to the one-dimensional case, recalling that |R g (t)| ≥ |R s (t)| for all t ≥ 0, we will say that a SNEC comes down from infinity (CDI) if for all t > 0 In the univariate case, characterizing which coalescent processes come down from infinity has been solved [39] for Λ-coalescents, with the following necessary and sufficient condition for coming down from infinity: Note that the previous condition is true as soon as Λ has an atom at 0 (Λ({0}) is the Kingman coefficient of the process). An equivalent criterion (see [6], and [2] for a probabilistic proof) is the integrability of 1/ψ near +∞, where We will now see that in the case of simple nested coalescents, we can give a general characterization of the different CDI properties of a SNEC process, depending only on the properties of the marginal species and marginal gene coalescents.
First notice that if the marginal gene coalescent does not CDI, then any species block with infinitely many gene blocks at some time t clearly keeps infinitely many gene blocks for any t ′ ≥ t. Also in any case the process R s has the distribution of the marginal species coalescent, so determining whether the number of species comes down from infinity is trivial. A simple example of a SNEC process coming down from infinity is the nested Kingman coalescent ('Kingman in Kingman'), given by its marginal rates a s , a g > 0, defined so that each pair of species coalesces at rate a s independently of the others, and each pair of genes within the same species coalesces at rate a g independently of the rest. Since the marginal coalescents are precisely two Kingman coalescents, they both come down from infinity.
Note that the Bolthausen-Sznitman coalescent [9] (denoted U -coalescent in [35] because the measure Λ is uniform on [0, 1]) satisfies the conditions of the peculiar case ii). So for a SNEC R defined by a Kingman gene coalescent evolving within a species U -coalescent, at each positive time the number of gene blocks within a species block is infinite (if the initial state ̺ 0 has an infinite number of species blocks).
Case iii) can easily be obtained by considering a "slow" species coalescent, such as a δ x -coalescent for x ∈ (0, 1), or any β(a, b)-coalescent with a > 1, b > 0 (that is a Λ-coalescent with Λ(dx) = Proof. i) Suppose both marginal coalescents come down from infinity, and consider an initial state ̺ ∈ N ∞ with infinitely many species blocks, each containing infinitely many gene blocks.
In addition, Π is right-continuous, so |Π(δ)| ↑ ∞ as δ → 0. Therefore, one can choose δ > 0 small enough, and then ε > 0 such that P(|Π(δ)| ≤ A) < η and e(ε) := is a marginal gene coalescent (and so CDI by assumption), which is independent of T i , and there is the following equality between processes, for u < T i+1 − T i , Finally, we have for any t > 0, and any initial ̺ ∈ N ∞ , which concludes the proof.