Pathwise construction of tree-valued Fleming-Viot processes

In a random complete and separable metric space that we call the lookdown space, we encode the genealogical distances between all individuals ever alive in a lookdown model with simultaneous multiple reproduction events. We construct families of probability measures on the lookdown space and on an extension of it that allows to include the case with dust. From this construction, we read off the tree-valued $\Xi$-Fleming-Viot processes and deduce path properties. For instance, these processes usually have a.s. c\`adl\`ag paths with jumps at the times of large reproduction events. In the case of coming down from infinity, the construction on the lookdown space also allows to read off a process with values in the space of measure-preserving isometry classes of compact metric measure spaces, endowed with the Gromov-Hausdorff-Prohorov metric. This process has a.s. c\`adl\`ag paths with additional jumps at the extinction times of parts of the population.


Introduction
Similarly to the neutral measure-valued Fleming-Viot process, which is a model for the evolution of the type distribution in a large neutral haploid population, the tree-valued Fleming-Viot process models the evolution of the distribution of the genealogical distances between randomly sampled individuals. The lookdown model of Donnelly and Kurtz [11,12] provides a pathwise construction of the measure-valued Fleming-Viot process and more general measure-valued processes. In this article, we give a pathwise construction of the tree-valued Fleming-Viot process from the lookdown model.
Using the construction from the lookdown model, we study path properties. The classical tree-valued Fleming-Viot process is known to have continuous paths in the space of equivalence classes of metric measure spaces, endowed with the Gromov-weak topology. This topology emphasizes the typical genealogical distances in a sample from the population. As initially suggested to the author by G. Kersting and A. Wakolbinger, we also consider a process whose state space is endowed with a stronger topology, the Gromov-Hausdorff-Prohorov topology, and we study its path regularity. This process, which we may call the tree-valued evolving Kingman coalescent, highlights also the overall structure of the population. The paths of the tree-valued evolving Kingman coalescent have jumps at the times when the shape of the whole genealogical tree changes due to the extinction of old families.
Let us first continue to consider the case that the genealogy of the population at a fixed time can be described by Kingman's coalescent. Kingman's coalescent is a classical model in population genetics. It is a Markov process (π t , t ≥ 0) with values in the space of partitions of N such that each pair of blocks merges into one block independently with rate 1. Evans [14] studies Kingman's coalescent as the completion S of N with respect to the random ultrametric d in which the distance between i, j ∈ N is twice the time until i and j are elements of the same block. On an event of probability 1, the partition π ε is finite for all ε > 0, hence (S, d) is compact and the asymptotic frequencies |B| of the blocks B of π ε , which exist as a consequence of Kingman's correspondence, sum up to 1. Let us endow the random metric space (S, d) with a probability measure. This illustrates one aim in the present article in which we consider genealogies that evolve in time. On an event of probability 1, a probability measure m (ε) on (S, d) is given by m (ε) = B∈πε |B| δ min B . It follows directly from this construction that a. s., the probability measures m (ε) converge in the Prohorov metric to a probability measure m as ε decreases to zero, and that a. s., also the probability measures n −1 n i=1 δ i on (S, d) converge in the Prohorov metric to m as n tends to infinity. Conditionally given the coalescent, let x 1 , x 2 , . . . be an m-iid sequence in S. By exchangeability, the unconditional distributions of the matrices (d(x i , x j )) i,j∈N and (d(i, j)) i,j∈N are equal, see Proposition 4. That is, the genealogy of an m-iid sample from S is again described by Kingman's coalescent.
To study genealogies that evolve in time, we use the lookdown model of Donnelly and Kurtz [12] as an underlying infinite population model. We still focus on the situation in which the genealogy at each fixed time is described by Kingman's coalescent. We now sketch the population model which is studied in more detail and generality in Sections 4 and 5. The time axis is R + , and there are countably infinitely many levels which are labeled by N. At each time, there is one particle on each level. For every pair i < j of levels independently, the following reproduction events occur at times given by a Poisson process with rate 1: The particles on levels j, j + 1, . . . increase their levels by 1, and the particle on level i has one offspring on level j. We use an appropriate Poisson random measure η to encode all reproduction events in the lookdown model. The lookdown model contains a genealogical structure. We define R + × N as the set of individuals and call (t, i) ∈ R + × N the individual on level i at time t. Given a suitable matrix ρ which encodes the genealogical distances between the individuals at time 0, a semi-metric ρ can be defined on the space R + × N. If r ∈ R + is the time of the most recent common ancestor of the particle on level i ∈ N at time s ∈ R + and the particle on level j ∈ N at time t ∈ R + , we set ρ((s, i), (t, j)) = s − r + t − r. If these particles have different ancestors at time 0, we set ρ((s, i), (t, j)) = s + t + ρ k, where k and denote the levels of these ancestors. Let us identify the elements of R + × N with ρ-distance zero and take the completion to obtain a random complete and separable metric space. We call this random metric space the lookdown space and we denote it by (Z, ρ). For t ∈ R + , let X t be the closure of {t} × N in (Z, ρ). Then (X t , ρ ∧ t) is essentially the random ultrametric space (S, d ∧ t).
We define probability measures on (Z, ρ) by m n t = n −1 n i=1 δ (t,i) for n ∈ N and t ∈ R + . The Poisson random measure η which encodes the reproduction events in the lookdown model is realized a. s. in such a way that, under an appropriate assumption on ρ, there exists a family (m t , t ∈ R + ) of probability measures on (Z, ρ) with  for all T ∈ R + , where d Z P denotes the Prohorov metric on (Z, ρ). This is part of Theorem 1 in Section 7 which is the central result in this article. We show that the map t → m t is a. s. continuous in the weak topology on the space of probability measures on (Z, ρ). Moreover, we show that the map t → X t is càdlàg in the Hausdorff distance on the space of subsets of (Z, ρ), and that jumps occur at the times at which old families die out. A formal definition of these times is given in Section 7.
From this construction, we read off stochastic processes. Let us first recall some state spaces. Details and references can be found in Section 2. A metric measure space is a complete and separable metric space endowed with a probability measure on the Borel sigma algebra. Metric measure spaces (X, r, m) and (X , r , m ) are called equivalent if supp m and supp m are measure-preserving isometric. There exists a space M of equivalence classes of metric measure spaces, and a space M of measure-preserving isometry classes of compact metric measure spaces. The Gromov-Prohorov distance d GP is a complete and separable metric on M . The Gromov-Hausdorff-Prohorov distance d GHP is greater than or equal to the Gromov-Prohorov distance. It is a complete and separable metric on M.
By taking equivalence classes or measure-preserving isometry classes, not only can random metric measure spaces be considered as random variables with values in Polish state spaces, also processes can be obtained from the lookdown model whose states do not display the labels of the levels. The M -valued process (χ t , t ∈ R + ), where χ t denotes the equivalence class of (Z, ρ, m t ), is a tree-valued Fleming-Viot process when the initial state which is given by the matrix ρ can be considered as a tree (Section 8). In the article of Greven, Pfaffelhuber, and Winter [20], the tree-valued Fleming-Viot process is studied by a different approach: Tree-valued processes are read off from finite population models and shown to converge in distribution to the solution of a well-posed martingale problem as population size tends to infinity. In Remark 2.20 of [20], a. s. convergence at fixed times of tree-valued processes read off from restrictions of the lookdown model is considered. In the present article, we construct the tree-valued Fleming-Viot process pathwise by studying limits of probability measures on the lookdown space (Theorems 1 and 2). The lookdown space and the probability measures on it are determined by a realization of the Poisson random measure η and the matrix ρ. To show measurability, we also consider the process ((ρ(t, i), (t, j)) i,j∈N , t ∈ R + ) and we apply Proposition 3 from Section 3.
In Section 9, we study the process (X t , t ∈ R + ), where X t denotes the measurepreserving isometry class of the compact metric measure space (X t , ρ, m t ). We call this process the M-valued evolving Kingman coalescent. While the tree-valued Fleming-Viot process has a. s. continuous paths in (M , d GP ), the M-valued evolving Kingman coalescent has a. s. càdlàg paths in (M, d GHP ) with jumps when old families die out (Theorem 6). As it also holds supp m t = X t for all t ∈ R + a. s., there exists a function that maps on an event of probability 1 the realizations of the tree-valued Fleming-Viot process to the realizations of the M-valued evolving Kingman coalescent (Corollary 3).
Kingman's coalescent is generalized by the class of Λ-coalescents which was introduced by Pitman [34] and Sagitov [36]. The distribution of such a coalescent is parametrized by a finite measure Λ on [0, 1]. In a Λ-coalescent, multiple blocks may coalesce into one block. Analogously to the Kingman case, a random ultrametric d on N can be associated with a Λ-coalescent, we denote the metric completion of (N, d) again by (S, d). For some Λ, the Λ-coalescent contains dust, that is, each block does not coalesce with any other block for a positive time a. s. In the case with dust, a probability measure m on (S, d) cannot be constructed as in the Kingman case sketched in the beginning of the introduction. The simplest example for such a coalescent is the so-called star-shaped coalescent in which all blocks coalesce into a single block after an exponentially distributed time. Here a. s., the probability measures n −1 n i=1 δ i on (S, d) converge vaguely to the zero measure as n tends to infinity.
A realization of a Λ-coalescent can be represented as a tree, see Figure 1. First we assume that the coalescent contains dust. Let us consider the subtree spanned by the branchpoints z(1), z(2), . . . of the coalescent tree, these are defined in Figure 1. Let (S , d) be the completion of {z(i) : i ∈ N} with respect to the metric defined by the edge lengths. We define a partition Π of N such that i, j ∈ N are in the same block if and only if z(i) = z(j), that is, the external branch that ends in leaf i and the external branch that ends in leaf j begin in the same branchpoint. It can be shown that the partition Π has a. s. no singleton blocks, see Lemma 7. On an event of probability 1, we can define a probability measure m = B∈Π |B| δ z(min B) on S . The asymptotic frequencies |B| of the blocks B of Π exist and sum up to 1 a. s. by exchangeability. In this way, we obtain a metric measure space that describes the subtree spanned by the branchpoints of the coalescent tree. To consider also the external branches, we define a probability measure µ = B∈Π |B| δ (z(min B),u min B ) on S × R + , where u i is the length of the external branch of the coalescent tree that ends in leaf i ∈ N. In particular, the projection of µ to R + is the empirical measure of the lengths of the external branches. The triple (S , d, µ) is a random marked metric measure space, this concept is recalled in Section 2. Conditionally given the coalescent, let (x i , u i ) i∈N be a µ-iid sequence. Then the unconditional distributions of the random matrices (u i + d(x i , x j ) + u j ) i,j∈N and (u i + d(z(i), z(j)) + u j ) i,j∈N are equal (Proposition 4). That is, if we sample iid branchpoints x 1 , x 2 , . . . from the coalescent tree according to the projection of µ to Z, and consider pairwise distinct leaves 1 , 2 , . . ., where i is a leaf in which an external branches that begins in x i ends, then the genealogy of these leaves is described by a Λ-coalescent.
The probability measure m is a. s. the weak limit of the probability measures m n = n −1 n i=1 δ z(i) on S as n tends to infinity. The probability measure m n can be obtained from the uniform measure on the first n leaves by moving mass from each leaf into the nearest branchpoint in the coalescent tree. In the Kingman case, and more generally for Λ-coalescents without dust, the external branches of the coalescent tree all have length zero a. s., hence z(i) is the leaf labeled by i for all i ∈ N, and S coincides with the Figure 1: A realization of a Λ-coalescent with dust can be represented as a tree with edge lengths. The leaves are labeled by N. The block of the coalescent at time t that contains i ∈ N is given by the labels of the leaves of the subtree that contains the leaf labeled by i and that is rooted in depth t below the top. In the illustration, we see the restriction to the subtree induced by the first 7 leaves. On the right-hand side, the states of the coalescent process, restricted to partitions of {1, . . . , 7}, are given. Time goes downwards. Starting from any leaf i ∈ N, the block {i} does not coalesce with any other block for a positive time u i . This time corresponds to the length of the external branch that ends in leaf i. We denote the branchpoint in which it starts by z(i). In the illustration, the external branches are drawn in red, and the subtree spanned by the branchpoints is drawn in black. In this example, there are leaves j > 7 with z(6) = z(j).
space of leaves S. In this sense, the constructions in the cases with and without dust are consistent. A probability measure on S × R + can be defined by µ = m ⊗ δ 0 in the case without dust. We give a similar construction for genealogies that evolve in time.
The class of Λ-coalescents is generalized by the class of Ξ-coalescents which was introduced by Schweinsberg [37] and Möhle and Sagitov [31]. In [37], such a coalescent is constructed from a Poisson random measure that is characterized by a measure Ξ on the infinite simplex. This generalizes the construction of the Λ-coalescent from a Poisson random measure in [34]. All these coalescents can be obtained as the genealogies at fixed times in appropriate lookdown models, see Donnelly and Kurtz [12] and Birkner et al. [6]. The completion S of the random ultrametric space of the leaves of the Ξ-coalescent tree can be constructed analogously to the Kingman case. This space is totally bounded, hence compact, if and only if the Ξ-coalescent comes down from infinity, that is, it has finitely many blocks at each positive time.
In the present article, we work with a lookdown model in which particles may have, even simultaneously, multiple offspring, according to a reproduction mechanism underlying the flow of partitions of Foucart [18]. Following [6,18], we use essentially the Poisson random measure from [37] to encode the reproduction events. The lookdown model contains dust if and only if there are at most finitely many reproduction events in bounded time intervals in which particles on a given level reproduce. To include the case with dust, we define a random semi-metric ρ on the disjoint union (R + × N) N of the space R + × N of all individuals at all times and the set N which represents ancestors of the individuals at time 0. For instance, if the genealogy at time 0 is given by a Ξ-coalescent with dust, these ancestors are the branchpoints of the coalescent tree at time 0. In the case without dust, we can assume that these ancestors are identified with the individuals at time 0. In any case, we identify the elements with ρ-distance zero and take the metric completion to obtain a random metric space which we may again call the lookdown space and which we denote by (Z, ρ).
On an event of probability 1, there exists for each t ∈ R + a probability measure µ t on Z × R + that can be defined similarly to the probability measure µ on S (Theorem 1). We generalize the tree-valued Fleming-Viot process to a process with values in the space M of equivalence classes of marked metric measure spaces, endowed with the marked Gromov-Prohorov metric. This state space is also recalled in Section 2. We call the process (χ t , t ∈ R + ), where χ t denotes the equivalence class of the marked metric measure space (Z, ρ, µ t ), the M-valued Ξ-Fleming-Viot process. In the case without dust, we can define analogously an M -valued Ξ-Fleming-Viot process. These processes jump at the times of multiple reproduction events a. s. We generalize the M-valued evolving Kingman coalescent to the M-valued evolving Ξ-coalescent for those Ξ which satisfy that the Ξcoalescent comes down from infinity.
The construction on the lookdown space allows to read off path properties of these processes. We analyze atomicity (Theorem 1) and jump times (Theorems 1, 2, 6, and Proposition 8). Furthermore, we show convergence to equilibrium (Theorems 3 and 6). In Theorems 4 and 5, we show that the M-and the M -valued Ξ-Fleming-Viot processes solve well-posed martingale problems which generalize the martingale problem from [20]. The construction of the probability measures on the lookdown space relies on exchangeability properties which we study in Section 5. We use these exchangeability results in particular in Section 10 where we adapt techniques of Donnelly and Kurtz [12] to obtain convergence results on families of partitions (Section 6) and on probability measures on the lookdown space (Section 7). In the case without dust, we use the flow of partitions (Π s,t , 0 ≤ s ≤ t) where for each t ∈ (0, ∞), the partition-valued process (Π (t−s)−,t , s ∈ [0, t)) is the Ξcoalescent that describes the genealogy of the individuals at time t in the lookdown model until time 0. For the construction in the case with dust, we introduce a family (Π t , t ∈ R + ) of partitions where for each t ∈ R + , the partition Π t can be defined from the genealogy of the individuals at time t similarly to the partition Π.
In the remainder of the introduction, we discuss more relations to the literature. Pfaffelhuber and Wakolbinger [32] study the lookdown graph in two-sided time. This object, which captures the genealogical structure in the lookdown model, is obtained from the space R × N of all individuals at all times, and from ancestral lineages. The lookdown graph can be interpreted as a random semi-metric space, it is closely related to the lookdown space. A lookdown construction of the measure-valued Ξ-Fleming-Viot process is given by Birkner et al. [6]. Another way to construct the measure-valued Λ-Fleming-Viot process is through the flow of bridges of Bertoin and Le Gall [4]. For flows of partitions, we refer to Foucart [18] and Labbé [26]. In [26], the asymptotic block frequencies in the flow of partitions are considered to study connections between flows of bridges, flows of partitions, and the lookdown model.
Evolving genealogies have also been described by functionals of evolving coalescents. The tree length of the evolving Kingman coalescent is studied as a stochastic process with jumps by Pfaffelhuber, Wakolbinger, and Weisshaupt [33]. This process and its jumps are further studied in the lookdown model by Dahmer, Knobloch, and Wakolbinger [7]. The external length in a certain class of evolving coalescents without dust is studied by Kersting, Schweinsberg, and Wakolbinger [24]. Here the external length is compensated and rescaled, and also time is rescaled, as population size tends to infinity.
Greven, Pfaffelhuber, and Winter [19] show that the metric measure spaces associated with finite Λ-coalescents converge in the Gromov-weak topology if and only if there is no dust. Biehler and Pfaffelhuber [5] show that the Λ-coalescent measure tree is compact if and only if the Λ-coalescent comes down from infinity. We recall that it is an open problem to give a necessary and sufficient condition in terms of the measure Ξ for the Ξ-coalescent to come down from infinity, see [22,28,37]. Marked metric measure spaces are studied by Depperschmidt, Greven, and Pfaffelhuber [8] and applied by the same authors to construct the tree-valued Fleming-Viot process with mutation and selection [9,10], here the marks encode types. In the present article, we consider only the neutral situation and we do not work with types; we use marks to decompose genealogical trees. In the context of measure-valued spatial Λ-Fleming-Viot processes with dust, a skeleton structure appears in the work of Véber and Wakolbinger [38]. By methods which differ from those in the present article, Depperschmidt, Greven, and Pfaffelhuber [10] show in particular that a. s., the states of the tree-valued Fleming-Viot process are non-atomic. We also mention the work of Athreya, Löhr, and Winter [2] where in particular the Gromov-weak topology and the Gromov-Hausdorff-Prohorov topology are compared, and of Kliem and Löhr [25], where marked metric measure spaces are further studied.
Finally, let us relate our construction of the marked metric measure space associated with a Λ-coalescent to the theory of continuum random trees of Aldous [1]. Consider a family (R(k), k ∈ N), where R(k) is a random graph-theoretic tree with edge lengths that has k leaves. For k ∈ N, let (L k 1 , . . . , L k k ) be a uniform reordering of the leaves of converges to zero in probability as k → ∞, where d denotes the metric given by the edge lengths of R(k). We disregard the additional assumption in [1] that the R(k) are socalled proper k-trees. The leaf-tight property holds for a consistent family (R(k), k ∈ N) of Λ-coalescent trees in the case without dust. The metric measure space associated with the Λ-coalescent can be seen as a generalized continuum random tree such that R(k) is distributed as the subtree induced by k iid sampled leaves. Now assume that (R(k), k ∈ N) is a consistent family of Λ-coalescent trees with dust. For k ∈ N and j ≤ k, let I k j be the nearest branchpoint to L k j in R(k). It can be seen as in Lemma 7 in Section 6 that min converges to zero in probability as k → ∞. This property is weaker than the leaftight property in [1]. Let R (k) be the tree obtained by pruning the leaves and external branches of R(k). The marked metric measure space associated with the Λ-coalescent can be seen as a generalized continuum random tree associated with the family (R (k), k ∈ N) to which the leaves and external branches which were pruned away are reattached as marks. We denote the set of partitions of N by P and for n ∈ N, the space of partitions of [[n]] by P n . We define γ n : P → P n , π → {B ∩ [[n]] : B ∈ π} \ {∅}. We endow P with the metric d(π, π ) = 2 − inf{k∈N:γ k (π) =γ k (π )} , π, π ∈ P.

Some notation
Moreover, we denote by P n be the space of those partitions in P such that the first n integers are not all in different blocks, that is A larger subspace of P is the space of partitions such that at least one of the first n integers belongs to a non-singleton block, we denote this space by Furthermore, we write P n = P n \ {{{1}, . . . , {n}}}. We set P 0 = ∅. A partition π ∈ P induces an equivalence relation ∼ π on N such that i ∼ π j if and only if i and j are in the same block of π. We denote the block of π ∈ P that contains i ∈ N by π(i) and we write π(0) = ∪{i : {i} ∈ π} for the union of all singleton blocks of π. We denote the set of integers that are a minimal element of a block of π by M (π) = {min B : B ∈ π}. For π ∈ P ∪ n∈N P n and i ∈ [[#π]], we denote the i-th block of π by B(π, i), where the blocks are ordered increasingly according to their smallest elements. For i ∈ N with i > #π, we set B(π, i) = ∅. The coagulation of partitions π, π ∈ P is the partition Coag(π, π ) ∈ P defined by B(Coag(π, π ), i) = j∈B(π ,i) B(π, j) We denote by |B| the asymptotic frequency |B| = lim n→∞ n −1 #(B ∩ [[n]]) of a set B ⊂ N provided this limit exists. We denote the relative frequency of B among the first n integers by |B| n = n −1 #(B ∩ [[n]]). For a partition π ∈ P, we write , provided that the asymptotic frequencies of the blocks of π exist. The partition π is said to have proper frequencies if |π| 1 = 1.
For sets X and Y , we denote by Y the projection from X ×Y to Y . For a metric space (X, r), we write U X ε (x) = {y ∈ X : r(x, y) < ε} and B X ε (x) = {y ∈ X : r(x, y) ≤ ε}. For a measurable space E, we denote by M b (E) the space of bounded measurable functions E → R.
We write a property which may depend on t ∈ R + holds for all t ∈ R + a. s. if there exists an event of probability 1 that does not depend on t on which the property holds for all t simultaneously. For the weaker statement where an event of probability 1 that depends on t is sufficient, we write the property holds a. s. for all t ∈ R + .

Some state spaces
We review some parts of the theory of metric measure spaces and marked metric measure spaces for which we refer to Greven, Pfaffelhuber, and Winter [19], Gromov [21], Depperschmidt, Greven, and Pfaffelhuber [8], Evans and Winter [17], and Miermont [30]. In Section 2.2, we interpret marked distance matrices as decompositions of distance matrices.

Metric measure spaces and marked metric measure spaces
A metric measure space (mm-space) is a triple (X, r, m) which consists of a complete and separable metric space (X, r) and a probability measure m on the Borel sigma-algebra on (X, r). Two mm-spaces are defined to be equivalent if there exists a measure-preserving isometry between the supports of the two measures. An R + -marked metric measure space (we always say marked metric measure space, or mmm-space) is a triple (X, r, µ) which consists of a complete and separable metric space (X, r) and a probability measure µ on the Borel sigma-algebra on X × R + . We endow X × R + with the ∞-product metric d X×R + given by d X×R + ((x, u), (x , u )) = r(x, x ) ∨ |u − u | for x, x ∈ X and u, u ∈ R + . Two marked metric measure spaces (X, r, µ) and (X , r , µ ) are defined to be equivalent if there exists an isometry ϕ between the supports of the measures X (µ) and X (µ ) such that the isometryφ : supp The Prohorov metric on the space of probability measures on a separable metric space (Z, d) is given by where the outer infimum is taken over all couplings ν of the probability measures m and m on Z, see [13, Theorem 3.1.2]. We will often deduce from a coupling an upper bound for the Prohorov metric. Two metric measure spaces (X, r, m) and (X , r , m ) can be compared by the Gromov-Prohorov distance d GP ((X, r, m), (X , r , m )) = inf{d Z P (ι(m), ι (m ))}.
Here and in the next display, the infimum is taken over all complete and separable metric spaces Z and isometric embeddings ι : X → Z and ι : X → Z. The marked Gromov-Prohorov distance between two marked metric measure spaces (X, r, µ) and (X , r , µ ) is defined by d MGP ((X, r, µ), (X , r , µ )) = inf{d Z×R + P (ι(µ),ι (µ ))}, where we setι = (ι, id),ι = (ι , id) with the identity map id : R + → R + , and where Z × R + is endowed with the ∞-product metric. The distortion dis R of a relation R ⊂ X × X between two metric spaces (X, r) and (X , r ) is defined by In the proof of Proposition 3 in Section 3, we use also the following characterization of the marked Gromov-Prohorov distance. Proposition 1. Let (X, r, µ) and (X , r , µ ) be marked metric measure spaces. Then d MGP ((X, r, µ), (X , r , µ )) is the infimum of all ε > 0 such that there exist a relation R ⊂ X × X and a coupling ν of µ and µ with 1 Proof. This can be seen as an adaptation of Proposition 6 in [30]. Here we sketch the proof of the upper bound for d MGP ((X, r, µ), (X , r , µ )). Let ε > 0 and assume R,R, and ν with 1 2 dis R ≤ ε and ν(R) ≥ 1 − ε are given as in the assertion. A metric d on the disjoint union Z = X X can be defined by and we endow Z × R + with the ∞-product metric. Let ϕ : X → Z and ϕ : X → Z be the canonical embeddings. Moreover, letφ(x, u) = (ϕ(x), u) andφ (x , u) = (ϕ (x ), u) for x ∈ X, x ∈ X , and u ∈ R + . Then the coupling ν induces a couplingν ofφ(µ) and The coupling characterization (2) of the Prohorov metric and the definition of the marked Gromov-Prohorov distance imply d MGP ((X, r, µ), (X , r , µ )) ≤ ε.

Distance matrices
Denote by the cone of all semi-metrics on N which are written as matrices with index set N 2 . Define For n ∈ N, let and D n = D n × R n + = {(r, u) : r ∈ D n , u ∈ R n + }.
We call the elements of D and D n distance matrices, and the elements of D and D n marked distance matrices. We define the restrictions and γ n : We endow D with the complete metric given by d(ρ, ρ ) = ∞ n=1 2 −n (|γ n (ρ) − γ n (ρ )| ∧ 1) for ρ, ρ ∈ D. In the present article, we will use marked distance matrices to decompose distance matrices. This motivates the definition of the maps For a metric measure space (X, ρ, m), we set The distance matrix distribution of (X, ρ, m) is defined as (X,ρ) (m ⊗N ), the pushforward measure on D under (X,ρ) of the distribution of an infinite m-iid sequence in X. Similarly, for a marked metric measure space (X, r, µ), we set The marked distance matrix distribution of (X, r, µ) on D is defined as (X,r) (µ ⊗N ). Equivalent (marked) metric measure spaces have the same (marked) distance matrix distributions. By Gromov's reconstruction theorem, two (marked) metric measure spaces with the same (marked) distance matrix distribution are equivalent. Proposition 5 in Section 3 implies Gromov's reconstruction theorem.

The spaces of equivalence classes of (marked) metric measure spaces
We now define the space of equivalence classes of metric measure spaces. For settheoretical reasons, let us first define the set Each χ ∈ M is characterized by its distance matrix distribution, denoted by ν χ , which is well-defined through representants. A sequence (χ n , n ∈ N) ⊂ M converges to χ ∈ M in the Gromov-weak topology if and only if the associated distance matrix distributions (ν χn , n ∈ N) converge weakly to ν χ . It is shown in [19] that the Gromov-Prohorov distance d GP defines a complete and separable metric on M which metrizes the Gromovweak topology on M . We always endow M with d GP . Similarly, we define the set M = {(X, r, µ) : (X, r, µ) marked metric measure space with X ⊂ R}.
For a marked metric measure space (X, r, µ), we denote by [[X, r, µ]] the intersection of M with the equivalence class of (X, r, µ) and we define the set : (X, r, µ) marked metric measure space}.
Also each element χ ∈ M is characterized by its marked distance matrix distribution, again denoted by ν χ . A sequence (χ n , n ∈ N) ⊂ M converges to χ ∈ M in the marked Gromov-weak topology if and only if the associated marked distance matrix distributions (ν χn , n ∈ N) converge weakly to ν χ . It is shown in [8] that the marked Gromov-Prohorov distance d MGP defines a complete and separable metric on M which metrizes the marked Gromov-weak topology on M. We always endow M with d MGP .
We say χ ∈ M contains atoms or is purely atomic, respectively, if this holds for the probability measure µ in a representant (X, r, µ) of µ. Furthermore, we will also work with the space is an isometry. We come to the so-called polynomials on M and M . For n ∈ N, let C n be the space of continuously differentiable functions φ : D n → R with compact support. For φ ∈ C n , we denote also by φ the function φ • γ n : D → R, and we call the function Φ : M → R, Φ(χ) = ν χ φ the polynomial associated with φ. The degree of Φ is the smallest possible k such that Φ = ν ·φ for someφ ∈ C k . We set C = n∈N C n and we denote by Π the set of all polynomials associated with some φ ∈ C. Since C n is convergence determining in D n for all n ∈ N, the set of functions Π generates the marked Gromov-weak topology on M. It follows from a result of Le Cam [27] that Π is convergence determining in M, see Löhr [29].
Similarly, let C n be the space of continuously differentiable functions from D n to R with compact support, and let C = n∈N C n . For φ ∈ C n , we denote also by φ the function φ • γ n : D → R, and we call the function Φ : M → R, Φ(χ) = ν χ φ the polynomial associated with φ. We denote by Π the set of all polynomials associated with some φ ∈ C . The set C n is convergence determining in D n , hence Π is convergence determining in M .
In this article, we work with general (marked) metric measure spaces. In applications, these spaces are often tree-like. We address this situation in a series of remarks alongside of the main text. We begin with Remark 1 below which is followed by Remarks 3, 6, 9, and 12 in subsequent sections.
Remark 1. We briefly address the relation to R-trees. For details, see Evans [15] and the references therein. A metric space (X, r) is 0-hyperbolic if and only if the four-point condition r(x, y) + r(z, t) ≤ max{r(x, z) + r(y, t), r(y, z) + r(x, t)} for all x, y, z, t ∈ X is satisfied. A metric space is an R-tree if it is 0-hyperbolic and connected. Every 0-hyperbolic space is a subspace of an R-tree. In particular, ultrametric spaces are 0-hyperbolic, and every ultrametric space is the set of leaves of an R-tree. An ultrametric on a set X is a metric r that satisfies the strengthened triangle equation max{r(x, y), r(y, z)} ≥ r(x, z) for all x, y, z ∈ X. This implies that if two balls in an ultrametric space intersect, one contains the other.
It follows from the definition of equivalence classes of mm-spaces and mmm-spaces, respectively, that the closed subspace

The space of isometry classes of compact metric measure spaces and the Gromov-Hausdorff-Prohorov metric
The Hausdorff distance between subsets X and Y in a metric space Z is defined by The Gromov-Hausdorff-Prohorov distance between two metric measure spaces (X, r, m) and (X , r , m ) is defined by where the infimum is over all complete and separable metric spaces Z and isometric embeddings ι : X → Z and ι : X → Z. Obviously, the Gromov-Hausdorff-Prohorov distance is larger than or equal to the Gromov-Prohorov distance. A correspondence R ⊂ X × X between two sets X and X is a relation that satisfies that for every x ∈ X, there exists x ∈ X with (x, x ) ∈ R, and that for every x ∈ X , there exists x ∈ X with (x, x ) ∈ R. We also need the following characterization of the Gromov-Hausdorff-Prohorov distance.
Proposition 2 (Proposition 6 in [30]). Let (X, r, m) and (X , r , m ) be compact metric measure spaces. Then d GHP ((X, r, m), (X , r , m )) is the infimum of all ε > 0 such that there exist a correspondence R ⊂ X × X and a coupling ν of m and m with 1 2 dis R ≤ ε and ν(R) ≥ 1 − ε.
For a metric measure space (X, r, m), we denote by [X, r, m] the intersection of M with the measure-preserving isometry class of (X, r, m). For two metric measure spaces (X, r, m) and (X , r , m ) to be measure-preserving isometric, we require that there exist a measure-preserving isometry from X to X , not only from supp m to supp m as in Section 2.1. We define the set M = {[X, r, m] : (X, r, m) compact metric measure space}.
The Gromov-Hausdorff-Prohorov distance d GHP defines a complete and separable metric on M, see [17,30]. We always endow M with d GHP . We also let For X ∈ M, we define the distance matrix distribution ν X on D through representants as in Section 2.2.

Construction from distance matrices
We have seen how a (marked) distance matrix can be sampled from a (marked) metric measure space. In this section, we build a (marked) metric measure space from an admissible (marked) distance matrix. We show in Proposition 3 below that this construction defines measurable functions with values in M, M , and M, respectively. In Proposition 4, we show that if a random admissible marked distance matrix is jointly exchangeable, it has the marked distance matrix distribution of the associated marked metric measure space. Moreover, we reconstruct in Proposition 5 a marked metric measure space from its marked distance matrix distribution and thus obtain a variant proof of the Gromov reconstruction theorem.
We point out that Vershik [39] studies complete and separable metric spaces as distance matrices and gives conditions for the distribution of a random matrix to be the distance matrix distribution of a metric measure space or of a compact metric measure space, respectively. For the theory of jointly exchangeable arrays, such as jointly exchangeable matrices, the interested reader is also referred to the monograph of Kallenberg [23].
Sometimes we work with semi-metric spaces. Given a semi-metric space (X, ρ), we can identify the elements with ρ-distance zero to obtain a metric space (X , ρ). By each element x ∈ X and each subset A ⊂ X, we also refer to the associated element or subset of X , writing x ∈ X and A ⊂ X in slight abuse of notation. We define the completion of the semi-metric space (X, ρ) as the completion of the associated metric space (X , ρ).
For each (r, u) ∈ D, the matrix r defines a semi-metric r on N. We denote (in this paragraph) the completion of (N, r) by (X, r). We say that (r, u) is admissible if the probability measures n −1 n i=1 δ (i,u i ) converge weakly to a probability measure µ on X ×R + as n tends to infinity. Let D * be the space of admissible marked distance matrices (r, u). We define ψ : D * → M as the function that maps (r, Similarly, a matrix ρ ∈ D defines a semi-metric ρ on N. We consider the completion (X , ρ) of (N, ρ). We say ρ is admissible if the probability measures n −1 n i=1 δ i on (X , ρ) converge weakly to a probability measure m on (X , ρ) as n tends to infinity. We denote by D * the space of admissible distance matrices ρ. We define ψ 0 : D * → M as the function that maps ρ to [[X , ρ, m]]. The metric space (X , ρ) is compact if and only if it is totally bounded, that is lim We denote by D CDI the space of those ρ ∈ D that satisfy condition (3). We write D * CDI = D * ∩D CDI , and we denote the function that maps ρ to [X , ρ, m] by υ : D * CDI → M. We remark that (r, u) ∈ D * implies r ∈ D * . Proposition 3. The functions ψ, ψ 0 , and υ are measurable.
Proof. For n ∈ N and ρ ∈ D n , a semi-metric ρ on [[n]] is defined by ρ . We identify the elements of [[n]] with ρ -distance zero to obtain a metric space (X , ρ ). Let υ n : D n → M be the function that maps ρ to [X , ρ , n −1 n i=1 δ i ]. The function υ n is continuous. To show this, let ρ ∈ D n . Another semi-metric ρ on [[n]] is defined by ρ . We identify the elements with ρ -distance zero to obtain a metric space (X , ρ ). With the coupling CDI be arbitrary. A semi-metric ρ on N is defined by ρ. Let (X, ρ) be the completion of (N, ρ). For each n ∈ N, define a probability measure ], X) = 0 and lim n→∞ d X P (m n , m) = 0 for a probability measure m on X. As it holds υ(ρ) = [X, ρ, m] and υ n (γ n (ρ)) = [[[n]], ρ, m n ], we have lim n→∞ d GHP (υ n (γ n (ρ)), υ(ρ)) = 0. This implies measurability of υ. Measurability of ψ and ψ 0 can be proved analogously by using Proposition 1.
The group S ∞ of bijections of N onto itself is defined to act on D and on D as follows: For (r, u) ∈ D and p ∈ S ∞ , we set For n ∈ N, the actions on D n and on D n of the group S n of permutations on [[n]] are defined analogously. A random variable R in D (in D) is called jointly exchangeable if for all p ∈ S ∞ , the random variables p(R) and R are equal in distribution. The following Proposition may be compared with Lemma 8 of Vershik [39].
Proposition 4. Let (r, u) be a jointly exchangeable random variable with values in D * . Let (r , u ) be a random variable with values in D and conditional distribution ν ψ(r,u) given ψ(r, u). Then (r , u ) ∈ D * a. s., and (r , u ) is distributed as (r, u).
Proof. Let n ∈ N, φ ∈ C n , and let (X, r, µ) be a representant of ψ(r, u). We have The second equality follows from the definition of D * and by dominated convergence.
For the third equality, we use that summands where 1 , . . . , n are not pairwise distinct vanish in the limit, and that for all other summands, the expectation in the second line equals by exchangeability the expectation in the third line.
This proposition implies the Gromov reconstruction theorem. Indeed, for two marked metric measure spaces (X, r, µ) and (X , r , µ ) with the same marked distance distribution ν, and for a random variable R with distribution ν, we obtain ψ(R) = [[X, r, µ]] a. s. and ψ(R) = [[X , r , µ ]] a. s., hence the two marked metric measure spaces are equivalent. In the proof of Theorem 1 in Section 7, we will argue similarly to the proof below. The simplicity of the proof below lies in constructing an equivalent marked metric measure space from the marked distance matrix distribution instead of comparing two marked metric measure spaces, cf. the proofs of the Gromov reconstruction theorem in chapter 3 1 2 of Gromov [21] and in Theorem 4 of Vershik [39]. A Gromov reconstruction theorem for marked metric measure spaces is stated in [8, Theorem 1].
Proof. Let (X , r , µ ) be a representant of χ. W. l. o. g. we assume supp X (µ ) = X and (r, u) = (X ,r ) (x, u) for a µ-iid sequence (x, u). A semi-metric r on N is defined by r. We denote by (X, r) the completion of (N, r). The isometry that maps x i to i for all i ∈ N can be extended to an isometry ϕ from X to X. An isometryφ from

Deterministic lookdown genealogies
In Section 4.1, we construct the lookdown space from the genealogy at time 0 and from a deterministic point measure that encodes the reproduction events. Moreover, we define the matrices of the genealogical distances between the individuals at each time, and we introduce a decomposition of the genealogical distances which we encode by marked distance matrices. In Section 4.2, we characterize the evolution of the genealogical distances and of the marked distance matrices by integral equations. Stochastic equations for the lookdown model, in particular for the ancestral level and the type process, are studied in the articles of Donnelly and Kurtz [12] and Birkner et al. [6].

Genealogical distances and the lookdown space
A simple point measure is a purely atomic measure whose atoms all have mass 1. Let N be the space of simple point measures η on (0, ∞) × P that satisfy η((0, t] × P n ) < ∞ for all t ∈ (0, ∞) and n ∈ N.
Let η ∈ N . The point measure η encodes the reproduction events in a realization of the lookdown model. In this model, there are countably infinitely many levels which are labeled by N; each level is occupied by one particle at each time t ∈ R + . The interpretation of a point (t, π) of η is that the following reproduction event occurs at time t: First the particles on the levels i ∈ N with i > [[#π]] are removed. Then for each i ∈ [[#π]], the particle on level i moves to level min B(π, i) and has offspring on all other levels in B(π, i). This can be seen as the population model which underlies the flow of partitions in [18]. If for each point (t, π) of η, the partition π contains only one non-singleton block, there are no simultaneous multiple reproduction events.
For all 0 ≤ s ≤ t, each particle at time t has an ancestor at time s. We denote by A s (t, i) the level of the ancestor at time s of the particle on level i at time t such that s → A s (t, i) is càdlàg. We define individuals as elements of the set R + × N, we call (t, i) ∈ R + × N the individual on level i at time t. We define (s, A s (t, i)) as the ancestor at time s of the individual (t, i). The ancestral level can also be defined in terms of integral equations. For all π ∈ P and i ∈ N, there exists a unique α(π, i) ∈ N with π(i) = B(π, α(π, i)). Clearly, for n ∈ N and π ∈ P \ P n , it holds α(π, i) = i for all i ∈ [[n]]. Let (t, i) ∈ R + × N and let (L s , s ∈ [0, t]) be the càdlàg solution of the integral equation with L 0− = i. By the assumption that η ∈ N , the solution must be piecewise constant, existence and uniqueness follow as the equation determines the jumps. Then we set In later parts of this article, we also need the quantity D t (s, i) which we define for (s, i) ∈ R + ×N and t ∈ [s, ∞) as the lowest level that is occupied at time t by a descendant of the individual (s, i), that is N). See also [32] for the forward level process and lines of ascent in the lookdown graph. The map t → D t (s, i) is non-decreasing and right continuous. For t ∈ (s, ∞), we denote the left limits by , that is, the lines of ascent do not cross. In particular, it holds for all 0 ≤ s ≤ t. Moreover, for τ = inf{t ∈ [s, ∞) : D t (s, i) = ∞}, the following implication holds by right continuity of t → D t (s, i): If it holds both τ < ∞ and D τ − (s, i) < ∞, then there exists π ∈ P with #π < ∞ and η{(τ, π)} > 0. This will be crucial in the proof of Proposition 8. Particles can also be removed due to accumulations of finite jumps of t → D t (s, i), then it holds D τ − = ∞.
Remark 2. A lookdown model with a reproduction mechanism that is different in the case with simultaneous multiple reproduction events is studied by Birkner et al. [6]. In the encoding by the point measure η, a point (t, π) of η corresponds to the following reproduction event at time t: Let i 1 < i 2 < . . . be the increasing enumeration of π(0) ∪ (N \ M (π)). For each j ∈ N, the particle on level i j moves to the j-th lowest level in π(0) if this level exists, else the particle is removed. For each non-singleton block B ∈ π, the particle on level min B remains on its level and has one offspring on each level in B \ {min B}. Here the lines of ascent may cross: In a reproduction event encoded by a point (t, π) of η with 1 ∼ π 2, 3 ∈ M (π), {3} / ∈ π, and {4} ∈ π, there exists s ∈ (0, t) such that the individual (s, 3) is the ancestor at time s of the individual (t, 3), and (s, 2) is the ancestor at time s of (t, 4). This lookdown model is addressed in Remark 7 in Section 6.
We are interested in the genealogical distances between all individuals which may live also at different times. Let ρ = (ρ i,j ) i,j∈N ∈ D. For i, j ∈ N, we interpret ρ i,j as the genealogical distance between the individuals (0, i) and (0, j). We define the genealogical distance between two individuals (s, Then ρ is a semi-metric on R + × N. For each t ∈ R + , we denote the matrix of the genealogical distances between the individuals at time t ∈ R + by It holds ρ(η, ρ, 0) = ρ. We will refer to this construction through the map Let us interject at this point that we will call the metric space obtained as the completion of the semi-metric space (R + × N, ρ) the lookdown space associated with η and (ρ, 0) ∈ D. By now, we have discussed everything we need from this subsection to construct the tree-valued Ξ-Fleming-Viot process in the case without dust and the tree-valued evolving Ξ-coalescent. We continue by decomposing genealogical distances.
Let R = (r, u) ∈ D, we assume ρ =ρ(R), that is, R encodes a decomposition of the genealogical distances at time 0 which are given by ρ. For each time t ∈ R + , we define a marked distance matrix as follows: For i ∈ N, we set where we introduce the notation P(j) = {π ∈ P : {j} / ∈ π} for j ∈ N. Then we define We will refer to this construction through the map We obtain ρ(η, ρ, t) =ρ(R(η, R, t)). The quantity u t,i is the time back from the individual (t, i) until the ancestral lineage is involved in a reproduction event in which it belongs to a non-singleton block, if there is such an event, else u t,i is defined from the given marks at time 0. We will use later in particular that In Corollary 2 in Section 6, we show the relation of the decomposition given by the marked distance matrix R(η, R, t) to external branches of genealogical trees. By construction and as η ∈ N , for all n ∈ N, the path t → γ n (ρ(η, ρ, t)) jumps at the times s ∈ (0, ∞) with η({s} × P n ) > 0, and between these jumps, the distances (ρ(η, ρ, t)) i,j for i, j ∈ [[n]] with i = j grow linearly in t with slope 2. It follows that the path t → ρ(η, ρ, t) is càdlàg. We denote the left limit at a time t ∈ (0, ∞) by ρ(η, ρ, t−) = lim s↑t ρ(η, ρ, s). A further implication of η ∈ N is that the left limits u t−,i = lim s↑t u s,i exist for all t ∈ (0, ∞) and i ∈ N. Indeed, if there exists s ∈ (0, t) with η([s, t) × (P(i) ∪ P i )) = 0, then it holds u s,i + t − s = u t−,i , else it holds u t−,i = 0. It follows that also the left limits R(η, R, t−) := lim s↑t R(η, R, s) exist for all t ∈ (0, ∞).
We want to embed the metric space obtained from (R + × N, ρ) into a complete and separable metric space (Z, ρ) and assign to each individual (t, i) ∈ R + × N an element Then we endow the space (R + × N) N, that is, the disjoint union of R + × N and N, with an extension of the semi-metric ρ on R + × N which we define by Time goes upwards and levels go from the left to the right. In the lower part, the metric space obtained from N, endowed with the semi-metric obtained from r is symbolized. For each i ∈ N, the junction between the individual (0, i) and its ancestor i has length u i . Individuals that are in the same block in a reproduction event have genealogical distance zero and are identified. In the figure, they are connected by horizontal lines. In this example, there are no simultaneous multiple reproduction events. The genealogical distances between the individuals (t, i) and their respective ancestors z(t, i) equal u t,i , they are represented by red lines. The genealogical distance between any two individuals is the sum of the lengths of the vertical parts of the path from one individual to the other, plus the distance in the metric space obtained from r if this space has to be traversed.
for (t, i) ∈ R + × N and j ∈ N. We identify the elements with ρ-distance zero and take the completion which we denote by (Z, ρ). We call (Z, ρ) the lookdown space associated with η and R. Illustrations of lookdown spaces are given in Figures 2 and 3. Note that for t ∈ (0, ∞) and i ∈ N, the limits (t−, i) := lim s↑t (s, i) and Remark 3. If the semi-metric on N obtained from a distance matrix ρ ∈ D yields an ultrametric space when elements with distance zero are identified, then also the distance matrices ρ(η, ρ, t) for all t ∈ R + have this property. If the marked distance matrix (r, u) ∈ D satisfies that the metric space obtained from N endowed with the semi-metric obtained from r is 0-hyperbolic, then this also holds for the marked distance matrices R(η, (r, u), t) for all t ∈ R + . In this case, (Z, ρ) itself is 0-hyperbolic.  [32]. This occurs for instance at time s, the limit lim r↑s D r (0, 3) is part of the boundary. Further elements of the boundary are obtained from Cauchy sequences at fixed times.

Integral equations
We give alternative characterizations of the family of distance matrices (ρ(η, ρ, t), t ∈ R + ) and the family of marked distance matrices (R(η, R, t), t ∈ R + ) from the last subsection. We will analyze these families for a Poisson random measure η in Section 5.
We have P n = P \ {π ∈ P : σ n (π) ∼ ∅}. It follows from the properties of the path We define is constant in t and the entries of the vector (u t,i ) i∈[[n]] grow linearly with slope 1. It follows that the map t → R(η, R, t) is càdlàg, and (R(η, R, t), t ∈ R + ) is the unique solution (R t , t ∈ R + ) of the system of integral equations for n ∈ N, where 1 n denotes the vector in R n + whose entries are all 1.

A model in two-sided time
In the population model from Section 4.1, we can use R as the time axis instead of R + . Let η be a simple point measure on R×P that satisfies η([s, t]×P n ) < ∞ for all s ≤ t and n ∈ N. Let again N be the set of levels. We interpret the points (t, π) of η in the same way as in Section 4.1. There is no difference in the definition of the ancestral level A s (t, i), except that all s ≤ t are now allowed.
where u i is the time back from the individual (t, i) until the ancestral lineage is involved in a reproduction event, and r i,j is 0 or a genealogical distance between ancestors at depths u i and u j , Thereby we impose the further assumption on η that these quantities be finite. Then it holdsρ(η, t) ∈ D andR(η, t) ∈ D for all t ∈ R. We denote byN the space of the point measures η that satisfy these assumptions. We refer to this construction through the mapsρ : For each t ∈ R + and η ∈N , we obtain an ultrametric space from the semi-metric on N given byρ(η, t).

Evolution of genealogical distances in the lookdown model
The population model discussed in Section 4.1 and the integral equations from Section 4.2 will be driven by a Poisson random measure η which we discuss in Section 5.2. We describe the resulting stochastic processes (ρ(η, ρ, t), t ∈ R + ) and (R(η, R, t), t ∈ R + ) by martingale problems in Section 5.3. In Section 5.4, we show exchangeability properties of the marked distance matrix R(η, R, τ ) for various stopping times τ . The argument builds on the preservation of exchangeability properties in a single reproduction event, this is discussed in Section 5.1 below. Preservation of exchangeability of type configurations is studied in [6,12].

Preservation of exchangeability properties in a single reproduction event
For n ∈ N, we define the action of the group S n on S n by p(σ) = {p(B) : B ∈ σ} for p ∈ S n and σ ∈ S n . We define the action of the group S ∞ on P by p(π) = {p(B) : B ∈ π} for p ∈ S ∞ and π ∈ P. We say a random variable σ in S n is exchangeable if the equality in distribution p(σ) d = σ holds for all p ∈ S n . Similarly, a random variable π in P is called exchangeable if p(π) d = π for all p ∈ S ∞ . For instance, if a random partition π in P is exchangeable, then σ n (π) is exchangeable.
For b, n ∈ N 0 with b + n ≥ 1, we define a subgroup S b,n of S b+n by For b ∈ N and n ∈ N 0 , we define the map γ b : Lemma 1. Let n ∈ N and b ∈ N 0 . Let R be a jointly (b, n)-exchangeable random variable with values in D b+n and let σ be an independent (b, n)-exchangeable random variable with values in S b,n . Then σ(R) is jointly (b, n)-exchangeable.
In the special case b = 0, Lemma 1 states that σ(R) is jointly exchangeable if R is jointly exchangeable and σ is exchangeable. This can be seen as a generalization of Lemma 4.3 of Bertoin [3].
The assertion follows as it holdsp(R) d = R.

The Ξ-lookdown model
Kingman's correspondence is a one-to-one correspondence between the distributions of exchangeable random partitions of N and the probability measures on the simplex U i and U j fall into a common subinterval that is not the dust interval. This construction defines a probability kernel κ from ∆ to P. Conversely, every exchangeable random partition π in P has distribution ∆ ν(dx)κ(x, ·) for some distribution ν on ∆. Here x is the random vector in ∆ of the asymptotic frequencies of the blocks of π. We now define a Poisson random measure η on (0, ∞) × P for which we refer to Schweinsberg [37] and Foucart [18]. This Poisson random measure will drive the population model from Section 4.1. We denote the space of finite measures on ∆ by M f (∆). Let Ξ ∈ M f (∆) be arbitrary, we decompose Ξ = Ξ 0 + aδ 0 where a = Ξ{0}. For i, j ∈ N with i = j, we denote by π(i, j) the partition in P that contains the block {i, j} and apart from that only singleton blocks. We define a σ-finite measure H Ξ on P by From now on, let η always be a Poisson random measure on (0, ∞) × P with intensity dt H Ξ (dπ). We may construct the Poisson random measure η in two steps. First, we define a Poisson random measure ξ 0 on (0, ∞) × ∆ with intensity dt |x| −2 2 Ξ 0 (dx). Assume Let π i for i ∈ N be P-valued random variables that are conditionally independent given ((τ i , y i ), i ∈ N) such that π i has conditional distribution κ(y i , ·) given ((τ i , y i ), i ∈ N) Then a Poisson random measure η 0 on (0, ∞) × P is given by In case Ξ 0 (∆) = 0, we set η 0 = 0. We define an independent Poisson random measure η K on (0, ∞) × P with intensity dt a 1≤i<j δ π(i,j) (dπ).
By the properties of Poisson random measures, η 0 + η K is a Poisson random measure with intensity dt H Ξ (dπ). We may assume η = η 0 + η K , then it holds where for π ∈ P, we denote by |π| ↓ the vector in ∆ of the asymptotic frequencies of the blocks of π provided they exist. For n ∈ N and σ ∈ S n \ {∅}, the Poisson process (η((0, t] × σ −1 n (σ)), t ∈ (0, ∞)), which will count the reproduction events encoded by a partition in σ −1 n (σ), has rate where = #σ, and k 1 , . . . , k ≥ 1 are the sizes of the subsets in σ in arbitrary order. For the last equality, we consider the paintbox partition π associated with x ∈ ∆. In the context from the beginning of this subsection, i, j ∈ [[n]] are elements of a common subset in σ n (π) if and only if U i and U j fall into a common subinterval that is not the dust interval. In particular, i / ∈ ∪σ n (π) if and only if U i falls into the dust interval. For π ∈ P n , the Poisson process (η((0, t] × γ −1 n (π)), t ∈ (0, ∞)), whose jump times will be the times of reproduction events encoded by a partition in γ −1 n (π), has rate λ π := H Ξ (γ −1 n (π)) = σ∈S n : σ∼π λ n,σ .
Remark 4. Let π ∈ P n , let k 1 , . . . , k r ≥ 2 be the sizes of the non-singleton blocks of π in arbitrary order, an let s = n − k 1 − . . . − k r . Then it holds λ π = λ n;k 1 ,...,kr;s in the notation of Schweinsberg [37] on the right-hand side. From it follows λ π < ∞ for all π ∈ P n . This implies η ∈ N a. s. From now on, we always consider the population model from Section 4 driven by the Poisson random measure η.
We may now call it the Ξ-lookdown model. The times at which two fixed levels are in the same block in a reproduction event are the jump times of a Poisson process with rate λ {{1,2}} .
Remark 5. If Ξ is concentrated on {x ∈ ∆ : x 2 = 0}, then a. s., no simultaneous multiple reproduction events occur. The measure Ξ is then determined by the finite measure In the particular case that Ξ = δ 0 , there are a. s. only binary reproduction events. In the case without simultaneous multiple reproduction events, it holds for σ ∈ S n \ {∅} with σ = {B} and k = #B. The rates λ n,σ for σ ∈ S n with #σ > 1 are equal to zero. Furthermore, is the rate at which a fixed level is in a non-singleton block in a reproduction event. Let M dust be the subset of M f (∆) such that λ 1,{{1}} is finite if Ξ ∈ M dust . In case Ξ ∈ M dust , we say the Ξ-lookdown model contains dust. We set M nd = M f (∆) \ M dust . Clearly, Ξ ∈ M dust implies Ξ{0} = 0. In case Ξ ∈ M dust , it follows from λ 1,{{1}} < ∞ that λ n,σ < ∞ for all σ ∈ S n \ {∅}. Thus, Ξ ∈ M dust implies η ∈ N dust a. s. In contrast, Ξ ∈ M nd implies u t = 0 for all t ∈ (0, ∞) and R ∈ D a. s., where (r t , u t ) = R(η, R, t). In any case, the path t → R(η, R, t) is càdlàg a. s. The lookdown space in Figure 2 could be the lookdown space associated with a typical realization of an appropriate Poisson random measure η with Ξ ∈ M dust and R ∈ D. Figure 3 illustrates an example with Ξ ∈ M nd and R = (ρ, 0) ∈ D.
For every a. s. finite K-stopping time τ , the point measure η τ is distributed as η and independent of K τ . We will call this the strong Markov property of the Poisson random measure η, it is a well-known property of Poisson random measures. We show that for arbitrary ρ ∈ D and R ∈ D, the stochastic processes (ρ(η, ρ, t), t ∈ R + ) and (R(η, R, t), t ∈ R + ) are Markovian. Let s, t ∈ R + . By construction, we have and R(η, R, t) is K t -measurable. By the properties of conditional expectations, we have for measurable φ : D → R. This yields the Markov property of (R(η, R, t), t ∈ R + ). The Markov property of (ρ(η, ρ, t), t ∈ R + ) can be seen analogously. Now we describe these processes by martingale problems. Ωf (X s )ds is a martingale with respect to the natural filtration of (X t , t ∈ R + ).
For martingale problems, we refer to the monograph of Ethier and Kurtz [13]. We denote by 1 the vector in R N + whose entries are all 1, and we write 2 = 2(1{i = j}) i,j∈N .
In the following Propositions, we use for φ ∈ C and ρ ∈ D the notation ∇φ, 2 (ρ ) = 2 and for φ ∈ C and (r , u ) ∈ D the notations and for n ∈ N, φ ∈ C n , and ρ ∈ D. Then for each ρ ∈ D, the stochastic process (ρ(η, ρ, t), t ∈ R + ) solves the martingale problem (Ω 1 , C ) with initial state ρ. and for n ∈ N, φ ∈ C n and (r , u ) ∈ D. Then for each R ∈ D, the stochastic process (R(η, R, t), t ∈ R + ) solves the martingale problem (Ω 2 , C) with initial state R.
Proof. Let n ∈ N and φ ∈ C n . By definition of λ n,σ , it holds for all (r , u ) ∈ D. As it also holds σ n (π)(γ n (r , u )) = γ n (σ n (π)(γ n (r , u ))) for all n ≥ n and π ∈ P, the operator Ω repr 2 is well-defined. Let R ∈ D. For t ∈ R + , we write R t = R(η, R, t). By the integral representation (6), we have For the second equality, we use the compensation formula for Poisson random measures.
For the third equality, we use that the process (R s , s ∈ R + ) has a. s. only countably many jumps. As this process is also Markovian, the assertion follows.
Proof of Proposition 6. This follows analogously to Proposition 7, we use the integral representation (5).
The following Lemma builds on Lemma 1. For b, n ∈ N 0 with b + n ≥ 1, we say a random variable R is (jointly) (b, n)-exchangeable conditionally given a sigma-algebra F if it holds P(R ∈ B|F) = P(p(R ) ∈ B|F) a. s. for all p ∈ S b,n and all measurable subsets B of the state space of R .
Proof. For k ∈ N, we define a ξ 0 -measurable random time τ k which assumes countably many values by τ k = (j + 1)/k on the event {τ ∈ [j/k, (j + 1)/k)} for j ∈ N 0 . For n ∈ N, φ ∈ C n , and p ∈ S n , it holds by Lemma We let k tend to infinity. The assertion follows as t → R t is càdlàg a. s. To prove the assertion for R τ − , we replace τ k withτ k = τ k /k.
We also consider exchangeability properties at stopping times with respect to two different filtrations. For b ∈ N 0 and n ∈ N, we define equivalence classes on D b+n by  Lemma 5. Let b ∈ N 0 and n ∈ N. Assume that γ b+n (R) is jointly (b, n)-exchangeable. Let τ be a finite H b,n -stopping time. Then the marked distance matrix Proof. We show that for each t ∈ R + , the marked distance matrix is (b, n)-exchangeable conditionally given H b,n t . The assertion then follows for stopping times which assume countably many values, and by an approximation argument as in the proof of Corollary 1 also for all finite stopping times.
We enlarge the spaces D b+n and D b+n by a coffin state ∂. Let K be the probability kernel from D b+n to D b+n such that it holds K(∂, {∂}) = 1, and such that for all R ∈ D b+n \ {∂}, the probability measure K([R ] b,n , ·) is the uniform distribution on the representants of [R ] b,n .
Let τ = inf{t > 0 : η((0, t] × P b ) > 0} and setR t = γ b+n (R t ) for t < τ , andR t = ∂ for t ≥ τ . By assumption, K is a regular conditional distribution of γ b+n (R) given [γ b+n (R)] b,n . For all t ∈ R + , Lemma 3 implies that K is a regular conditional distribution ofR t given [R t ] b,n , where we set [∂] b,n = ∂. We apply Theorem 2 of Rogers and Pitman [35] to the Markov process (R t , t ∈ R + ), the measurable function D b+n → D b,n , R → [R ] b,n , and the probability kernel K to obtain that for each t ∈ R + , the random variableR t has the same conditional distribution given H b,n t as given [R t ] b,n , this follows from equation (1) in [35]. This implies the assertion.
Furthermore, for n ∈ N, we define equivalence classes on D by [R ] n = {p(R ) : p ∈ S ∞ with p(i) = i for all i > n} for R ∈ D. For t ∈ R + , we denote by J n t the sigma-algebra generated by ([R s ] n , s ∈ [0, t]). A filtration is defined by J n = (J n t , t ∈ R + ). We set D n = {[R ] n : R ∈ D}.
Lemma 6. Let n ∈ N and assume that γ n (R) is jointly exchangeable. Let τ be a finite J n -stopping time. Then γ n (R τ ) is jointly exchangeable.
Proof. We show that for each t ∈ R + , the marked distance matrix γ n (R t ) is jointly exchangeable conditionally given J n t . It follows from Lemma 3 that γ n (R t ) is jointly exchangeable conditionally given [R t ] n . Let K be the probability kernel from D n to D such that for each R ∈ D, the probability measure K([R ] n , ·) is the uniform distribution on the representants of [R ] n . Then K is a regular conditional distribution of R given [R] n , and of R t given [R t ] n . Now we apply Theorem 2 of [35] to the Markov process (R t , t ∈ R + ), the measurable function D → D n , R → [R ] n , and the probability kernel K to obtain that for each t ∈ R + , the marked distance matrix R t has the same conditional distribution given [R t ] n as given J n t .

Two families of partitions
In Section 7, we will construct families of probability measures on the lookdown space using two different families of partitions of N. In case Ξ ∈ M nd , we will use the flow of partitions and for the converse case Ξ ∈ M dust , we introduce here another family of partitions. We recall that in particular the definitions of Ξ, η, η 0 , ξ 0 , η K , ρ, R, ρ t , ρ t− , R t = (r t , u t ), and R t− = (r t− , u t− ) from Section 5.4 are in force. The ancestral level A s (t, i) is defined in terms of the point measure η of reproduction events. Let us first read off the flow of partitions from the marked distance matrices from Section 4. For 0 ≤ s ≤ t, we define a random partition Π s,t on N by for all i, j ∈ N with i = j. That is, i and j are in the same block of Π s,t if and only if A s (t, j) = A s (t, i), recall that the paths s → A s (t, i) are càdlàg by definition. By the construction in Section 4.1, the so-called cocycle property Π r,t = Coag(Π s,t , Π r,s ) for all r ≤ s ≤ t holds. For all 0 ≤ s ≤ t, the random partition Π s,t is exchangeable. This follows from Lemma 3 where we may assume w. l. o. g. (as Π s,t is η-measurable) that R is jointly exchangeable. We define another family (Π t , t ∈ R + ) of partitions of N as follows: For t ∈ R + , we set i ∼ Πt j ⇔ u t,i = u t,j < t and r t,i,j = 0 for all i, j ∈ N with i = j. The random partition Π t is exchangeable. Again, this follows from Lemma 3 where we use that Π t is η-measurable.

Proof.
A proof is required only for the assertion that {i ∈ N : u t,i ≥ t} ⊃ Π t (0) for all t ∈ R + a. s. First, we show that this inclusion holds a. s. for a fixed t ∈ R + . The assumption Ξ ∈ M dust implies Ξ{0} = 0, hence η K = 0 a. s. As the sequence (u t,i , i ∈ N) is exchangeable and as it holds P(u t,1 ≥ t) = e −tλ 1,{{1}} > 0, the de Finetti theorem implies P(u t,i ≥ t for infinitely many i) = 1.
For all n ∈ N with Ξ(∆ >1/n ) > 0 and all k ∈ N, the random partition π n,k is exchangeable and independent of τ n,k by the definitions of τ n,k and η 0 , and by the coloring theorem for Poisson random measures. Let B n,k = {A τ n,k (t, i) : i ∈ N, u t,i ≥ t − τ n,k , τ n,k ≤ t}. As τ n,k is an K-stopping time, as π n,k is K τ n,k -measurable, as τ n,k is independent of π n,k , and as B n,k is measurable with respect to (η τ n,k , τ n,k ), the random set B n,k is independent of π n,k . Also, the definition of u t implies that for i, j ∈ N with u t,i ∧ u t,j ≥ t − τ n,k and i = j, it holds A τ n,k (t, i) = A τ n,k (t, j) on the event {τ n,k ≤ t}. On this event, the random set B n,k is a. s. infinite as it holds u t,i ≥ t − τ n,k for infinitely many i ∈ N a. s. A. s. on the event {τ n,k ≤ t}, the intersection B ∩ B n,k is infinite for each block B ∈ π n,k with #B ≥ 2. This follows as conditionally given |π n,k | ↓ , the partition π n,k is distributed as the paintbox partition associated with |π n,k | ↓ .
For each i ∈ N, the following holds a. s. on the event {u t,i < t}. By definition of u t , there exist random n, k such that t − u t,i = τ n,k and A τ n,k (t, i) ∈ B ∩ B n,k for a nonsingleton block B of π n,k . Hence, there exists a random j ∈ N with A τ n,k (t, j) = A τ n,k (t, i) and A τ n,k (t, j) ∈ B ∩ B n,k . This implies u t,j = u t,i < t and r t,i,j = 0, which proves the assertion for fixed t, hence also for all t ∈ Q + simultaneously on an event E of probability 1.
On the event {η ∈ N dust } of probability 1, there exists for all t ∈ R + and i ∈ N a time q(t, i) ∈ (t, ∞) ∩ Q with η((t, q(t, i)] × P (i) ) = 0. Now i ∈ Π t (0) implies i ∈ Π q(t,i) (0) and i = A t (q(t, i), i), hence u q(t,i),i ≥ q(t, i) and u t,i ≥ t on the event {η ∈ N dust } ∩ E. This proves the assertion.
Remark 6. We state Corollary 2 below and this remark to give the following interpretation of the decomposition of the distance matrix ρ t that is given by the marked distance matrix R t . Assume for the moment that R d ∼ ν χ for some χ ∈ M 0 such that α −1 (χ) is ultrametric. Then the space N endowed with the semi-metric given by ρ t is the space of leaves of an R-tree for all t ∈ R + a. s., cf. Remarks 1 and 3. For all i, j ∈ N with i ∼ Πt j, the external branches of this R-tree that end in leaf i and in leaf j, respectively, begin in the same internal node. The vector u t encodes the lengths of the external branches. The matrix r t encodes the distances between the nodes in which the external branches begin. In Corollary 2, the assumption on χ is not needed.
Corollary 2. Assume Ξ ∈ M dust and R d ∼ ν χ for some χ ∈ M. Then it holds u t,i = 1 2 inf j∈N\{i} ρ t,i,j for all t ∈ R + and i ∈ N a. s. Proof. From the construction in Section 4.1, we have u t,i ≤ 1 2 ρ t,i,j for all t ∈ R + and i, j ∈ N with i = j. From Lemma 7, we obtain for all t ∈ R + and i ∈ N a. s. Let (X, r, µ) be a representant of χ and let (( Then t → B t is non-increasing, and B t is infinite for all t ∈ R + a. s. As B t is also independent of R, it holds inf{r(x i , x j )∨|u i − u j | : j ∈ B t , j = i} = 0 for all t ∈ R + and i ∈ N a. s. Hence, on an event of probability 1, there exists for each ε > 0, t ∈ R + , and i ∈ N with u t,i ≥ t an integer j such that i = j, u t,j ≥ t and r t,i,j ∨ |u t,i − u t,j | < ε, hence ρ t,i,j = u t,i + r t,i,j + u t,j < 2u t,i + 2ε.
for all T ∈ R + and b ∈ N 0 . The paths t → |Π t (b)| are càdlàg a. s. and it holds A result similar to Lemma 8 appears in [26,Proposition 2.13]. We defer the core of the proofs of Lemmata 8 and 9 to Lemmata 14 and 15 in Section 10.
Proof of Lemma 8. By η-measurability of the random variables in the assertion, we may assume w. l. o. g. that R is jointly exchangeable. We choose f in Lemma 14 such that for all t ∈ R + and i ∈ N. In this case, it holds  ∈ (s, ∞). This implies that the regularity property (15) holds a. s., and that lim s ↑t |Π s,s (b)| = c t for some c t ∈ [0, 1] for each t ∈ (s, ∞) a. s. To show the regularity property (16), let ε > 0, and choose on an event of probability 1 a sufficiently large integer n 0 such that |Π s,t (b)| n − |Π s,t (b)| < ε for all t ∈ [s, s + T ] and n ≥ n 0 . Then it holds lim sup s ↑t |Π s,s (b)| n ≤ c t + ε and lim inf s ↑t |Π s,s (b)| n ≥ c t − ε for all n ≥ n 0 and t ∈ [s, s + T ]. It follows c t = |Π s,t− (b)| for all t ∈ (s, ∞) a. s.
Proof of Lemma 9. We assume w. l. o. g. that R is jointly exchangeable. Let b ∈ N. We choose f in Lemma 15 such that for all t ∈ R + and i ∈ N. Then it holds and X(t) = |Π t (b)| for all n ∈ N and t ∈ R + a. s., with X n (t) and X(t) as defined in Lemma 15. As η ∈ N dust a. s., the processes t → |Π t (b)| n are càdlàg for all n ∈ N a. s. Lemma 15 implies the assertion.
It remains the case b = 0. By Lemma 7, it holds Π t (0) = {i ∈ N : u t,i ≥ t} for all t ∈ R + a. s. Lemma 15 now implies the convergence (17). By Lemma 7 and as η ∈ N dust a. s., the paths t → |Π t (0)| n are càdlàg a. s. for all n ∈ N. Hence, by (17), also the paths t → |Π t (0)| are càdlàg a. s. By Lemma 7 and as η ∈ N dust a. s., it holds lim s↑t |Π s (0)| n = |Π t− (0)| n for all t ∈ (0, ∞) and n ∈ N a. s. The uniform convergence (17) implies the assertion on the left limits. Proof. We assume w. l. o. g. that R is jointly exchangeable.
Step 1. There exists an event of probability 1 on which it holds that for all t ∈ [s + ε, ∞) with #Π s,t = ∞, the partition Π s,t contains no singleton blocks. This follows as the partition Π s,s+ε contains no singletons a. s. For t ∈ [s + ε, ∞), an implication of #Π s,t = ∞ is that #Π s+ε,t = ∞, by equation (4) also that A s+ε (t, N) = N, and on the event that Π s,s+ε contains no singletons, also that Π s,t contains no singletons.
Step 3. For all k ∈ N, the process )-adapted and has a. s. càdlàg paths by Lemma 8. Hence, is a stopping time with respect to the usual augmentation of J n for all n ∈ [[k]]. As ϑ ε,j ≥ ϑ ε,k for integers j ≥ k, it follows that is a stopping time with respect to the usual augmentation of J n for all n ∈ N.
Proof. We proceed similarly to the proof of Lemma 10. We assume w. l. o. g. that R is jointly exchangeable.
Step 1. For all n ∈ N with Ξ(∆ >1/n ) > 0 and all k ∈ N, the partitionΠ τ n,k − is exchangeable by Corollary 1. By definition, it contains at most one singleton block. It follows from Kingman's correspondence thatΠ τ n,k − has proper frequencies a. s.
Step 2. Let ε > 0. We set for k ∈ N We deduce from Lemma 9 that ϑ ε,k is a stopping time with respect to the usual augmentation of J n for all n ∈ [[k]]. Then we define ϑ ε = sup k∈N ϑ ε,k which is for all n ∈ N a stopping time with respect to the usual augmentation of J n . Let T ∈ R + . The partitioñ Π ϑε∧T is exchangeable by Lemma 6. We deduce as in step 1 that it has proper frequencies a. s.
Step 3. We conclude as in the proof of Lemma 10, using Lemma 9.
Lemma 12 below will be used in the proof of Theorem 1(iii) in the next section. We denote by M CDI the subset of M f (∆) such that Ξ ∈ M CDI implies #Π s,t < ∞ for all 0 ≤ s < t a. s. (18) As condition (18) implies that Π s,t has proper frequencies for all 0 ≤ s < t a. s., it holds M CDI ⊂ M nd . An equivalent condition for Ξ ∈ M CDI is that the Ξ-coalescent comes down from infinity a. s., that is, it holds #Π s,1 < ∞ for all s ∈ [0, 1) a. s. Indeed, it can be deduced from this condition that #Π s,t < ∞ for all rational 0 ≤ s < t a. s. as it holds η t d = η for all t ∈ R + . Condition (18) then follows from the cocycle property. is a stopping time with respect to the usual augmentation of J n for all n ∈ N. By Lemma 6, the distance matrix γ n (ρ ϑ∧T ) is jointly exchangeable for each T ∈ [s + ε, ∞) and n ∈ N. Hence, the partition Π s,ϑ∧T is exchangeable and by the assumption Ξ ∈ M CDI , it holds #Π s,ϑ∧T < ∞ a. s. Kingman's correspondence now implies that each block of Π s,ϑ∧T has a positive asymptotic frequency a. s. By Lemma 8, on an event of probability 1, this also holds true for all partitions Π s,t with t in a right neighborhood of ϑ ∧ T . By definition of ϑ, it follows P(ϑ < T ) = 0. The assertion follows as T and ε can be chosen arbitrarily.
Remark 7. If the reproduction events are defined as in Remark 2, then the cocycle property (10) does not hold in general as the lines of ascent may cross in the case with simultaneous multiple reproduction events. The main results in this article nevertheless hold. In the proof of Lemma 10, it suffices to show that a. s., the partitions Π s,τ k − and Π s,ϑε∧(s+2T ) are finite or contain no singleton blocks.

The construction on the lookdown space
In this section, we endow the lookdown space with a family of probability measures. We study path properties of this family in the Prohorov metric. Moreover, we study path properties of a family of subsets of the lookdown space in the Hausdorff metric. In Theorem 1 below, we consider mathematical objects that are defined from realizations of the random variables η and R on an event of probability 1. Stochastic processes are read off from this construction in Sections 8 and 9.
For t ∈ (0, ∞) and s ∈ [0, t), we say a family of age at least t − s dies out at time t if it holds A s (t, N) = s ∈[s,t) A s (s , N). We denote the set of times at which old families die out by In the following, (Z, ρ) is always the lookdown space associated with the random variables η and R, that is, the completion of (R + × N) N with respect to the semimetric ρ obtained from η and R as in Section 4.1. Also z(t, i) for (t, i) ∈ R + × N is defined as in Section 4.1. For t ∈ R + , let X t be the closure of {t} × N in Z. For t ∈ R + and n ∈ N, we define the probability measure on (Z, ρ) and the probability measure on the space Z ×R + which we endow with the metric d Z×R + ((z 1 , u 1 ), (z 2 , u 2 )) = ρ(z 1 , z 2 )∨ |u 2 − u 1 |. In case Ξ ∈ M nd , it holds u t = 0, hence µ n t = m n t ⊗ δ 0 , for all t ∈ (0, ∞) and n ∈ N a. s.
Then there exists an event of probability 1 on which all the following assertions hold: There exists a family (µ t , t ∈ R + ) of probability measures on Z × R + such that for all T ∈ R + . The map t → µ t is càdlàg in the weak topology. The set Θ, defined in (11), is the set of jump times. It holds R t ∈ D * and [[Z, ρ, µ t ]] = ψ(R t ) for all t ∈ R + .
(ii) The following holds if Ξ ∈ M nd : There exists a family (m t , t ∈ R + ) of probability measures on Z with µ t = m t ⊗ δ 0 for all t ∈ R + . The measure m t contains atoms for all t ∈ Θ, and the left limit m t− contains no atoms for all t ∈ (0, ∞). It holds ρ t ∈ D * and [[Z, ρ, m t ]] = ψ 0 (ρ t ) for all t ∈ R + .
(iii) The following holds if Ξ ∈ M CDI and α −1 (χ) ∈ M c : For all t ∈ R + , it holds ρ t ∈ D CDI , the set X t ⊂ Z is compact, and it holds supp m t = X t . It holds ). The map t → X t is càdlàg in d Z H with set of jump times Θ . For all t ∈ Θ , the set X t and the left limit X t− are not isometric.
Theorem 1 above remains valid in case Ξ ∈ M nd ⊃ M CDI if the assumption that χ ∈ M 0 , R d ∼ ν χ , and α −1 (χ) ∈ M c is replaced by the assumption that ρ d ∼ ν X for some X ∈ M and R = (ρ, 0). In case Ξ ∈ M CDI , it follows from Theorem 1 that on an event of probability 1, for each t ∈ Θ \ Θ, the set X t− \ supp m t− ⊂ X t− is equal to X t− \ X t , this is the part of X t− which dies out at time t. This part is taken account of by the measure-preserving isometry class of (X t− , ρ, m t− ), but not by the equivalence class of (X t− , ρ, m t− ). In the next proposition, we determine the intersection of the sets of jump times Θ and Θ . Let Proposition 8. It holds Θ ∩ Θ = Θ f a. s.

Now equation (21) implies the assertion.
Proof of Theorem 1 (beginning). The statements in this proof hold on an event of probability 1, we mostly omit 'a. s. ' We denote the closure of the subset N of (Z, ρ) by A. Let the marked metric measure space (X, r, µ) be a representant of χ with supp X (µ) = X, and let (x, u) be a µ-iid sequence that is independent of η. We may assume R = (X,r) (x, u). For n ∈ N, we define a probability measure on X × R + . By the Glivenko-Cantelli theorem, the probability measures µ n converge weakly to µ as n tends to infinity. The map {x i : i ∈ N} → A, x i → i can be extended to an isometry ϕ from (X, r) to (A, ρ). We deduce as in the proof of Proposition 5: The probability measures µ n 0 converge weakly to the probability measure µ 0 :=φ(µ) on A × R + as n tends to infinity, whereφ(x, u) = (ϕ(x), u) for (x, u) ∈ X × R + . It holds R ∈ D * and [[Z, ρ, First, we assume Ξ ∈ M dust . For t ∈ R + , we define By Lemma 11, we may define a family (µ t , t ∈ R + ) of probability measures on Z × R + by for t ∈ (0, ∞). As |Π 0 (0)| = 1, equation (22) also holds for t = 0. We show that the probability measures µ n t converge to µ t . Let ε > 0 and b ∈ N. For each J ⊂ [[b]] and t ∈ R + , we define a subset C J t of Z × R + by where an empty intersection equals Z × R + . A partition of A × R + is given by Let T ∈ R + . We bound each term on the right-hand side of inequality (23). We begin with the last term. By Lemma 11, we may choose such that sup t∈[0,T ] i∈M (Πt): The second last sum on the right-hand side of inequality (23) converges to zero uniformly in t ∈ [0, T ] as n tends to infinity by Lemma 9. Now we show that the second term on the right-hand side of inequality (23) converges to zero uniformly in t ∈ [0, T ]. We fix J ⊂ [[b]]. In Lemma 15, we choose f such that Y i (t) satisfies for all n ∈ N. Lemma 15 implies that X n (t), hence also µ n t (C J t ), converges to a limit X(t) uniformly in t ∈ [0, T ] as n tends to infinity.
For fixed t ∈ R + , let (k 1 , k 2 , . . .) be the increasing enumeration of {i ∈ N : u t,b+i ≥ t}. This set is a. s. infinite as shown in the proof of Lemma 7. For i ∈ N, let k i = A 0 (t, b + k i ) a. s. As (k 1 , k 2 , . . .) and (k 1 , k 2 , . . .) are η-measurable, each of these sequences is independent of (x, u). Thus the law of large numbers implies This yields As it holds lim n→∞ n/k n = lim n→∞ |Π t (0)| kn = |Π t (0)| a. s. by Lemma 7, we have a. s. for each t ∈ R + . Equality (24) also holds for all t ∈ Q + on an event of probability 1. It holds θ −1 s (C J s ) = θ −1 t (C J t ) for all s, t ∈ R + with s ≤ t and η((s, t] × P (b) ) = 0, hence the map t → µ 0 (θ −1 t (C J t )) is càdlàg a. s. Also the map t → |Π t (0)| is càdlàg a. s. by Lemma 9. As Ξ ∈ M dust a. s., the map t → X n (t) is càdlàg a. s. for all n ∈ N, hence also t → X(t) is càdlàg a. s. It follows that equation (24) holds simultaneously for all t ∈ R + on an event of probability 1, hence X(t) = µ t (C J t ) for all t ∈ R + a. s. As the random set {A 0 (T, i) : i ∈ N, u T,i ≥ T } is infinite and independent of (x, u), it follows that {(x A 0 (T,i) , u A 0 (T,i) ) : i ∈ N, u T,i ≥ T } is a. s. dense in supp µ, hence {(A 0 (T, i), u A 0 (T,i) ) : i ∈ N, u T,i ≥ T } is a. s. dense in supp µ 0 . It follows that the infimum B of all b ∈ N with Now inequality (23) implies that on an event of probability 1, it holds lim sup As ε > 0 and b ∈ N were arbitrary, it follows (µ n t , µ t ) = 0 a. s.
As η ∈ N dust , the maps t → µ n t are càdlàg for all n ∈ N, hence the map t → µ t is càdlàg. As T ∈ R + was arbitrary, it holds R t ∈ D * and [[Z, ρ, µ t ]] = ψ(R t ) for all t ∈ R + . Here we use that the completion of N with respect to the semi-metric obtained from r t is isometric to the closure of {z(t, i) : i ∈ N} in Z.
We state the left limits explicitly. By Lemma 11, we may define a family of probability measures on Z × R + by for each t ∈ (0, ∞). There exists for each t ∈ (0, ∞) and ε > 0 an integer such that i∈M there exists a coupling ν of the probability measures µ s and µ t− such that The coupling characterization of the Prohorov metric implies By Lemma 9, it follows that µ s converges weakly to µ t− as s ↑ t. For t / ∈ Θ, it holds Π t− = Π t , z(t−, i) = z(t, i), and u t−,i = u t,i for all i ∈ N, hence µ t− = µ t .
To prove uniform convergence in a neighborhood of 0 in Theorem 1 in the case Ξ ∈ M nd , we use the following application of Lemma 14.
Then it holds lim Proof. In Lemma 14, choose f such that for all t ∈ R + and i ∈ N. The assertion follows as it holds X n (t) ≤ b + 1 + n n |A J t | b+1+n ≤ X n (t) + b + 1 n and X(t) = |A J t | for all t ∈ R + a. s. in this case, where X n (t) and X(t) are defined in Lemma 14.
Proof of Theorem 1 (continuation). Now we assume Ξ ∈ M nd . The assumption χ ∈ M 0 implies u 0 = 0, hence ρ 0 = r 0 ∈ D * and µ 0 = m 0 ⊗ δ 0 for a probability measure m 0 on Z which is the weak limit of the probability measures m n 0 as n → ∞. We show that there exists for each t ∈ (0, ∞) a probability measure m t on Z such that lim n→∞ sup t∈[0,T ] d Z P (m n t , m t ) = 0 and that the map t → m t is càdlàg with set of discontinuity times Θ. We already noted that u t = 0 for all t ∈ (0, ∞). The assertion in the theorem may then be deduced by defining probability measures µ t on Z × R + by µ t = m t ⊗ δ 0 .
We work on an event of probability 1 on which in particular the assertions in Lemma 8 and Lemma 10 hold simultaneously for all s ∈ Q + . We define for each t ∈ (0, ∞), s ∈ (0, t) ∩ Q, and n ∈ N a probability measure m Let T ∈ (ε, ∞), now we show uniform convergence in [τ, T ]. Let s 0 = 0 and s j = τ + (j − 1)ε for j ∈ N. By Lemma 10, we may choose for each j ∈ N 0 an integer j such that for all t ∈ [s j+1 , s j+2 ]. We set = max{ j : j ∈ N 0 , s j+1 ≤ T }. By Lemma 8, there exists an integer n such that for all n ≥ n , it holds |Π s j ,t (i)| n > 1 − ε for all t ∈ [s j+1 , s j+2 ] and j ∈ N 0 with s j+1 ≤ T . For all t ∈ [τ, T ] and k, n ≥ n , it holds, as the Prohorov distance is bounded from above by the total variation distance, where j ∈ N 0 is such that t ∈ [s j+1 , s j+2 ]. By Lemma 8, this expression converges to ε uniformly in t ∈ [τ, T ] as n, k → ∞. Furthermore, it holds for all n, k ∈ N, t ∈ [τ, T ], and Hence, there exists a family (m t , t ∈ R + ) of probability measures on the complete space Z such that lim n→∞ sup t∈[0,T ] d Z P (m n t , m t ) = 0.
As the completion of N with respect to the semi-metric obtained from ρ t is isometric to X t , the above implies in particular ρ t ∈ D * and [[Z, ρ, m t ]] = [[X t , ρ, m t ]] = ψ 0 (ρ t ) for all t ∈ R + . As it holds u t = 0 for all t ∈ R + , we also obtain R t ∈ D * and [[Z, ρ, µ t ]] = ψ(R t ).
As η ∈ N , the maps t → m n t are càdlàg for all n ∈ N, hence the map t → m t is càdlàg. For n ∈ N and t ∈ (0, ∞), we define the probability measure Then it holds m n t− = w-lim s↑t m n s . Let m t− = w-lim s↑t m s . From the uniform convergence showed above, it follows that m n t− converges weakly to m t− as n → ∞. Note that it holds d Z P (m n t− , m n t ) ≤ 1/n for all t ∈ (0, ∞) \ Θ and all n ∈ N a. s. This implies m t− = m t for all t ∈ (0, ∞) \ Θ a. s.
For all k, n ∈ N with Ξ(∆ >1/n ) > 0, it holds ρ τ n,k −,i,j > 0 a. s. for all i = j as η ∈ N a. s., and there exists a. s. an i ∈ N such that |{j ∈ N : ρ τ n,k ,i,j = 0}| = |π n,k (i)| > 0. Let χ = ψ 0 (ρ τ n,k ). On an event of probability 1, we sample an m τ n,k -iid sequence x in Z and we set ρ = (ρ(x i , x j )) i,j∈N . Then ρ is a D-valued random variable with conditional distribution ν χ given χ . As ρ τ n,k is jointly exchangeable, it follows from Proposition 4 that ρ τ n,k and ρ are equal in distribution. It follows that a. s., there exists an i with ρ(x i , x j ) = 0 for infinitely many j, hence m τ n,k contains an atom a. s. Similarly, it can be shown that m τ n,k − is non-atomic. We deduce that the set of discontinuity times of the map t → m t is Θ = {τ n,k : n, k ∈ N, Ξ(∆ >1/n ) > 0}.
We may postpone the proof of part (iii) and look into the proof of the first parts of Theorem 1 to obtain descriptions of the sampling measure µ t for t ∈ R + .
We begin with the case Ξ ∈ M dust . The random variables R and η are realized a. s. in such a way that for each t ∈ R + , one can sample from the lookdown space Z according to the probability measure Z (µ t ) as follows: With probability given by the asymptotic frequency of the individuals at time t whose ancestral lineages do not coalesce with other ancestral lineages within the time interval (0, t], we sample according to Z (µ 0 ). For each block in a reproduction event at a time τ in (0, t], we draw the individual on, say, the lowest level in this block (which is identified with the individuals on all other levels in this block, as they have genealogical distance zero) with probability given by the asymptotic frequency of the individuals at time t that descend from this block and whose ancestral lineages do not coalesce with any other ancestral lineages in the time interval (τ , t]. This description of Z (µ t ) is a consequence of equation (22) The assumption Ξ ∈ M CDI implies #Π s,t < ∞ for all t ∈ (0, ∞) and s ∈ (0, t) a. s., hence the measures m  as n tends to infinity for all t ∈ (0, ∞) and s ∈ (0, t) ∩ Q a. s. It also holds d Z P (m (n,s) t , m n t ) ≤ 2(t − s) for all n ∈ N, this implies (26) withm t = m t . By the coupling characterization of the Prohorov metric, we have, using m s (U Z 2(s−r) (s, i)) = |Π r,s (i)| and m t (U Z 2(t−r) (t, i)) = |Π r,t (i)|, the metric measure spaces (X t , ρ, m n t ) and (X t , ρ t , n −1 n i=1 δ i ) are measure-preserving isometric. By the construction in the beginning of this proof, the metric measure spaces (X, r, X (µ)) and (X 0 , ρ, m 0 ) are measure-preserving isometric with supp m t = X 0 , and the assumptions imply that X 0 is compact. We obtain [X t , ρ, m t ] = υ(ρ t ), supp m t = X t , and β([[Z, ρ, m t ]]) = [X t , ρ, m t ] for all t ∈ R + .
We come to the regularity properties. We begin with right continuity. Let t ∈ R + and ε > 0. By construction, it holds X s ⊂ B Z s−t (X t ) for all s ≥ t. As X t is compact and as {t} × N is dense in X t , there exists n ∈ N such that X t ⊂ B Z ε (X n t ).
This proves right continuity of the map t → X t in d H .
For t ∈ (0, ∞) \ Θ , it holds X t = X t− . To show this, let x ∈ X t− . Then there exists a sequence ((s k , i k ) : k ∈ N) in (0, t) × N ⊂ Z with 0 < s 1 < s 2 < . . . which converges to x. For each k ∈ N, there exists ∈ N such that ρ((s n , i n ), (s , i )) < 2(s − s k ) for all n ≥ . This implies A s k (s n , i n ) = A s k (s , i ) for all n ≥ . We set j k = A s k (s , i ). Then it holds j k = A s k (s n , j n ) for all n ≥ k ∈ N, hence j k ∈ s∈[s k ,t) A s k (s, N). By our assumption on t, this implies j k ∈ A s k (t, N), hence j k := D t (s k , j k ) < ∞ for all k ∈ N. The sequence ((s k , j k ) : k ∈ N) converges to x as it holds ρ((s k , j k ), x) = lim n→∞ ((s k , j k ), (s n , i n )) ≤ lim n→∞ ((s k , i k ), (s n , i n )) = ρ((s k , i k ), x) for all k ∈ N. Also the sequence ((t, j k ) : k ∈ N) converges to x as it holds ρ((t, j k ), (s k , j k )) = t − s k Theorem 2. Let χ ∈ M and assume R d ∼ ν χ . If Ξ ∈ M nd , assume χ ∈ M 0 . Then the M-valued Ξ-Fleming-Viot process with initial state χ given by (χ t , t ∈ R + ) = (ψ(R t ), t ∈ R + ) a. s.
is Markov and has a. s. càdlàg paths in (M, d MGP ). The set of jump times is a. s. equal to Θ.
Let (χ n , n ∈ N) be a sequence in M with lim n→∞ d MGP (χ n , χ) = 0. For n ∈ N, let R n be a random variable with values in D and distribution ν χ n , independent of η. Then the Feller property ψ(R(η, R n , t)) is satisfied for all t ∈ R + . If Ξ ∈ M nd , let χ = α −1 (χ) ∈ M and define an M -valued Ξ-Fleming-Viot process with initial state χ by In this case, it then holds χ t = α(χ t ) for all t ∈ R + a. s. Remark 10. Also the atomicity properties of the families of probability measures (µ t , t ∈ R + ) and (m t , t ∈ R + ) given in Theorem 1(i) and 1(ii) carry over to the processes (χ t , t ∈ Proof of Theorem 2. We use Theorem 1. By definition of the marked Gromov-Prohorov metric, the process (χ t , t ∈ R + ) = ([[Z, ρ, µ t ]], t ∈ R + ) has càdlàg paths in (M, d MGP ) and no discontinuity times outside Θ a. s. The left limits are given by χ t− = [[Z, ρ, µ t− ]] for all t ∈ (0, ∞) a. s. As the atomicity properties given in Theorem 1(i) and 1(ii) are properties of the equivalence classes, it follows χ t− = χ t for all t ∈ Θ a. s. In case Ξ ∈ M nd , Theorem 1(ii) implies χ t = α(χ t ) for all t ∈ R + a. s.
For each t ∈ R + , the marked distance matrix R t is jointly exchangeable by Lemma 3. LetR be a D * -valued random variable with conditional distribution ν χt given χ t . Then, by Proposition 4, the marked distance matrices R t andR are equal in distribution. The Markov property of (χ t , t ∈ R + ) now follows from an application of Theorem 2 of Rogers and Pitman [35] to the D * -valued Markov process (R t , t ∈ R + ) the measurable function ψ : D * → M, and the probability kernel K from M to D * given by K(χ, ·) = νχ for χ ∈ M.
It remains to show equation (27). Let t ∈ R + . As the map D → D, R → R(η, R , t) is continuous, we have the convergence in distribution R(η, R n , t) d → R(η, R, t) as n → ∞. For every φ ∈ C with associated polynomial Φ, it holds by Proposition 4 as the marked distance matrices R(η, R n , t) and R(η, R, t) are jointly exchangeable As the set of polynomials is convergence determining in M, we deduce ψ(R(η, R n , t)) d → ψ(R(η, R, t)) in M as n → ∞.
We show that the M-valued Ξ-Fleming-Viot process converges to an equilibrium. Let η be a Poisson random measure on R × P with intensity dt H Ξ (dπ). It can be shown similarly to Theorem 1 thatR(η, t) ∈ D * for all t ∈ R a. s. if Ξ(∆) > 0. In the following definition, we consider a generalization of the Λ-coalescent measure tree of [19].
Theorem 3. Assume Ξ(∆) > 0 and let (χ t , t ∈ R + ) be an M-valued Ξ-Fleming-Viot process with initial state χ ∈ M. Then χ t converges in distribution in the marked Gromovweak topology to a Ξ-coalescent measure tree as t tends to infinity.
Let n ∈ N and φ ∈ C n with associated polynomial Φ. By comparing (28) and (29), we obtain for all t ∈ R + as both sides of this equation are independent of R andR(η, 0). As χ t d = χ 0 , the above implies where the right-hand side converges to zero as t tends to infinity by dominated convergence. The assertion follows as the set of polynomials is convergence determining.
In [11], convergence to stationarity of measure-valued Fleming-Viot processes is also proved by a coupling argument.

Martingale problems
We denote by D M ([0, ∞)) and D M ([0, ∞)) the space of càdlàg paths in M and M , respectively, which we endow with the Skorohod metric.
Theorem 4. Assume Ξ ∈ M dust , let χ ∈ M, and let (χ t , t ∈ R + ) be a càdlàg version of an M-valued Ξ-Fleming-Viot process with initial state χ. Define an operator Ω : for φ ∈ C with associated polynomial Φ and χ ∈ M, where Ω 2 is defined from Ξ in Proposition 7. Then the law of (χ t , t ∈ R + ) is the unique solution in D M ([0, ∞)) of the martingale problem (Ω, Π) with initial state χ.
Theorem 5. Assume Ξ ∈ M nd , let χ ∈ M , and let (χ t , t ∈ R + ) be a càdlàg version of an M -valued Ξ-Fleming-Viot process with initial state χ. Define an operator Ω : for φ ∈ C with associated polynomial Φ and χ ∈ M , where Ω 1 is defined from Ξ in Proposition 6. Then the law of (χ t , t ∈ R + ) is the unique solution in D M ([0, ∞)) of the martingale problem (Ω , Π ) with initial state χ.
Proof of Theorem 4. For χ ∈ M, we denote the probability measure of our probability space by P χ and the associated expectation by E χ when we wish to indicate that R has distribution ν χ . Under P χ , we may assume (χ t , t ∈ R + ) = (ψ(R t ), t ∈ R + ) a. s. Denote by F = (F t , t ∈ R + ) the filtration induced by the process (χ t , t ∈ R + ). Let φ ∈ C with associated polynomial Φ, fix 0 ≤ s ≤ t, and A ∈ F s . By the Markov property of (χ y , y ∈ R + ), it holds with We have The first equality follows from Fubini and the definitions of Φ and Ω. For the second equality, we apply Lemma 3 and Proposition 4. Proposition 7 implies that expression (31) equals zero. It follows that the process (χ t , t ∈ R + ) solves the martingale problem (Ω, Π 1 ). We prove uniqueness using a function-valued dual process. This method is applied in the context of tree-valued Fleming-Viot processes in [9], another dual process is used in [20]. We fix n ∈ N and work with a dual process with state space C n . We define the function ι 3 : S n → (C n → C n ), σ → (φ → (φ • ι 2 (σ))).
For σ ∈ S n , we write also σ instead of ι 3 (σ) for brevity. We define an independent process (φ t , t ∈ R + ) as the Markov process with càdlàg paths in C n such that • for each σ ∈ S n \ {∅} at rate λ n,σ , the process jumps from φ to σ(φ), • and between jumps, the process evolves deterministically according to φ t+s (r , u ) = φ t (r , u + 1 n s) for s, t ∈ R + and (r , u ) ∈ D n .
For π ∈ P n , we write also π instead of ι 4 (π) for brevity. We prove uniqueness analogously to the proof of Theorem 4. We define an independent process (φ 0 t , t ∈ R + ) as the Markov process with càdlàg paths in C n such that • for each π ∈ P n at rate λ π , the process jumps from φ ∈ C n to π(φ), • and between jumps, the process evolves deterministically according to for s, t ∈ R + and ρ ∈ D n .

The M-valued evolving Ξ-coalescent
By Theorem 1, we can define an M-valued stochastic process as follows.
Definition 5. Assume Ξ ∈ M CDI , let X ∈ M, and assume ρ d ∼ ν X . We say the stochastic process given by (υ(ρ t ), t ∈ R + ) on an event of probability 1 is an M-valued evolving Ξ-coalescent starting from X .  The following definition is analogous to Definition 4. Definition 6. Assume Ξ ∈ M CDI . We call a random variable with values in M that is distributed as υ(ρ(η, 0)) an M-valued Ξ-coalescent. Theorem 6. Assume Ξ ∈ M CDI , let X ∈ M, and assume ρ d ∼ ν X . Then the M-valued evolving Ξ-coalescent starting from X given by is Markov with a. s. càdlàg paths in (M, d GHP ). The set of discontinuity times is a. s. equal to Θ ∪ Θ .
Let (X n , n ∈ N) be a sequence in M with lim n→∞ d GP (X n , X ) = 0. For n ∈ N, let ρ n be a random variable with values in D and distribution ν X n that is independent of η. Then the continuity property is satisfied for all t ∈ (0, ∞). Furthermore, X t converges in distribution in the Gromov-Hausdorff-Prohorov topology to an M-valued Ξ-coalescent as t tends to infinity.
Proof. We work in the context of Theorem 1. It holds (X t , t ∈ R + ) = ([X t , ρ, m t ], t ∈ R + ) a. s. This implies the path regularity. We denote the left limit at each time t ∈ (0, ∞) by X t− = lim s↑t X s . On an event of probability 1, we have X t = X t− for all t ∈ Θ as X t and X t− are not isometric for these t by Theorem 1(iii). Moreover, we have X t = X t− for all t ∈ Θ a. s. by Theorem 1(ii). The Markov property follows as in Theorem 2.
To show equation (32), let t, ε > 0 and n ∈ N. Let (Z , ρ ) be the lookdown space associated with η and (ρ n , 0) ∈ D. Let X t ⊂ Z be the closure of the individuals at time t therein, and define a probability measure m t on Z analogously to m t . Then there exists a. s. a measure-preserving homeomorphism h between (X t , ρ, m t ) and (X t , ρ , m t ). The correspondence R = {(x, h(x)) : x ∈ X t } ⊂ X t × X t has distortion max{|ρ n i,j − ρ i,j | : i, j ∈ A 0 (t, N)}. With the coupling ν(dx dx ) = m t (dx)δ h(x) (dx ) of m t and m t , Proposition 2 implies P(d GHP (υ(ρ(η, ρ n , t)), X t ) ≥ ε) for all k ∈ N as it holds [X t , ρ , m t ] = υ(ρ(η, ρ n , t)) a. s. We may assume w. l. o. g. that the distance matrices ρ n converge in probability. We let n and then k tend to infinity. The proof of the last assertion is similar to the proof of Theorem 3. Let the random variable X be an M-valued Ξ-coalescent. Let ρ be a D-valued random variable that is independent of η with conditional distribution ν X given X . Let (X t , t ∈ R + ) = (υ(ρ(η, ρ , t)), t ∈ R + ) a. s.
Then also X t is an M-valued evolving Ξ-coalescent for all t ∈ R + and it holds X t = X t on the event {diam X t < 2t}. By the Markov property, as P(diam X 1 < 2) > 0, and as {X 1 < 2} is independent of ρ, the random time τ = inf{t ≥ 0 : diam X t < 2t} is geometrically bounded. The assertion follows as it holds diam X t < 2t for all t > τ .
Remark 12. It follows from Remark 3 that the subspaces X t of the lookdown space are ultrametric for all t ∈ R + if X 0 is ultrametric. As ultrametric spaces are closely related to R-trees, Definition 5 can be used to define a process with values in the space of measurepreserving isometry classes of weighted R-trees. Alternatively, the lookdown space can be extended to an R-tree so that this process can be read off directly.
10 Uniform convergence in the lookdown model Donnelly and Kurtz [12] prove that the measure-valued processes whose states are the empirical measures of the types on the first n levels in the lookdown model converge a. s. as n tends to infinity. In this section, we adapt their techniques to prove Lemmata 14 and 15 below. In the proofs, we use the exchangeability properties from Section 5.4. We apply Lemma 14 in the proofs of Lemmata 8 and 13, and we apply Lemma 15 in the proofs of Lemma 9 and Theorem 1.
We work with the stochastic processes (U (t), t ∈ R + ) and (V (t), t ∈ R + ) which we define by U (t) = By equation (12), the random variable U (t) is finite for all t ∈ R + a. s., it equals the sum of the squared asymptotic block frequencies in the reproduction events up to time t. If Ξ ∈ M dust , then also the random variable V (t) is finite for all t ∈ R + a. s. as it holds Impose the assumption on f that |X n (t) − X n (s)| 1{η((s, t] × P b ) = 0} ≤ n −1 N b+1+n (s, t] for all n ∈ N and 0 ≤ s < t. Then there exists a process (X(t), t ∈ R + ) with lim n→∞ sup t∈[0,T ] |X n (t) − X(t)| = 0 a. s. for all T ∈ R + .
Lemma 15. Assume Ξ ∈ M dust and that R is jointly exchangeable. Let b ∈ N and let f be a function from R + × R 2b + × R + to {0, 1}. For t ∈ R + and i, n ∈ N, define Y i (t) = f (t, (r t,j,b+i , u t,j ) j∈[[b]] , u t,b+i ) and X n (t) = 1 n n i=1 Y i (t).
The following proof is an adaptation of the proofs of Lemmata 3.4 and 3.5 of Donnelly and Kurtz [12] and of Lemma 3.2 of Birkner et al. [6].
As it holds up to null events for all k ∈ N 0 , the above implies For k ∈ N 0 , it holdsα k ≥ β k ∧ α k+1 a. s. on the event Indeed, the intersection of this event with {α k < β k ∧ α k+1 } is a null event, as it holds there |X n (α k ) − X n (α k )| < 3ε, whereas we have |X n (α k ) − X n (α k )| ≥ 4ε a. s. by definition ofα k and right continuity. Similarly, it holdsβ k ≥ α k+1 a. s. on the event By Lemma 16 below and the Markov inequality, it holds ∈N k P(N b+1+ (0, α 1 ) > ε) < ∞.
By the strong Markov property of (N b+1+ (0, t], t ∈ R + ) at α k and by the assumption on f , it follows Altogether, it follows that there exist δ which do not depend on n such that ∞ =1 δ < ∞ and P( sup |X n (t) − X (t)| > 4ε, U (T ) ≤ c, E ) < δ for all ∈ N. By Corollary 1 and the de Finetti Theorem, there exists an event of probability 1 on which the limits X(t) = lim n→∞ X n (t) exist for all t ∈ Q + . Hence, The assertion follows by letting c tend to infinity.
Lemma 16. For ∈ N, let α 1 be defined as in the proof of Lemma 14. Then there exists a constant C such that E[(N 2 (0, α 1 )) 4 ] ≤ C −2 for all ∈ N.
The proof extends the argument presented on p. 44 in [6] where additional assumptions on Ξ are required to ensure that the process used there instead of U (t) is finite.
The last inequality follows from the definitions of (V (t), t ∈ R + ) and α 1 . The assertion for Y i (t) = 1{u t,i ≥ t} can be proved by setting b = 0, P (0) = ∅, and defining E as the sure event. Alternatively, we can choose b = 1 and f such that f (t, (r t,j,b+i , u t,j ) j∈ [[b]] , u t,b+i ) = 1{u t,b+i ≥ t} for t ∈ R + and i ∈ N.