A representation for exchangeable coalescent trees and generalized tree-valued Fleming-Viot processes

We give a de Finetti type representation for exchangeable random coalescent trees (formally described as semi-ultrametrics) in terms of sampling iid sequences from marked metric measure spaces. We apply this representation to define versions of tree-valued Fleming-Viot processes from a $\Xi$-lookdown model. As state spaces for these processes, we use, besides the space of isomorphy classes of metric measure spaces, also the space of isomorphy classes of marked metric measure spaces and a space of distance matrix distributions. This allows to include the case with dust in which the genealogical trees have isolated leaves.


Introduction
1.1 Some background on coalescent trees, ultrametrics, and metric measure spaces In population genetics, coalescents are common models for the genealogy of a sample from a population. The Kingman coalescent [33] is a partition-valued process in which each individual of the sample forms its own block at time 0, and as we look into the past, each pair of blocks merges independently at constant rate. These blocks stand for the families of individuals that have a common ancestor at given times in the past. Generalizations of the Kingman coalescent include the Λ-coalescent (Pitman [44], Sagitov [46], Donnelly and Kurtz [16]) where multiple blocks are allowed to merge to a single block at the same time, and the Ξ-coalescent (Möhle and Sagitov [41], Schweinsberg [47]) where several clusters of blocks may also merge simultaneously. A (semi-)ultrametric ρ is a (semi-)metric that satisfies the strong triangle inequality max{ρ(x, y), ρ(y, z)} ≥ ρ(x, z). A realization of a coalescent for an infinite sample can be expressed as a càdlàg path (π t , t ∈ R + ) with values in the space of partitions of N such that π t is a coarsening of π s for all s ≤ t. We assume that for each pair of integers, there is a time t such that the elements of this pair are in a common block of π t . Then (π t , t ∈ R + ) can equivalently be expressed as a semi-ultrametric ρ on N such that for all t ∈ R + and i, j ∈ N, ρ(i, j) ≤ 2t if and only if i and j are in the same block of π t , (1.1) and (1.1) yields a one-to-one correspondence between these càdlàg paths and the semiultrametrics on N, cf. [21,Example 3.41] and [22, p. 262].
Evans [20] studies the completion of the random ultrametric space associated with the Kingman coalescent which he endows with a probability measure such that the mass on each ball is given by the asymptotic frequency of the corresponding family, and a class of more general coalescents is studied by Berestycki et al. [3]. Remark 1.1. Let us briefly recall the well-known correspondence between ultrametric spaces and real trees to which we will refer to explain main concepts in this article. A real tree is a metric space (T, d) that is tree-like in the sense that (i) no subspace is homeomorphic to the unit circle, and (ii) for each x, y ∈ T , there exists an isometry ι from the real interval [0, d(x, y)] to T with ι(0) = x and ι(d(x, y)) = y, see e. g. Evans [21] for an overview. An ultrametric space (X, ρ) can be isometrically embedded into the real tree (T, d) that is obtained by identifying the elements with distance zero of the semi-metric space (R + × X, d) given by d((s, i), (t, j)) = max{ρ(i, j) − s − t, |s − t|}. Then T equals the set U 0 on in [22, p. 262] with X = N and the metrics d here and in [22, p. 262] coincide up to a factor 2. Clearly, (X, ρ) is isometric to the subspace {0} × X of the leaves of (T, d). For a semi-ultrametric space (X, ρ), we identify the elements with distance zero to obtain an ultrametric space which we associate with a real tree (T, d) as above. A related embedding of an ultrametric space is given in [29,Section 6].
As in Remark 1.1, a semi-ultrametric on N can be considered as an infinite tree whose leaves are labeled by the elements of N. Often these labels are not relevant, for instance, when they only record the order in which iid samples from a population are drawn. To remove the labels, we could pass to the isometry class. However, the asymptotic block frequencies in the coalescent given by an ultrametric on N are not determined by the isometry class, as one may apply an infinite permutation without changing the isometry class. To retain just this information besides the metric structure, we can take a measure-preserving isometry class of the completion of the ultrametric space that is endowed with a probability measure that charges each ball with the asymptotic frequency of the corresponding block, if such a probability measure exists. This probability measure can equivalently be described as the weak limit of the uniform probability measures on the individuals 1, . . . , n, as n → ∞. Then we obtain the description by isomorphy classes of metric measure spaces of Greven, Pfaffelhuber, and Winter [25] that was applied to Λ-coalescents in the dust-free case. We speak of the dust-free case if the semi-ultrametric space has no isolated points, which means that the coalescent tree has no isolated leaves. Greven, Pfaffelhuber, and Winter [25] also show that their approach is not directly applicable to Λ-coalescents with dust. The most elementary example for the case with dust is the star-shaped coalescent which starts in the partition into singleton blocks which all merge into a single block at some instant. The associated ultrametric on N induces the discrete topology. Here the uniform probability measures on 1, . . . , n do not converge weakly as they converge vaguely to the zero measure.
A triple (X, r, µ) that consists of a complete and separable metric space (X, r) and a probability measure µ on the Borel sigma algebra on X is called a metric measure space. For a metric measure space (X, r, µ), one can consider the matrix (r(x(i), x(j))) i,j∈N of the distances between µ-iid samples (x(i)) i∈N . The distribution of (r(x(i), x(j))) i,j∈N is called the distance matrix distribution of (X, r, µ). By the Gromov reconstruction theorem (see Theorem 4 of Vershik [50]), there exists a measure-preserving isometry between the supports of the measures of any two metric measure spaces that have the same distance matrix distribution, in which case we call them isomorphic.
We view a random semi-metric ρ on N as the random matrix (ρ(i, j)) i,j∈N , and we call it exchangeable if (ρ(i, j)) i,j∈N is distributed as (ρ(p(i), p(j))) i,j∈N for each (finite) permutation p of N. Under an appropriate condition which we interpret as dust-freeness in Remark 3.13, Vershik [50,Theorem 5] associates with any typical realization of an exchangeable (and ergodic) random semi-metric on N a metric measure space whose distance matrix distribution is the distribution of this semi-metric. In the next subsection, we discuss an extension of such a representation to the case with dust.

The sampling representation
We give a representation for all exchangeable random semi-ultrametrics on N in terms of sampling from random marked metric measure spaces. Marked metric measure spaces are introduced in Depperschmidt, Greven, and Pfaffelhuber [12]. A (R + )-marked metric measure space is a triple (X, r, m) that consists of a complete and separable metric space (X, r) and a probability measure m on the Borel sigma algebra on the product space X × R + . The marked distance matrix distribution of a marked metric measure space (X, r, m) is defined as the distribution of ((r(x(i), x(j))) i,j∈N , (v(i)) i∈N ) where (x(i), v(i)) i∈N is an m-iid sequence in X ×R + . Marked metric measure spaces with the same marked distance matrix distribution are called isomorphic.
In the present article, we use marked metric measure spaces to obtain from a random variable (r,ṽ) that has the marked distance matrix distribution of a marked metric measure space an exchangeable semi-metricρ on N bỹ ρ(i, j) = (r(i, j) +ṽ(i) +ṽ(j)) 1{i = j} .
We call the distribution of (ρ(i, j)) i,j∈N the distance matrix distribution of the marked metric measure space. The basic result in this article (stated in Theorem 3.9 below) is that every exchangeable semi-ultrametric ρ on N can be represented as the outcome of a two-stage random experiment, where we have the isomorphy class χ of a random marked metric measure space in the first stage, and we sample (ρ(i, j)) i,j∈N from this marked metric measure space according to its distance matrix distribution in the second stage.
We construct χ realization-wise from the exchangeable semi-ultrametric ρ: the key idea is to decompose the tree that is associated with a realization of ρ into the external branches and the remaining subtree. Here we define that an external branch consists only of the leaf if that leaf corresponds to an integer that has ρ-distance zero to another integer. In the marked metric measure space, the marks encode the external branch lengths, and the metric space describes the remaining subtree. We call the semi-ultrametric dust-free if the external branches all have length zero a. s. In this case, the marked metric measure space can also be replaced by a metric measure space (as in Corollary 3.12). We prove Theorem 3.9 in Section 10. In Section 2, we formulate the decomposition at the external branches in terms of semi-ultrametrics.
The representation for exchangeable semi-ultrametrics from Theorem 3.9 can also be seen in the more general but less explicit contexts of the ergodic decomposition (Section 3.5) and the Aldous-Hoover-Kallenberg representation (see e. g. [31,Section 7]). In the representation result outlined above, the distance matrix distribution of the isomorphy class χ of the marked metric measure space is the ergodic component in whose support the realization ρ lies. The ergodic component is also characterized by χ itself, or in the dust-free case by the isomorphy class of a metric measure space. The finite analog of the aforementioned ergodic decomposition is that a (discrete) random tree whose leaves are labeled exchangeably can be obtained by first drawing the random unlabeled tree and then sampling the labels of the leaves uniformly without replacement.
We mention that Evans, Grübel, and Wakolbinger [22] also decompose real trees into the external branches and the remaining subtree to give a representation of the elements of the Doob-Martin boundary of Rémy's algorithm in terms of sampling from a weighted real tree and an additional structure. In [22,Section 7], a sampling representation for exchangeable ultrametrics is considered (see Remark 10.9).

Evolving genealogies
In Section 4, we lay the foundation for our study of evolving genealogies by considering a general time-homogeneous Markov process with values in the space of semi-ultrametrics on N; this process describes evolving leaf-labeled trees. Assuming that the state at each time is exchangeable, we map this process realization-wise to the processes of the ergodic components. We express these ergodic components as (isomorphy classes of) metric measure spaces and marked metric measure spaces, and as distance matrix distributions, respectively. Here we use the representation result for exchangeable semi-ultrametrics. This approach characterizes the processes of the ergodic components up to null events only at countably many time points, i. e. as versions, as we discuss in Remark 4.4. Using the criterion of Rogers and Pitman [45,Theorem 2], we deduce that these image processes are also Markovian, and we describe them by well-posed martingale problems. This is an example of Markov mapping in the sense of Kurtz [36], and Kurtz and Nappo [37].
In Sections 5 -6, we study a concrete Markov process with values in the space of semi-ultrametrics, namely the process given by the evolving genealogical trees in a lookdown model with simultaneous multiple reproduction events. Lookdown models were introduced by Donnelly and Kurtz [15,16] to represent measure-valued processes along with their genealogy, see also e. g. Etheridge and Kurtz [18] and Birkner et al. [8]. A lookdown model can be seen as a (possibly) infinite population model in which each individual at each time is assigned a level. The role of this level is model-inherent, namely to order the individuals such that the restriction of the model to the first finitely many levels is well-behaved (i. e. only finitely many reproduction events are visible in bounded time intervals) and that the modeled quantity (e. g. types, genealogical distances) is exchangeable. In [16] and in the present article, the level is the rank among the individuals at the respective time according to the time of the latest descendant. Although the levels in finite restrictions of the lookdown model differ from the labels in the Moran model, the processes of the unlabeled genealogical trees coincide which is used to study the length of the genealogical trees in Pfaffelhuber, Wakolbinger, and Weisshaupt [43] and Dahmer, Knobloch, and Wakolbinger [11].
In Section 7, we remove the labels from the evolving genealogical trees in the infinite lookdown model by applying the result from Section 4 to the process from Sections 5 -6. We call the processes of the ergodic components tree-valued Fleming-Viot processes, regardless which one of the three state spaces we use. The tree-valued Fleming-Viot process with values in the space of isomorphy classes of metric measure spaces is introduced in the case with binary reproduction events (which is associated with the Kingman coalescent) by Greven, Pfaffelhuber, and Winter [26] as the solution of a well-posed martingale problem that is the limit in distribution of corresponding processes read off from finite Moran models. In [26,Remark 2.20], a construction of (a version of) this process from the lookdown model of Donnelly and Kurtz [15] is outlined. The aim in the present article regarding tree-valued Fleming-Viot process is the generalization to the case with dust. We remark that tree-valued Fleming-Viot processes with mutation and selection are studied in Depperschmidt, Greven, and Pfaffelhuber [13,14] where the states are isomorphy classes of marked metric measure spaces and the marks encode allelic types. In the present article, the marks encode lengths of external branches. We consider only the neutral case, and we describe genealogies without using types.
In Section 8, we show continuity properties of the semigroups of tree-valued Fleming-Viot processes and that the domains of the martingale problems for them are cores. In Section 9, we show that tree-valued Fleming-Viot processes converge in distribution to equilibrium.
While we construct versions of tree-valued Fleming-Viot processes in the present article using the representation result, the full sample paths are constructed by techniques specific to the lookdown model in the companion article [28].

Additional related literature
Aldous [1] represents consistent families of finite trees that satisfy a "leaf-tight" property by random measures on ℓ 1 (and random subsets of ℓ 1 ). Kingman's coalescent is given as an example in [1]. The "leaf-tight" property corresponds to the absence of dust. A representation for exchangeable hierarchies in terms of sampling from random weighted real trees is given by Forman, Haulk, and Pitman [23]. There are many other representation results for exchangeable structures in the literature. For instance, by the Dovbysh-Sudakov theorem, see Austin [2] for a proof based on a representation for exchangeable random measures, jointly exchangeable arrays that are non-negative definite can be represented in terms of sampling from the space The genealogy in the lookdown model is further studied in Pfaffelhuber and Wakolbinger [42]. Kliem and Löhr [34] further study marked metric measure spaces. In their article, tree-valued Λ-Fleming-Viot processes in the dust-free case is also mentioned. Kliem and Winter [35] use marked metric measure spaces to describe trait-dependent branching processes. In the context of measure-valued spatial Λ-Fleming-Viot processes with dust, Véber and Wakolbinger [49] work with a skeleton structure. Functionals of coalescents like external branch lengths have also been studied, see for example [40]. Also the time evolution of such functionals has been studied for evolving coalescents, see for example [10,32].
Bertoin and Le Gall [5][6][7] represent Ξ-coalescents in terms of sampling from flows of bridges from which they also construct measure-valued Fleming-Viot processes. They also consider mass coalescents. Mass coalescents (see e. g. Chapter 4.3 in Bertoin [4]) also describe genealogies without labeling individuals. In Section 12, we construct the Fleming-Viot process with values in the space of distance matrix distributions from the dual flow of bridges. We also mention the work of Labbé [38] where relations between the lookdown model and flows of bridges are studied.

Distance matrices and their decompositions
We write N = {1, 2, 3, . . .}. Let U denote the space of semi-ultrametrics on N and let D denote the space of semimetrics on N. We view U and D as subspaces of R N 2 in that we do not distinguish between a semi-metric ρ and the distance matrix (ρ(i, j)) i,j∈N . We endow R N 2 with a complete and separable metric that induces the product topology when R is equipped with the Euclidean topology. Using the map ∈ U} whose elements we call decomposed semi-ultrametrics or marked distance matrices. As above, we view D × R N + andÛ as subspaces of R N 2 × R N which we endow with a complete and separable metric that induces the product topology.
We define the function and we denote by β the function that maps a semi-ultrametric ρ ∈ U to the decomposed semi-ultrametric (r, v) ∈Û that is given by v = Υ(ρ) and The interpretation of these functions is given in Remark 2.2 below from which it follows that r is a tree-like semi-metric (i. e., r is 0-hyperbolic, see e. g. [21]). Alternatively, it can be easily checked that r satisfies the triangle inequality. The function α retrieves the semi-ultrametric from a decomposed semi-ultrametric. For instance, α • β is the identity map on U. Remark 2.1. Let us agree on the following notation. When we identify the elements of a semi-metric space (X, ρ) that have ρ-distance zero to obtain a metric space (X ′ , ρ), we refer by each element x ∈ X also to the associated element of X ′ . Furthermore, we define the metric completion of the semi-metric space (X, ρ) as the metric completion of (X ′ , ρ). Remark 2.2. Let ρ ∈ U, (r, v) = β(ρ), and let (T, d) be the real tree associated with ρ as in Remark 1.1 with X = N. Then v(i) = Υ(ρ)(i) can be interpreted as the length, and (i, v(i)) as the starting vertex of the external branch that ends in the leaf (i, 0) of T . Here we define that this external branch consists only of the leaf if there exists k ∈ N \ {i} with ρ(i, k) = 0. Furthermore, the map ϕ(i) = (i, v(i)) from (N, r) to (T, d) is distance-preserving.
In this sense, the map β : ρ → (r, v) decomposes the coalescent tree that is given by ρ into the external branches with lengths v and the subtree spanned by their starting vertices whose mutual distances are given by r. More generally, any element ofÛ can be seen as a decomposed coalescent tree.
We call a semi-ultrametric ρ ∈ U dust-free if Υ(ρ) = 0, that is, if all external branches in the associated tree have length zero so that there are no isolated leaves.
3 Sampling from marked metric measure spaces 3

.1 Preliminaries
Recall the definitions of metric measure spaces, marked metric measure spaces, and their (marked) distance matrix distributions from Sections 1.1 and 1.2. Also recall that two metric measure spaces are said to be isomorphic if they have the same distance matrix distributions. We denote the set of isomorphy classes of metric measure spaces by M and we endow it with the Gromov-weak topology in which metric measure spaces converge if and only if their distance matrix distributions converge. Greven, Pfaffelhuber, and Winter [25] showed that M is then a Polish space.
Analogously, two marked metric measure spaces are said to be isomorphic if they have the same marked distance matrix distributions. We denote the set of isomorphy classes of marked metric measure spaces byM and we endow it with the marked Gromov-weak topology in which marked metric measure spaces converge if and only if their marked distance matrix distributions converge weakly. This makesM a Polish space, as shown by Depperschmidt, Greven, and Pfaffelhuber [12].
We denote the distance matrix distribution of the isomorphy class of a metric measure space χ ∈ M by ν χ . We denote the marked distance matrix distribution of χ ′ ∈M by ν χ ′ , so that α(ν χ ′ ) is the distance matrix distribution of χ ′ , in accordance with the definition in Section 1.2. (We denote by ϕ(ξ) = ξ • ϕ −1 the pushforward measure of a measure ξ on a measurable space E under a measurable function ϕ on E.) Remark 3.1. We call a marked metric measure space (X, r, m) dust-free if the probability measure m is of the form m = µ⊗δ 0 for a probability measure µ on the Borel sigma algebra on X. Then the distance matrix distribution α(ν (X,r,µ⊗δ 0 ) ) equals the distance matrix distribution ν (X,r,µ) of the metric measure space (X, r, µ). We call (X, r, µ) the metric measure space associated with the dust-free marked metric measure space (X, r, µ ⊗ δ 0 ).
Let S ∞ denote the group of finite permutations on N. We define the action of S ∞ on D and D × R N + , respectively, by p(ρ) = (ρ(p(i), p(j))) i,j∈N and p(r, v) = ((r(p(i), p(j))) i,j∈N , (v(p(i)) i∈N ) for p ∈ S ∞ , ρ ∈ D, (r, v) ∈ D × R N + . A random variable, for instance with values in D or D × R N + , is called exchangeable if its distribution is invariant under the action of the group S ∞ . Remark 3.2. Exchangeable random variables with values in D or D × R N + can be seen as jointly exchangeable arrays, see e. g. [31,Section 7]. Also recall that the definition of exchangeability does not change when S ∞ is replaced with the group of all bijections from N to itself, as the finite restrictions determine the distribution of a random variable in D or D × R N + . Remark 3.3. The coalescents associated by (1.1) with the exchangeable semi-ultrametrics on N form a larger class of processes than the so-called exchangeable coalescents defined in e. g. Section 4.2.2 of Bertoin [4]. For example, the coalescent process associated with an exchangeable semi-ultrametric on N needs not be Markovian. is a closed subspace ofM. It contains the marked metric measure spaces with ultrametric distance matrix distribution. Following e. g. [25,26] and Remark 1.1, we call the elements of U trees. Also the elements ofÛ may be called trees (as in Remark 10.8 below). Proposition 3.4 below states that a. e. realization of aÛ-valued random variable with the marked distance matrix distribution of a marked metric measure space inÛ is the decomposition of a semi-ultrametric by the map β from Section 2. As a consequence, the isomorphy class of a marked metric measure space inÛ is determined already by its distance matrix distribution.
Remark 3.5. We call a semi-ultrametric ρ ∈ U dust-free if Υ(ρ) = 0. It can be seen as a consequence of Proposition 3.4 that (the isomorphy class of) a marked metric measure space (X, r, m) inÛ is dust-free (as defined in Remark 3.1) if and only if a random variable with distribution α(ν (X,r,m) ) is a. s. dust-free. In particular, a random variable with the distance matrix distribution of a metric measure space is a. s. dust-free.

Marked metric measure spaces from marked distance matrices
In this subsection, we define functions by which we construct a (marked) metric measure space from a (marked) distance matrix. An interpretation of these functions is given in Remark 3.8 below. In Remark 3.16, we state their role in the context of the ergodic decomposition.
First we define the function ψ : D → M that maps ρ ∈ D to the isomorphy class of the metric measure space (X, ρ, µ), given as follows: (X, ρ) is the metric completion of (N, ρ). The probability measure µ is defined as the weak limit of the probability measures n −1 n i=1 δ i as n tends to infinity, if this weak limit exists. If the limit does not exist, we define m arbitrarily, let us set µ = δ 1 . Furthermore, we denote by D * the subset of distance matrices ρ ∈ D such that the weak limit in the definition above exists.
Analogously, we define the functionψ : D×R N + →M that maps (r, v) to the isomorphy class of the marked metric measure space (X, r, m), where (X, r) is the metric completion of the semi-metric space (N, r) and m is the weak limit of the probability measures n −1 n i=1 δ (i,v(i)) on X × R + if this weak limit exists, else we set m = δ (1,0) . We denote byD * the subset of marked distance matrices (r, v) ∈ D × R N + such that the weak limit in the definition above exists.
We call µ and m in the definitions of ψ andψ also sampling measures.
Remark 3.8 (An interpretation of ψ andψ). For ρ ∈ D * ∩U, the probability measure in the ultrametric metric measure space ψ(ρ) charges each ball with the asymptotic frequency of the corresponding block of the coalescent which is associated with ρ by (1.1).
Similarly, for (r, v) ∈D * ∩Û, let (X, r, m) be the representative ofψ(r, v) from the definition ofψ. We consider the completion (T , d) of the real tree (T, d) associated with (r, v) as in Remark 2.2, and the extension ϕ : X →T of the isometry ϕ from Remark 2.2. Then the image measure µ := ϕ(m(· × R + )) charges each region ofT with the asymptotic frequency of the integers that label the leaves of T that are the endpoints of external branches that begin in that region.

The sampling representation
The basic result in this paper is stated in Theorem 3.9 below. Here we consider an exchangeable random semi-ultrametric ρ on N, and we assert existence of a random variable χ with values in the space of isomorphy classes of marked metric measure spaces that has the following property: Let ρ ′ be a random variable whose conditional distribution given χ is the distance matrix distribution of χ. Then the random variables ρ and ρ ′ have the same (unconditional) distribution. (In the language of the theory of random measures, this means that the distribution of ρ is equal to the first moment measure Theorem 3.9. Let ρ be an exchangeable U-valued random variable. Let χ =ψ •β(ρ). Let ρ ′ be a U-valued random variable whose conditional distribution given χ is α(ν χ ). Then: (ii) ρ and ρ ′ are equal in distribution.
Assertion (i) above states that for a typical realization of ρ and its decomposition β(ρ), the sampling measure m in the definition ofψ(β(ρ)) in Subsection 3.3 is the weak limit of the uniform probability measures therein. Assertion (iii) states that the realization of χ can typically be reconstructed from the realization of ρ ′ . We interpret the reconstruction mapψ • β in terms of the ergodic decomposition in Remark 3.16. We prove Theorem 3.9 in Section 10.4. We give two proofs of Theorem 3.9(i). In one of them, the de Finetti theorem yields the aforementioned sampling measure m as the directing measure of an exchangeable sequence.
We also note the following uniqueness property which is proved in Section 10.3.
Proposition 3.11. Let χ and χ ′ beÛ-valued random variables. Let ρ be a U-valued random variable with conditional distribution α(ν χ ) given χ, and let ρ ′ be another Uvalued random variable with conditional distribution α(ν χ ′ ) given χ ′ . Then ρ and ρ ′ are equal in distribution if and only if χ and χ ′ are equal in distribution.
(In terms of first-moment measures, Proposition 3.11 says that χ and χ ′ are equal in distribution if and only if E[α( The aim of the present paper is the treatment of the case with dust. In the dust-free case, we need not decompose the semi-metric ρ by the map β. Instead, we can work directly with the map ψ from Subsection 3.3. Theorem 3.9 then reduces to the setting of metric measure spaces as follows: Corollary 3.12. Let ρ be an exchangeable U-valued random variable that is a. s. dustfree. Let χ = ψ(ρ). Let ρ ′ be a U-valued random variable whose conditional distribution given χ is ν χ . Then: (ii) ρ and ρ ′ are equal in distribution.
Proof. This is immediate from Theorem 3.9 and Remarks 3.1, 3.5, and 3.6.
Remark 3.13. The assertions of Corollary 3.12 are closely related to Vershik [50]: Condition (4) in [50, Theorem 5] is a necessary and sufficient condition for an exchangeable (and ergodic) random semi-metric to have the distance matrix distribution of a metric measure space. By Remark 3.5, the marked metric measure space χ in Theorem 3.9 is a. s. dust-free if and only if ρ is a. s. dust-free. Hence, for a semi-ultrametric ρ, condition (4) in [50] is equivalent to dust-freeness. In the dust-free case, the metric measure space associated with χ as in Remark 3.1 is the completion of a typical realization of the semi-metric, endowed with the probability measure given by the asymptotic block frequencies of the associated coalescent (as in Remark 3.8). This can also be deduced from [50,Equation (9)]. Assertion (iii) can be proved by Proposition 10.5 below which is related to [50] as stated in Remark 10.6.

Interpretation as ergodic decomposition
In this subsection, we interpret the representation from Theorem 3.9 as the ergodic decomposition of an exchangeable distribution on the semi-ultrametrics on N.
We denote by U the space of exchangeable probability distributions on U, and we endow U with the Prohorov metric d P which is complete and separable. We will also consider the subspace U erg = {ξ ∈ U : ξ = α(ν (X,r,m) ) for some marked metric measure space (X, r, m)} of distance matrix distributions of marked metric measure spaces. The sets U erg andÛ are in one-to-one correspondence by Proposition 3.4. Hence, also the elements of U erg can be seen as trees.
We define the invariant sigma algebra I on U as the sigma algebra that is generated by those Borel sets B ⊂ U that satisfy B = {(ρ(p(i), p(j))) i,j∈N : ρ ∈ B} for all finite permutations p ∈ S ∞ . A distribution ξ on U is called ergodic (with respect to the action of the group S ∞ of finite permutations) if ξ(I) ∈ {0, 1} for all I ∈ I. Proposition 3.14. The distance matrix distribution α(ν (X,r,m) ) of a marked metric measure space (X, r, m) is invariant and ergodic with respect to the action of the group of finite permutations.
Proof. This is analogous to [50,Lemma 7]. For I ∈ I, the Borel setĨ ⊂ (X × R + ) N that given bỹ is invariant under finite permutations, that is, From the ergodicity of an m-iid sequence (x(i), v(i)) i∈N , we obtain Proof. By Theorem 3.9(ii), each element of U is a mixture of elements of U erg . The assertion follows by Proposition 3.14 and as the ergodic distributions in U are extreme in the convex set U (see e. g. [31, Lemma A1.2]).
Remark 3.16. Theorem 3.9 decomposes the distribution of the exchangeable U-valued random variable ρ ′ into ergodic components in the sense of e. g. Theorem A1.4 in Kallenberg [31]. The function is a decomposition map in the sense of Varadarajan [48,Section 4] so that typically, ζ(ρ ′ ) is the ergodic component in whose support the realization ρ ′ lies. Note that this ergodic component is characterized by the isomorphy class χ =ψ • β(ρ ′ ) of a marked metric measure space, and in the dust-free case also by the isomorphy class ψ(ρ ′ ) of a metric measure space. Some further references on the ergodic decomposition are given e. g. in [31, p. 475].
By the following proposition, (U erg , d P ) is Polish which will be applied in [27].
Proof. Let (ρ n , n ∈ N) be a sequence of U-valued random variables that converges in distribution to some U-valued random variable ρ. Assume that for each n ∈ N, the distribution of ρ n lies in U erg . Then ρ n has ergodic distribution by Proposition 3.14. Lemma 7.35 of [31] says that ρ n is dissociated, which means that for any disjoint I 1 , . . . , I k ⊂ N, the restrictions (ρ n (i, j)) i,j∈I 1 , . . ., (ρ n (i, j)) i,j∈I k are independent. As this property is preserved under the limit in distribution, it also holds for ρ, and another application of Lemma 7.35 of [31] and yields that ρ has ergodic distribution. The assertion follows by Proposition 3.15.

Application to tree-valued processes
Using the functionψ from Section 3.3, we map a Markov process (ρ t , t ∈ R + ) whose states are exchangeable U-valued random variables to a process with values in the space of isomorphy classes of marked metric measure spaces. At each time, the state of the image process is the marked metric measure space from the representation (Theorem 3.9) of the state of the U-valued process. We also consider the process of the distance matrix distributions of these marked metric measure spaces. In the dust-free case, we can also work with isomorphy classes of metric measure spaces and the map ψ as in Corollary 3.12.
In the proof of Theorem 4.1 below, we use the criterion of Rogers and Pitman [45,Theorem 2] to show that also the image processes are Markovian. A martingale problem for the U-valued process (ρ t , t ∈ R + ) or theÛ-valued process (β(ρ t ), t ∈ R + ) yields a martingale problem for the respective image process.
The so-called polynomials and marked polynomials, introduced in [12,25] have been used as domains of martingale problems in e. g. [13,14,26]. We recall them here, adapting the definition to our present use of the marks. The uniform continuity of the derivative in the definitions of C n andĈ n below will turn out useful in [27]. For n ∈ N, we write [n] = {1, . . . , n} for n ∈ N, and we denote by γ n the restriction from ). We denote also by γ n the restriction from R N 2 to R n 2 , γ n (ρ) = (ρ(i, j)) i,j∈ [n] . Let C n denote the set of bounded differentiable functions R n 2 → R with bounded uniformly continuous derivative. For φ ∈ C n , we denote also by φ the function φ • γ n : R N 2 → R, and we call the function U → R, χ → ν χ φ the polynomial associated with φ. (Here and at other places, we use the notation ξf = ξ(dx)f (x) for a measure ξ and an integrable function f , and we view measures also as functionals on spaces of integrable functions.) Similarly, we denote bŷ C n the set of bounded differentiable functions R n 2 × R n → R with uniformly continuous derivative. For φ ∈Ĉ n , we denote also by φ the function φ • γ n : R N 2 × R N → R, and we call the functionÛ → R, χ → ν χ φ the marked polynomial associated with φ. (Usually, the argument (r, v) of a function φ ∈Ĉ will be a marked distance matrix.) We write C = n∈N C n andĈ = nĈ n . We denote the set of polynomials by the set of marked polynomials bŷ and we define the set of test functions For a metric space E, let M b (E) denote the set of bounded measurable functions E → R. For a subset D ⊂ M b (E) and an operator G : D → M b (E), we mean by a solution of the martingale problem (G, D) a progressive E-valued process (X t , t ∈ R + ) such that for every f ∈ D, the process Gf (X s )ds is a martingale with respect to the filtration induced by (X t , t ∈ R + ), cf. Ethier and Kurtz [19, p. 173].
. Then the following two assertions hold: for all φ ∈Ĉ with associated polynomial Φ, and all χ ∈Û.

Assertion (iii) below holds under the additional assumption that
, then (χ t , t ∈ R + ) solves the martingale problem (B, Π), given by for all φ ∈ C with associated polynomial Φ, and all χ ∈ U.
The proof of Theorem 4.1 can be found in Section 10.5.
Remark 4.2. The process (β(ρ t ), t ∈ R + ) in Theorem 4.1 is Markov. This follows as (ρ t , t ∈ R + ) is Markov by assumption and as ρ t is determined by β(ρ t ) via ρ t = α(β(ρ t )) so that for all s ≤ t ≤ u and bounded measurable f :Û → R. This is an example for Dynkin's criterion [17,Theorem 10.13] for a function of a Markov process to be Markov.
Remark 4.3. In Theorem 4.1, if ρ t is dust-free for some t ∈ R + , thenχ t is (by Theorem 3.9 and Remark 3.5 the isomorphy class of a) dust-free marked metric measure space, χ t is the (isomorphy class of the) metric measure space associated (as in Remark 3.1) with (any representative of)χ t , and we have ξ t = ν χt . The process (χ t , t ∈ R + ) is relevant only in the dust-free case: If ρ t is not dust-free, then ψ(ρ t ) is just the arbitrary element of M from the definition of ψ in Section 3.3.
Remark 4.4. In Theorem 4.1, we characterize only versions of the processes (χ t , t ∈ R + ), (χ t , t ∈ R + ), and (ξ t , t ∈ R + ). That is, we do not make assertions on the full sample paths but only on the states at countably many times. From Theorem 3.9, we obtain β(ρ t ) ∈D * (and in the dust-free case also ρ t ∈ D * by Corollary 3.12) only for a fixed time t (or countably many t) on an event of probability 1. This means that the uniform probability measures on the starting vertices of the external branches that end in the first n leaves of the tree associated with the semi-ultrametric ρ t are shown to converge only at countably many times t on an event of probability 1. For β(ρ t ) ∈D * , a realizationχ t =ψ(β(ρ t )) can be considered as an ergodic component. At the other times t, we do not exclude thatχ t =ψ(β(ρ t )) is just the arbitrary element of M with probability measure δ (1,0) in the definition ofψ in Section 3.3.
Theorem 4.1 yields in particular the semigroups of the processes (χ t , t ∈ R + ), (χ t , t ∈ R + ), and (ξ t , t ∈ R + ). Also the martingale problems in Theorem 4.1 characterize only versions of these processes.
For the particular example of the process (ρ t , t ∈ R + ) in Sections 5 -9, it is shown in [28] that β(ρ t ) ∈D * (and ρ t ∈ D * in the dust-free case) also holds simultaneously for all t ∈ R + on an event of probability 1 (see Theorems 3.1(i) and 3.10(i), and Remarks 4.4 and 4.13 in [28]). This allows to construct the full sample paths (Section 4 in [28]). These results are obtained in [28] by techniques specific to the lookdown model. Remark 4.5. Theorem 4.1 is an example for Markov mapping. To show that the image processes (ψ(β(ρ t )), t ∈ R + ), (ξ t , t ∈ R + ), and (ψ(ρ t ), t ∈ R + ) are Markovian, we use the simple criterion of Rogers and Pitman [45,Theorem 2] as this criterion is formulated in terms of the abstract semigroups of the processes, which fits to our assumption that (ρ t , t ∈ R + ) is a general time-homogenous Markov process whose states ρ t are exchangeable.
A criterion for the Markov property of the image processes in terms of martingale problems is given in Corollary 3.5 of Kurtz [36] which requires more assumptions, including uniqueness for the martingale problem for (ρ t , t ∈ R + ) and existence of solutions of the martingale problems for the image processes. Corollary 3.5 of [36] would also yield uniqueness for the martingale problems for the image processes.
In the present paper, we use martingale problems only to provide additional characterizations of the processes under consideration. In Proposition 7.1, we show uniqueness for the martingale problems for the image processes directly by duality for the concrete examples from Section 7. Remark 4.6. In particular in Sections 8 -9, 11.3 and in [27], we need convergence determining (or at least separating) sets of test functions. As in [12,25,39], the sets Π andΠ are convergence determining in U andÛ, respectively. The argument from [39, Corollary 2.8] also applies for C : The algebra C generates the product topology on R N 2 . By a theorem due to Le Cam, see e. g. [39, Theorem 2.7] and the references therein, it follows that C is convergence determining in U. Hence, C generates the weak topology on U erg . AsΠ is an algebra (see [12,25]) and by definition of U erg , also C is an algebra. Again by [39,Theorem 2.7], it follows that C is convergence determining in U erg . Remark 4.7. The set of polynomials Π ′ = {Û → R, χ → α(ν χ )φ : φ ∈ C} is separating onÛ. This follows from Propositions 3.4 and 10.5 as in the proof of Proposition 3.11. Nevertheless, we work with the spaceΠ of test functions onM as Π ′ is not convergence determining, a counterexample can be constructed from [25, Example 2.12(ii)].

Genealogy in the lookdown model
In this section, we define a Markov process (ρ t , t ∈ R + ) to which we will later apply Theorem 4.1. In Subsection 5.1, we read off a realization of such a process from a population model that is driven by a deterministic point measure η. In Subsection 5.2, we let η be a Poisson random measure, and we study further properties of (ρ t , t ∈ R + ) in Subsection 5.3. We remark that for the lookdown model of Donnelly and Kurtz [15], the process of the evolving genealogical distances and its martingale problem are considered in Remark 2.20 of Greven, Pfaffelhuber, and Winter [26].

The deterministic construction
We denote by P the set of partitions of N. We endow P with the topology in which a sequence of partitions converges if and only if the sequences of their finite restrictions converge. For n ∈ N, we denote by P n the set of partitions of [n] = {1, . . . , n}. We denote the restriction map from P to P n by γ n , that is, γ n (π) = {B ∩ [n] : B ∈ π} \ {∅}. Recall that other restriction maps, e. g. from R N 2 → R n 2 are also denoted by γ n . Moreover, we denote by 0 n = {{1}, . . . , {n}} the partition in P n that consists of singletons only, and by P n = {π ∈ P : γ n (π) = 0 n } the set of partitions of N in which the first n integers are not all in different blocks. Furthermore, for π ∈ P, we denote by B 1 (π), B 2 (π), . . . the enumeration of the blocks of π with min B 1 (π) < min B 2 (π) < . . .. For i ∈ N, we denote by π(i) the integer j that satisfies i ∈ B j (π).
We use a lookdown model as the population model. In this model, there are countably infinitely many levels which are labeled by N, and each level is occupied by one particle at each time t ∈ R + . The particles undergo reproduction events which are encoded by a simple point measure η on (0, ∞) × P. A simple point measure is a purely atomic measure whose atoms all have mass 1. Let us impose a further assumption on η, namely η((0, t] × P n ) < ∞ for all t ∈ (0, ∞) and n ∈ N. (5.1) The interpretation of a point (t, π) of η is that the following reproduction event occurs: At time t−, the particles on the levels i ∈ N with i > #π are removed. At time t, for each i ∈ [#π], the particle that was on level i at time t− assumes level min B i (π) and has offspring on all other levels in B i (π). Thus, the level of a particle is non-decreasing as time evolves. Condition (5.1) means that for each n ∈ N, only finitely many particles jump away from the first n levels in bounded time intervals.
For all 0 ≤ s ≤ t, each particle at time t has an ancestor at time s. We denote by A s (t, i) the level of the ancestor at time s of the particle on level i at time t such that the maps Remark 5.1. We will use that the trajectories of the particles are non-crossing in the following sense: For any times s ≤ t and particles x, y on levels i x ≤ i y at time s ∈ R + , particle x is still alive if particle y is still alive, in which case the particles x and y occupy levels j x ≤ j y . In particular, if infinitely many particles at time s survive until time t, then all particles at time s survive until time t.
We are interested in the process of the genealogical distances between the particles that live at the respective times. Let ρ 0 ∈ R N 2 . (We can assume ρ 0 ∈ U here, but differentiability will be more elementary in the larger space, as a matter of taste.) We interpret ρ 0 (i, j) as the genealogical distance between the particles on levels i and j at time 0. We define the genealogical distance between the particles on levels i and j at time t by In words, the genealogical distance between two particles at a fixed time is twice the time back to their most recent common ancestor, if such an ancestor exists, else it is given by the genealogical distance between the ancestors at time zero.
Remark 5.2. If ρ 0 ∈ U, then ρ t ∈ U for each t ∈ R + . Indeed, a semi-metric ρ on N is a semi-ultrametric if and only if for each s ∈ R + , an equivalence relation ∼ on N is given by i ∼ j :⇔ ρ(i, j) ≤ s. If this property holds for ρ 0 , then the definition of ρ t readily yields that it also holds for ρ t . We also describe the process (ρ t , t ∈ R + ) in a more formal way which will be useful for the description by martingale problems in Section 5.2. With each partition π ∈ P n we associate a transformation R n 2 → R n 2 , which we also denote by π, by Here π(i) denotes the integer k such that i is in the k-th block, when blocks are ordered according to their minimal elements. Note that for each reproduction event encoded by a point (s, π) ∈ η, the corresponding jump of the process (ρ t , t ∈ R + ) can be described by In particular, γ n (π) = 0 n if π ∈ P \ P n , and 0 n acts as the identity on R n 2 . By assumption (5.1), there are only finitely many reproduction events in bounded time intervals that result in a jump of the process (γ n (ρ t (i, j)), t ∈ R + ). Between such jumps, the genealogical distances grow linearly with slope 2, that is, ρ t (i, j)+2s = ρ t+s (i, j) for distinct i, j ∈ [n] and t, s ∈ R + with η((t, t + s] × P n ) = 0.
Remark 5.3. Schweinsberg [47] constructs the Ξ-coalescent analogously from a point measure. The population model described in this section can be seen as the population model that underlies the dual flow of partitions in Foucart [24]. A lookdown model with a reproduction mechanism that is different in the case with simultaneous multiple reproduction events is studied by Birkner et al. [8]. In this model, a partition π ∈ P encodes the following reproduction event: Let i 1 < i 2 < . . . be the increasing enumeration of the integers that either form singletons or are non-minimal elements of blocks of π. For each j ∈ N, the particle on level i j moves to the level given by the j-th lowest singleton of π if π has at least j singletons, else the particle is removed. For each non-singleton block B ∈ π, the particle on level min B remains on its level and has one offspring on each level in B \ {min B}. Here the trajectories of the particles may cross: Consider a partition π ∈ P such that 1 and 2 are in the same block, 4 forms a singleton, and 3 is the minimal element of a non-singleton block. If the reproduction event encoded by π occurs at time t ∈ (0, ∞), then there exists s ∈ (0, t) such that the particle on level 3 at time s is on level 3 also at time t, and the particle on level 2 at time s jumps to level 4 at time t. Such a crossing cannot occur in our population model by Remark 5.1.

The Ξ-lookdown model
The population model from the Subsection 5.1 will now be driven by a Poisson random measure η on (0, ∞) × P as in Schweinsberg [47], Bertoin [4], and Foucart [24]. To define this Poisson random measure, we briefly recall Kingman's correspondence. For a full account, see e. g. [4, Section 2.3.2]. Kingman's correspondence is a one-to-one correspondence between the distributions of the exchangeable random partitions of N and the probability measures on the simplex where |x| 1 = i∈N x i . Every x ∈ ∆ can be interpreted as a partition of [0, 1] into subintervals of lengths x 1 , x 2 , . . ., and possibly another interval of length 1 − |x| 1 which may be called the dust interval. Let U 1 , U 2 , . . . be iid uniform random variables with values in [0, 1]. The paintbox partition associated with x is the exchangeable random partition of N where two different integers i and j are in the same block if and only if U i and U j fall into a common subinterval that is not the dust interval. This construction defines a probability kernel κ from ∆ to P. Conversely, every exchangeable random partition π in P has distribution ∆ ξ(dx)κ(x, ·) for some distribution ξ on ∆. Here x is the random vector in ∆ of the asymptotic frequencies of the blocks of π.
Let Ξ be a finite measure on ∆. We decompose For i, j ∈ N with i = j, we denote by K i,j the partition in P that contains the block {i, j} and apart from that only singleton blocks. We define a σ-finite measure H Ξ on P by Let η be a Poisson random measure on (0, ∞) × P with intensity dt H Ξ (dπ). Note that κ(x, P n ) ≤ n 2 |x| 2 2 for all x ∈ ∆ and n ≥ 2. This follows as in the paintbox partition associated with x, the probability that two fixed integers belong to the same block is |x| 2 2 . The random point measure η thus satisfies condition (5.1) a. s. as for all t ∈ R + and n ∈ N. Hence, we can and will define the population model from Subsection 5.1 from almost every realization of η and every ρ 0 ∈ R N 2 . We also let ρ 0 be a R N 2 -valued random variable that is independent of η. We define the R N 2 -valued process (ρ t , t ∈ R + ) realization-wise from the Poisson random measure η and the random initial state ρ 0 as in the preceding subsection.
Proof. The description around equation (5.3) implies that for 0 ≤ t < t ′ and each n ∈ N, the conditional expectation of γ n (ρ t ′ ) given (ρ s , s ≤ t) is measurable with respect to ρ t and the restriction of η to (t, t ′ ] × P. The assertion follows as n ∈ N was arbitrary and as the restrictions of a Poisson random measure to disjoint subsets are independent. For each n ∈ N and π ∈ P n \ {0 n }, the rate at which reproduction events encoded by partitions in γ −1 n (π) = {π ′ ∈ P : γ n (π ′ ) = π} occur in the lookdown model is given by λ π = H Ξ (γ −1 n (π)). The rates λ π are calculated explicitly in (6.4) and (6.3) in Section 6.2. Remark 5.5. The quantity λ π is the coagulation rate q π in Section 4.2.1 of Bertoin [4]. It is related to the quantity λ n;k 1 ,...,kr;s from Schweinsberg [47] by λ π = λ n;k 1 ,...,kr;s , where k 1 , . . . , k r denote the sizes of the non-singleton blocks of π, and s = n − k 1 − . . . − k r . This can be seen by a comparison of equations (6.4) and (6.3) with equation (11) in [47]. In particular, equation (18) in [47] implies that η satisfies a. s. condition (5.1).
In the next proposition, we state a martingale problem for the process (ρ t , t ∈ R + ).
Recall the set C from Section 4. For φ ∈ C and ρ ∈ R N 2 , we write for n ∈ N, φ ∈ C n , and ρ ∈ R N 2 . Then the stochastic process (ρ t , t ∈ R + ) solves the martingale problem (A, C).
Proposition 5.6 follows from the discussion above and the description of the process (γ n (ρ t ), t ∈ R + ) around equation (5.3). As in [26], the operator A grow reflects the growth of the genealogical distances between reproduction events that affects them. The operator A repr stands for the jumps of the genealogical distances in reproduction events, as described by equation (5.3). We omit a formal proof of Proposition 5.6.
Remark 5.7. That the solutions of the martingale problems in Proposition 5.6 and in Proposition 6.4 below are unique can be shown by the approach from Section 11.3. We do not use this assertion in the present paper.

Properties of the genealogy at a fixed time
We consider the process (ρ t , t ∈ R + ) from Subsection 5.2. To apply Theorem 4.1, we need exchangeability of the random variable ρ t for each t ∈ R + . Proposition 5.8. Let t ∈ R + and assume that ρ 0 is exchangeable. Then ρ t is exchangeable.
We prove Proposition 5.8 in Section 11.1. s , s ∈ [0, t)). The distance matrix ρ t ∧ (2t) can be retrieved from (Π (t) s , s ∈ [0, t)) by ρ t (i, j) ∧ (2t) = 2 inf{s ∈ [0, t] : i and j are in the same block of Π (t) s , or s = t} As Ξ-coalescents are exchangeable, it follows that the random variable (ρ t (i, j) ∧(2t)) i,j∈N is exchangeable. We remark that the collection of partitions (Π is the dual flow of partitions from Foucart [24] in one-sided time. We also remark that preservation of exchangeability in the lookdown model is studied in e. g. [8,15,16].
For the application of Theorem 4.1, it is also of interest whether the states ρ t are a. s. dust-free. Proposition 5.10 formulates the criterion from [47,Proposition 30] in our present context. We call the finite measure Ξ on ∆ dust-free if

Decomposition of the genealogical distances
To apply Theorem 4.1(i) to the process (ρ t , t ∈ R + ) from Section 5.2, we need to describe theÛ-valued process (β(ρ t ), t ∈ R + ) by a martingale problem. A version of this process that readily yields a description by a martingale problem is read off from the lookdown model in this section. We define such a process in Subsection 6.1 for a deterministic point measure η that drives the population model. In Subsection 6.2, we let η again be the Poisson random measure.

The deterministic construction
The quantity v t (i) is the time back until an ancestor of the particle on level i at time t is involved in a reproduction event in which it belongs to a non-singleton block, if there is such an event, else v t (i) is defined from v 0 . We let ρ 0 = α(r 0 , v 0 ) and define the process (ρ t , t ∈ R + ) from η and ρ 0 as in Subsection 5.1. We set for t ∈ R + and i, j ∈ N. Then (r t , v t ) can be thought of as a decomposition of the distance matrix ρ t in the sense of Section 2. In this decomposition, we remove from the genealogical tree at time t the part between any leaf i and the most recent reproduction event on the ancestral lineage of this leaf, and we encode the length of this part as the mark v t (i).
Remark 6.1. Consider for this remark the following change (compared to our definition from Section 5.1) in the definition of the reproduction event encoded by a point (t, π) ∈ η: For each non-singleton block B i (π), the reproducing particle on level i at time t− dies and is replaced at time t by its offspring on all the levels in B i (π). Then the quantity v t (i) is the age of the particle on level i at time t if this holds for t = 0. Condition (6.2) below ensures that the times at which the particles on a fixed level are replaced do not accumulate.
Analogously to Section 5.1, we give another description of the process ((r t , v t ), t ∈ R + ). Let S n be the set of semi-partitions of [n], that is, the set of systems of nonempty disjoint subsets of [n]. Every partition is also a semi-partition. However, in a semi-partition, there can be missing elements, that is, elements of [n] that are not contained in the union ∪σ of the blocks of σ. By "blocks" we mean the subsets of [n] that are the elements of σ. From every semi-partition σ ∈ S n , a partition π is obtained by inserting a singleton block for each missing element. We call π the partition associated with σ, and we define σ(i) = π(i) for each i ∈ [n], where π(i) is defined in Section 5.1. In order that equation (6.1) below hold, we associate with each element σ of S n a transformation R n 2 × R n → R n 2 × R n , which we also denote by σ, by σ(r, v) We define the function that removes all singleton blocks from a partition of N and restricts the semi-partition obtained in this way to a semi-partition of [n]. For each reproduction event encoded by a point (s, π) ∈ η, the corresponding jump of the process ((r t , v t ), t ∈ R + ) can be described by ς n (π)(γ n (r s− , v s− )) = γ n (r s , v s ). (6.1) Here we cannot use the restriction γ n (π) (of π to [n]) instead of ς n (π) as we cannot read off from γ n (π) which singleton blocks in γ n (π) are also singleton blocks in π.
We remark thatP n is the set of partitions of N in which not all of the first n integers form singleton blocks, hence it is strictly larger than the set P n . Only reproduction events that are encoded by a partition inP n affect the decomposed genealogical distances on the first n levels (γ n (r t , v t ), t ∈ R + ). If η satisfies the condition η((0, t] ×P n ) < ∞ for all t ∈ (0, ∞) and n ∈ N.
then there are only finitely many reproduction events in bounded time intervals that result in a jump of the process (γ n (r t , v t ), t ∈ R + ). Between such jumps, the matrix r t is constant, and the entries of the vector v t grow linearly with slope 1, that is, v t (i) + s = v t+s (i) for i ∈ [n] and t, s ∈ R + with η((t, t + s] ×P n ) = 0.

Stochastic evolution
Now let η be the Poisson random measure from Section 5.2 whose distribution is characterized by some finite measure Ξ on ∆. Consider the population model from Subsection 6.1 driven by the Poisson random measure η. For each n ∈ N and σ ∈ S n \ {∅}, the rate at which reproduction events encoded by a partition in ς −1 n (σ) ∈ P occur is given by where ℓ = #σ, and k 1 , . . . , k ℓ ≥ 1 are the sizes of the subsets in σ in arbitrary order, and Ξ 0 is defined as in (5.4). For the last equality, we consider the paintbox partition π associated with x ∈ ∆: With the notation from the beginning of Section 5.2, integers i, j ∈ [n] are elements of a common subset in ς n (π) if and only if U i and U j fall into a common subinterval that is not the dust interval. In particular, i / ∈ ∪ς n (π) if and only if U i falls into the dust interval.
Note that the rates λ π for π ∈ P n \ {0 n }, which we discussed already in Remark 5.5, satisfy where the union and the sum are over all semi-partitions σ ∈ S n with the same nonsingleton blocks as π. In (6.4), we also use the restriction map γ n : P → P n . From equations (6.3) and (6.4), we see that λ {{1,2}} = Ξ(∆) < ∞ and λ π < ∞ for all π ∈ P n \ {0 n }, where 0 n = {{1}, . . . , {n}}. This implies η((0, t] × P n ) < ∞ a. s. for all t ∈ (0, ∞). That is, condition (5.1) is a. s. satisfied, as stated in Section 5.2. The condition (5.7) for Ξ to be dust-free is the condition that λ 1,{{1}} = ∞. That is, each particle reproduces with infinite rate if and only if Ξ is dust-free. Hence, if Ξ is not dust-free, then almost every realization of η satisfies condition (6.2). Moreover, if Ξ is not dust-free, then λ n,σ < ∞ for all n ∈ N and σ ∈ S n \ {∅} as a consequence of equation ( The rates λ n,σ for σ ∈ S n with #σ > 1 are equal to zero in this case. Now we consider the R N 2 × R N -valued process ((r t , v t ), t ∈ R + ) from Subsection 6.1, driven by the Poisson random measure η. The initial state is defined as a R N 2 ×R N -valued random variable (r 0 , v 0 ) that is independent of η.
Proof. This follows by the same argument as for Proposition 5.4.
Recall the setĈ from Section 4. For (r, v) ∈ R N 2 × R N and φ ∈Ĉ n , we write From the discussion above and the description of the process (γ n (r t , v t ), t ∈ R + ) around equation (6.1), we deduce the next proposition.
The operatorÂ grow accounts for the growth of the marks v t which is described in the end of Subsection 6.1. The operatorÂ repr stands for the jumps of the decomposed genealogical distances in reproduction events which are described by equation (6.1). We omit a formal proof of Proposition 6.4.
Finally, we consider again the process (ρ t , t ∈ R + ) which is defined from the Poisson random measure η and the initial state ρ 0 = α(r 0 , v 0 ) as in Section 5.1. We assume ρ 0 ∈ U. Then ρ t ∈ U for all t ∈ R + by Remark 5.2. Moreover, the construction in Subsection 6.1 and the definition of the map α in Section 2 yield ρ t = α(r t , v t ) and (r t , v t ) ∈Û for all t ∈ R + . We further assume that (r 0 , v 0 ) = β(ρ 0 ). (6.5) Then by the following proposition, the decomposition (r t , v t ) of the semi-ultrametric ρ t is the one given by the map β from Section 2, namely the decomposition into the external branches and the remaining subtree.
We prove Proposition 6.5 in Section 11.2.
Proof. This is immediate from Propositions 6.3, 6.4 and 6.5. The Markov property can alternatively be seen from Proposition 5.4 and Remark 4.2.

Tree-valued Fleming-Viot processes
In this section, we apply Theorem 4.1 to the process (ρ t , t ∈ R + ) from Section 5.2. By Remark 5.2, we can consider (ρ t , t ∈ R + ) as an U-valued process. We call all the image processes in Theorem 4.1 tree-valued Fleming-Viot processes. To distinguish them, we also call them U-,Û, and U erg -valued Ξ-Fleming-Viot processes. Proposition 7.1 below states that the martingale problems for the tree-valued Fleming-Viot processes have unique solutions.

Processes with values in the space of metric measure spaces
In this subsection, we consider a finite measure Ξ on ∆ that is dust-free. Let χ ∈ U, and let (ρ t , t ∈ R + ) be the U-valued Markov process from Section 5.2 that is defined in terms of Ξ and an initial state ρ 0 with distribution ν χ . We define a U-valued Ξ-Fleming-Viot process (χ t , t ∈ R + ) with initial state χ ∈ U by χ t = ψ(ρ t ). As a justification for this name, we note that χ 0 = ψ(ρ 0 ) = χ a. s. by Corollary 3.12(iii) and Remark 3.5. By Theorem 4.1 and Propositions 5.4, 5.6, 5.8, and 5.10, the process (χ t , t ∈ R + ) is Markovian and solves the martingale problem (B, Π), where the generator B is defined by BΦ(χ) = ν χ (Aφ) for φ ∈ C with associated polynomial Φ ∈ Π, and χ ∈ U. Here A is the generator defined in Proposition 5.6. The martingale problem (B, Π) is a generalization of the martingale problem in Theorem 1 of Greven, Pfaffelhuber, and Winter [26].

Processes with values in the space of marked metric measure spaces
Let Ξ be a general finite measure on the simplex ∆. Letχ ∈Û, let ρ 0 be a U-valued random variable with distribution α(νχ), and let the U-valued Markov process (ρ t , t ∈ R + ) be defined, as in Section 5.2, from Ξ and the initial state ρ 0 . We define aÛ-valued Ξ-Fleming-Viot process (χ t , t ∈ R + ) with initial state χ ∈Û byχ t =ψ(β(ρ t )) for t ∈ R + .
If Ξ is dust-free, then for each t ∈ (0, ∞) by Remark 4.3 and Proposition 5.10, the marked metric measure spaceχ t is a. s. dust-free, andχ t is determined a. s. by the associated metric measure space χ t .

Processes with values in the space of distance matrix distributions
Let (χ t , t ∈ R + ) be the process from Section 7.2, where Ξ is a general finite measure on the simplex ∆. We define a U erg -valued Ξ-Fleming-Viot process (ξ t , t ∈ R + ) with initial state α(νχ 0 ) ∈ U erg by ξ t = α(νχ t ). Again by Theorem 4.1 and Propositions 5.4, 5.6 and 5.8, it follows that (ξ t , t ∈ R + ) is Markovian and solves the martingale problem (C, C ), where the generator C is defined by CΨ(ξ) = ξ(Aφ) for all ξ ∈ U erg , φ ∈ C, and Ψ ∈ C : ξ ′ → ξ ′ φ. Here the generator A is defined as in Proposition 5.6. That a martingale problem is well-posed means that a solution exists whose finitedimensional distributions are uniquely determined by the initial state. A proof of Proposition 7.1 by duality is given in Section 11.3.

Some semigroup properties
In this section, we state Feller continuity of tree-valued Ξ-Fleming-Viot processes, and that the domains of the martingale problems for them are cores. We considerÛ-valued Ξ-Fleming-Viot processes in detail, analogous results hold for the other processes from Section 7.
Let Ξ be a finite measure on the simplex ∆. For χ ∈Û, let (χ t , t ∈ R + ) under the probability measure P χ with associated expectation E χ be theÛ-valued Ξ-Fleming-Viot process from Section 7.2 with initial state χ. We denote by C b (E) the set of bounded continuous R-valued functions on a metric space E. We endow C b (E) with the supremum norm.
The results in this section rely on the following lemma which we prove in Section 11.4 using the lookdown construction.
As a corollary, we obtain the Feller continuity of aÛ-valued Ξ-Fleming-Viot process, namely that its semigroup preserves the set of bounded continuous functions.
Proof. This follows from Lemma 8.1 as the setΠ of marked polynomials is convergence determining, we use the definition of convergence in distribution inÛ.
LetL denote the closure ofΠ in C b (Û) with respect to the supremum norm. For application in [27], we note two more corollaries of Lemma 8.1. The first of them states that the semigroup of aÛ-valued Ξ-Fleming-Viot process can be restricted to a semigroup onL that is strongly continuous.
is an element ofL. Moreover, Proof. The first assertion follows from Lemma 8.1 and the definition ofL. As (χ t , t ∈ R + ) solves the martingale problem (B,Π) from Section 7.2, for all t ∈ R + and Φ ∈Π. The second assertion follows asBΦ is bounded and by definition ofL.
The next corollary says that the semigroup onL of aÛ-valued Ξ-Fleming-Viot process is generated by the closure of the operatorB with domainΠ, see [19,Chapter 1] for the definitions. Let L be the closure of Π in C b (U) and let L ′ be the closure of C in C b (U erg ), with respect to the supremum norm. In the same way as above, it can be shown: The semigroup on L ′ of a U erg -valued Ξ-Fleming-Viot process is strongly continuous and generated by the closure of the operator C with domain C from Section 7.3. If Ξ is dust-free, then the semigroup on L of a U-valued Ξ-Fleming-Viot process is strongly continuous and generated by the closure of the operator B with domain Π from Section 7.1. Continuity properties analogous to Proposition 8.2 also hold.

Convergence to equilibrium
Let Ξ be a finite measure on the simplex ∆ with Ξ(∆) > 0. We show convergence to equilibrium for theÛ-valued process (β(ρ t ), t ∈ R + ) from Section 6.2. From this, we deduce in Proposition 9.1 that also the tree-valued Ξ-Fleming-Viot process from Section 7.2 converges to equilibrium. In the same way, it can be shown that the other processes from Section 7 converge to equilibrium.
We define stationary processes and use a coupling argument. Analogously to Section 5.2, letη be a Poisson random measure on R × P with intensity dt H Ξ (dπ). This Poisson random measure drives a population model in two-sided time (with time axis R) where the reproduction events and the ancestral levelsĀ s (t, i) are defined as in Section 5.1. Then we define the stationary U-valued process (ρ t , t ∈ R) of the genealogical distances byρ On an event of probability 1, all these distances are finite. This follows from the assumption that Ξ(∆) > 0. Thatρ t is indeed a semi-ultrametric for each t ∈ R can be seen as in Remark 5.2. Clearly, ρ t is exchangeable, which follows from exchangeability of the Ξ-coalescent as in Remark 5.9 or can be shown as in the proof of Proposition 5.8.
Proposition 9.1. TheÛ-valued random variable χ t converges in distribution to a Ξcoalescent measure tree as t → ∞.

Proofs of the general results
In Subsection 10.1, we prove Proposition 3.4 which is needed for the proof of the uniqueness result (Proposition 3.11) in Subsection 10.3. We prove the sampling representation (Theorem 3.9) in Subsections 10.2 -10.4. Theorem 4.1 gives the application to tree-valued processes and is proved in Subsection 10.5.

Proof of Proposition 3.4
The proof of this result from Section 3.2 relies on the fact that in a separable metric space, an iid sequence with respect to a probability measure on the Borel sigma algebra has no isolated elements.
We write ρ = α(r, v). We show that v = Υ(ρ) a. s. from which the assertion follows by definition of the map β. Let ε > 0 and i ∈ N. By separability, X × R + can be covered by countably many balls of diameter ε. This implies and that there exists a random j ∈ N \ {i} with By inequality (10.1) and the definition of ρ, it follows that Using the definition of the map Υ, we deduce v(i) + 2ε ≥ 1 2 ρ(i, j) ≥ Υ(ρ)(i) a. s. For the converse inequality, we first note that 2v(i) ≤ v(i) + v(j) + 2ε + r(i, j) = ρ(i, j) + 2ε (10.2) by inequality (10.1) and the definition of ρ. Moreover, for all k ∈ N \ {i, j}, we obtain Here we use inequality (10.2) for the first and inequality (10.1) for the fifth and sixth step, the definition of ρ for the third and fifth step, and ultrametricity for the second step. By definition of the map Υ, we obtain As ε > 0 and i ∈ N were arbitrary, it follows that Υ(ρ) = v a. s.

Measurability of the construction of (marked) metric measure spaces
In this subsection, we show Proposition 3.7 from Section 3.3. We only discuss measurability of the mapψ : D × R N + →M therein. Measurability of the map ψ : D × M follows along the same lines.
Recall that the Prohorov distance between two probability measures µ and µ ′ on the Borel sigma algebra on a metric space (Z, d Z ) is given by where the first infimum is over all couplings ξ of the probability measures µ and µ ′ . We also use the marked Gromov-Prohorov distance d mGP which metrizes the marked Gromov-weak topology onM, see [12]. It is defined by for marked metric measure spaces (X, r, m) and (X ′ , r ′ , m ′ ). Here the infimum is over all isometric embeddings ϕ : X → Z and ϕ ′ : X ′ → Z into complete and separable metric spaces (Z, d Z ). The space Z × R + is endowed with the product metric analogously for X × R + and X ′ × R + . The mapsφ : X ×R + → Z ×R + andφ ′ : We writeD = D × R N + . For n ∈ N, we denote bŷ D n = {(r, v) ∈ R n 2 + × R n + : r(i, i) = 0, r(i, j) = r(j, i), r(i, j) + r(j, k) ≥ r(i, k) for all i, j, k ∈ [n]} the space of decomposed semimetrics on [n] which we view as a subspace of R n 2 × R n . We denote byψ n :D n →M the function that maps (r, v) ∈D n to the isomorphy class of the marked metric measure space ([n], r, n −1 n i=1 δ (i,v(i)) ), here we also identify the elements of the semi-metric space ([n], r) with distance zero. Proof. W. l. o. g. we can assume thatD is endowed with the metric d that is given by For (r, v), (r ′ , v ′ ) ∈D n , we define a probability measure ξ on (D) 2 as the distribution of ((r( . Then ξ(· × D) = νψ n(r,v) and ξ(D × ·) = νψ n(r ′ ,v ′ ) . For the coupling characterization (10.4) implies d P (νψ n(r,v) , νψ n(r ′ ,v ′ ) ) ≤ c + ξ{(y, y ′ ) ∈D 2 : d(y, y ′ ) > c} = c.
Continuity ofψ n follows by definition of the marked Gromov-weak topology.
Proof of Proposition 3.7. Let (r, v) ∈D * and let (X, r) be the metric completion of (N, r). We endow the product space X × R + with the metric d X×R + ((x, v), , m) = 0 for a probability measure m on X × R + . Asψ(r, v) equals the isomorphy class of (X, r, m), and asψ n (γ n (r, v)) equals the isomorphy class of (X, r, n −1 n i=1 δ (i,v(i)) ) for each n ∈ N, the definition of the marked Gromov-Prohorov metric implies that lim n→∞ d mGP (ψ(r, v),ψ n (γ n (r, v))) = 0.
For (r, v) ∈D \D * , the imageψ(r, v) is constant by definition. Using Lemma 10.1 and Lemma 10.2 below, we deduce measurability ofψ. Proof. We representD * by countable unions and intersections of measurable sets. The assertion on D follows along the same lines by removing the marks v.
For (r, v) ∈D, let (X, r) be the metric completion of (X, r). We endow the product space X ×R + with the metric d X×R + ((x, v), (x ′ , v ′ )) = r(x, v)∨|v−v ′ | and define for n ∈ N the probability measures m n = n −1 n i=1 δ (i,v(i)) on X × R + . The assertion (r, v) ∈D * is equivalent to the assertion that (m n , n ∈ N) is a Cauchy sequence with respect to the Prohorov metric on X × R + . Hence, 1{∃j ∈ F with r(i, j) ∨ |v(i) − v(j)| < ε} +ε}.

Resampling from marked metric measure spaces
We will use the statements from this section to prove assertions (ii) and (iii) of Theorem 3.9. In the end of this section, we also prove Proposition 3.11 from Section 3.4.
The following proposition can be compared with Lemma 8 of Vershik. We construct a marked metric measure space from a marked distance matrix. When we sample according to its marked distance matrix distribution, the assertion is that we arrive at a random variable that has the same distribution as the marked distance matrix with which we started. Recall the functionsψ, ψ and the setsD * ,D from Section 3.3. Proof of Proposition 10.3. Let n ∈ N and let φ : R n 2 + × R n + → R be bounded and continuous. Let (X, r, m) be the representative ofψ(r, v) as in the definition ofψ. We have Here the assumption (r, v) ∈D * ensures that m is the weak limit of the uniform probability measures 1 k k ℓ=1 δ (ℓ,v(ℓ)) on X × R + . This yields the second equality by dominated convergence. For the third equality, we use that summands where ℓ 1 , . . . , ℓ n are not pairwise distinct vanish in the limit, and that for all other summands, the expectation in the second line equals by exchangeability the expectation in the third line.
In the next proposition, we start with a marked metric measure space and sample (r, v) according to its marked distance matrix distribution. The marked metric measure space that we construct from any typical realization of (r, v) turns out to be isomorphic to the marked metric measure space with which we started.
Proposition 10.5. Let χ ∈M and let (r, v) be a D × R N + -valued random variable with distribution ν χ . Then (r, v) ∈D * a. s. andψ(r, v) = χ a. s. Remark 10.6. Proposition 10.5 is essentially Vershik's proof [50,Theorem 4] of the Gromov reconstruction theorem (where metric measure spaces are considered, cf. also [12,Theorem 1] for marked metric measure spaces). The present formulation focuses on the mapψ that will be used in the proofs of Theorems 3.9(iii) and 4.1 below.
Proof of Proposition 10.5. Let (X ′ , r ′ , m ′ ) be a representative of χ. W. l. o. g. we assume that the closed support of the probability measure m ′ (· × R + ) is the whole space X ′ , and that (r, v) = ((r ′ (x(i), x(j))) i,j∈N , v) for an m ′ -iid sequence (x, v). We denote by (X, r) the completion of (N, r). We endow X ′ × R + with the product metric and analogously X × R + . As the sequence (x(i)) i∈N is a. s. dense in X ′ , the isometry that maps x(i) to i for all i ∈ N can a. s. be extended to a (surjective) isometry ϕ from X ′ to X. An isometryφ from X ′ × R + to X × R + is a. s. given by (x, v ′ ) → (ϕ(x), v ′ ). By the Glivenko-Cantelli theorem, the probability measures m ′n := n −1 n i=1 δ (x(i),v(i)) on X ′ × R + converge weakly to m ′ a. s. Asφ is continuous, the probability measures m n := n −1 n i=1 δ (i,v(i)) =φ(m ′n ) on X × R + converge weakly to m :=φ(m ′ ) a. s. This implies (r, v) ∈ D * a. s. and thatψ(r, v) equals the isomorphy class of (X, r, m) a. s. The second assertion follows asφ is a. s. a measurepreserving isometry from X ′ ×R + to X ×R + , which implies that (X ′ , r ′ , m ′ ) and (X, r, m) have a. s. the same marked distance matrix distribution.
Remark 10.8 (Marked metric measure spaces and weighted real trees). Let χ ∈Û, and let (r, v) be aÛ-valued random variable with the marked distance matrix distribution of χ. By Proposition 10.5, we have (r, v) ∈D * a. s., hence we can associate with any typical realization of (r, v) a complete and separable weighted real tree (T , d, µ) as in Remark 3.8. As in Proposition 3.14, the random marked distance matrix (r, v) is ergodic with respect to the action of the group of finite permutations. This yields that the measure-preserving isometry class of the weighted real tree (T , d, µ) is an a. s. constant random variable. Its typical realization can be associated with χ.
Proof of Proposition 3.11. Let (r, v) be a random variable with conditional distribution ν χ given χ. Then we can assume ρ = α(r, v). Propositions 3.4 and 10.5 imply χ =ψ•β(ρ) a. s. Hence, the distribution of ρ determines the distribution of χ uniquely, which is the "only if" assertion. The other direction clearly holds as the distribution of χ determines the distribution of ρ uniquely.

Proof of the sampling representation
We give two proofs of Theorem 3.9(i) from Section 3.4 that build on a common part, namely statement (10.7) below. The plan for the first proof is the following: We partition the completion of the tree (T, d) associated with the semi-ultrametric ρ (as in Remark 1.1) into small subsets. Into each of these subsets, we lay an atom whose mass is given by the asymptotic frequency of those integers that label the leaves of T that are the endpoints of the external branches that begin in this subset. By exchangeability, these asymptotic frequencies exist, and (10.7) yields that they add up to one. We obtain an atomic probability measure on the product space of the metric completion of the tree and the mark space R + by defining the R + -component as the distance to the top of the coalescent tree. Using the coupling characterization (10.4) of the Prohorov metric, we show that this probability measure converges as the subsets become infinitely small, and that the limit measure coincides with the limit of the uniform measures in the definition ofD * .
As a slight difference to the description in the preceding paragraph, we will work with the space (X, r) that corresponds to the completion of the space only of the starting vertices of the external branches, but we will occasionally recall the relation to the whole tree. We will use definitions also from Section 2.
Let ε > 0. As the distribution of the random variable v(i) has at most countably many atoms, there exists a deterministic sequence 0 < h (ε) 1 < h (ε) 2 < . . . that increases to infinity and that satisfies h for all i, j, n ∈ N. We set h (ε) n ) for n ∈ N. We define an equivalence relation ∼ ε on N such that two distinct integers i, j are equivalent if and only if there exists n ∈ N with v(i), v(j), 1 2 ρ(i, j) ∈ I ε n .
To show transitivity, we consider i, j, k ∈ N with i = k, i ∼ ε j, and j ∼ ε k. Then there exists n ∈ N with v(i), v(j), v(k), ρ(i, j)/2, ρ(j, k)/2 ∈ I ε n . As by definition of Υ and ultrametricity, it follows that i ∼ ε k.
Note that the definitions in Section 2 imply for i ∼ ε j. (That is, in the context of Remark 2.2, the starting points of external branches that end in leaves (0, i), (0, j) of T with i ∼ ε j have distance smaller than 2ε.) In the next two paragraphs, we prove the following claim: A. s., the partition of N given by ∼ ε contains no singleton blocks. (10.7) For each i, n ∈ N the sequence (1{v(j) ∈ I ε n , ρ(i, j)/2 ∈ I ε n }, j ∈ N \ {i}) is exchangeable. By the de Finetti theorem, it is conditionally iid. Hence, on the event that there exists j ∈ N \ {i} with v(j) ∈ I ε n and ρ(i, j)/2 ∈ I ε n , there exists a. s. another (in fact, infinitely many) such j in N \ {i}.
For j ∈ N, the definition of Υ and condition (10.5) imply the existence of (random) n ∈ N and i ∈ N \ {j} such that v(j) ∈ I ε n and ρ(i, j)/2 ∈ I ε n a. s. As shown in the preceding paragraph, there exists a. s. an integer k ∈ N \ {i, j} with v(k) ∈ I ε n and ρ(i, k)/2 ∈ I ε n . From v(k) ≤ ρ(j, k)/2 ≤ (ρ(i, j) ∨ ρ(i, k))/2, it follows that ρ(j, k)/2 ∈ I ε n a. s. This proves (10.7). Now we show that the asymptotic frequencies exist and add up to one. For A ⊂ N and k ∈ N, we denote the relative frequency by |A| k = k −1 #(A ∩ [k]) and the asymptotic frequency by |A| = lim k→∞ |A| k , provided the limit exists. As the random partition given by ∼ ε is exchangeable, the asymptotic frequencies of its blocks exist a. s. by Kingman's correspondence. Let B ε (i) denote the equivalence class of i ∈ N with respect to ∼ ε , and let M ε = {j ∈ N : j = min B ε (i) for some i ∈ N} be the set of minimal elements of the equivalence classes of ∼ ε . As the exchangeable partition given by ∼ ε has no singleton blocks a. s., it has proper frequencies by Kingman's correspondence, that is, Consequently, on an event of probability 1, a probability measure m ε on the product sigma algebra on X × R + is given by (Into each of the subsets of (X, r) given by ∼ ε , the first component of the measure m ε lays an atom with mass given by the asymptotic frequency of the integers that label the corresponding leaves in T .) Let ε 1 > ε 2 > . . . > 0 with lim ℓ→∞ ε ℓ = 0. For each ℓ ∈ N, we replace ε with ε ℓ everywhere in this proof until now, and we use the notations introduced so far. We also assume that for k ≤ ℓ, the sequence (h By Fatou's lemma and as a. s., the partition given by ∼ ε ℓ has proper frequencies, it follows that |B ε k (i)| = |B ε ℓ (i 1 )| + |B ε ℓ (i 2 )| + . . . a. s.
Using equation (10.8), we deduce A. s., a coupling of m ε k and m ε ℓ is given by the probability measure Indeed, as equation (10.9) implies K is a. s. a coupling of m ε k and m ε ℓ . In words, the probability measure m ε ℓ can be obtained by splitting each atom of m ε k into fragments. Let us sample a point (j, v(j)) according to m ε ℓ , and let (i, v(i)) be the point such that the atom of m ε ℓ at (j, v(j)) is one of the fragments of the atom of m ε k at (i, v(i)). Then the pair ((i, v(i)), (j, v(j))) has distribution K.
For every pair (i, j) that appears in the sum in equation (10.10), we have i ∼ ε k j, hence |v(i) − v(j)| < ε k and r(i, j) < 2ε k . Hence, the coupling characterization of the Prohorov metric (10.4) yields a. s. for all k ≤ ℓ, when X × R + is endowed with the product metric d X×R + that is given by As a consequence, on an event of probability 1, the sequence (m ε ℓ , ℓ ∈ N) in the space of probability measures on the complete space X × R + is Cauchy, we denote its limit by m. Consider for n, ℓ ∈ N also the probability measure m ε ℓ n on X × R + , given by As there exists a. s. a coupling K ′ of the probability measures m ε ℓ n and m ε ℓ with K ′ {(y, y)} = m ε ℓ n {y} ∧ m ε ℓ {y} for all y ∈ X × R + , the coupling characterization of the Prohorov metric (10.4)  This shows assertion (i). Assertion (ii) follows from assertion (i) and Proposition 10.3. Proposition 10.5 implies assertion (iii).
The idea for the second proof of Theorem 3.9(i) is to construct directly by the de Finetti theorem a sampling measure on a subspace of the metric completion of the coalescent tree associated with ρ. To this aim, we fix by conditioning the closure of the subspace of the starting vertices of the external branches that end in the leaves labeled by the odd integers. By (10.7), this subspace contains a. s. the sequence of the starting vertices of the external branches associated with the even integers, and this sequence is exchangeable. For a related result, see also Forman, Haulk, and Pitman [23], where trees are embedded into ℓ 1 .
Remark 10.9. The second proof given below goes in a direction that is similar to the argument in Section 7 of [22] for the construction of the sampling measure µ on the real tree S = Γ(T). That the equality Γ(T) = Γ(T − ) = Γ(T + ) on p. 268 in [22] holds for the embedding of Γ(T − ) and Γ(T + ) into Γ(T) can be seen from (10.7) as in the proof below as Γ(T), Γ(T − ), and Γ(T + ) then correspond to X, X 1 , and X 2 therein. The real tree Γ(T − ) can then be endowed with a measure like X 1 is endowed with µ 1 . Note that the starting vertices of the external branches and the subtree spanned by them are called the points of attachment and the core, respectively, in [22].
We remark that the second last paragraph of the proof below shows that the isomorphy class of the weighted real tree (S, µ) is a. s. equal to ψ(r) where (r, v) = β(d) and d is the exchangeable ultrametric on N from [22,Section 7], which corresponds to ρ below. This equality can also be deduced from Theorem 3.9, Remark 3.6, as ψ(r) is a. s. constant by the ergodicity assumption in [22], and from the Gromov reconstruction theorem.
Second proof of Theorem 3.9(i). Let (r, v) = β(ρ). We construct the first component of the sampling measure, showing r ∈ D * a. s.
We denote by N 1 the odd, and by N 2 the even integers. Let (X, r) denote the metric completion of (N, r). A. s. by (10.6) and (10.7), there exists for each i ∈ N 2 an integer j ∈ N 1 with r(i, j) < 2ε. As ε can be chosen arbitrarily small, it follows that i is a. s. contained in the closure X 1 of the subset N 1 of (X, r) a. s., hence X 1 = X a. s. (Recall from Remark 2.2 that N corresponds here to the set of starting vertices of the external branches in the coalescent tree (T, d) associated with ρ.) (This is the length of the external branch that ends in the leaf (0, i) in the subtree spanned by the leaves with labels in N 1 .) By exchangeability of the sequence (ρ(i, j) : j ∈ N \ {i}) and by definition of v = Υ(ρ), it follows that v 1 (i) = v(i) a. s. Let ρ 1 = (ρ(i, j)) i,j∈N 1 be the restriction of ρ to N 1 . We define the random variable r 1 = (r 1 (i, j)) i,j∈N 1 by By definition of r in Section 2, it follows that r 1 = (r(i, j)) i,j∈N 1 a. s. Let Λ be a regular conditional distribution of ρ given ρ 1 . Then for a. a. ρ 1 , under Λ(ρ 1 , ·), the complete and separable metric space (X 1 , r) is a. s. constant as r 1 is ρ 1measurable.
Moreover, the sequence 2, 4, 6, . . . of the even integers, viewed as a sequence in (X 1 , r), is exchangeable under Λ(ρ 1 , ·) for a. a. ρ 1 . To see this, we use that the Borel sigma algebra on (X 1 , r) is generated by the balls around the elements of N 1 ⊂ X 1 . Let n ∈ N, and let B 2 , . . . , B 2n be some finite intersections of such balls. Note that {2 ∈ B 2 , . . . , 2n ∈ B 2n } can be written as an intersection of events of the form {ρ(i, j) < c}, where i ∈ N 2 , j ∈ N 1 and c ∈ (0, ∞). Using this, the uniqueness lemma, and the elementary fact that the conditional distribution of ρ given its restriction ρ 1 is invariant under permutations that leave N 1 fixed, we obtain the claimed exchangeability.
For this exchangeable sequence, the de Finetti theorem yields, Λ(ρ 1 , ·)-a. s. for a. a. ρ 1 , a sampling measure µ 1 on (X 1 , r) that is the weak limit of the probability measures µ 1 n := n −1 n i=1 δ 2i on (X 1 , r). By the same argument as above, also the closure X 2 of the subset N 2 in (X, r) equals X a. s. On the event of probability 1 on which N 2 is a dense subset of X 2 = X = X 1 , an isometry ϕ : X 1 → X 2 is given by ϕ(i) = i for i ∈ N 2 . As also the weak limit of the image measures ϕ(µ 1 n ) on (X 2 , r) exists a. s., we have shown (r(2i, 2j)) i,j∈N ∈ D * a. s. This implies r ∈ D * a. s. as r and (r(2i, 2j)) i,j∈N are equal in distribution by exchangeability of r.
That (r, v) ∈D * can be shown analogously by considering the sequence (i, v(i)) i∈N 2 in the space X 1 × R + which we endow with the metric d

Proof of Theorem 4.1
The following property is central in the proof of Theorem 4.1. and If the assumptions of Theorem 4.1 hold and ρ t is a. s. dust-free, then Proof. This is immediate from Theorem 3.9, the definition of ξ t , and Corollary 3.12.
Remark 10.11. In the context of Theorem 4.1(i), let (P t , t ∈ R + ) denote the semigroup on M b (Û) of the Markov process (β(ρ t ), t ∈ R + ), and let (Q t , t ∈ R + ) denote the semigroup on M b (Û) of the Markov process (χ t , t ∈ R + ). Let K denote the probability kernel from U toÛ, given by K(χ, ·) = ν χ for χ ∈Û. Then Proposition 10.10 yields the intertwining relation Q t K = KP t which is condition (b) in [45,Theorem 2]. Many papers appeared on intertwining of Markov processes, a classical one is for instance [9].
Also the proof of (iii) is analogous. We apply [45,Theorem 2] to the process (ρ t , t ∈ R + ), the measurable map ψ : U → U, and the probability kernel from U to U given by (χ, B) → ν χ (B). We use the assumption that ρ t is a. s. dust-free in the application of Proposition 10.10 and Remark 10.7.

Proofs related to the lookdown model
This section contains the remaining proofs of the results from Sections 5 -8.

Exchangeability in the lookdown model
To prove Proposition 5.8, we show in Lemma 11.1 below that exchangeability of the genealogical distances is preserved in single reproduction events. Then we construct the genealogical distance matrix ρ t at time t, restricted to the first n ∈ N particles, from the initial state ρ 0 and the reproduction events before time t that affect the genealogical distances between the first n individuals. Here we use the description of the process (γ n (ρ t ), t ∈ R + ) by its jumps and the evolution between the jumps from the end of Section 5.1.
For n ∈ N, we define the action of the group S n of permutations of [n] on the set P n of partitions of [n], and on R n 2 , respectively, by p(π) = {p(B) : B ∈ π} and p(ρ) = (ρ(p(i), p(j))) i,j∈[n] (11.1) for each p ∈ S n , π ∈ P n , ρ ∈ R n 2 . A random variable with values for instance in P n or in R n 2 is called exchangeable if its distribution is invariant under the action of S n .
Lemma 11.1. Let n ∈ N, let π be an exchangeable random partition of [n], and let ρ be an exchangeable random variable with values in R n 2 . Assume that π and ρ are independent. Then the random variable π(ρ) is exchangeable.
Lemma 11.1 can be seen as a generalization of Lemma 4.3 of Bertoin [4].
Proof. Let p ∈ S n . For each partition π ′ ∈ P n , the blocks of π ′ are in one-to-one correspondence with the blocks of p(π ′ ) via the bijection that maps a block B ∈ π ′ to the block p(B) ∈ p(π ′ ). Also, the blocks of π ′ are in one-to-one correspondence with the integers in [n] that are the minimal elements of the blocks of π ′ . The same holds for the blocks of p(π ′ ) and their minimal elements. It follows that the minimal elements of the blocks of π ′ are in one-to-one correspondence with the minimal elements of the blocks of p(π ′ ). We extend this one-to-one correspondence arbitrarily to a bijection from [n] to itself which we denote by f (π ′ ). This defines a map f : P n → S n which satisfies for all π ′ ∈ P n and i ∈ [n]. This equation holds as π ′ (i), by its definition in Section 5.1, is a minimal element of a block of π ′ and as p(π ′ )(p(i)) is the minimal element of the corresponding block of π ′ . By equation (11.1) and the definition (5.2) of the transformation on R n 2 associated with each element of P n , equation (11.2) implies for all π ′ ∈ P n and ρ ′ ∈ R n 2 . By assumption, p(π) and π are equal in distribution. As the distribution of f (π ′ )(ρ) is the same for all π ′ ∈ P n , namely equal to the distribution of ρ, it follows that f (π)(ρ) and π are independent, and that f (π)(ρ) is equal in distribution to ρ. This implies that π(ρ) and p(π) (f (π)(ρ)) are equal in distribution as also ρ and π are independent by assumption. By equation (11.3), it follows that π(ρ) and p −1 (π(ρ)) are equal in distribution, which yields the assertion.
Proof of Proposition 5.8. Let n ∈ N. For s ∈ R + , we define the map λ s : R n 2 → R n 2 , ρ ′ → ρ ′ + 2 n s, where 2 n = 2(1{i = j}) i,j∈ [n] . We will use the map λ s to account for the linear growth of the genealogical distances between reproduction events.

Equality of decompositions
To prove Proposition 6.5, we use the following lemma. Its meaning is that if the ancestral lineage of an individual i at time t can be traced back until a most recent reproduction event on that lineage, then there exists a. s. another individual k at time t that descends from this reproduction event.
Lemma 11.2. Assume that Ξ is not dust-free. Let t ∈ (0, ∞) and i ∈ N. Then a. s. on the event {v t (i) < t}, there exists an integer k ∈ N \ {i} with v t (i) = 1 2 ρ t (i, k).
Proof. Recall the process (Π (t) s , s ∈ R + ) from Remark 5.9. We work on the intersection of {v t (i) < t} with the event of probability 1 on which condition (6.2) is satisfied, v t (i) > 0, and for each s ∈ (0, t) ∩ Q, the partition Π (t) s contains infinitely many blocks if it contains singletons. The latter event indeed has probability 1 by Kingman's correspondence and as t is a. s. not the time of a reproduction event.
At time t − v i (t), a reproduction event occurs that is encoded by a partition in which the block that contains A t−vt(i) (t, i) contains some other element j. This follows from the definition of v t (i) in Section 6.1 and as η((0, t] ×P i ) < ∞ by condition (6.2) which means that the reproduction events in which particles on levels not larger than i reproduce do not accumulate.
Moreover, by condition (6.2), there exists a time s , s] ×P j ) = 0, which implies that the particle on level j at time t − v t (i) is still on level j at time s.
By definition of v t (i), the partition Π s has infinitely many blocks. This means that infinitely many particles at time s survive until time t. Remark 5.1 implies that all particles at time s survive until time t. Therefore, the particle that was on level j at the times t − v t (i) and s is on some level k at time t. The most recent common ancestor of the particles on levels i and k at time t lives at time t − v t (i), hence 1 2 ρ t (i, k) = v t (i). Proof of Proposition 6.5. Let t ∈ (0, ∞) and i ∈ N. We have to show that v t (i) = Υ(ρ t )(i) a. s.
From the definitions of the reproduction events in Section 5.1 and of the quantity v t (i) in Section 6.1, it follows that for each s ∈ (t − v t (i) ∧ t, t], only the particle on level i at time t descends from the particle on level A s (t, i) at time s. The definitions of Υ in Section 2 and of ρ t in Section 5.
In the case that Ξ is dust-free, we have Υ(ρ t ) = 0 a. s. by Proposition 5.10, hence also v t (i) = 0 a. s. Now we assume that Ξ is not dust-free. Lemma 11.2 yields Υ(ρ t )(i) ≤ v t (i) a. s. on the event {v t (i) < t}.
We claim that on the event {v t (i) ≥ t}, all individuals at time 0 have descendants at time t. This can be seen as follows: For each s ∈ (0, t), the exchangeable partition Π Hence, as Υ(ρ 0 ) = v 0 by assumption (6.5),

Uniqueness for the martingale problems for tree-valued Fleming-Viot processes
Proof of Proposition 7.1. We consider the martingale problem (B, Π), the proofs for the other martingale problems are analogous. It remains to show uniqueness of the solution discussed in Section 7. We use a function-valued dual process. This method is applied in the context of tree-valued Fleming-Viot processes in [13], another dual process is used in [26]. We fix n ∈ N and work with a dual process with state space C n . With each element π of P n , we also associate a transformation C n → C n , which we also denote by π, by π(φ)(ρ) = φ(π(ρ)), ρ ∈ R n 2 , φ ∈ C n .
From this definition, we have B(ν · φ)(χ ′ ) = B ↓ ν χ ′ (φ) for all φ ∈ C n and χ ′ ∈ U, where ν · φ is the polynomial associated with φ. For all t ∈ R + and all polynomials Φ ∈ Π of degree at most n, it follows from Theorem 4.4.11 in [19] that E[Φ(χ t )] is equal for all solutions ((χ t , t ∈ R + ); P ) of the martingale problem (B, Π) with initial state χ 0 . As n ∈ N was arbitrary and the space Π of polynomials is separating, the uniqueness assertion follows from Theorem 4.4.2 in [19].

Proof of Lemma 8.1
Using the lookdown construction, we show that the semigroup of anÛ-valued Ξ-Fleming-Viot process preserves the set of marked polynomials.
Proof of Lemma 8.1. Let N denote the space of simple point measures on (0, ∞)×P. Let t ∈ R + and n ∈ N. Note that in the construction in Sections 5.1 and 6.1, the restriction γ n (r t , v t ) depends only on the simple point measure η and the restriction γ n (r 0 , v 0 ) of the initial state. We may thus define the function g n : R n 2 × R n × N → R n 2 × R n that maps the restriction γ n (r 0 , v 0 ) of the initial state and the point measure η to γ n (r t , v t ). Note that when the simple point measure is fixed, g n is a differentiable function on R n 2 × R n with bounded uniformly continuous derivative.
Let φ ∈Ĉ n . We define the function where η is now the Poisson random measure from Section 5.2. By dominated convergence and the mean value theorem, also the function f is differentiable with bounded uniformly continuous derivative, and we obtain that f ∈Ĉ n . Let Φ be the marked polynomial associated with φ. For χ ∈Û, let (r 0 , v 0 ) be a random variable with the marked distance matrix distribution of χ, and let (r t , v t ) be defined from (r 0 , v 0 ) and the independent Poisson random measure η as in Section 6.2. From Propositions 6.5 and 10.10, and as we may assume that theÛ-valued Ξ-Fleming-Viot process (χ s , s ∈ R + ) from Section 8 satisfiesχ t =ψ(r t , v t ) a. s., we obtain that E χ [Φ(χ t )] = ν χ f for all χ ∈Û. Hence, χ → E χ [Φ(χ t )] is inΠ.

Construction from the flow of bridges
In this section, we construct a U erg -valued Ξ-Fleming-Viot process from the dual flow of bridges of Bertoin and Le Gall [5].
A random non-decreasing right-continuous functionF : [0, 1] → [0, 1] with exchangeable increments andF (0) = 0,F (1) = 1 is called a bridge. We view a bridge as a random variable with values in the space of càdlàg paths [0, 1] → [0, 1] which we endow with the Skorohod metric. The dual flow of bridges is a collection F = (F s,t , s < t) of bridges that satisfies the following properties (see [5, Section 5.1]): (i) For every s < t < u, F t,u • F s,t = F s,u a. s.
(ii) The law of F s,t depends only on t − s. For s 1 < s 2 < . . . < s n , the bridges F s 1 ,s 2 , F s 2 ,s 3 , . . . , F s n−1 ,sn are independent.
(iii) F 0,0 is the identity function. For every x ∈ [0, 1], the random variable F 0,t (x) converges to x in probability as t decreases to zero.
For each s < t, it is also assumed that F s,t is a. s. not the identity function. The interpretation is that the individuals of a continuous population are represented by the elements of the interval [0, 1]. For each s ≤ t, the individuals in a subinterval (x 1 , x 2 ] at time s have descendants at time t that are a. s. the elements of (F s,t (x 1 ), F s,t (x 2 )], see [7].
In [5, Section 3], Kingman's correspondence is extended so as to represent distributions of Ξ-coalescents in terms of sampling from flows of bridges. Let F be a dual flow of bridges, and let V = (V i , i ∈ N) be an iid sequence of uniform [0, 1]-valued random variables, independent of F . This iid sequence is interpreted as a sequence of random samples from the population at some time t ∈ R. For each s ∈ R + , a partitionπ (t) s is defined such that any integers i, j ∈ N are in the same block ofπ s , s ∈ R + ) obtained in this way is a version of a Ξ-coalescent of Schweinsberg [47].
For each t ∈ R, there exists an event of probability 1 on which for all s ≤ s ′ ∈ Q + , the partitionπ (t) s ′ can be obtained by merging blocks of the partitionπ (t) s . We can thus define a. s. an ultrametricρ t bỹ ρ t (i, j) = 2 inf{s ∈ Q + : i and j are in the same block ofπ (t) s }. The assumption that for each r < s, the bridge F r,s is a. s. not the identity function implies that the infimum in the definition ofρ t (i, j) is a. s. not over the empty set.
Moreover, we define a. s. a random variableξ t with values in the space (U, d P ) of exchangeable distributions on U such thatξ t is a regular conditional distribution ofρ t given the collection of bridges (F t−s,t , s ∈ Q + ). For the existence of this regular conditional distribution, see e. g. [30,Theorem 6.3].
Analogously to Sections 7.2 and 7.3, for a finite measure Ξ on ∆, a stationary U ergvalued Ξ-Fleming-Viot process (ξ t , t ∈ R) is given by ξ t = α(νψ •β(ρt) ), where (ρ t , t ∈ R) is defined as in Section 9. We note that a stationary U erg -valued Ξ-Fleming-Viot process can be read off from the dual flow of bridges: Theorem 12.1. There exists a finite measure Ξ on ∆ such that the process (ξ t , t ∈ R) is a version of a stationary U erg -valued Ξ-Fleming-Viot process.
For the proof of Theorem 12.1, we show that (ξ t , t ∈ R) is a Markov process and has the transition kernel of a U erg -valued Ξ-Fleming-Viot process. In the following, we fix u ∈ R + . First, we define for each finite measure Ξ on ∆ a probability kernel Λ Ξ from U to U such that for each ξ ∈ U, the distribution Λ Ξ (ξ, ·) is the distribution of a random variable ρ which we define as follows. Let ρ ′ be a random variable with distribution ξ. Let ρ ′′ be an independent U-valued random variable that is distributed as the random ultrametric associated with a Ξ-coalescent. That is, ρ ′′ shall be distributed as the random variablē ρ u mentioned above, cf. Remark 5.9. We define a partition π of N such that i and j are in the same block of π if and only if ρ ′′ (i, j) < 2u. Let B 1 (π), B 2 (π), . . . be the blocks of π, ordered increasingly according to their smallest element. For i ∈ N, let A(i) be the integer j such that i ∈ B j (π).