Existence of mark functions in marked metric measure spaces

We give criteria on the existence of a so-called mark function in the context of marked metric measure spaces (mmm-spaces). If an mmm-space admits a mark function, we call it functionally-marked metric measure space (fmm-space). This is not a closed property in the usual marked Gromov-weak topology, and thus we put particular emphasis on the question under which conditions it carries over to a limit. We obtain criteria for deterministic mmm-spaces as well as random mmm-spaces and mmm-space-valued processes. As an example, our criteria are applied to prove that the tree-valued Fleming-Viot dynamics with mutation and selection from [Depperschmidt, Greven, Pfaffelhuber, Ann. Appl. Probab. '12] admits a mark function at all times, almost surely. Thereby, we fill a gap in a former proof of this fact, which used a wrong criterion. Furthermore, the subspace of fmm-spaces, which is dense and not closed, is investigated in detail. We show that there exists a metric that induces the marked Gromov-weak topology on this subspace and is complete. Therefore, the space of fmm-spaces is a Polish space. We also construct a decomposition into closed sets which are related to the case of uniformly equicontinuous mark functions.

useful state space for tree-valued stochastic processes and Polish when equipped with the Gromov-weak topology. That the Gromov-weak topology actually coincides with the one induced by the ✷ λ -metric was shown in [Löh13].
Important examples for the use of mm-spaces within probability theory are individual-based populations X with given mutual genealogical distances r between individuals. Here, r can for instance measure the time to the most recent common ancestor (MRCA) (cf. [DGP12, (2.7), Remark 3.3]), where the resulting metric space is ultrametric. Another possibility is the number of mutations back to the MRCA (cf. [KW14]) where the resulting space is not ultrametric. Finally, there is a sampling measure ν on the space X which models population density. This means that the state of the process is an mm-space (X, r, ν). Such individual-based models are often formulated for infinite population size (with diffuse measures ν) but obtained as the high-density limit of approximating models with finite populations (where ν is typically the uniform distribution on all individuals).
For encoding more information about the individuals, such as an (allelic) type or location (which may change over time), marked metric measure spaces (mmm-spaces) and the corresponding marked Gromovweak topology (mGw-topology) have been introduced in [DGP11]. For a fixed complete, separable metric space (I, d) of marks, the sampling measure ν is replaced by a measure µ on X × I, which models population density in combination with mark distribution.
A natural question in this context is whether or not every point of the limiting population X has a single mark almost surely, that is, does genetic distance zero imply the same type/location? Put differently, we ask ourselves if µ factorizes into a "population density" measure ν on X and a mark function κ : X → I assigning each individual its mark. If this is the case, we call the mmm-space functionally-marked (fmm-space). This property is often desirable and one might want to consider the space of fmm-spaces, rather than mmm-spaces as the state space. Unfortunately, the subspace of fmmspaces is not closed in the mGw-topology, which means that limits of finite-population models that are constructed as fmm-spaces might not admit mark functions themselves. It is therefore of interest, if the space of fmm-spaces with marked Gromov-weak topology is a Polish space (that is a "good" state space). Here, we show in Theorem 2.2 that this is indeed the case. We also produce criteria to enable one to check if an mmm-space admits a mark function. For limiting populations, they are given in terms of the approximating mmm-spaces. We derive such criteria for deterministic spaces (Theorem 3.1), random spaces (Theorem 3.7) and mmm-space-valued processes (Theorem 3.9 and Theorem 3.11).
An important example of such a high-density limit of approximating models with finite populations is the tree-valued Fleming-Viot dynamics. In the neutral case, it is constructed in [GPW13] using the formalism of mm-spaces. In [DGP12], (allelic) types -encoded as marks of mmm-spaces -are included, in order to model mutation and selection. For this process, the question of existence of a mark function has already been posed. [DGP12,Remark 3.11] and [DGP13, Theorem 6] state that it admits a mark function at all times, almost surely. The given proof, however, contains a gap, because it relies on the criterion claimed in [DGP13,Lemma 7.1], which is wrong in general, as we show in Example 4.1. We fill this gap by applying our criteria and showing in Theorem 4.3 that the claim is indeed true and the tree-valued Fleming-Viot process with mutation and selection (TFVMS) admits a mark function at all times, almost surely. We also show in Theorem 4.4 that the same arguments apply to the Λ-version of the TFVMS in the neutral case, that is where selection is not present.
Intuitively, the existence of a mark function in the case of the TFVMS holds because mutations are large but rare in the approximating sequence of tree-valued Moran models. Hence, as genealogical distance becomes small, the probability that any mutation happened at all in the close past becomes small as well (recall that distance equals time to the MRCA). In contrast, in [KW14], where evolving phylogenies of trait-dependent branching with mutation and competition are under investigation, mutations happen at a high rate but are small which justifies the hope for the existence of a mark function also for the limiting model. Our criteria are also suited for this kind of situation.
Outline. The paper is organized as follows. In the subsections of the introduction we first introduce notations and basic results for the Prohorov metric for finite measures, then give a short introduction to the space of marked metric measure spaces (mmm-spaces) M I and the marked Gromov-weak topology as well as the marked Gromov-Prohorov metric d mGP on it. We continue with defining the so-called functionally-marked metric measure spaces (fmm-spaces) M fct I ⊆ M I and finally investigate the case of equicontinuous mark functions as an illustrative example. We emphasize that the restriction of the marked Gromov-Prohorov metric d mGP to M fct I is not complete. In Section 2, we therefore show that there exists another metric on M fct I that induces the marked Gromov-weak topology and is complete. As one sees in Subsection 1.4, the situation becomes easy if we restrict to a subspace of M I containing spaces with uniformly equicontinuous mark functions. We introduce in Subsection 2.2 several related subspaces capturing some aspect of equicontinuity, and obtain a decomposition of M fct I into closed sets. This decomposition is used to prove Polishness of M fct I and in Section 3 to formulate criteria for the existence of mark functions.
Section 3 gives criteria for the existence of mark functions. Based on the construction of the complete metric and the decomposition of M fct I , we derive in Subsection 3.1 criteria to check if an mmm-space admits a mark function, especially in the case where it is given as a marked Gromov-weak limit. We then transfer the results in Subsection 3.2 to random mmm-spaces and in Subsection 3.3 to M I -valued stochastic processes.
To conclude, Section 4 gives examples. We first show that the criterion in [DGP13] is wrong in general by means of counterexamples. Our criteria are then applied in Subsection 4.1 to prove the existence of a mark function for the tree-valued Fleming-Viot dynamics with mutation and selection. To this goal we verify the necessary assumptions for a sequence of approximating tree-valued Moran models. In Subsection 4.2 we show that a similar strategy applies if we replace the tree-valued Moran models by so-called tree-valued Λ-Cannings models. Finally, in Subsection 4.3, a future application to evolving phylogenies of trait-dependent branching with mutation and competition is indicated.

Notations and prerequisites
In this paper, let all topological spaces be equipped with their Borel σ-algebras. We use the following notation throughout the article. (1.1) For ϕ : E → F measurable, with F some other Polish space, denote the image measure of µ under ϕ by ϕ * µ := µ • ϕ −1 . Finally, for the product space X := E × F , the canonical projection operators from X onto E and F are denoted by π E and π F , respectively. Definition 1.2 (Prohorov metric). For finite measures µ 0 , µ 1 on a metric space (E, r), the Prohorov metric is defined as It is well-known that the Prohorov metric metrizes the weak convergence of measures if and only if the underlying metric space is separable. The following equivalent expression for the Prohorov metric turns out to be useful. Remark 1.3 (coupling representation of the Prohorov metric). Let (E, r) be a separable metric space and µ 1 , µ 2 ∈ M 1 (E). For a finite measure ξ on E 2 , we denote the marginals as ξ 1 := ξ(· × E) and ξ 2 := ξ(E × ·). It is well-known (see, e.g., [EK05, Theorem III.1.2]) that We obtain from this equation 2 to obtain equality in the above. Following the ideas of the proof of the representation (1.3) in [EK05], the representation (1.4) for the Prohorov metric d Pr (µ 0 , µ 1 ) is easily seen to hold true for measures µ 1 , µ 2 ∈ M f (E) as well, which are not necessarily probabilities.
From (1.4), we can easily deduce the following lemma, which we use below.
Lemma 1.4 (rectangular lemma). Let (E, r) be a separable, metric space, ε, δ > 0, and µ 1 , 1.2 The space of marked metric measure spaces (mmm-spaces) In this subsection, we recall the space M I of marked metric measure spaces, and the marked Gromov-Prohorov metric d mGP , which induces the marked Gromov-weak topology on it. This space, (M I , d mGP ), will be the basic space used in the rest of the paper. These concepts have been introduced in [DGP11], and are based on the corresponding non-marked versions introduced in [GPW09]. In contrast to [DGP11], we allow the measures of the marked metric measure spaces to be finite, that is do not restrict ourselves to probability measures only. Because a sequence of finite measures converges weakly if and only if their total masses and the normalized measures converge, or the masses converge to zero, this straightforward generalization requires only minor modifications (compare [LVW14, Section 2.1], where this generalization is done for metric measure spaces without marks).
In what follows, fix a complete, separable metric space (I, d), called the mark space. It is the same for all marked metric measure spaces in M I . Definition 1.5 (mmm-spaces, M I ). (i) An (I)-marked metric measure space (mmm-space) is a triple (X, r, µ) such that (X, r) is a complete, separable metric space and µ ∈ M f (X × I), where X × I is equipped with the product topology.
(1.8) With a slight abuse of notation, we identify an mmm-space with its equivalence class and write X = (X, r, µ) ∈ M I for both mmm-spaces and equivalence classes thereof.
Next we recall the marked Gromov-weak topology from [DGP11, Section 2.2] that turns M I into a Polish space (cf. [DGP11, Theorem 2]). To this goal we first recall Definition 1.6 (marked distance matrix distribution). Let X := (X, r, µ) ∈ M I and (1.10) The marked distance matrix distribution of X is defined by (1.11) The marked Gromov-weak topology is the one induced by the map X → ν X .
Definition 1.7 (marked Gromov-weak topology). Let X , X 1 , X 2 , . . . ∈ M I . We say that (X n ) n∈N converges to X in the marked Gromov-weak topology, X n Finally let us recall the Gromov-Prohorov metric from [DGP11, Section 3.2] which is complete and metrizes the marked Gromov-weak topology, as shown in [DGP11, Proposition 3.7].
Definition 1.8 (marked Gromov-Prohorov metric, d mGP ). (1.13) where the infimum is taken over all complete, separable metric spaces (E, r) and isometric embeddings ϕ i : X i → E, andφ i is as in (1.7), i = 1, 2. The Prohorov metric d Pr is the one on M f (E × I), based on the metricr = r + d on E × I, metrizing the product topology. d mGP is called the marked Gromov-Prohorov metric.
A direct consequence of the fact that d mGP induces the marked Gromov-weak topology is the following characterization of marked Gromov-weak convergence obtained in [DGP11, Lemma 3.4].
Lemma 1.9 (embedding of marked Gromov-weakly converging sequences). Let X n = (X n , r n , µ n ) ∈ M I for n ∈ N ∪ {∞}. Then (X n ) n∈N converges to X ∞ Gromov-weakly if and only if there is a complete, separable metric space (E, r) and isometric embeddings ϕ n : X n → E such that forφ n as in (1.7), (1.14)
In the present article we investigate criteria for the existence of a mark function for X , that is (cf. [DGP13, Section 3.3]), a measurable function κ : X → I such that Obviously, X admits a mark function if and only if K x is a Dirac measure for ν-almost every x. Recall that the complete, separable mark space (I, d) is fixed once and for all.
It is easy to see that X n → X in the marked Gromov-weak topology.

The equicontinuous case
It directly follows from Lemma 1.11 that the subspace M fct I is not closed in M I , meaning that if X n mGw − −− → X is a marked Gromov-weakly converging sequence in M I , and all X n admit a mark function, this need not be the case for X . In applications, however, the limit X is often not known explicitly, and it would be important to have (sufficient) criteria for the existence of a mark function in terms of the X n alone. An easy possibility is Lipschitz equicontinuity: if all X n admit a mark function that is Lipschitz continuous with a common Lipschitz constant L > 0, the same is true for X (see [Pio11]). More generally, this holds for uniformly equicontinuous mark functions as introduced below. We briefly discuss the equicontinuous case in this subsection, because it is straightforward and illustrates the main ideas.
Recall that a modulus of continuity is a function h : for all x, y ∈ X. Note that for every modulus of continuity h, there exists another modulus of continuity h ′ ≥ h which is increasing and continuous with respect to the topology of the one-point compactification of R + . Therefore, we can restrict ourselves without loss of generality to moduli of continuity from (1.17) For h ∈ H and a metric space (X, r), we define (1.18) Note that f : X → I is h-uniformly continuous if and only if (x, f (x)), (y, f (y)) ∈ A X h for all x, y ∈ X, and that A X h is a closed set in (X × I) 2 with product topology. The next lemma states that a marked metric measure space (X, r, µ) admits an h-uniformly continuous mark function if and only if a pair of independent samples from µ is almost surely in A X h . Furthermore, if a sequence with h-uniformly continuous mark functions converges marked Gromov-weakly, the limit space also admits an h-uniformly continuous mark function.
This preliminary result is quite restrictive because of the condition to have the same modulus of continuity for all occurring spaces. In fact, the mark function of the tree-valued Fleming Viot dynamic considered in Subsection 4.1 is not even continuous.
At the heart of the following generalisation to measurable mark functions lies the fact that measurable functions are "almost continuous" by Lusin's celebrated theorem (see for instance [Bog07, Theorem 7.1.13]). We here give a version tailored to our setup: Lusin's theorem. Let X, Y be Polish spaces, µ a finite measure on X, and f : X → Y a measurable function. Then, for every ε > 0, there exists a compact set K ε ⊆ X such that µ(X \ K ε ) < ε and f | Kε is continuous.

The space of fmm-spaces is Polish
M fct I is not a closed subspace of M I in the marked Gromov-weak topology, and hence the restriction of the marked Gromov-Prohorov metric d mGP to M fct I is not complete. In this section, we show that there exists another metric on M fct I that induces the marked Gromov-weak topology and is complete. This shows that M fct I is a Polish space in its own right.

A complete metric on the space of fmm-spaces
For a measure ξ on I, we define Proof (first part). As seen before, X = (X, r, ν ⊗ K) ∈ M I admits a mark function if and only if K x is a Dirac measure for ν-almost every x ∈ X, which is the case if and only if β(X ) = 0. Hence β −1 (0) = M fct I . Because M fct I is dense in M I by Lemma 1.11, no X ∈ M I \ β −1 (0) can be a continuity point of β. Thus cont(β) ⊆ β −1 (0).
We defer the proof of the inclusion β −1 (0) ⊆ cont(β) to Subsection 2.2, because it requires a technical estimate on β derived in Proposition 2.7.
In view of (2.3), we can use standard arguments to construct a complete metric on M fct I that metrizes marked Gromov-weak topology. Namely consider the sets where the closure is in the marked Gromov-weak topology. Then, due to Proposition 2.1, F m is disjoint from M fct I , and M fct Because F m is also closed by definition, we obtain . (2.6) with F m defined in (2.4). Note that ρ m is a continuous function on M I because F m is a closed set. Let hold. We have to show that the marked Gromov-weak convergence already implies (2.8). This, however, follows from the continuity of the ρ m . It remains to show that d fGP is a complete metric on M fct I . Consider a d fGP -Cauchy sequence (X n ) n∈N in M fct I . By completeness of d mGP on M I , it converges marked Gromov-weakly to some X = (X, r, µ) ∈ M I . Furthermore, for every fixed m ∈ N, (2.6) implies that 1/ρ m (X n ) converges as n → ∞, and hence we denote the open ball in M I with respect to d mGP . The following corollary gives formal criteria for a limiting space to admit a mark function, which are useful only together with estimates on β.
Corollary 2.3. Let (X n ) n∈N be a sequence in M I which converges marked Gromov-weakly to X . Then the following are equivalent: .": follows directly from the definition of ρ m . "(iii). ⇔ (iv).": Using monotonicity in δ we obtain where, in the third equivalence, we renamed δ to ε and ε to δ.

A decomposition of M fct I into closed sets and estimates on β
In this subsection, we derive some estimates on β and use them to complete the proof of Proposition 2.1. Furthermore, we construct a decomposition of M fct I into closed sets which are related to the sets M h I . As we have seen in Subsection 1.4, the situation becomes easy if we restrict to the uniformly equicontinuous case, that is to the subspace M h I for some h ∈ H as in Definition 1.12. We introduce in what follows several related subspaces capturing some aspect of equicontinuity. In analogy to the definition of A X h in (1.18), we use for a metric space (X, r), and δ, ε > 0 the notation (2.12) Note that A X δ,ε is a closed set. For every h ∈ H, using monotonicity and continuity of h, we observe that (2.13) Definition 2.4 (M δ,ε I , M δ,ε I , M h I ). Let δ, ε > 0 and h ∈ H. We define M δ,ε I := (X, r, µ) ∈ M I : µ ⊗2 (A X δ,ε ) = µ ⊗2 , (2.14) We have the following stability of M δ,ε I with respect to small perturbations in the marked Gromov-Prohorov metric.
In order to complete the proof of Proposition 2.1 with the help of Proposition 2.7, we first observe that, as a consequence of Lusin's theorem, every functionally marked metric measure space is an element of M h I for some h ∈ H. Together with Lemma 2.9 below, this means that we have a nice (though uncountable) decomposition of M fct I into closed sets. Conversely, let X = (X, r, ν, κ) ∈ M fct I . According to Lusin's theorem, we find for every ε > 0 a compact set K ε ⊆ X, and a modulus of continuity h ε ∈ H, such that ν(X \ K ε ) ≤ ε and κ| Kε is h ε -uniformly continuous. In particular, (2.20) We may assume without loss of generality that ε → h ε (δ) is decreasing and right-continuous for every δ > 0. We define h(δ) := inf ε > 0 : h ε (δ) < ε ∈ R + ∪ {∞}. (2.21) Clearly, h(δ) converges to 0 as δ ↓ 0 because h ε ∈ H. Furthermore h h(δ) (δ) ≤ h(δ), and hence (2.20) with ε = h(δ) implies X ∈ M h I .
Proof of Proposition 2.1 (completion). We still have to show continuity of β in X ∈ β −1 (0). Due to Lemma 2.8, there is h ∈ H with X ∈ M h I . Now Proposition 2.7 yields for δ > 0 the estimate supX ∈B M I δ (X ) β(X ) ≤ (h(2δ) + 2δ)(2 + µ + δ), which converges to 0 as δ ↓ 0. It directly follows from Proposition 2.7(iii) that the marked Gromov-weak closure of M h I is contained in M fct I . In fact, M h I is even Gromov-weakly closed, which will be used in the proof of Theorem 3.11 below.
Lemma 2.9 (closedness of M h I ). M δ,ε I is marked Gromov-weakly closed in M I for every δ, ε > 0. In particular, M h I is closed for every h ∈ H.
Proof. Fix ε, δ > 0 and let (X n ) n∈N be a sequence in M δ,ε I converging marked Gromov-weakly to some X = (X, r, µ) ∈ M I . Using Lemma 1.9, we may assume that X n , n ∈ N, and X are subspaces of a common separable, metric space (E, r), such that µ n w − → µ on E × I. By definition of M δ,ε I , we find µ ′ n ≤ µ n , µ ′ n − µ n ≤ ε, such that (µ ′ n ) ⊗2 is supported by A E δ,ε for all n ∈ N. Since (µ ′ n ) n∈N is tight, we may assume, by passing to a subsequence, that µ ′

Criteria for the existence of mark functions
Based on the construction of the complete metric and the decomposition M fct I = h∈H M h I into closed sets obtained in Section 2, we now derive criteria to check if a marked metric measure space admits a mark function, especially in the case where it is given as a marked Gromov-weak limit. We then transfer the results to random mmm-spaces and M I -valued stochastic processes.

Deterministic criteria
Our main criterion for deterministic spaces is a direct consequence of the results in Section 2. Recall that H is the set of moduli of continuity defined in (1.17).   for all n with d mGP (X , X n ) < δ.
We will use Theorem 3.1 in the following form.
Corollary 3.2. Let X n = (X n , r n , ν n , κ n ) ∈ M fct I , X n mGw − −− → X ∈ M I . Let Y n,δ ⊆ X n measurable, and h ∈ H. Then X ∈ M fct I if the following two conditions hold for every δ > 0.
Remark 3.3. To obtain X ∈ M fct I , it is clearly enough to show in Theorem 3.1 and Corollary 3.2, (3.1) respectively (3.2) and (3.3) only for δ = δ m for a sequence (δ m ) m∈N with δ m ↓ 0 as m → ∞.
We illustrate the rôle of the exceptional set X n \ Y n,δ , and the importance of its dependence on δ, with a simple example.
Remark 3.5 (equicontinuous case). If, in Corollary 3.2, Y n,δ = Y n does not depend on δ, then (3.3) means that κ n is h-uniformly continuous on Y n . Consequently, the mark function of X is in this case h-uniformly continuous. If we restrict to Y n = X n for all n, we recover part (ii) of Lemma 1.13.
Corollary 3.6. Let X n = (X n , r n , ν n , κ n ) ∈ M fct I and assume that X n converges to X = (X, r, µ) ∈ M I marked Gromov-weakly. Further assume that for n ∈ N, δ > 0, there are measurable sets Z n,δ ⊆ X n , such that lim δ↓0 lim inf n→∞ ν n (X n \ Z n,δ ) + where diam is the diameter of a set. Then X admits a mark function, that is X ∈ M fct I .

Random fmm-spaces
The following theorem is a randomized version of Theorem 3.1. It is our main criterion for M I -valued random variables.
Theorem 3.7 (random fmm-spaces as limits in distribution). Let (X n ) n∈N be a sequence of M I -valued random variables which converges in distribution (w.r.t. marked Gromov-weak topology) to an M I -valued random variable X . Further assume that for every ε > 0, there exists a modulus of continuity h ε ∈ H such that lim sup Then X admits almost surely a mark function, that is X ∈ M fct I almost surely. If additionally X n = (X n , r n , ν n , κ n ) ∈ M fct I almost surely, we can replace (3.8) by the following condition. There are random measurable sets Y ε n,δ ⊆ X n , n ∈ N, δ > 0, satisfying (3.3), such that lim sup Remark 3.8. In (3.9), we need not worry about measurability of the "event" B n,δ := ν n (X n \ Y ε n,δ ) ≤ h ε (δ) due to the choice of Y ε n,δ . (3.9) is to be understood in the sense of inner measure, that is we require that there are measurable sets C n,δ ⊆ B n,δ with lim sup δ↓0 lim sup n→∞ P(C n,δ ) ≥ 1 − ε.
Proof. The second statement follows in the same way as Corollary 3.2. We divide the proof of the main part in two steps. First, we show X ∈ M fct I if, instead of (3.8), even P m∈N X n ∈ M δm,hε(δm) I for infinitely many n ≥ 1 − ε (3.10) holds for a sequence δ m = δ m (ε) ↓ 0 as m → ∞. In the second step, we show that, given (3.8), we can modify h ε toĥ ε ∈ H such that (3.10) holds with h ε replaced byĥ ε .
Step 1. By Skorohod's representation theorem, we may assume that the X n are coupled such that they converge almost surely to X in the marked Gromov-weak topology. (3.10) implies that with probability at least 1 − ε, for all m ∈ N, X n ∈ M δm,hε(δm) I for infinitely many n. By Theorem 3.1 and Remark 3.3, this means that the probability that X admits a mark function is at least 1 − ε. Because ε is arbitrary, this implies X ∈ M fct I almost surely.

Fmm-space-valued processes
Let J ⊆ R + be a (closed, open or half-open) interval and consider a stochastic process X = (X t ) t∈J with values in M I and càdlàg paths, where M I is equipped with the marked Gromov-weak topology. We say that X is an M fct I -valued càdlàg process if where X t− is the left limit of X at t (X ℓ− := X ℓ if ℓ is the left endpoint of J). In the following, we give sufficient criteria for X to be an M fct I -valued càdlàg process. We are particularly interested in the situation where X is the limit of M fct I -valued processes X n . Unsurprisingly, if the set of P-measure smaller or equal to ε in Theorem 3.7 is independent of t, the result is true for all t simultaneously, almost surely. The modulus of continuity may also depend on t in a continuous way; or be arbitrary if the limiting process has continuous paths: Theorem 3.9. Let J ⊆ R + be an interval, and X n = (X n t ) t∈J , n ∈ N, a sequence of M I -valued càdlàg processes converging in distribution to an M I -valued càdlàg process X = (X t ) t∈J . Assume that for every t, ε > 0, there exists h t,ε ∈ H such that Then any of the following two conditions implies that X is an M fct I -valued càdlàg process, that is (3.15) holds.
(i) X has continuous paths a.s.
As above, if X n is M fct I -valued, (3.16) can be replaced by the existence of random measurable sets Y n t,ε,δ ⊆ X n t satisfying (3.3), such that Proof. Due to the Skorohod representation theorem, we may assume that X n → X almost surely in the Skorohod topology. For condition (i) respectively (ii) we obtain (i) If X has continuous paths a.s., the convergence in Skorohod topology implies uniform convergence of X n t (ω) on J a.s. with respect to d mGP . Hence we have X n t mGw − −− → n→∞ X t for all t ∈ J, almost surely, and we can proceed as in the proof of Theorem 3.7.
(ii) There are (random) continuous w n : J → J, converging to the identity uniformly on compacta, such that X n w n (t) → X t for all t ∈ J, almost surely. We can use the moduli of continuityĥ t,ε (δ) := h t,ε (δ) + δ and proceed as in the proof of Theorem 3.7. Note here that, due to continuity of h t,ε (δ) in t, there is for every compact subinterval J of J an N J ,ε,δ such thatĥ t,ε (δ) ≥ h w n (t),ε (δ) for all n ≥ N J ,ε,δ and t ∈ J .
The same arguments apply for left limits with w n − such that X n w n − (t) → X t− . To use Theorem 3.9, we have to check in (3.16) or (3.17) a condition for uncountably many t simultaneously, which is often much more difficult than for every t individually. One situation, where it is easy to pass from individual t to all t simultaneously is the case where the moduli of continuity h t,ε actually do not depend on t and ε (see Corollary 3.13). The independence of ε, however, is a strong requirement. Therefore, we relax it to not blowing up too fast as ε ↓ 0, where the "too fast" is determined by the following modulus of càdlàgness of the limiting process.
Theorem 3.11. Fix an interval J ⊆ R + . Let X = (X t ) t∈J and X n = (X n t ) t∈J , n ∈ N, be M I -valued càdlàg processes such that X n converges in distribution to X . Furthermore, assume that there is a dense set Q ⊆ J and w ε , h ε ∈ H, such that for all ε > 0 lim sup Then X is an M fct I -valued càdlàg process, that is (3.15) holds. Recall the decomposition M I \ M fct I = m∈N F m with F m defined in (2.4). The basic idea of the proof is to use the following lemma about càdlàg paths to show that, almost surely, the path of X avoids F m . The proof of the lemma follows easily using the triangle-inequality.
Lemma 3.12. Let J be an interval, (E, r) a metric space, and e = (e t ) t∈J ∈ D E (J) a càdlàg path admitting modulus of càdlàgness w ∈ H. Let F ⊆ E be any set, δ > 0, and Q ⊆ J such that for all t ∈ J there is t 1 , t 2 ∈ Q with t 1 ≤ t ≤ t 2 ≤ t 1 + δ. Then r(e t , F ) > w(δ) ∀t ∈ Q =⇒ e t ∈ F and e t− ∈ F ∀t ∈ J. Due to the Skorohod representation theorem, we may assume that X n → X almost surely in Skorohod topology. In order to simplify notation, we assume J = [0, 1] and Q = k∈N Q k with Q k = { i2 −k : i = 0, . . . , 2 k }. It is enough to show for every ε > 0, m ∈ N and F m as defined in (2.4) that To show (3.24), fix ε > 0 and m ∈ N, and let X t = (X t , r t , µ t ). Because X has càdlàg paths, we find K = K(ε) < ∞ such that P sup (3.25) According to (3.21) and (3.23), we can choose k ∈ N big enough such that for h : Assume without loss of generality that w ε (2 −k ) ≤ 1. Now Proposition 2.7(iii) implies that, whenever Combining (3.20) and Lemma 3.12, we obtain (3.28) Using (3.25), (3.27), and (in the last step) (3.26), we conclude Thus (3.24) holds for all ε > 0, and P({∃t ∈ [0, 1] : X t ∈ M fct I }) = sup m∈N p m = 0 follows.
If, in Theorem 3.11, we can choose the modulus of continuity h ε = h ∈ H, independent of ε, such that (3.19) holds, we do not need to check (3.20) and (3.21).
Corollary 3.13 (ε-independent modulus of continuity). Assume that X n = (X n t ) t∈J converges in distribution to an M I -valued càdlàg process X , and Q ⊆ J is dense.

Examples
The (neutral) tree-valued Fleming-Viot dynamics is constructed in [GPW13] using the formalism of metric measure spaces. In [DGP12], (allelic) types -encoded as marks of marked metric measure spaces -are included, in order to be able to model mutation and selection. [DGP12,Remark 3.11] and [DGP13, Theorem 6] state that the resulting tree-valued Fleming-Viot dynamics with mutation and selection (TFVMS) admits a mark function at all times, almost surely. The given proof, however, contains a gap, because it relies on the criterion claimed in [DGP13, Lemma 7.1], which is wrong in general (see Example 4.1). The reason why the criterion may fail is a lack of homogeneity of the measure ν, in the sense that there are parts with high and parts with low mass density. Consequently, if we condition two samples to have distance less than ε, the probability that they are from the high-density part tends to one as ε ↓ 0, and we do not "see" the low-density part. This phenomenon occurs if ν has an atom but is not purely atomic. We also give two non-atomic examples, one a subset of Euclidean space, and the other one ultrametric.

The tree-valued Fleming-Viot dynamics with mutation and selection
In the following, we prove the existence of a mark function for the TFVMS by verifying the assumptions of Theorem 3.9 for a sequence of approximating tree-valued Moran models. Due to the Girsanov transform given in [DGP12,Theorem 2], it is enough to consider the neutral case, that is without selection.
Recall the definition of a tree-valued Moran model with mutation (TMMM), X N t = (X N t , r N t , ν N t , κ N t ), with population U N = {1, . . . , N }, N ∈ N and sampling measure ν N t = 1 N N k=1 δ k from [DGP12, Subsections 2.1-2.3]. The dynamics involve resampling, mutation, selection and tree-growth. Recall in particular that the distance between two individuals is defined to be twice the time to the most recent common ancestor (MRCA) (cf. [DGP12, (2.7)]). The dynamics of the underlying Moran model with mutation (MMM) rely on resampling involving every pair of particles at rate γ > 0 and mutation for every particle at rate ϑ ≥ 0 according to a fixed stochastic kernel β(·, ·) on I.
Next recall the graphical construction of the MMM from [DGP12, = ∅ be the process that records the individuals of the population at time t with an ancestor at a time t 0 < s ≤ t involved in a mutation event. By a coupling argument this process can be constructed by means of the Poisson point processes (η k,ℓ res , η k mut , k, ℓ ∈ U N ) as follows: (4.4) Let ξ N t := 1 N #M t 0 ,N t 0 +t be the proportion of individuals at time t 0 + t, t ≥ 0 whose ancestors have not mutated after (the for the moment fixed) time t 0 .
As the construction of the TFVMS in [DGP12] is only given for a compact type-space I, we make the same assumption. Note, however, that our proof itself does not use compactness and is therefore valid for non-compact I, provided that the TFVMS is the limit of the corresponding Moran models, and there exists a Girsanov transform allowing use to reduce to the neutral case.
Theorem 4.3 (the TFVMS admits a mark-function). Let I be compact and X = (X t ) t≥0 be the treevalued Fleming-Viot dynamics with mutation and selection as defined in [DGP12]. Then P(X t ∈ M fct I for all t > 0) = 1. (4.12) In particular, (X t ) t>0 is an M fct I -valued càdlàg process.
Proof. Let X t = (X t , r t , µ t ). By [DGP12, Theorem 2] there exists a Girsanov transform that enables us to assume without loss of generality that selection is not present. Let δ > 0 fixed. Recall that the distance between two individuals is twice the time to the MRCA. Hence, r N t (x, y) < δ implies that at time t − δ/2 the individuals have a common ancestor. Further recall that U N = {1, . . . , N }, and = ∅ records the individuals of the population at time t with an ancestor at a time t 0 < s ≤ t involved in a mutation event (cf. (4.4)).
Fix an arbitrary time horizon T > 0 and i ∈ N, i ≤ 2T /δ. Using the notation of Theorem 3.7, for t ∈ [iδ/2, (i + 1)δ/2) arbitrary, let (Y t ) ε N,δ := U N \M satisfy r N t (x, y) < δ, then they have a common ancestor at time t 0 := (i − 1)δ/2 < t − δ/2 and after this point in time no mutation occurred along their ancestral lineages. In particular, d(κ N t (x), κ N t (y)) = 0. Moreover, by Lemma 4.2 we obtain for every a > 0, (4.14) For ε > 0 arbitrary, we use this inequality with a := √ ε −1 2T Cδ, together with ν N t ≤ 1 for t < δ/2, to see that the prerequisites of Theorem 3.9 are satisfied for h t,ε ∈ H with For fixed N , it is elementary to construct a finite, random (ultra-)metric measure space encoding the random genealogy of the Λ-coalescent, where the distance is defined as the time to the MRCA. In [GPW09, Theorem 4], existence and uniqueness of a Gromov-weak limit in distribution, as N → ∞, is proven to be equivalent to the so-called "dust-free"-property, namely 1 0 y −1 Λ(dy) = ∞. The resulting limit is called Λ-coalescent measure tree. Now, replace the tree-valued Moran models considered in Subsection 4.1 and [DGP12] by so-called tree-valued Λ-Cannings models. That is, leave the mutation-and selection-part as it is and change the resampling-part of the Moran models as follows: For k = 2, . . . , N , at rate N k λ N,k a block of k individuals is chosen uniformly at random among the N individuals of the population. Upon such a resampling event, all individuals in this block are replaced by an offspring of a single individual which is chosen uniformly from this block. Note that the genealogy (disregarding types) of the resulting Λ-Cannings model with N individuals is dual to the Λ-coalescent starting with N blocks. We call any limit point (in path space) of the tree-valued Λ-Cannings processes, as N tends to infinity and Λ is fixed, tree-valued Λ-Fleming-Viot process (TLFV). In the neutral case, existence and uniqueness of such a limit point follows as a special case of the forthcoming work [GKW14]. Here, we show that, whenever limit points exist, all of them admit mark functions.
Theorem 4.4 (the TLFV admits a mark-function). Suppose there is no selection, that is α = 0, and X = (X t ) t≥0 is a tree-valued Λ-Fleming-Viot process with mutation. Then P(X t ∈ M fct I for all t > 0) = 1. (4.17) Proof. By passing to a subsequence if necessary, we may assume that the Λ-Cannings models converge in distribution to X . We proceed as in Subsection 4.1. Again, let (M t 0 ,N t ) t≥t 0 , M t 0 ,N t ⊂ U N with M t 0 ,N t 0 = ∅ be the process that records the individuals of the population at time t with an ancestor at a time t 0 < s ≤ t involved in a mutation event and ξ N t := 1 N #M t 0 ,N t 0 +t be the proportion of individuals at time t 0 + t, t ≥ 0 whose ancestors have not mutated after (the for the moment fixed) time t 0 . By definition, ξ N t t≥0 is a (continuous time) Markov jump process on [0, 1] with ξ N 0 = 0 and generator where, using n i = (n/i) n−1 i−1 for i ≥ 1, (4.20) Recall that k Therefore ,

Future application: Evolving phylogenies of trait-dependent branching
In [KW14] the results of the present paper will be applied in a context of evolving genealogies to establish the existence of a mark function with the help of Theorem 3.9. These genealogies are random marked metric measure spaces, constructed as the limit of approximating particle systems. The individual birth-respectively death-rates in the N th -approximating population depend on the present trait of the individuals alive and are of order O(N ). At each birth-event, mutation happens with a fixed probability. Each individual is assigned mass 1/N . The metric under consideration is genetic distance: in the N thapproximating population genetic distance is increased by 1/N at each birth with mutation. Hence, genetic distance of two individuals is counted in terms of births with mutation backwards in time to the MRCA rather than in terms of the time to the MRCA. Because of the use of exponential times in the modeling of birth-and death-events in this therefore non-ultrametric setup the analysis of the modulus of continuity of the trait-history of a particle in combination with the evolution of its genetic age plays a major role in establishing tightness of the approximating systems and existence of a mark function. [Kli14, Lemma 3.9] yields a control on the modulus of continuity by transferring the model to the context of historical particle systems. In a first step, time is related to genetic distance by means of the modulus of continuity. The extend of the change of trait of an individual in a small amount of time (recall (3.9) and (3.3)) can then be controlled by means of the modulus of continuity of its trait-path in combination with a control on the height of the largest jump during this period of time. This can in turn be ensured by appropriate assumptions on the mutation transition kernels of the approximating systems.