Branching trees I: Concatenation and infinite divisibility

The goal of this work is to decompose random populations with a genealogy in subfamilies of a given degree of kinship and to obtain a notion of infinitely divisible genealogies. We model the genealogical structure of a population by (equivalence classes of) ultrametric measure spaces (um-spaces) as elements of the Polish space U which we recall. In order to then analyze the family structure in this coding we introduce an algebraic structure on um-spaces (a consistent collection of semigroups). This allows us to obtain a path of collections of subfamilies of fixed kinship h (described as ultrametric measure spaces), for every depth h as a measurable functional of the genealogy. Random elements in the semigroup are studied, in particular infinitely divisible random variables. Here we define infinite divisibility of random genealogies as the property that the h-tops can be represented as concatenation of independent identically distributed h-forests for every h and obtain a Levy-Khintchine representation of this object and a corresponding representation via a concatenation of points of a Poisson point process of h-forests. Finally the case of discrete and marked um-spaces is treated allowing to apply the results to both the individual based and most important spatial populations. The results have various applications. In particular the case of the genealogical (U-valued) Feller diffusion and genealogical (U V -valued) super random walk is treated based on the present work in [DG18b] and [GRG]. In the part II of this paper we go in a different direction and refine the study in the case of continuum branching populations, give a refined analysis of the Laplace functional and give a representation in terms of a Cox process on h-trees, rather than forests.


Introduction
We are interested in random genealogies for example those arising from an evolving branching population. We use here a concept of genealogy which is suited to consider the evolution in time of this structure described by martingale problems aiming eventually at spatial situations and which is based on ultrametric measure spaces (more precisely their equivalence classes). In particular do we follow a different point of view then in the literature describing genealogies of populations by labeled trees, see for example [Nev86], [NP89], [LG89], [LJ91], [Abr92] continuing up to the present, or genealogies are modeled as measure R-trees see [EPW06,Gro99], we comment later on relations. For more information on our approach to genealogies see the survey article [DG19a].
An important role in this research project will be played by branching processes, for example the genealogy of a continuous state Feller branching diffusion respectively spatial versions thereof as super random walk and here in this paper we lay the foundations to study such objects. For more on these processes see [DG19b,GRG] The question is to consider for varying depths of kinship decompositions of the population in subfamilies specifying their genealogy as well as their size and to see whether we can give a cluster representation of the genealogy, i.e. can we view the genealogy of the sub-populations as a Cox point process on the space of genealogies modeled as ultra-metric measure spaces?
The term family decomposition appears in the literature of branching processes frequently, see e.g. [DG96], [Daw93] based on the historical process [DP91], but not always corresponding exactly to genealogies, but here we will get in general an interpretation in terms of genealogies, using a specific concept of "genealogy", even for continuum state processes as the limit of the obvious meaning it has in discrete Galton-Watson processes.
In order to study this type of questions following [EPW06,GPW09], we code the genealogy as equivalence classes of ultrametric measure spaces (and in case of a population distributed in space, where individuals have a location a marked one [DGP11]) turning the genealogy of the time-t population into a random variable with values in a Polish space. A survey on this approach is found in [DG19a]. Here the distance describes the degree of kinship and is twice the time back to the most recent common ancestor giving in fact an ultrametric. Now we can fix a degree of kinship, say h > 0 and decompose the space into open 2h-balls and this induces a number of ultrametric measure spaces, each describing the genealogy of a subfamily and we can obtain one ultrametric measure space by connecting the subspace to a forest, by giving distance 2h to points in different balls. This h-concatenation of the ultrametric measure spaces describes then the 2h-family decomposition. In particular, we can use the algebraic structure of the concatenation and the framework of ultrametric measure spaces to characterize the decomposition. This allows to carry out calculations to obtain properties of the whole path of 2h-family decompositions for h ∈ (0, diam(space)/2). However since we take equivalence classes of such objects some care is needed.
The h-concatenation is a binary operation and we show that it defines a topological semigroup. Further topological and algebraic properties are established and we notice a close relationship in structure of our results and arguments to those in a Cartesian semigroup (M, ) on metric measure spaces, introduced by Evans and Molchanov [EM16], even though the binary operations used in their and in our paper are completely different 1 INTRODUCTION 5 (but there the algebraic properties are similar).
One of the most important questions for semigroups is whether it is atomic, i.e. every element can be written as product of irreducible elements, and furthermore if it is a unique factorization domain, i.e. the representation is unique up to order. We show the answer to both questions is yes, getting for fixed depths a well-defined family decomposition of an ultrametric measure space and finally also a well-defined path of family decompositions of varying depth. Starting with a fixed ultrametric measure space this defines a càdlàg path of family decompositions. This structure is much richer and more informative than having just the semigroup structure of U. If we decrease the parameter starting from the diameter, then we obtain a succession of refinements of the family decomposition as h decreases to zero and as limit the original ultrametric measure space. There are more applications of this structure we exploit in [GRG].
We establish that the association of the path of decompositions is measurable and hence is indeed suitable to give rise to a legitimate random variable. The next task is to turn to random genealogies. The goal of the research program is to understand better the probabilistic structure of this random path for genealogies of evolving populations.
We begin by giving a concept of infinite divisibility on the level of random genealogies in our coding. This lifts the concept one has for the population size process (an R + -valued process) or if we are interested for example in the Dawson-Watanabe process that of measure-valued processes. Since this should be related to properties of the family decomposition we want to use the algebraic structure from above (h-forests with the concatenation operation). There is a quite general concept of infinite divisibility for random variables with values in semigroups which we follow here ( [BCR84]) but now lift it to a whole consistent collection of semigroups. We shall discuss in Section (2.9) the subtle relation to the classical theory of semigroups and negative definite functions i.e. harmonic analysis.
We are then establishing a Lévy-Khintchine formula for the Laplace functional implying a representation of the forests of concatenated 2h-forests corresponding to the random state of the genealogy as a concatenation of independent subfamily forests which are generated by a Poisson point process of 2h-forests for every h in (0, t] for random ultrametric measure space of diameter 2t and which are truncation consistent. As example we conclude showing the state of genealogies of branching processes are infinitely divisible if the initial state has this property. The results hold also for the genealogies of spatial models and similarly models where individuals carry a type. Examples are super random walks or Dawson-Watanabe processes as well as multitype branching processes, provided they exist as um-measure space valued processes (an issue addressed in forth-coming work [DG19b]). This allows to establish here a Lévy -Khintchine formula for genealogies for this very important class of spatial population models.

Perspectives
The key results in this paper can be used to get information about concrete stochastic systems. In [DG19b] we apply our results to discuss at length the example of the U-valued Feller diffusion and its spatial version the genealogical (U-valued) super random walk, the genealogy of this classical model represented by ultra-metric measure spaces. We also use results in [GRG] to study criteria for generators allowing to obtain a "branching property" from the form of the generator working very well in our context of genealogical and/or historical information.
Finally in part II of this paper [DGG] we apply the developed techniques to the study of continuum mass branching populations and give a more refined analysis of the present Lévy -Khintchine representation for that special case. The key point there will be to 6 move to a representation of the 2h-tops in terms of the Cox process on 2h-trees rather than forests as in this part I, i.e. splitting into the descending ancestors at depth h, that is a representation by the prime elements of (U(h)) , i.e. elements of U(h), rather than Poisson point processes on forests. This is related to the Cox cluster representation of historical processes associated with spatial branching processes, see [DP91] which is developed in [GRG]. This will also allow there better to understand the structure of the random genealogy by generating it dynamically.
Further tasks are to apply the present theory also to genealogies of α-stable processes and to modify concepts and theory to deal with genealogies appear in the form of algebraic trees as developed by Löhr and Winter [LW18,LMW18].
Outline In Section (2) we collect all of the important concepts and results. Section (3) contains all of the proofs concerning the structure of the state spaces and semigroups. The Section 4 proves statements on random variables with values in U. Proofs for infinite divisibility results can be found in Section (5). Finally an appendix contains basic facts on ultrametric measure spaces and on boundedly finite measures.

Basic concepts and Results
In this section we introduce first in Subsection (2.1) the state space of our genealogyvalued random variables. Subsection (2.2) introduces decompositions in disjoint open balls with weights providing the precise meaning of the concept of family decomposition. Subsection (2.3) gives some key analytical instruments, namely (truncated) polynomials and properties of quantities related to family decompositions. In Subsection (2.4) we pass to random ultrametric measure spaces and introduce Laplace functionals. In Subsection (2.5) we relate infinite divisible random "trees" with Poisson point processes of "forests" giving a version of the Lévy-Khintchine formula and then we discuss particular cases with additional properties, in Subsection (2.6) an application to branching with an example of a concrete branching process in Subsection 2.7 and in Subsection (2.8) to individual based (discrete) populations but more important to our main goal to understand spatial populations by generalizations to genealogies of spatial populations modeled by marked metric measure spaces. Section 2.9 discusses the relation to harmonic analysis and 2.10 gives an outline for the proof section.

Ultrametric measure spaces
A building block of our description of genealogies are ultrametric measure spaces. An ultrametric measure space is a triple (U, r, µ), with U a set, r an ultrametric on U such that (U, r) is a Polish space and with µ a finite Borel measure on (U, B(U )) with B denoting the Borel σ-algebra. Here U describes the individuals of a population, r the genealogical distance between individuals and µ is the multiple (the population size) of the sampling measure, a probability measure on (U, B(U )).
We say that two such triples (U, r, µ) and (U , r , µ ) are equivalent if there is a measure preserving isometry between the two supports of µ resp. µ . The equivalence class of a triple (U, r, µ) and the total mass is denoted by (2.1) u = [U, r, µ] andū = µ(U ).
Extending the space of ultrametric probability measures spaces from [GPW09] to finite measures see in particular Section 2.4. in [Glö12] or [ALW16], the basic state space of our random variables is the following. 7 Definition 2.1 (Ultrametric measure spaces). Define the space (2.2) U := set of isomorphy classes of ultrametric measure spaces with finite measure endowed with the Gromov-weak topology. This topology on U is metrized by the Gromov-Prokhorov metric d GPr such that U is a Polish space, see [GPW09], [Glö12], [DG19a].
Recall that in this topology sequences converge iff all distance matrix measures (see (2.22)) converge weakly to such a measure of a limit element in U. The null measure on a measure space and the null tree [{1}, r, 0] are here both denoted simply by 0. ♦ We refer the reader to the Appendix (A) for detailed definitions and and to [Daw93] for facts on spaces of measures and corresponding weak topologies.
Remark 2.2 (Ultrametric spaces and trees). Recall that any ultrametric space (U, r) (with finite diameter) can be imbedded isometrically into an (rooted) R-tree such that the leaves of the R-tree correspond to the elements of U . In particular we are then able to talk about a most recent common ancestor of two points of U , which is the unique point in the R-tree such that the distance to each of two elements of U is half their distance in (U, r).
By the above remark this can be viewed as transforming the time back to the most recent ancestor. It has the same topology as u but a different geometry. A particular example is τ a (r) = ar written (2.4) a u := τ * a (u) for a ≥ 0 and u ∈ U.

Family decomposition: the semigroup of h-forests
We want to decompose an ultrametric measure space into disjoint open balls of a fixed radius, say h > 0, corresponding to subfamilies which are descendants of a single MRCA time h back, recall Remark (2.2). To do this we need some notation.
as the set of ultrametric measure spaces which only realize distances in A. We give more definitions of sets and a binary operation: Definition 2.4 (Forests, trees and concatenation). Let h ≥ 0.
(a) Define the subset of h-forests ..., April 9, 2019, 3:26, INFDIV-lastrevision˙5.tex 2 BASIC CONCEPTS AND RESULTS where is the disjoint union of sets and r U h r V | U ×U = r U , r U h r V | V ×V = r V and for x ∈ U , y ∈ V : We write if h > 0 has been fixed.
(c) Define the subset of h-trees Forests and trees are endowed with the relative topology from U. ♦ Remark 2.5. For an ultrametric measure space with diameter 2h we have always a decomposition into countably many open h-balls. Each of these balls generates an ultrametric measure space by restriction. This decomposition is unique. Then we expect that the equivalence classes of ultrametric measure spaces generated by this decomposition induces a unique decomposition of the corresponding equivalence class of the full space. Here we based this however on a selection of representatives for all these objects which have a unique decomposition. We want to lift this to the equivalence classes. To see that this is feasible the simpled way is to proceed with purely algebraic and topological arguments. ♣ Remark 2.6. Recalling Definition (2.1) we observe: (a) Note U(h) and U(h) are separable metric spaces in their restricted topologies and the former is the completion of the latter hence Polish, see Proposition (3.2). With each forest we associate a path of decompositions in subfamilies. We define for that purpose the h-truncation.

9
We can now associate with every element of U or U(t) an object which is a key object in applications but also contains the key mathematical structure of U to study infinite divisibility of U−valued random variables and which allows to make precise the concept of family decompositions.

♦
The first result tells us that any forest can be seen as a unique (at most countable) concatenation of trees i.e. elements of U.
The result clarifies also the algebraic structure of the binary operation . Recall that an element x not the identity e in a semigroup is called irreducible if x=yz implies that x=y or y=e and prime if it divides a concatenation only if it divides one of the factors, in general a stronger property. We will need the concept of Delphic semigroups which we recall.
Remark 2.11. Delphic semigroups were introduced by [Ken68] and further studied in [Dav68]. They are commutative, topological semigroups where the set of divisors of any element is compact and there is a continuous homomorphism into ([0, ∞), +) with trivial kernel which is here given by u →ū. ♣ Remark 2.12. Note that we can define for countable index sets concatenations by taking limits of the finite concatenations of finitely many elements from the index set and requiring that the limit exists for all choices of the finite subsets. ♣ We prove in Section (3.1) the following: Theorem 2.13 (Semigroup structure, truncation consistency).
(a) The algebraic structure (U(h) , h ) is a Delphic semigroup, see Remark (2.11). The set of irreducible elements is U(h) and any u ∈ U(h) has a unique (up to order) decomposition: where I is a countable index set and u i ∈ U(h) \ {0}. We can associate via τ (h ) for each h ∈ [0, h] with u a path of h − decompositions.
are the h respectively h-decomposition of u and similarly for countable decompositions.
In other words there exists for every h for a forest of diameter t a unique family decomposition with kinship degree h. As a consequence we can associate with every forest with diameter t a path of family decompositions, where the "time-index" is degree of kinship h ∈ [0, t]. The different decompositions are consistent in the sense that they are successive truncations. Remark 2.14. The semigroup (M 1 , ) from [EM16] has the property that an element can be decomposed into countably many prime elements. Since both semigroups allow a unique factorization in prime elements, they can be understood as free semigroups with a certain set of generators, the prime elements. We do not know of a "natural" isomorphism between the two semigroups which is continuous. However in the sequel we only use that many arguments carry over since they only use the algebraic structure. ♣ The previous result allows us next to formalize the idea of a family decomposition for an ultrametric measure space and is the key object for the analysis of the genealogies of branching populations.
For that purpose we want to extract the h-top consisting of the decomposition in the open 2h-balls of an element u ∈ U and in addition an object containing the ancestral relations of the different 2h-balls by a second element of U(h) which is complementary to the h-top namely the so called h-trunk. (b) Suppose u (h) = h i∈I u i as in (2.12) with at most countable index set I and u i ∈ U(h) \ {0} and write u i = [U i , r i , µ i ] for i ∈ I. The h-trunk of u is defined as the ultrametric space

♦
Having Theorem (2.13) we can regard the h-top of u as a collection of elements in U(h) and we will make use of this identification frequently.
Remark 2.16 (Family decompositions). Recall the connection between ultrametric measure spaces and R-trees explained in Remark (2.2). In particular, we can view u as the leaves of a family tree for a population. Then, Theorem (2.13) applied to u (h) can be stated in the way that a population represented by u has a unique family decomposition of depth h, that is, a decomposition into subfamilies u i , i ∈ I, where within each subfamily all individuals share a common ancestor whose death dates back at most time h. We will speak of the h-family decomposition into sub-families of individuals whose degree of kinship is at most h. In a similar way we can think of I as the set of ancestors who lived at time h back in time. The metric of the trunk encodes the genealogy of the h-ancestors. If the diameter of u is 2t, then we can view u as the population alive at time t and Theorem (2.13) applied to each h ∈ (0, t] induces a (unique) collection of family decompositions The h-trunk has the property that see Proposition (3.25). Together with Theorem (2.13) this allows to say that we can approximate any u ∈ U in a natural way by um-spaces consisting of at most countably many points, namely the h-trunks for h ↓ 0. ♣ The following result tells us that the operation τ (h) called h-truncation is continuous.
Remark 2.18. It is also possible to establish using the explicit definition of Gromov-Prokhorov distance that the first mapping is a non-expansive map. ♣ Note that in ultrametric spaces two open balls are either contained in each other or disjoint. Hence it makes sense to speak of the number of open balls of radius h in u. By Theorem (2.13) this number is unique. Since it is important we introduce a special notation for this number.
Definition 2.19 (Number of h-balls). If u ∈ U and let I be the index set belonging to the decomposition of u (h) as given in Theorem (2.13). Then we set # h u := #I. ♦ The following is analog to Lemma 2.4(a) in [EM16] and proved in Section (3.2).
The number of open 2h-balls # h is measurable. It is an additive functional on (U(h) , h ), that is Remark 2.21. Note however that the map # h is neither upper semi-continuous nor lower semi-continuous. This can be seen from the following counterexample. Take u n = Another example of a measurable functional is what we call the path of tops, associating to every depth h the h-subfamilies in a measure description, see Section (3.4). This functional describes the fragmentation of the tree in smaller sub-trees when we are getting closer to the top of the tree.

12
For a population and its genealogy a natural concept is the genealogy of a subpopulation and its sub-genealogy which induces an order of these objects which is also the natural order association with the group structure. Indeed we can equip the space of h-forest with a partial order such that if u, v are h-forests and u is a sub-forest of v, then u is smaller than v. More formally: We skip the index h on ≤ h if no ambiguities may occur. Different interesting partial orders on metric measure spaces are developed in [GRG] and [GR16], the latter less restrictive.

Analytical tools for h-forests: polynomials and their truncation
The key objects to study the h-forests are distance matrices, monomials and polynomials, objects which we define next. We begin with the finite sub-trees of size m of an element u ∈ U.   For m = 1 we set ν 1,u :=ū := µ(U ) the total mass.

♦
Note that we cannot define here a nice distance matrix measure on D ∞ since µ need not be a probability measure. For that we have to consider (ū, ν), with ν = R ∞,(U,r) ( µ) and µ = (μ) −1 µ and then the normalized measure allows to define a distance matrix distribution on a sampling sequence which provides the full information on the genealogy.
The finite sub-trees with m leaves can be described by the following test functions.
Definition 2.24 (Polynomials). For m ≥ 1 and φ ∈ C b (D m ), define the monomial The elements of the algebra generated by Π are called polynomials, the corresponding set A(Π).

13
We denote special classes of monomials for h > 0 as follows: (2.24) (2.26) This extends to polynomials by linearity. ♦ , since we require the test functions in monomials to be continuous. Truncated polynomials are the key in studying the semigroup (U(h) , h ). We list some of the important properties in the next theorem.  (c) Elements of U(h) are identified by truncated monomials: Proof of Theorem 2.27. The proofs are given via three statements and their proofs, which can be found in Propositions (3.8) for (a) and (b), (3.10) for (c), (3.11) for (d).
Item (c) states that in order to identify an element in U(h) it suffices to know the distance matrices only at distances strictly less than 2h, moreover, this even suffices to determine the topology of the space.
Theorem (2.27) implies that non-negative, truncated monomials on U(h) are monotone with respect to the partial order introduced in Definition (2.22).

Analytical tools for random h-forests: Laplace transforms
We are now considering random elements in U, which we generically denote by the capital letters U, V. As in the classical settings of random measures or real-valued random variables, also for random trees the Laplace transform is a powerful tool. In the abstract setting of the semigroup we say that given a (semi-)character χ : U(h) → [0, 1] (or the complex numbers with modulus not greater than 1) we define a characteristic function M 1 (U(h) ) → [0, 1], µ → µ(du) χ(u). In particular, they will play a role when we introduce infinitely divisible random trees.
Definition 2.29 (Laplace transform). The Laplace functional L U : A + (Π) → [0, 1] of a random um-space U is defined by  (a) Let U, U ∈ U(h) be random h-forests. Then, (b) Let U, U n , n ∈ N, be random h-forests. Then, This result will be established in Section (4.1). Note that we do require polynomials in the previous result. It is open whether we could just use monomials above, at least in the first claim, more in Remark (4.3).
So, Laplace transforms of the truncated polynomials are a powerful tool for the analysis of the semigroup (U(h) , ). Further properties showing on the importance of the Laplace transforms on semigroups are given in Section 5 of [DMZ08].

Infinite Divisibility
The next step is to identify the random forests where we can represent the h-tops and path of h-tops via Poisson point processes on U(h) . Note that we have here not a pure semigroup question, for which one has an abstract theory, compare Section (2.9), but we have an R + indexed collection of nested semigroups related via truncation maps for which we want to decompose our law. The key concept is therefor the following.
Definition 2.31 (Infinite divisibility). Suppose t > 0. A random um-space U taking values in U which is not identically 0 is called infinitely divisible if for all h > 0 (or tinfinitely divisible if for all h ∈ (0, t]) and n ∈ N we find i.i.d. U

15
Note that by Theorem (2.27b), (2.30) this is equivalent to saying that for all h > 0 the Laplace functional of the h-treetop factorizes for every n ∈ N: Given an infinitely divisible U and observing only total population sizes we should get back to the classical concept of infinite divisibility of non-negative R-valued random variables. Indeed the following is proved in Section (5).
(a) If U is infinitely divisible (or t-infinitely divisible for a t > 0), the total massŪ is infinitely divisible in the classical sense.
(b) Conversely, let X be an infinitely divisible random variable taking values in R + such that its log-Laplace transform has the form where ν is the Lévy measure, that is, it is a σ-finite measure on (0, ∞) such that Then, for each h > 0, there exists a random element U h taking values in U(h) such that U h is h-infinitely divisible andŪ h has the same distribution as X.
Remark 2.33. The choice in (b) is not unique. A particular example is to take a star-tree with diameter h with a measure µ, namely [N, 2h, µ]. Here µ arises by taking the atoms of the Poisson point process on (0, ∞) with intensity ν labeled bei N in decreasing size, this size giving the density of µ w.r.t. to counting measure. ♣ Note that only (0, ∞)-valued infinitely divisible random variables can occur as the total mass of an infinitely divisible random tree. Recall their Laplace-transform has no other part than the one we use in the r.h.w. of (2.32). For the representation of the U-valued state we will discuss this important aspect in greater detail in Section (2.9).
Remark 2.34. The reader might wonder why not defining infinite divisibility using that ν u is a measure on distance matrices generating a sampling sequence using the approach as in Kallenbergs theory of random measures. However we first note that if U is not a random ultrametric probability space, we cannot define ν U . This would be necessary to return to the setup of random measures and apply this classical theory. However another possible definition would be to consider for a random U the collection of random measures {ν u,m , m ∈ N} and to require infinite divisibility of the random measure ν U,m defined in (2.22) for every m ∈ N . By equation (2.31) it turns out that the latter is implied by our definition but is not very convenient to work with since we can not define easily from their representation the needed PPP on U. Altogether this means the theory of random measures on (R + ) ( N 2 ) cannot be used.
be the canonical t-concatenation of (U i ) i∈{1,...,M } . We then refer to P as a compound Poisson t-forest with parameters θ and λ, short, a CPF t (θ, λ). Note if P is a CPF t (θ, λ), then P ∈ U(t) , that is, every CPF t (θ, λ) is a random t-forest. By construction CPF is infinitely divisible, since we can divide M d = M 1 + · · · + M n for (M i ) 1≤i≤n i.i.d. and Poiss(θ/n). ♣ This is however not the only possibility, the general case is a Poisson point process on forests as our main result shows. It generalizes as in the classical setting of infinite divisibility on R to limits of CPF's, i.e. allowing in (2.33) a countable concatenation of independent random elements which are not necessarily identically distributed.
Recall that on a Polish space E, where we have defined bounded sets together with a point infinitely far away M # (E) denotes the set of boundedly finite measures on E, which we will consider here for the space E = U(h) \ {0}) with the point 0 infinitely far away, see the discussion before Proposition (B.4). Recall that for h = ∞ we get E = U \ {0}.
Theorem 2.37 (Lévy-Khintchine representation of U−valued random variables). An infinitely divisible random ultrametric measure space U allows for a Lévy-Khintchine representation of its Laplace functional; more precisely, there exists a unique In either case, We refer to λ h as the h-Lévy measure and to λ ∞ as the Lévy measure.
An interested question is to find the class of random variables for which the Lévymeasure is in fact concentrated on the trees, i.e. on U(h) \ {0} rather than on the forests U(h) as in the above statement. This will be addressed in part II where the concept of Markov random trees is introduced for U−valued random variables.
Our goal was to decompose our tree in equally distributed independent pieces such that the collection is decomposed in a consistent way. Indeed the relation (2.34) says that the treetop U(t) can be seen as a Poisson number of t-forests. Then, (2.35) says that U (h) is a concatenation of the same Poisson number of objects, now h-forests and these h-forests are simply h-truncations of the t-forests. This is also well understood in the case of the CPF t (θ, λ) where we have λ h (dv) = θ λ(du)1(u(h) ∈ dv), see Proposition (4.5).
In [DG19b] we determine the Lévy -measure in the case of the U-valued Feller diffusion explicitly and give various representations for the ingredients of the decomposition described above.
Definition 2.38 (Treetop canonical measure). For U t-infinitely divisible, the measure λ t is called the treetop canonical measure and h < t, λ h is the canonical measure at depth h. ♦ Remark 2.39. The equation (2.35) already exhibits aspects of the path of treetop decompositions, however since we decompose here in forests but not in trees we will want to refine the analysis and go further beyond the information coded in the semigroup related to a fixed h. ♣ The last theorem allows us to give a reformulation of infinite divisibility in the sense of a Poisson cluster representation: If U is t-infinitely divisible, then there exists a Poisson point process on U(t) such that the h-truncations of the points form a Poisson point process N h with Lévy measure λ h with (2.37).
The role of the Lévy measure is underlined by the following convergence criterion analogous to Theorem 13.14 in [Kal02], which we prove in Section (5.3).

Theorem 2.41 (Convergence and infinite divisibility). Let
There are various classes of infinitely divisible random trees, as this is the case in classical R-valued or even Banach-space valued random variables, which are characterized by different type of properties and which correspond to special forms of the Lévy-measure, see here Section (2.9) for references. Among these are in our case certain random trees, which arise in branching processes. We will discuss what the appropriate concepts are in the realm of random trees, i.e. random ultrametric measure spaces. We discuss this in the next subsections.

Genealogies of branching processes and infinite divisibility
We give now a first idea how we can work with the collection of semigroups of (2.11) to study branching processes on the level of genealogies and we will show that branching processes always have infinitely divisible marginals.
(b) Now, let (Q t ) t≥0 be a semigroup of probability kernels on U × B(U). We say that the semigroup (Q t ) t≥0 has the branching property if for all h ≥ 0: The Markov process generated by a semigroup and an initial value has the branching property if its semigroup has the branching property.
The Markov process describing the genealogy of the Feller continuum state branching diffusion is such a process, see Section 2.7.
Clearly, the convolution defined above induces a semigroup structure on the probability measures on U(h) . As a first application of the convolution we may formulate the following consequence of Theorem (2.41) which is already stated for groups in Theorem IV.4.1 of [Par75]. It is a classical statement that continuous state branching processes have marginal distributions which are infinitely divisible distribution on [0, ∞). We can derive the following result in our context.

Theorem 2.44 (U−valued branching processes have infinitely divisible marginals).
Let h > 0 and π ∈ M 1 (U(h) ) be h-infinitely divisible. Suppose (Q t ) t≥0 is a semigroup which has the branching property. Assume that (U t ) t≥0 is the stochastic process induced by Another key feature of branching processes is that processes can be realized jointly for different initial values. This feature is also conserved in the setting of h-forests as the next proposition shows. Recall the partial order ≤ h from Definition (2.22) and denote the corresponding stochastic order by h .
Proposition 2.45 (Joint realization of branching processes). Let U = (U t ) t≥0 , V = (V t ) t≥0 be branching processes with the same semigroup such that U(h)

Examples of U-valued branching processes
Among R + -valued processes the continuous state branching processes have infinitely divisible one-dimensional marginals starting in a fixed point. Certainly the most prominent example of a continuous state branching process is the Feller diffusion (X t ) t≥0 , the solution of dX t = √ bX t dW t and X 0 = x ∈ [0, ∞), with b > 0 and (W t ) t≥0 standard Brownian motion. The solution defines the Feller diffusion process where marginal distribution is infinitely divisible whose Lévy -measure is given via the density . This diffusion is the many individuals-small mass-rapid branching limit of individual based binary critical branching in continuous time.
We obtain the genealogy as the limit of the Galton-Watson genealogy both taken as U-valued random variables. For the Galton-Watson tree growing we can explicitly read of the ultrametric (twice the time back to the most recent common ancestor) and we take the counting measure on the leaves as sampling measure. Then one passes to rapidbranching-small mass limit, see [Glö12] for the proof of tightness and convergence. Indeed we can define rigorously an U-valued diffusion by a martingale problem, which describes the genealogy of a population as equivalence class of ultrametric measure spaces [DG19b] and is the limit of the U-valued critical Galton-Watson process see [DG19a] for a survey.
We recall the operator for the martingale problem of the U-valued Feller diffusion from [DG19b]. We need the concept of a polynomial to get the domain of the operator. Fix n ∈ N and φ ∈ C 1 b (R ( n 2 ) , R). Then define for an equivalence class of an ultrametric measure space [U, r, µ] the function and the action of the operator denoted Ω ↑ by and Ω ↑ Φ n,φ (0) = 0. The operators on the r.h.s. are given by For a = 0 we have the critical Feller diffusion. Note that the martingale problem for (Ω ↑ , Π(C 1 b )), where Π(A) denotes the polynomials where φ is chosen from A, has a unique solution, see [DG19b] and this solution has the branching property and infinitely divisible marginal distributions if this holds initially. Therefore we have with this process the key example where the results of this paper apply.
This U-valued process is studied in great detail in [DG19b], where its Lévy -measures on U (h) are identified based on the present work and the branching property of the process starting in a fixed element is shown via a new generator criterion in [GRG]. In fact in [DG19b] we are able to derive an explicit representation of the Lévy -measure on U \ {0} using U 1 -valued coalescents and the R-valued Feller diffusion. Nice results can also obtained for the sub-and supercritical case as well as with adding immigration. The reader finds also further material in the survey article [DG19a].

Generalizations: discrete and marked setting
In this subsection we treat two other situations where genealogies are modeled with special versions or extensions of the space U.
A suitable concept of infinite divisibility is still important in particular for stochastic population models as genealogies of individual based Galton-Watson processes which is a special case with its own features. More precisely in order to describe genealogies of stochastically evolving populations it is also important to cover the case of individual based models, where we get population sizes which are natural numbers or a multiple of it.
On the other hand spatial models as super random walks or just continuum state multitype branching are important models requiring an extension of our approach. If 20 populations are geographically structured i.e. individuals have a location in a geographic space (some complete separable metric space Ω) then we have to replace U by a more general object, similarly if individuals are of different types from some set K or we have both.
However the idea here in the spatial case is similar for populations where individuals carry a type or are at a geographic location. We want to decompose for h > 0 the population in subpopulations which have a common ancestor at most time h in the past. The new aspect is that now each of these subpopulations consists of individuals which are located in space or carry a type. However this quality of individuals we will describe in the many individuals-small mass limit by a measure on the geographic space or the types (or both) which give the population size in a geographic set A or a set of types in a set B for A, B varying in the measurable sets of geographic space or type space. This means that our decomposition is as before in balls, but these subpopulations now have additional structure which however itself has no impact on whether an individual belongs to a genealogically defined subpopulation or not. Of course this has to be made rigorous.
Topology of the state space for spatial models In order to incorporate genealogies in spatial models it is necessary to generalize the concept of ultrametric measure spaces to marked ultrametric measure spaces. How to do this has been developed in [DGP11] and for infinite total population size in [GSW16]. We recall the idea.
Consider a Polish mark space with a metric (V, r V ) which is fixed. We shall assume that: (2.45) (V, r) is a topological group, with neutral element 0.
The reader might think here for example of V = Z d . Then let (U × V, r, ν) be the new object where (U, r) is a Polish space with specified metric r and ν a finite Borel measure on B(U × V ).
Then define (U × V, r , ν ) and (U × V, r, ν) as equivalent if there exists a map ϕ from supp µ, where µ(·) = ν(· × V ), to supp µ which is an (r, r )−isometry such that ϕ : U × V → U × V satisfies ϕ((u, v)) = ( ϕ(u), v) and ϕ * (ν) = ν . The equivalence class of (U, r, ν) is denoted The set of all such equivalences classes of V −marked ultrametric measure spaces is denoted: This set is endowed with the V −marked Gromov weak topology (see [DGP11]), which makes U V a Polish space. Namely the space is equipped with the marked Gromov-Prokhorov metric, see Section 1.2. in [GSW16] generating the topology. The reader might think of this as follows. Write ν =ν ν,ν = ν(U × V ). A sequence is converging if first theν n converge and second the space generated by the samples taken i.i.d. with ν converges as a finite marked metric space for all sample-sizes, drop the latter condition ifν n → 0. This concept allows also to consider boundedly finite measures on U V , which in turn allows to define generalized Poisson point processes on U V see Section 2.4. in [DVJ03], which we need for the Lévy -Khintchine representation. In that setup we replace U \ {0} used here so far by U V \ {0} where now the 0 space is defined similar as before as [ Remark 2.46. Here we have to address the choice of definition for the isomorphy classes of marked ultrametric measure spaces. If we have a population which is concentrated on a closed subset V of the space V , the question is whether this is an element of U V which we have to distinguish from the element of U V where the support of ν is in U × V .
Another choice to define the isomorphy ϕ would be to say (2.51) The former choice is the one usually taken that is requiring, i.e. in (2.49) the equation to hold for all i ∈ supp(µ) and all v ∈ V . However note that the latter choice leads to taking a quotient w.r.t. to a certain subspace (closed) and we simply can work with the quotient topology. However if one wants to think about random genealogies it is not convenient to work with this concept of isomorphy. ♣ We define again a (distance matrix, mark)-measure on (R) ( n 2 ) × V n and its Borelσ-algebra as push forward corresponding to the map This measure is denoted ν u,n .
The polynomials take now the form The algebra generated by these monomials is separating, see ( [GSW16]). Based on these polynomials on U V we define the Laplace transform again via (2.27) and we use the same notation.
In the sequel we will choose as mark space V the geographic space Ω for example Ω = Z d or R d . In that case of a mark space (different from the usual multitype situation where V may be a finite set) it is necessary to allow also infinite measures ν in [U ×V, r, ν]. However then one has to restrict to boundedly finite measures. Namely if Ω can be obtained as Ω n ↑ Ω with Ω n ⊆ Ω and Ω n finite or bounded. Take for example Ω = Z d or R d . Then we require that ν | U ×A is finite for every A finite (bounded). These Ω n are chosen for example as [−n, n] d ∩ Z d in the case of Ω = Z d or more generally on a Polish space with fixed metric balls around a fixed point. Then the equivalence classes are defined by requiring that all restrictions to the sub-populations U × Ω n are equivalent in the sense defined above (2.46). Then we obtain still a Polish space U V (for V = Ω) introducing the Ω-marked Gromov weak # topology, see [GSW16] for the details. Roughly: we define the topology by defining it again by the convergence in (A.4) just using now the polynomials for the marked case with g having a bounded support.
The semigroup structure of the state spaces Next we have to introduce the semigroup structure for the discrete and the spatial case.
The discrete semigroups are a sub-semigroup of h-forests U(h) , consisting of those with integer-valued measures (or multiples of those). The binary operation is the concatenation h from (2.7).
The marked h-forests are sub-semigroups of marked h-forests consisting of those marked um-spaces with genealogical distance bounded by 2h. For the marked setting we define the binary operation, the concatenation as follows. Denote µ, ν as the extensions from U resp. W to U W , then set We see in particular that we just lift the operation of concatenation on (U, r) to (U ×V, r ⊗ r V ) and the addition of measures from B(U ) to B(U × V ). Note that measures on a space form a topological semigroup, the genealogical part does as well as we saw. Therefore the operation on U V inherit much of the structure and this is easily seen.
(a) Let a > 0. Define the set r a pseudo-ultrametric on U and call U(h, 1) the set of discrete h-forests.
(b) Let V be a Polish space. The set of marked h-forests is defined as Combinations of the two definitions in U V (h, a) are defined analogously. ♦ We then obtain with our setup: Proposition 2.48 (Semigroup properties).
(a) The adapted Theorem (2.13) holds for the two semigroups above. In particular it is factorial for h > 0 (i.e. we have (2.12)).
Furthermore we have for h ≥ 0: forms a topological semigroup, i.e. h is continuous as function of two variables.

23
Note that (c) follows from Theorem (2.13) as well the second part of (b) while closedness is straightforward. Therefore we will later in Section 5.4 have to verify essentially (a).
Remark 2.49. We note two facts. The h-subfamily decomposition means now (as we can deduce having proved uniqueness of the decomposition) that the h-balls in which we decompose now lead to [ Finally the generalizations involve the set of truncated (marked) polynomials, see Definition 3.7 in [DGP13] and [GSW16] for more detail, which we denote: For the Lévy -measures we have now measures on U V \ {0}. Then we can generalize our results. All results above hold once we make the indicated changes in the statements: We can apply this to the states of the genealogies of the super random walk the spatial version of the Feller diffusion which is a well known measure valued process (see [Daw93]). More precisely we talk about the genealogy (modeled as U V -valued process) of the Markov process X(t) = (x i (t)) i∈Ω for Ω countable abelian group given by the SSDE with (w i (t)) t≥0 i.i.d. standard brownian motions, a is a homogeneous summable transition matrix on Ω × Ω and b > 0. The corresponding U Ω -valued process of genealogies is treated in [DG19b] and [GRG] in great detail. However once we can construct the latter then we can apply the theorem to the above object.

Discussion: Relation to negative definite functions [DMZ08]
We have here a collection of semigroups in h which are all consistent with respect to the additional operation of truncation maps (besides concatenation and scalar multiplication) which gives the interesting features given here as well as those worked out in [GRG]. Nevertheless we can focus on a particular h and see what one can get from abstract theory for that object alone. That means we now relate our set up to the general theory of characters and semigroups, i.e. to harmonic analysis.
The semigroup (K, +) := (U(h) , h ) can be seen as an example of a convex cone as defined in Section 2 of [DMZ08] (here their relation 2.5. does not hold). The multiplication by positive scalars there works in the way: a[U, r, µ] = [U, r, aµ] for a ∈ (0, ∞) and 24 [U, r, µ] ∈ K. The origin coincides with the neutral element 0 and K is a normed cone w.r.t. the Gromov-Prokhorov metric d GPr . We have shown in Theorem (2.27) that the semigroup (K, +) possesses a strictly separating class of homomorphisms.
The Laplace-transform is an important tool for the analysis of random elements taking values in K as presented in Section 5 of [DMZ08]. In particular, as already known by classical results on positive definite functions ( [BCR84]), infinitely divisible random elements allow a representation of the Laplace functional via a Lévy -Khintchine formula. Our formula (2.34), however, has some special features in comparison to general Lévy -Khintchine formulas. We list these features below in a list after having explained briefly the relation to positive definite functions.
For an infinitely divisible random element in U(h) one can check that the map on the semigroup EΠ h is negative definite, see Section 5.2 of [DMZ08]. Note that EΠ h is a subset of the semigroup homomorphisms from K to [0, 1], denoted byK in that reference. Then Theorem 4.3.19 in [BCR84] establishes the existence of a type of Lévy -Khintchine formula for the map defined in (2.59) compare (6.5) in Section 6.1. of [DMZ08].
What is the relation to our setup and our Lévy -Khintchine formula (2.34) to the one obtained for fixed h?
• In general the Lévy -measure will be a measure on the bidual space of K, see Section 7.2 of [DMZ08]. In our Theorem (2.37), we see that it is actually supported on K (more precisely ι(K) if ι denotes the injection of a space into its bidual). To us it is not clear how to get this from abstract grounds in our case.
• There is no quadratic form part. This comes from the fact that EΠ h has no involution, except the identity, see Theorem 4.3.20 in [BCR84] or (6.6) in [DMZ08].
• There is no linear term. We have no easy explanation for that, recall we deal with ultrametric measure spaces here. The result follows from the fact that in (5.22) we can show that π h = 0. Another argument would be to use an analogous result to Lemma 5.8 of [EM16] which states that all non-decreasing continuous functions are constant and their application in their Section 9.

Outline proof section
The proofs are presented in three sections and an appendix. We prepare in Sections (3),(4) the ground by establishing first the results on properties of the state space and the semigroups structure and then the results on the properties of probability laws on these structures. The Section (5) contains the proofs of the main results on the infinite divisibility. In the appendix more technical points are collected.

Proofs of statespace description and semigroup results
We collect here in five subsections the main technical ingredients for the proofs of our theorems, topological basics concerning our state spaces and key objects of tree description, trunks, evaluations of polynomials reading off tree tops only (Section (3.2)-(3.5)) and the algebraic structure of our subfamily decomposition (Section (3.1), (3.3)). We work here with polynomials rather than the metric structure. In this subsection we will follow [EM16] and leave out some of the proofs since this reference provides elaborated proofs on rather similar statements with minor modifications. Before we start this section we give a quick remark on [EM16].
Remark 3.1. In Definition (2.4) we defined a one-parameter family of semigroups Another binary operation , leading to a very different tree, is defined in [EM16]. For The operation ⊕ on the metric was explained in [EM16]. The operation g coincides with the definition when g = 1 and it is easy to see that this defines a semigroup isomorphic to (M 1 , 1 ). One may also augment the space and consider M ≤g = {x ∈ M :x ≤ g}. Even though we get very different trees from this operation, algebraic properties are very similar. ♣ For Proof. This result is established as Lemmas 2.1 and 2.2 in [EM16]. Proof. It is elementary to show that the binary continuous operation h on U(h) defines a semigroup with the neutral element 0. Likewise commutativity is obvious.
Recall the partial order from Definition (2.22) and the modulus of mass distribution v δ (·, h) from [GPW09]: Then we can show the following lemma.
Proof. The first claim is obvious.
2h (x)). Likewise for z ∈ V and therefore: (c) is a trivial consequence of (b).
To show (d) see the following: Since A is compact, we know by Proposition (B.2) that for all h > 0, ε > 0 there is a δ(h, ε) > 0 s.t.

27
For any n ∈ N there are v n ∈ A and w n ∈ v∈A {u : u ≤ v} with u n w n = v n . This allows to deduce sub-sequential limits v n → v ∞ ∈ A (by compactness) and w n → w ∞ ∈ U(h) (by pre-compactness). By continuity of (in Lemma (3.3) (a)) we deduce, u ∞ w ∞ = v ∞ . Thus u ∞ ∈ v∈A {u : u ≤ v}. Similar arguments lead to the second statement. (e) is a consequence of (d)'s second statement.
An element u ∈ U(h) is called irreducible if u = 0 and v ≤ u for v ∈ U(h) implies that v is either 0 or u. We characterize the set of irreducible elements; that it is a measurable subset of U(h) was provided in Lemma (3.2). This is analogous to Proposition 5.1 in [EM16].  Before we give a proof we say that the semigroup is also sequentially Delphic in the sense of [Dav68], since U(h) is first countable (metric space!).
(A): The total mass mapping ∆ : converges to a limit v (∈ U(h) by closedness, see Lemma (3.2)). We want to establish that v = 0, which is stronger than what Kendall requires in (C), but also states that the only infinitely divisible element in the semigroup is 0 by his Theorem II. In order for v(i) to converge the sequence needs to be tight in the Gromov-weak topology. However, for any δ > 0 we find an i(δ) with c(i ) < δ for i ≥ i(δ) and thus . In order to satisfy the tightness criterion in Proposition (B.4) we need to have thatv i < ε for all large i and thus to havev → 0. This means that for the limitv = 0 and hence v = 0. Thus we have established that U(h) is a Delphic semigroup.
By Theorem III in Kendall's article and with Lemma 6 we know that for any u ∈ U(h) a representation as in (3.9) exists; he calls irreducible elements "indecomposable".
Proof of Theorem 2.30. From the Proposition 3.7 above we get part (a) of the Theorem. The part (b) follows since the truncation provides an h -decomposition, so that the uniqueness shown in (a) gives the claim.

Properties of truncated monomials
We begin studying truncation and monomials.
Proposition 3.8 (Truncated monomials of concatenation). Let h > 0 and u i ∈ U(h) , i ∈ I for a finite or countable set I. Let u = i∈I u i be the h-concatenation of (u i ) i∈I as in Definition (2.4). Then, for every Φ ∈ Π, This establishes Theorem (2.27) (a) and (b).
Proof. Suppose that Φ = Φ m,φ ∈ Π is a monomial. Let u i = (U i , r i , µ i ) for i ∈ I. Recall that for u = (U, r, µ): µ = i∈I µ i and thus where we used that r(x, y) = 2h whenever x and y are not contained in the same U i , i ∈ I. The other equality follows by Φ h ( u (h)) = Φ h (u).
In the previous result we did not require Φ ∈ Π h ; it sufficed to have Φ ∈ Π. Recall that # h (u) is defined as the number of prime elements of u (h). Let # h (u) be the (unique) number of open balls of radius h in u, (it is easy to see that one can recover # h (u) from the distance matrix distributions): for u ∈ U, (3.14) Now we can prove Proposition (2.20) which states the additivity and measurability of the map # h : U → N 0 .

29
Proof of Proposition 2.20. Measurability is clear, since 1(ν m,u ([0, 2h) ( m 2 ) > 0) is measurable for all m ∈ N. It is true that for m ∈ N: This establishes the claim, since taking m > # h (u) + # h (v) does not allow one to find a pair (k 1 , k 2 ) such that the right hand side is positive, whereas m ≤ # h (u) + # h (v) allows at least one positive summand on the right hand side choosing k 1 ≤ # h (u) and The previous lemma directly implies the following result.
Above we saw truncated monomials are homomorphisms on our topological semigroup. The next two results state that the class of truncated monomials allows to identify elements and that their initial topology on U(h) coincides with the induced topology from (U, d GPr ).
Proof. The second statement is an easy consequence of the first one. For the first statement necessity is obvious.
Fix m ≥ 1. We will deduce an expression for ν u,m (A) for A ∈ B(R ( m 2 ) ) only relying on values of ν u,m | [0,2h) ( m 2 ) . This derivation can then also be done for ν u ,m giving the result. It suffices to check the equality for for some a ij , b ij ∈ R. However, since ν u,m and ν u ,m only have positive mass on D m ∩ ([0, 2h) ∪ {2h}) ( m 2 ) , we can further restrict to A of the form in (3.18) with the additional property that: Define a permutation π on {1, . . . , m} depending on A in the following inductive way: let π(1) = 1 and let (i 2 , . . . , i l 1 ) be the ordered collection of indices i ∈ {2, . . . , m} with 30 A 1i ⊂ [0, 2h). Define π(i j ) = j. If there are any indices left (that is if l 1 < m), then take the smallest one, that is m 1 = inf({1, . . . , m} \ {1, i 2 , . . . , i l 1 }) and let {i l 1 +1 , . . . , i l 2 } those indices with A m 1 , * ⊂ [0, 2h). Define π(i j ) = j. Continue until no indices are left. The considerations with the permutation π allows us to give a "subtree decomposition" π * A of A of the form in (3.19).
We define now a symbol V as follows. By symmetry of ν, we know that ν u,m (A) = ν u,m (π * A), where π * A = {r : r π −1 (i)π −1 (j) ∈ A}. Thus we may work with the rearranged A now. Then by (3.19), Thus we may work with π * A instead of A. We can w.l.o.g. restrict to A of the form. and requiring for p ∈ {1, . . . , k} that Then we have, restricted to D m : (3.26) Now we can use the inclusion-exclusion formula to obtain: (3.27) But this is a formulation where on the right hand side in between two chosen points either their distance is less than 2h or there is no restriction at all. This allows to calculate ν u,m (A) using ν u,n | [0,2h) ( n 2 ) , 1 ≤ n ≤ m only. That is what we had to show.
In the next proposition we show that the class of truncated continuous monomials is also convergence-determining. The metric d GPr is defined in (A.5).
For sufficiency note that Theorem 5 in [GPW09] also tells us that it suffices to show that ν un,m ⇒ ν u,m as n → ∞ for any m ∈ N; convergence here means weak convergence of measures on [0, ∞) ( m 2 ) . Fix m ∈ N. By assumption we know that φ, ν un,m → φ, ν u,m for all φ ∈ C b ([0, ∞) ( m 2 ) , [0, ∞)) which are equal to zero outside of [0, 2h) ( m 2 ) . Therefore, we know that for n → ∞: We need to extend that convergence to [0, ∞) ( m 2 ) . Theorem 2.4 in [Bil09] states that it suffices to show convergence of the measures only for sets of the form (3.18) which have a boundary with ν m,u -measure zero. For such sets we can derive (3.27) for u and also for u n ,n ∈ N. The argument in Billingsley's proof applied for the measures ν 1,u , . . . , ν m,u allows to deduce weak convergence.
The following quantitative estimate plays a role.
Lemma 3.12. For s < t, u = [U, r, µ] ∈ U and a monomial Φ = Φ m,φ ∈ Π: Proof. The result is obtained via direct calculation: Next note that the set in the indicator goes to the empty set as s ↑ t, hence continuity follows.
These results allows us to give a proof of Proposition (2.17).
Proof of Proposition 2.17. Let u n → u in d GPr . Then by Theorem 5 in [GPW09]: (3.35) In the last step we applied Proposition (3.11). The continuity in h follows from Lemma 3.12.
This is equal to zero if q < # h (u) = k by the very definition of # h . Otherwise choose l out of the k trees without replacement and choose the mn 1 , . . . , mn l points x i from these sub-trees respectively. The choice is made without resemblance to the order so we also need to take into account permutations leading us to: by part (a).

Uniqueness of the factorization: Proof of Proposition 3.17
The question to address later is the uniqueness of the factorization in (3.9).
Lemma 3.15. Let X be a set and P a family of functions on X which separates points. Suppose P = n i=1 A i for n ∈ N. If none of A 2 , . . . , A n does separate two given points in X, then A 1 separates these two points in X.
Proof. It suffices to show the statement for n = 2, since it can be extended by considering Suppose that x i , y i ∈ X, i ∈ {1, 2} with x 2 = y 2 and (3.47) That means A 2 does not separate these two points in X. We claim that A 1 = P \ A 2 needs to separate these points, i.e. x 1 = y 1 . Define A 1 = {φ ∈ P : φ(x 2 ) = φ(y 2 )} ⊂ P \ A 2 = A 1 . Note that A 1 is not empty since otherwise φ(x 2 ) = φ(y 2 ) for any φ ∈ P , which implies x 2 = y 2 since P separates points; this would be a contradiction. Fix φ 1 ∈ A 1 . Then it is true that for any ψ ∈ A 2 : Therefore φ 1 + ψ ∈ A 1 , which is a subset of A 1 . Thus, Therefore ψ(x 1 ) = ψ(y 1 ) for any ψ ∈ A 2 . Altogether φ(x 1 ) = φ(y 1 ) for any φ ∈ A 1 ∪ A 2 = P . Thus x 1 = y 1 . (a) P separates points in U.
Proof. (a): This is a standard argument and we omit it. (b): Since Φ ∈ P we know that Φ(u) ∈ [ū m 1 , 2ū m 2 ]. This allows to give the bounds.
Therefore define the sets of monomials By (3.53) we know that P = A 1 ∪ · · · ∪ A k . Lemma (3.16) (a) says that P is separating and by Lemma (3.15) we know that at least one of the sets A i , say A i * , must be separating two given points. Thus, u 1 = v i * . (b): Suppose u 1 is irreducible and divides w + z. By Lemma (3.7) we know that w = i∈I 1 w i and z = j∈I 2 z j for some irreducible w i , z j and at most countable sets I 1 and I 2 , which we assume to be disjoint. Thus, w + z = i∈I v i for I = I 1 ∪ I 2 and v i = w i 1(i ∈ I 1 ) + z i 1(i ∈ I 2 ). By (a) we know that there is i * ∈ I such that u 1 = v i * . If i * ∈ I 1 , then u 1 |w and if i * ∈ I 2 , then u 1 |z. This is all we needed to show.  We obtain the following easy lemma.
Lemma 3.19. Let u ∈ U and h > 0. Then for Φ ∈ Π: the last equation under the assumption that u = i≤# h (u) u i .
Proof. Recall (3.55) and use the fact that Φ h is an h -homomorphism.
Proposition 3.20 (Countable support of ν 2,u ). For every ultrametric measure space u ∈ U and ε > 0 the measure ν 2,u | [ε,∞) has a countable support, i.e. there exist x n ∈ [ε, ∞) and m n > 0, n ∈ I for a countable index set I such that Proof. Recall the remark 2.9. Consider the τ (ε)(u). By Theorem 2.13 there is a unique prime factorization of τ (ε)(u) into countably many elements: for any element y i ∈ U i and it suffices to only take one x i from each prime element U i . Then we can calculate for A ⊂ [ε, ∞) with a countable sum: (3.60)

Paths of family decompositions
For u ∈ U(h) , we can observe the path of family decompositions u(s) for s ∈ [0, h]. We denote the space of càdlàg paths from I ⊆ R + → E by D(I, E) equipped with the (J 1 )-Skorohod topology. It is convenient to work with measure-valued representations.
The h-family decomposition whose existence and uniqueness is guaranteed by Theorem 2.13 naturally induces a point measure on subfamilies. This measure represents the set of h-subfamilies in an equivalent way. The measure is in general not finite, but it is boundedly finite, that is, it is finite on bounded subsets. We denote the set of boundedly finite measures on a metric space (E, d), by N # (E), and equip it with the weak # -topology, see [DVJ03].
In our case E = U(h) \ {0}, the h-trees is equipped with the metric One can view Θ h also as a map on U via the composition with h-truncation: This map is a bi-measurable bijection (see Lemma 3.18). We have the following result: Proposition 3.21 (Path of tops measurable). Let t > 0. The mapping is measurable.
This is important when we want to consider the evolution of the family decomposition of a random forest of diameter 2t as h varies, that is stochastic process taking values in U(h) .
Proof. Like any càdlàg function, the function (Θ t−s (u)) s∈[0,t) can be approximated in the Skorohod topology via a step function Thus, it suffices to check measurability of the mapping U → N # (U(t) \ {0}), u → Θ r (u) for any r ∈ (0, t). But this is obvious by the following two points. First, the mapping U → U(r) , u → u(r) is continuous by Lemma 2.17 and second, the mapping Θ r : U(r) → N # (U(r) \ {0}) is measurable by Lemma 3.18. We also used that U(r) is a measurable subset of U (t).   Remark 3.23. However, the mapping is not continuous. Consider for example u n = [{a, b, c}, r(a, b) = 2 − n −1 , r(a, c) = 2 + n −1 , δ a + δ b + δ c ] for n ∈ N ∪ {∞}. Then u n → u ∞ in d GPr . However, it is not true that the paths of the tops converge in the Skorohod J 1 topology. In [Gri17] a finer metric on U is defined under which the mapping is continuous.  4). So we only need to show that the only limit point of that sequence is Θ t . Suppose Θ was another limit point for the sequence t n → t. Let Φ = Φ m,φ ∈ Π + :

38
The first expression vanishes by definition of the limit and so we consider the second expression further using Lemmas 3.19 and 3.12: But [t n , t) ∅ and thus the right hand side approaches zero. So the expression on the left-hand side of (3.65) is arbitrarily small. By Proposition (B.6) we obtain that Θ t is the unique limit point of {Θ tn : n ∈ N} (b): First, note that (Θ sn (u)) n∈N is tight in M # (U(t) ) by Proposition (B.4). A similar argumentation as before allows to derive that there is only a unique limit point with the help of Proposition (B.6). Let Φ = Φ m,φ ∈ Π + . For the increasing sequence s n we get for 1 ≤ k ≤ n: Since ν 2,u is a finite measure we can bound the right hand side arbitrarily if only we take k ∈ N sufficiently large. (c): this is a consequence of (a) and (b).
We obtain a result about the mass-fragmentation of an ultrametric measure space. Consider the Polish space of decreasing numerical sequences which was defined in [Ber06]: if we assume that u (h) = i∈N u i and that the trees u i ∈ U(h) are size-ordered w.r.t. their mass. The topology on S ↓ is given by the 1 distance.
Corollary 3.24 (Mass fragmentation of the top). The mapping is measurable.
Proof. We already know by Proposition 3.21 that u → (Θ t−s (u)) s∈[0,t) is measurable. By [EK86, Exercise 3.11.13] it suffices to show that N # (U(t)\{0}) → S ↓ , i δ u i → (ū 1 ,ū 2 , . . . ) is continuous. The topology on N # (U(t) \ {0}) is that of boundedly finite convergence; that means that a sequence converges if all restrictions to bounded sets (i.e. sets of the form {u :ū ≥ ε}) do converge. The topology on S ↓ is that of 1 and thus the continuity is obvious.

39
The non-continuity issue in Remark (3.23) is not resolved for the path of the mass fragmentation in the previous lemma. With the same counterexample as there we see that the mapping is not continuous. The interesting question whether the previous mapping is invertible has a negative answer, e.g. u i = [{a, b, c, d}, r i , δ a + δ b + δ c + 2δ d ], i = 1, 2 with r 1 (a, b) = 1, r 1 (a, c) = 2, r 1 (a, d) = 3 and r 2 (a, b) = 1, r 2 (a, c) = 3, r 2 (c, d) = 2 and the necessary extensions for ultrametric spaces do lead to the same mass-fragmentation process.

Properties of trunks
Proposition 3.25 (Approximation by trunk). Letting u ∈ U in the Gromov-weak topology: Proof. Let Φ = Φ m,φ ∈ Π. Since φ is bounded and ν m,u is a finite measure dominated convergence implies (3.77) Since this works for every Φ ∈ Π, lim h↓0 u(h) = u in Gromov-weak topology by Theorem 5 in [GPW09] 4 Proofs for probability measures on the space U In this section we study random h−forests and hence use heavily the algebraic and order structure of the semigroup of h−forests from the last paragraph. The central objects are Laplace transform and stochastic order.
Then for every polynomial Φ = Φ (m,φ) ∈ A(Π) we have the representation For the first result on general Laplace transforms we may work just with monomials.
Restricted to um-spaces with positive mass π is continuous. Take A 1 ∈ B(R + ) and A 2 ∈ B(U 1 ). Then Hence, for anyΦ ∈ B b (R + ),Φ m,φ ∈Π. Recall thatΠ is separating for M 1 (U 1 ) (see Proposition 2.6. in [GPW09] for this reconstruction theorem) and C b (R + ) is separating for M 1 (R + ). Hence, by [EK86][Prop. 3.4.6], S := {u →Φ(ū)Φ(û) :Φ ∈ C b (R + ) ,Φ ∈Π} is separating for R + × U 1 . Therefore, (4.14) implies: Remark 4.3. One should note the difference to Proposition (4.1): there we only required to know about monomials in the Laplace transform, whereas in the truncated setting we actually need polynomials. Due to truncation we lack information about the joint distribution of the different sub-trees. Let (X 1 , X 2 ), (Y 1 , Y 2 ) be random variables taking values in the cone E = {(x 1 , x 2 ) ∈ R 2 : x 1 ≥ x 2 ≥ 0}. Suppose U = ({a, b}, r(a, b) = 3, X 1 δ a + X 2 δ b ) and V = ({a, b}, r(a, b) = 3, Y 1 δ a + Y 2 δ b ) . Then for h = 1, we observe the contribution of ν m,U to 0 namely (X m 1 + X m 2 ) and of ν m,V being (Y m 1 + Y m 2 ). So requiring We do not know whether this is sufficient to state: (X 1 , X 2 ) d = (Y 1 , Y 2 ). However, it seems possible that there are similar examples where the restriction of Laplace transforms to monomials does not suffice to determine the laws. This question seems to be related to inverse problems of Radon type in the cone E. Even though there are results which state injectivity of restrictions of Radon transforms for compactly supported measures, see [Kri09], we do not think that injectivity in (4.17) holds in general. ♣

Example: compound Poisson forest
We continue by calculating the Laplace transform in some examples. That will allow us to deduce the Laplace transform for the compound Poisson forest from Example (2.36) and we see the Lévy -Khintchine formula for this example.
Proof. Using Proposition (3.8) and the independence between U 0 and (U i ) i∈N we obtain (4.19) Proposition 4.5 (Laplace transform of CPF). Let P be a CPF h (θ, λ). Then, for all Φ ∈ Π + , Proof. By Proposition (4.4) and inserting the generating function of Poiss (θ) gives the claim.

Stochastic order and applications
Recall that in h-forests the notion of sub-trees induces a partial order, see Definition (2.22). This partial order induces a stochastic partial order for random h-forests. For a general treatment of stochastic orders we refer to [KKO77] and [SD83]. Let f ∈ bmB(U) the latter denotes the measurable bounded and monotone functions on U.
Definition 4.6. Let h > 0. Suppose U and V are random variables taking values in U(h) . Then we say that ] for all f ∈ bmB(U(h) ). ♦ Our first remark is implied by classical results in [KKO77].
Remark 4.7. Suppose U n is an h-subtree in V n for each n and U n =⇒ U, V n =⇒ V then, there are U d = U, V d = V on a common probability space such that U can be embedded as a subtree into V . ♣ Proposition 4.8 (Tightness via domination). If {U n : n ∈ N} ⊆ U is tight and V n h U n for all n ∈ N for some h > 0, then {V n : n ∈ N} is tight.
Proof of Proposition 4.8. By Theorem 1 in [KKO77] we may assume that, for each n, U n and V n are defined on the same probability space and V n ≤ U n almost surely. We denote by 2) (compare also [GPW09][Proposition 8.1]) we have to show that the total masses are tight and that for each ε > 0 there exists δ > 0 and C > 0 such that But obviously w Vn ([C, ∞)) ≤ w Un ([C, ∞)) almost surely and also by Lemma (3.5) v δ (V n ) ≤ v δ (U n ). By [GPW09][Proposition 8.1], (4.22) holds for {U n } and hence also for {V n }. As for the total mass, obviouslyV n ≤Ū n almost surely and hence tightness follows from the tightness of {Ū n }.
Example 4.9. Note that the partial order on U(h) induces a partial order on the path space D([0, ∞), U(h) ) in the following way: Hence we also obtain a partial stochastic order on M 1 (D([0, ∞), U(h) )). Hence by the same reasoning as before if U = (U t ) t≥0 , V = (V t ) t≥0 , U n = (U n t ) t≥0 , V n = (V n t ) t≥0 , n ∈ N are stochastic processes taking values in U(h) and U n V n for all n. By Strassen's result Lemma 13 in [Str65], U V and we can find a probability space (Ω, A, P) such that P{U t ≤ V t ∀t ≥ 0} = 1. That is we have an almost surely path-wise embedding of the dominated process into the dominating process. A typical example is a branching process starting in the zero tree but with different masses, as we shall see in the next section. ♣

Proof of Theorem 2.41
We now prove Theorem (2.41) which says that weak limits of infdiv random trees are again infdiv and moreover the Lévy measures converge.
Proof of Theorem 2.41. By assumption for each m and n there is U Also, Note that, by Skorohod embedding, we can choose U m : m ∈ N}. So let (m k ) be a sub-sequence and assume that (m k l ) is a further sub-sequence which converges to some U (n) so that Taking into account (4.24) and (4.25),

44
This shows that all convergent sub-sequences of sub-sequences have the same limit which implies convergence of the sequence itself , that is, for each n, It also shows that U is infinitely divisible and λ (m) h =⇒ λ h as m → ∞ as boundedly finite measures, see Corollary (B.5).

Proof of infinite divisibility and related results
This section proves the Lévy-Khintchine formula in the first subsection and the further claims from Subsection (2.5) related to it.

Proof of total mass results
We first prove Proposition (2.32). We use for (b) ⇒ (a) facts which are proved further below.
Proof of Proposition 2.32.
(a) This is obvious if we consider the polynomials Φ(u) = λū for λ > 0. Then (2.31) tells us that we may find that the variableŪ (h,n) is such that the total massŪ can be written as the sum of n i.i.d. copies ofŪ (h,n) .
(b) Recall for a measure ν we denote byν the total mass and by ν the normalized measure. By [Kle07][Satz 16.5] there are ν (n) ∈ M f (R + ) such that, if we denote θ (n) := ν (n) (R + ), Recall the definition of compound Poisson forest from Example (2.36) and let with a ⊗ e being the δ-measure on the element of U arising by multiplying the mass of e ∈ U 1 with a > 0: ThenŪ (n) ⇒ X. Moreover, ν (n) ⇒ ν as boundedly finite measures, with ν the Lévymeasure of X: We prove below that (U (n) ) n∈N is tight. Let V be a weak limit point of U (n) . Since U (n) is t-infinitely divisible for any n ∈ N, we know that U is t-infinitely divisible by Theorem (2.41) and the Lévy measures of the U (n) converge. Thus, Finally we prove that (U (n) ) n∈N is tight. By Proposition (B.4) we need to verify equations (B.4) and (B.5). Let ε > 0. For C = 2h + 1 we have ν 2,U (n) ([2h + 1, ∞)) = 0. The existence of an M in (B.5) follows using thatŪ (n) is tight (see (5.2)). So we are left with the modulus of continuity i.e. (B.4) we calculate: (5.6) And since (1∧x)π(dx) is a finite measure on (0, ∞), we may choose δ so small that the last expression is less than the given ε. So we have verified (B.4) and we know that (U (n) ) n∈N is tight.

The Lévy-Khintchine formula (Proof of Theorem 2.37)
We now have to prove the Lévy-Khintchine formula. In this section we will denote the law of the random tree U by P ∈ M 1 (U) and that of its n-th root at depth h > 0 i.e. U (1,n) h by P h n ∈ M 1 (U(h) ) and hence: We consider nP h n , a sequence of boundedly finite measures and prove it converges to a limit, the excursion law. With this fact we conclude later the proof in (5.49)-(5.51) easily. Hence the key to the proof is to show tightness of {nP h n , n ∈ N} and the uniqueness of the limit points.
It will be necessary to use tightness criteria for sequences of measures on U or U \ {0} and to define what we mean by weak convergence of boundedly finite measures; we refer the reader to Section (B). The strategy is to show first the tightness of measures on U(h) \ {0} (in the Gromov weak # −topology, see Appendix (B)) and then the uniqueness of a limit point in two steps.
Step 1 (Tightness) We want to establish the existence of an excursion of the process arising following the path of h−truncated states. A step is the tightness of the marginal distributions. First, we verify the claim that lim sup n→∞ nP h n (du) (ū ∧ 1) < ∞: We know thatŪ is a non-negative infinitely divisible random variable by Proposition (2.32). Therefore there exist c 1 ≥ 0 and a measure ν on (0, ∞) with ∞ 0 (1 ∧ h) ν(dh) < ∞, s.t. (see [Daw93], Theorem 3.3.1).
Since the law P (du) ∈ M 1 (U) is tight, for any ε we can find a δ = δ(ε) such that by Proposition (B.4): Step 2 (Uniqueness) Now we need to show that there is only one limit point of the sequence (nP h n 1(· = 0)) n∈N .
The final proof is then done with the next Lemma (5.3) below. Since the excursion measure is a measure on truncated trees we need first some preparation to get its uniqueness, recall here Theorem (2.30).
For the uniqueness problem, we need to show that for any polynomial Φ ∈ A(Π + ) the Laplace transforms coincide. For m = (m 1 , . . . , m l ) ∈ N l (repetitions allowed) let and D m = D m 1 +...+m l ∩ [0, 2h) ( m 2 ) . To evaluate a function φ : D m (h) → R we need a vector of measures. We write a vector ν of measures in the following form: (5.16) ν = (ν 1 , . . . , ν l ) : where l ∈ N is fixed and X 1 , . . . , X l are Polish and ν i ∈ M f (X i ), 1 ≤ i ≤ l. We call the set of such objects M(X 1 , . . . , X l ) = M(X 1 ) × · · · M(X l ). If we exclude the case that the measure attains the value zero in any coordinate we write Define ν m,u = (ν m 1 ,u , . . . , ν m l ,u ), u ∈ U and use the following notation for a polynomial Even though the above looks close to the desired result we have to realize that this does not mean we have this on the level of elements in U yet.
Proof. Let U be the realization of a random variable with law P (·) and U (i,n) h , 1 ≤ i ≤ n, i.i.d. copies of random elements in U(h) with law P h n (·). Then by (5.7): That means for any m ∈ N: Thus the measure vector ν m,U h restricted to [0, 2h) ( m 2 ) is infinitely divisible and by the extension to vector measures of Proposition 6.1 in [Kal83], see Section 3.1. in [GR91], there We have to show that the linear term vanishes.
In Proposition 6.1 of [Kal83] it is also stated how to calculate π m h using the law P h n . Since we want to show that π m h = 0 we directly use φ k (r) = 1(r ij < h), 1 ≤ k ≤ l. If u ∈ U we write u (h) = i∈I u i for u i ∈ U(h) and a countable index set I. First note that the following inequality holds for δ ∈ (0, 1) and m ∈ N: Suppose m 1 = min m k and m 2 = max m k . Then we can calculate starting from Kallenberg' s formula (given next) as follows: As we have seen in the proof of Lemma (5.1) in (5.14) for any ε > 0 we can choose δ so small that (5.24) is less than ε uniformly in n ∈ N. Therefore π m h = 0. Then use (5.22) to get:  This shows the two parts of the statement.
Next comes the existence of the excursion measure.
If we want to identify the cases in which the measure λ h is finite, the following observation is helpful: Lemma 5.4 (Total weight Lévy -measure). Let P ∈ M 1 (U) and assume (5.7). Then which is finite iff P (u = 0) > 0. To get the second equality set now m = 1 and φ ≡ a to get : .
Letting a → ∞ we get the claim.
Then for 0 < h < h: Proof of Lemma 5.5.
But we can rewrite the left hand side in this equation to get . Use Corollary (B.6) to deduce the claim.
For the special case t = ∞ we need to establish the existence of λ ∞ and we will do that in the next lemma.
Proof. We need to verify that the sequence is tight and that the set of limit points contains only a single object. Let us first do the uniqueness and assume there are λ ∞ , λ ∞ ∈ M # (U \ {0}) with two sequences h n ∞ and h n ∞ as n → ∞: Then for H > 0 and n sufficiently large with Lemma (5.5), But this means that both measures coincide on U(H) for any H > 0 and that suffices since then expectations of all polynomials Φ m,φ (which by definition has compactly supported φ) coincide. It remains to show the tightness of the sequence (λ h ) h>1 using Proposition (B.4); 1 was chosen arbitrarily. First, v δ (u, h ) = v δ (u(1), h ) gives with Lemma (5.5) that for h < 2 and h > 1: (5.42) The measure λ 1 as a single measure is tight and therefore it allows for any ε > 0 to choose δ such that the last quantity is bounded by ε. Sinceū =ū(1) we can show (B.5) and the 51 only thing left to show is the first part of (B.4). Note that by (2.37), Lemma (3.5) (a) for h > 0 : Hence the exponential transforms satisfy: (5.45) Using this for h = N allows to derive the following inequalities: (5.49) The next to last inequality holds, since L[U] is tight, allowing to choose N sufficiently large.
Finally, we can give a proof of the Lévy -Khintchine representation.
Proof of Theorem 2.37. We have shown the main things already, we just put them together again. Let Φ = Φ m,φ , see (5.18) be a polynomial. Then,

Proofs of related results
Proof of Corollary 2.40. Denote by V the right hand side of (2.37). We calculate the truncated Laplace transform of V (h). Use first Proposition (3.8) and then that N λ h is a Poisson process: (5.55) But this equals the Laplace transform of U (h). Since Theorem (2.30) tells us that Laplace transforms uniquely determine the law restricted to U(h) , we can conclude that If u ∈ U is fixed and µ ∈ M 1 (R + ), then if X ∼ µ we write µ ⊗ u for the law of the random measure arising as X· sampling measure of u, which means the measure on U induced by taking the element u and multiplying its mass with the random variable X.
Proof. Assume that CP F ((θ (n) , ν (n) ⊗ e)) = [U n , r n , µ n ]. It is obvious that the total masses of the collection of CPF's is tight. Hence by the tightness criterion in Theorem 3 in [GPW09], we only have to show that (n) and N (n) ∼ Poiss(θ (n) ) and ⊥ ⊥ i X (n) i ⊥ ⊥ N for each n. Denote the density of Poiss(θ (n) ) by (p We are now in a position to prove Theorem (2.44).
Proof of Theorem 2.44. Let n ∈ N. Suppose that V ∼ π and V n takes values in U(h) such that L V (Φ) = (L Vn (Φ)) n for all Φ ∈ A(Π h ). We know that the kernel Q t has the branching property, i.e. fulfills (2.39). By Theorem (2.30) this is equivalent to a relation for Laplace transforms. Using this statement we obtain for Φ ∈ A(Π t+h ): if we set U n as the um-space in U(t+h) with distribution given by P(V n ∈ dv 1 ) Q t (v 1 , d·).
Proof of Proposition 2.45. By assumption there is w ∈ U(h) such that v = u h w. Denote the semigroup by Q t . Then for all f ∈ bmB(U) (bounded and B−measurable), (5.80) This shows that U t h V t . Now, the claim follows from Theorem 1 in [KKO77].

Proof of extensions
We have already seen in Section 2.8 that the basic concepts we use in this paper, the concatenation and the truncation carry over to spatial models and we need now to verify that these operations have the same algebraic and topological structure. We have to prove first that we get topological semigroups. Then the second point is to prove the Lévy -Khintchine formula. For that we have to show that all properties needed to obtain the Lévy -Khintchine representation hold to then verify the formula. Altogether we proceed in two subsubsections.

Proof of Proposition 2.48
In order to obtain the case where we have now marked metric measure spaces consider first the case where µ is a finite measure and later we generalize this to the general case the argument based on the finite one.
In the finite measure case, note first the h-truncation, the concept of h−marked forests and trees refers to the genealogical part only and not the mark, we have only replaced 55 (U, r) by (U ×V, r⊗r V ) and µ ∈ M(U, B(U)) by ν ∈ M(U ×V, B(U ×V )) in the definitions. Furthermore concatenation involves the marks only via the fact that now two measures on U × V rather than U are in focus, where however V is a fixed object for all elements of our state space U V . Using that measures on a fixed measure space are a topological group and that the projections on U are topological groups one verifies in a straight forward way that we have again a topological semigroup. Therefore the strategy for the U-valued case can be used to get the marked case of the proposition, where essentially (a) has to be proven. Namely we have to return to Section 3.1, where Theorem 2.13, the non-spatial version of our present claim is proved and see by inspection that we can repeat these arguments with the given observations for the lifted objects.
We need here only one extra information, namely a marked compactness criterion in proving the conditions for a Delphic semigroup. The compactness in the marked case requires in addition that the measures of the subset of U in question projected on the marks are tight, see Theorem 3 in [DGP11]. Here we talk about measures being all bounded by one given finite measure. Hence tightness is immediate.
This means that all arguments carry over if we modify the statements on the concatenation as indicated in Section 2.8.
Remark 5.8. Once we have the proof of the Proposition 2.13, then we can see that the decomposition is also obtained via lifting. Namely apply the kernel κ to the µ i arising in the genealogical decomposition of [U, r, µ] we obtained on U, where κ is the transition probability from U to V induced by ν on U × V by µ i = µ| U i ⊗ κ. Once we have proved the marked version of our statement, then we know we have exactly this representation of the decomposition. ♣ In the case where we consider µ which are not finite on U × V , we have assumed µ is boundedly finite, i.e. finite on sets U × A, with A being a bounded mark set. In particular do we have the following approximation with elements of U with µ finite. We consider the restriction of the state U to U × V n denoted U n , with V n ↑ V and V n finite resp. bounded. Then the restricted random states U n fit our theory as explained in the previous paragraph.
Since the object in U V can be identified with convergent sequences of the elements in U Vn namely the restriction of the set U × V to elements (u, v) with u ∈ U and v ∈ V n , the result carries then over using as well the definition of the convergence (convergence of polynomials).

Proof of Theorem 2.51
We first have to argue first for the Propositions preparing the Lévy -Khintchine representation.
We have here the Propositions 2.30, 2.37, 2.41 which collect the properties of truncated polynomials and the corresponding properties of the Laplace-transform. Note we consider here polynomials based on ϕ · χ where ϕ depends on the distances and χ on the marks. Due to this product structure, we can use the results we have on the non-spatial case to lift them for χ ≡ const to the marked case and similarly for ϕ ≡ const, the well known statements for measure-valued states in M(V, B(V )) can be lifted to our situation. Hence we have to argue that the extension can be done for the joint distribution of marks and distances.
Here we note that the h-truncation we consider here is affecting only the distances and not the marks, which are locations or types. Therefore transferring the propositions to the marked case is straight forward and suppressed here.
Remark 5.9. Occasionally it is useful to use marks which depend on the ultrametric structure explicitly (as for example ancestral path of individuals). For example marking individuals by ancestral path, which depend explicitly on genealogical information. This interesting but complicated situation is not touched here, as it would require a different form of concatenation and truncation. This will be coming up in [GRG]. ♣ Now we turn to the Lévy -Khintchine representation and we start with elements from U V for V a bounded set and hence we have finite measures ν.
We can now consider the projection of U onto U by [U ×V, r ⊗r V , µ⊗κ] → [U, r, µ]. For the image we obtain from our results the Lévy -Khintchine representation via a measure λ h resp. λ ∞ on U(h) \ {0} respectively U \ {0}. This representation we have to lift now to U V .
What is the additional structure we have to deal with? We have to bring into play the infinite divisibility of the sampling measure on U × V . Clearly the projection onto V leads to a random measure which is infinitely divisible for all h-truncations (the projection is the same for all h) and has a Lévy -Khintchine representation by the classical theory of random measures see [Kal83], but we have to obtain the joint distribution of marks and genealogy. However we can follow the steps of our proof on U also here very closely.
Return to the proof in Section 5.2 and go through the argument. We just replace polynomials, Laplace-transforms on U by the ones on U V and the measure nP n h , from which we obtained λ h as a limit for n → ∞ before, is now a measure in M # (U V \ {0}). The tightness statements require now, as we saw above as additional criterion a condition on the mark component, namely the projections of the measures on the marks need to be tight. However for finite measure we have the same structure of addition and order the same arguments work here as in Section 3.1. Note also for random measures this representation is well known, see [Kal83]. Since we consider here as mark space finite or bounded sets which are fixed and the measures finite this condition is satisfied. Then there is no obstacle to repeat the proof step by step replacing U by U × V and U \ {0} by U V \ {0}. We leave further details to the reader.
Having the Lévy -Khintchine representation for this case we can continue with the general case. Next we note that by construction of U V in the case of infinite sampling measures a state is nothing else than the sequence of all its restrictions to a sequence of bounded mark spaces exhausting the full space in fact they converge to the state U, which allows to handle the remaining claim, the Lévy -Khintchine formula.
The restriction is defined by mapping The image can be mapped 1 − 1 and isometric and measure preserving onto where r Vn is defined as restriction of r V to V n × V n and ν n is the image measure of ν| Vn . For the latter we can apply the previous results on the marked case with bounded mark sets and finite sampling measures. Namely by the very definition of the topology, the U n approximate the U, see Section 1.2. in [GSW16]. Therefore consider the corresponding Lévy -measures λ n h , more precisely corresponding to the populations U n in the finite measure spaces to corresponding U × V n . They give elements of U Vn . They can be extended to elements in U V (we embed for that the n-population from U n × V n in U n × V ). We have to show that these elements converge as n → ∞ to a limit measure λ

57
This convergence takes place since (λ n h ) n∈N form a projective family on U V . Consequently both sides of the Lévy -Khintchine representations for given n converge to a limit which gives the Lévy -Khintchine representation of the U V −valued random variable via the limit measure λ ∞ h .
..., April 9, 2019, 3:26, INFDIV-lastrevision˙5.tex A ULTRAMETRIC MEASURE SPACES 58 Appendices A Ultrametric measure spaces Our random variables take values in the space of metric measure spaces see [GPW09], later also in marked metric measure spaces, first introduced in [DGP11] based on [GPW09] and generalized in [GSW16]. We briefly review definitions and topological facts used.
Definition A.1 (Topology of state space). (i) We call (X, r, µ) a metric measure space (mm space) if (a) (X, r) is a complete separable metric space, (b) µ is a finite measure on the Borel subsets of X.
(ii) We define an equivalence relation on the collection of mmm spaces as follows: two mm spaces (X, r X , µ X ) and (Y, r Y , µ Y ) are equivalent if and only if there exists a measurable map ϕ : X → Y such that (A.1) r X (x 1 , x 2 ) = r Y (ϕ(x 1 ), ϕ(x 2 )) ∀x 1 , x 2 ∈ supp(µ X ) and i.e. ϕ restricted to supp(µ X ) is an isometry onto its image and ϕ is measure preserving.
We denote the equivalence class of an mm space (X, r X , µ) by [X, r X , µ], if it does not create confusion we refer to [X, r X , µ] as an mm space too.
(iii) We denote the collection of (equivalence classes of) mm spaces by We use Gothic type letters x, y, . . . to denote generic elements of M. ♦ If (X, r, µ) is an mm space, we interpret X as the set of individuals, r as genealogical distance and, after normalization,μ :=ū −1 µ as the sampling measure.
We next define a topology on M. The main idea is to extend the Gromov-weak topology from [GPW09] to the setting of finite measures on X, see Section 2.4. in [Glö12] for details. where d Z Pr is the Prokhorov distance of finite measures on the metric space Z. The metric space (M, d GPr ) is complete and separable. This can be shown as in Section 5 of [GPW09] and several variants of the topology can be found in [ALW16], [Glö12]. ♣ We are especially interested in the set of ultrametric measure spaces.
Definition A.4. We say that the metric measure space (U, r, µ) is ultrametric if r(u, w) ≤ r(u, v) ∨ r(v, w) for all u, v, w ∈ U except on a set of µ-measure 0. ♦ Definition (2.1) can now be rephrased as: (A.6) U = {u ∈ M : u is ultrametric} .
The set U is a closed subset of M and therefore U is a Polish space, see Lemma 2.3 in [GPW13].

B.1 Tightness
As in [GPW09] we want to establish a criterion for relatively compact subsets of U and afterwards a suitable tightness criterion for finite measures on U and boundedly finite measures on U \ {0}. The main difference to their article is that we are working with finite sampling measures instead of probability measures. Since the notation is consistent with that of [GPW09] if we only consider probability measures on the metric spaces we chose not to change notation. However, we extend the notation a bit in allowing an additional parameter h in the next definition. Using the same steps as in [GPW09], it is not very difficult to establish the following analogue of Proposition 7.1 in this article in the non-normalized setting, see also Remark 2.5 in [DGP11]. Remark B.3. In the case we need to give a compact subset A of U(h) for a fixed h > 0, it suffices to verify that for any h ∈ (0, h), ε > 0 we can find a δ(h , ε) > 0 s.t. sup u∈A v δ (u, h ) < ε and that sup u∈Aū < ∞. The condition on the diameter is obsolete since it is bounded by 2h. Proof. Let S n = {u ∈ U :ū ≥ n −1 }. "⇒": By [DVJ03, Proposition A2.6.IV] we know that {m| Sn : m ∈ A} is tight as a family of finite measures on S n . First, sup m∈A m(S n ) < ∞ for any n ∈ N which directly implies (B.5). Additionally, (B.4) follows from a similar argument as in [GPW09] and Proposition (B.2). "⇐": Let ε > 0. For any n ∈ N choose δ n , C n and M < ∞ such that {u : ν 2,u ([C, ∞)) + µ u (x : µ u (B 2h (x)) < δ n ) ≤ ε2 −n } which is a pre-compact set by Proposition (B.2), since ν 2,u is uniformly bounded by M 2 on this set, for any ε > 0 one can find a n ∈ N s.t. ε2 −n < ε . Choosing C = C n tells that ν 2,u ([C , ∞)) < ε uniformly. For the same n ∈ N take δ = δ n to obtain uniformly And therefore A is relatively compact by [DVJ03, Proposition A2.6.IV].

B.2 Separation of measures
Boundedly-finite measures in ultrametric spaces appear in the Lévy-Khintchine formula.
In that context the next proposition gives a helpful statement.