Regenerative tree growth: structural results and convergence

We introduce regenerative tree growth processes as consistent families of random trees with n labelled leaves, n>=1, with a regenerative property at branch points. This framework includes growth processes for exchangeably labelled Markov branching trees, as well as non-exchangeable models such as the alpha-theta model, the alpha-gamma model and all restricted exchangeable models previously studied. Our main structural result is a representation of the growth rule by a sigma-finite dislocation measure kappa on the set of partitions of the natural numbers extending Bertoin's notion of exchangeable dislocation measures from the setting of homogeneous fragmentations. We use this representation to establish necessary and sufficient conditions on the growth rule under which we can apply results by Haas and Miermont for unlabelled and not necessarily consistent trees to establish self-similar random trees and residual mass processes as scaling limits. While previous studies exploited some form of exchangeability, our scaling limit results here only require a regularity condition on the convergence of asymptotic frequencies under kappa, in addition to a regular variation condition.


Introduction to regenerative tree growth processes
For each n ≥ 1, denote by T n the set of rooted leaf-labelled combinatorial trees with no degree-2 vertices and n + 1 degree-1 vertices, one of which is called the root, the others leaves. We distinguish the leaves by labels 1, . . . , n. Vertices of degree 3 or higher are called branch points. Consider a family T n , n ≥ 1, of random trees in T n , n ≥ 1. For n ≥ 2, we refer to the vertex adjacent to the root as the first branch point. It induces the first split, a random partition Π n = (Π n,1 , . . . , Π n,Kn ) of the label set [n] := {1, . . . , n} into the label sets of the subtrees above the branch point, the connected components of the tree with the first branch point removed. Here, we put the blocks Π n,i of Π n in the order of their least elements. For illustration, we write where we have ordered subtrees by their least labels to uniquely choose plane tree representatives.
We suppose that the family (T n , n ≥ 1) is consistent in the sense that removal of leaf n + 1 (and the resulting degree-2 vertex, if any) from T n+1 yields T n . Reversing this removal gives a tree growth step from n to n + 1. A consistent family (T n , n ≥ 1) is called a tree growth process. For B ⊆ [n], let T B be the set of trees with #B leaves labelled by B, so that T [n] = T n . Let T n,B ∈ T B be the reduced subtree of T n spanned by the root of T n and leaves in B, and let T n,B ∈ T [#B] be the image of T n,B after relabelling of leaves by the increasing bijection from B to [#B].  Figure 1: Illustration of the regenerative tree growth step Definition 1 We call a tree growth process (T n , n ≥ 1) regenerative if for each n ≥ 2, conditionally given that the first split of T n is Π n = (B 1 , . . . , B k ), the relabelled subtrees T n,B i , 1 ≤ i ≤ k, above the first branch point are independent copies of T #B i .
While this property is well-known for many tree growth processes, the goals of the present paper are to provide general structural results and to study implications for continuum tree asymptotics in this general framework. In the terminology of [6], the trees in a regenerative tree growth process as defined here are "consistent labelled Markov branching trees". The exchangeable case, where the distribution of T n is invariant under all permutations of labels, was initiated by Aldous [3], who posed the problem of providing a Kingman-type representation in this case. Bertoin's [4] theory of homogeneous fragmentations solved that problem as explained in [17]. Then [17,18] studied tree growth processes associated with fragmentation processes. Natural non-exchangeable tree growth processes were described in terms of simple growth rules that admit regenerative descriptions based on the first split and its subtrees, see particularly [5,10,28], as reviewed in Examples 16 and 17 below. We remark, however, that not all natural models fall into our current framework. For example, if T n is uniform on T n then (T n , n ≥ 1) is not a regenerative tree growth process because it is not consistent (see [27] for weak limits). Haulk and Pitman [19] give de Finetti representations for exchangeable tree growth processes that are not necessarily regenerative. An important consequence of Definition 1 is that all regenerative tree growth processes admit descriptions in terms of a growth rule (cf. Figure 1).

Proposition 2
In the tree growth step from n to n + 1 for n ≥ 2, there are the following disjoint events, G n,i for i = 0, . . . , K n + 1, where K n ≥ 2 is the number of blocks of the first split of T n : • G n,0 : leaf n + 1 is attached to a new branch point between the root and the first branch point of T n ; • G n,i , 1 ≤ i ≤ K n : label n + 1 is inserted into the ith block of the first split; • G n,Kn+1 : leaf n+1 is attached to the first branch point, as singleton block of the first split.
A tree growth process (T n , n ≥ 1) is regenerative if and only if P(G n,0 | T n ) = P(G n,0 ) does not depend on T n and P(G n,i | T n ) = P(G n,i | Π n ), 1 ≤ i ≤ K n + 1, only depends on the partition Π n of the first split. In the event G n,i , 1 ≤ i ≤ K n , label n + 1 is inserted into the ith subtree of T n of size #Π n,i following the same rule, up to relabelling by the increasing bijection from Π n,i ∪ {n + 1} to [#Π n,i + 1].
See Appendix A for a proof of this proposition.
One of our main results is that regenerative tree growth rules are (almost) in one-to-one correspondence with σ-finite measures on P, the set of partitions of N = {1, 2, 3, . . . }. Before stating this, let us introduce the notation P π = {Γ ∈ P : Γ [n] = π} where Γ [n] ∈ P n is the partition whose blocks are the non-empty blocks of (Γ i ∩ [n], i ≥ 1). Often we will abuse this notation and for a partition π = (B 1 , . . . , B k ) write P B 1 ,...,B k instead of P π . The most common occurrence of this will be the use of P [n] instead of P 1 [n] . We equip P with the σ-algebra generated by {P π , π ∈ P n , n ≥ 1}, which is also the Borel σ-algebra generated by the metric d(Γ, Γ) = exp(− inf{n ≥ 1 : Γ [n] = Γ [n] }).
Theorem 3 (i) Let (g n , n ≥ 2) be a regenerative growth rule such that g j (0) < 1 for all j ≥ 2.
We remark that part (ii) of this theorem shows how the relation between (g n , n ≥ 2) and κ fails to be one-to-one. That is, if κ produces a regenerative growth rule (g n , n ≥ 2) by (1), then any constant multiple of κ produces the same growth rule (g n , n ≥ 2) by (1). If κ as in part (ii) and (g n , n ≥ 2) are related by (1), we call κ a dislocation measure for (g n , n ≥ 2). Many of the asymptotic properties of a regenerative tree growth process can be obtained by analysing the asymptotic properties of the associated measure κ. In fact, the two most important considerations turn out to be the growth rate of λ n and the regularity of the convergence of asymptotic frequencies under κ. Let us expand on the second point. For Γ ∈ P and n ≥ 1, consider the decreasing rearrangement i | exists, this is denoted by |Γ| ↓ i and |Γ i |, respectively, and we say that an asymptotic frequency exists for that part. If |Γ i | exists for all i we say Γ has asymptotic frequencies while if |Γ| ↓ i exists for all i we say Γ has asymptotic ranked frequencies. Moreover, if the asymptotic (ranked) frequencies exist and sum to 1 κ-a.e., we say they are proper. If Γ has asymptotic ranked frequencies then |Γ| ↓ = (|Γ| ↓ i , i ≥ 1) naturally lives in the space We will equip S ↓ with the topology of pointwise convergence (which is also the topology of ℓ p convergence for any p > 1). We also introduce S ↓ 1 = {s ∈ S ↓ : i≥1 s i = 1}. We can then prove the following theorem, the background for which will be fully developed later.
Theorem 4 Let (T n , n ≥ 1) be a regenerative tree growth process associated with a dislocation measure κ. Assume that κ-a.e. Γ ∈ P has asymptotic ranked frequencies in S ↓ 1 \ {(1, 0, . . .)}, define ν to be the push-forward of κ under Γ → |Γ| ↓ and suppose S ↓ (1 − s 1 )ν(ds) < ∞ and λ n = κ(P \ P [n] ) = n γ ℓ(n) for some slowly varying function ℓ and γ > 0. If then T • n n γ ℓ(n) → T γ,ν in distribution, as n → ∞, in the rooted Gromov-Hausdorff-Prokhorov (GHP) sense, where T γ,ν is a self-similar fragmentation tree with characteristics (γ, ν) and T • n is the tree obtained from T n by delabelling the leaves, considered as a metric measure space with the graph metric and the uniform probability measure on the leaves.
We remark that when considering T • n purely as a tree we treat it as an element of the set T • n of rooted unlabelled trees with n leaves and no degree-2 vertices.
This theorem provides conditions for the existence of a scaling limit of T • n , where the label structure of T n has been forgotten. However, the leaf labels are an integral part of the tree growth processes under consideration here, so it is natural to ask what happens to the labels. Ideally, one would like a notion of labelled continuum trees to serve as scaling limits of regenerative tree growth processes, just as there is a notion of ordered continuum trees that serve as scaling limits of ordered Galton-Watson trees [2]. However, the appropriate notion is elusive, so we content ourselves with studying the leaf {1} and the structure of the path from the root to this leaf. We obtain several results relating the convergence of the residual mass process of the leaf {1} to the existence of a scaling limit of the whole tree. Here, by the residual mass process of {1} we mean the Markov chain in m ≥ 0 starting from X (n) 0 = n, decreasing to X (n) 1 = #Π n,1 and further according to successive splits of the block containing {1} until M n = inf{m ≥ 0 : X (n) m = 1}, when label 1 becomes a singleton. We set X (n) m = 0, m > M n . The limiting processes are decreasing self-similar Markov processes in [0, ∞), which Lamperti [20] represented in terms of subordinators ξ, as Theorem 5 Let (T n , n ≥ 1) be a regenerative tree growth process with dislocation measure κ and X (n) the residual mass process of {1} in T n . Assume that the first block Γ 1 of κ-a.e. Γ ∈ P has an asymptotic frequency |Γ 1 | ∈ (0, 1) and define Λ as push-forward of κ under Γ → − log(|Γ 1 |).
for some slowly varying function ℓ and γ > 0. If ⌊λnt⌋ /n → X t in distribution, as n → ∞, in the Skorohod sense as functions of t ≥ 0, where X is a self-similar Markov process and E(e −sξr ) = exp(−r (0,∞) (1− e −sy )Λ(dy)) in Lamperti's representation (4). Moreover, letting A n be the absorption time of X (n) at 0, the above convergence in distribution holds jointly with the convergence of A n /λ n to the absorption time of X at 0.
Assuming that κ-a.e. Γ ∈ P has asymptotic ranked frequencies, the remaining conditions of Theorem 5 are stronger than those of Theorem 4. In particular, note that and that (5) implies (3) (see the proof of Theorem 28). In Example 19 we construct a regenerative tree growth process where the conditions of Theorem 4 are satisfied, but the conditions of Theorem 5 are not. We note again that leaf {1} is generally not typical (i.e. uniformly random) and a heuristic interpretation of the last part of Theorem 5 is that the natural conditions that imply the convergence of the residual mass process of leaf {1} are strong enough to imply that the residual mass process of a typical leaf converges as well.
In the other direction, there is a natural strengthening of the hypotheses of Theorem 4 that implies that the conclusions of Theorem 5 are satisfied.

Corollary 6
If, in addition to the hypotheses of Theorem 4, including (3), we assume that 1 | κ(dΓ) = 0 then, with the notation of Theorem 5, X (n) ⌊λnt⌋ /n → X t in distribution, as n → ∞, in the Skorohod sense as functions of t ≥ 0 and this convergence in distribution happens jointly with the convergence of λ −1 n A n to the absorption time of X at 0. When κ-a.e. Γ ∈ P has asymptotic ranked frequencies, Theorem 5, combined with previous results about the residual mass process of a typical leaf, provides a description of how leaf {1} differs from a typical leaf. To see this, let (U (n) k , k ≥ 0) be the residual mass process of a leaf picked uniformly at random from T n for n ≥ 1. Under the assumptions of Theorem 4, Lemma 28 in [16] implies that U It is easy to check that this agrees with the expression for E(e −sξr ) in Theorem 5 when κ is exchangeable (see Example 12), but these two expressions may differ in general. Thus we have identified the scaling limit of the residual mass process of {1} and scaling limit of the residual mass process of a uniform leaf in terms of subordinators whose Laplace exponents we know explicitly in terms of κ. This provides insight into the difference between what the tree looks like from {1}'s perspective versus that of a typical leaf. The remainder of this paper is organized as follows. In Section 2 we give a detailed analysis of the laws of T n . The proof of Theorem 3 can be found here, along with a number of other structural results. Section 3 is devoted to examples. One of the nice aspects of the theory presented in this paper is that it gives a coherent framework for many particular models that have been studied previously in the literature. As a result, we are able to give simplified proofs of a number of previously known results about these models. Moreover, our framework makes it easy to specify regenerative growth processes with desired asymptotic properties and this allows us to construct examples illustrating what can go wrong if some of the hypotheses of our theorems are left out. Section 4 provides the necessary background to understand the precise meaning of our statements about scaling limits. We define the limit objects T γ,ν , the GHP topology, and provide the main results from the literature on which our present theorems are built. In Section 5 and 6 respectively, we provide the proofs of Theorems 4 and 5 based on general convergence criteria by Haas and Miermont [15,16] for (not necessarily consistent) Markov branching models and non-increasing Markov chains. Actually, our results are stronger but a bit more technical than Theorems 4 and 5, so we will prove results that have these theorems as obvious consequences. Section 7 gives some pointers at further problems and related work.
2 Laws of regenerative growth processes

Explicit formulas in terms of the growth rule
The regenerative nature of the growth processes conditioned on the partition at the first split shows that much of the analysis of these processes can be reduced to analyzing the laws of the partition at the first split of T n , i.e. the splitting rule p n , n ≥ 2. We first find the splitting rule in terms of (g n , n ≥ 2) and then obtain a formula for the law of T n . From the growth rule, we have for all π = (B 1 , . . . , B k ) ∈ P n \ {1 [n] }, Using the natural convention g 1 (0) = 1, the solution to these equations can be written as p n (π) = p n (B 1 , . . . , B k ) = g min B 2 −1 (0) and π [j] is the vector of non-empty B i ∩ [j]. The RHS of this formula is the probability of successively creating a new first branch point when min B 2 is added and inserting all higher labels such that the resulting partition at the first split is π. By the regenerative property of T n , we can write tree probabilities as a product over branch points; for a tree T ∈ T n , we identify each vertex with the set B of labels in the subtrees above this vertex, write π(B) for the partition of the split at B, and π(B) for the partition of [#B] obtained when relabelling π(B) by the increasing bijection from B to [#B]: where I j (B) = i if j + 1 ∈ π(B) i , and where π(B) = ( π(B) 1 , . . . , π(B) k(B) ).
The residual mass process of the leaf {1} can be described in terms of p n . Recall that the residual mass process is a Markov chain in m ≥ 1 starting from X Proposition 7 In a regenerative tree growth process, the family (C n , n ≥ 1) of compositions is regenerative in the sense that conditionally given C Mn ) of n − j has the same distribution as C n−j . The entries of the transition probability matrix are This is a straightforward consequence of Definition 1. We stress that we have consistency in the sense that C n can be obtained from C n+1 by reducing one part of C n+1 by 1 (the one corresponding to label n+1 in T n+1 ), but (C n , n ≥ 1) is not sampling consistent in the sense of [11] as this part is not a size-biased pick from C n+1 , in general. In special cases, versions of this proposition are in the literature; in the exchangeable (sampling consistent) case, it is implicit in Bertoin's [4] study of tagged particles and explicit in [18].

The dislocation measure
Recall the notation P n for the set of partitions π = (B 1 , . . . , B k ) of [n] = {1, . . . , n} with blocks indexed in increasing order of their least elements and the notation P for the set of all partitions Γ = (Γ i , i ≥ 1) of N, with blocks ordered by their least element and Γ i = ∅ if there are fewer than i blocks. Theorem 3 relates growth rules g n and splitting rules p n on P n \ {1 [n] } to dislocation measures κ on P. Taking our cues from the exchangeable case, cf. [4], one thing we want from our dislocation measures is to be able to use them to embed regenerative tree growth processes in continuous time, making the trees the genealogical trees of continuous-time fragmentation processes. This κ is to provide rates for the first split of [n], n ≥ 2, which allow us to consistently embed the evolution of blocks in T n , n ≥ 1, into continuous time (see Theorem 10). Observe that the rate λ n of the first split of [n] can then be thinned by the event that this split also splits [n − 1], an event with probability 1 − g n−1 (0), where (g n , n ≥ 2) is the growth rule of the regenerative tree growth process, so that we need λ n (1 − g n−1 (0)) = λ n−1 , n ≥ 3, and hence λ n = λ 2 Note that g j (0) = 1 for any j ≥ 2 means that all insertions in a subtree with j leaves are made below the first split; if scaling limits of (T n , n ≥ 1) exist at all, such subtrees with j leaves will collapse in the scaling as n → ∞. We will exclude such behaviour in the sequel and make the Proposition 8 Consider a regenerative tree growth rule (g n , n ≥ 2) satisfying Assumption (A) with splitting rule (p n , n ≥ 2) given by (7), and let λ 2 > 0 be arbitrary. With λ n , n ≥ 3, defined by (9), define κ(P π ) = λ n p n (π), Then κ extends uniquely to a measure on P.
Proof. This is essentially the same as the analogous result for exchangeable fragmentations, cf.
and it clearly extends to a countably additive measure on the ring generated by these sets. Carathéodory's Extension Theorem provides the unique extension to the σ-ring these sets generate in P. It is then straightforward to check that this σ-ring is a σ-algebra and, in fact, is the Borel σ-algebra on P.
Note that we can condition κ on splitting [n] and write p n as Proof of Theorem 3. Theorem 3 is a direct consequence of (6) and Proposition 8.
Let us discuss how Bertoin's [4] notion of a P-valued homogeneous fragmentation process finds a natural extension where his exchangeable dislocation measure is replaced by a dislocation measure in the sense defined above.
. For a refining process Π, we define genealogical trees T n ∈ T n , n ≥ 1, using the representation above (8): T n has as branch points and leaves all blocks Π i (t), i ≥ 1, t ≥ 0, visited by the restriction Π [n] of Π to [n]. Every regenerative tree growth process can be represented by a nice refining P-valued process: Theorem 10 For each dislocation measure κ as defined after Theorem 3, there exists a P-valued Feller process Π = (Π(t), t ≥ 0) such that the genealogical trees T n of the restrictions Π [n] of Π to [n], n ≥ 1, form a regenerative tree growth process associated with dislocation measure κ.
Proof. We will use κ in a Poissonian construction based on independent P-valued Poisson point processes (Ξ (i) (t), t ≥ 0), i ≥ 1, with intensity measure κ. Roughly, we construct Π with Π(0) = 1 N such that for all i and t the partition More precisely, we build consistent P n -valued continuous-time Markov chains ( and, if S [n] (k + 1) < ∞, let M [n] (k + 1) be the partition obtained from M [n] (k) by replacing the ith block by the blocks of Ξ (i) (S [n] (k + 1)), the image of Ξ (i) (S [n] (k + 1)) ∩ [#M  Since Π is uniquely determined by (Π [n] , n ≥ 1), standard properties of Poisson point processes, and of the space P complete the proof.
By Theorem 3, a growth rule (g n , n ≥ 2) determines a measure κ only up to a multiplicative factor λ 2 > 0. This is reflected in the fragmentation processes Π of Theorem 10 in the fact that the genealogical trees T n , n ≥ 1, are unaffected by (linear) time changes of Π.
From the consistency of (T n , n ≥ 1), it is clear that there is a unique branch point of T n , where 1 and 2 are separated into different blocks. Moreover, as n varies, the partitions at this branch point define a partition of some random subset of N, whose distribution when relabelled by the increasing bijection is described by the splitting rule conditioned on partitions that restrict to ({1}, {2}), hence by κ( · | P {1},{2} ). In the Poissonian construction, this partition after relabelling is Ξ (1) (S [2] (1)). More generally, while there may be Poisson points that do not induce branch points of T n , n ≥ 1, e.g. when κ is finite or when κ can produce blocks of finite size, those points Ξ (i) (S [n] (k)) used in the Poissonian construction describe the partition at a branch point of T m for all m ≥ n. The partition at every branch point, separating labels j and ℓ say, has a distribution that is absolutely continuous with respect to κ.
The Poissonian construction formulated here differs from Bertoin's [4, Section 3.1.3] in the relabelling by increasing bijections: In the exchangeable case this yields the same processes, in distribution. A notable consequence is that under assumptions that ensure that there are always infinitely many blocks and that they are all infinite, we can recover the (Ξ (i) , i ≥ 1) from Π in our setting. It is now possible to explore some more of Bertoin's fragmentation theory [4,Chapter 3] in our extended generality, notably erosion effects, stopping lines and extended branching properties, and, under conditions that ensure the existence of asymptotic frequencies, also self-similar partition-valued fragmentation processes. More generally, it would be interesting to characterise Markov processes (with a suitable branching property) whose genealogical trees are regenerative.

Unlabelled Markov branching trees
Our scaling limit results take advantage of recent progress on scaling limits of rooted unlabelled Markov branching trees, which we now introduce. For n ≥ 1, let T • n be the image of T n under the map that delabels the leaves of a tree. Define Let (p • n , n ≥ 2) be a sequence such that for each n, p • n is a probability function on P • n \ {(n)}. A sequence (T • n , n ≥ 1) of random variables such that T • n ∈ T • n is called a Markov branching model based on (p • n , n ≥ 2) if for each n ≥ 2, the law of T • n is the same as the law of the tree T constructed as follows: choose (N 1 , . . . , N p ) according to p • n ; conditionally given that (N 1 , . . . , N p ) = (n 1 , . . . , n p ), let ( T 1 , . . . , T p ) be a vector of independent trees such that T i is distributed as T • n i , 1 ≤ i ≤ p; the tree T is then obtained by identifying the roots of T 1 , . . . , T p as a single vertex and attaching a new root to this vertex.
The following proposition is an immediate consequence of these definitions.

Proposition 11
If (T n , n ≥ 1) is a regenerative tree growth process with associated dislocation measure κ and (T • n , n ≥ 1) is the sequence of trees such that T • n is obtained from T n by delabelling the leaves, then (T • n , n ≥ 1) is a Markov branching model based on the functions where we write (#π) ↓ for the decreasing rearrangement of the block sizes of π.

Examples
An important motivation for our results is that they allow a unified treatment of previously studied models. In this section we discuss these models, recall or construct their dislocation measures and demonstrate how our Theorems 4 and 5 apply. We also develop some further examples that explore the conditions (3) and (5) that appear in Theorems 4 and 5. Before proceeding with the examples, we introduce paintbox partitions, which are a recurring theme in the construction of dislocation measures. For s ∈ S ↓ , where S ↓ is defined in (2), we define Kingman's paintbox κ s as the distribution of the random partition Π of N where i, j ∈ N are in the same block if i = j or and where s 0 = 1 − i≥1 s i . Note that the Strong Law of Large Numbers implies that κ s -a.e. Γ ∈ P has asymptotic ranked frequencies |Γ| ↓ = s.
Example 12 (Exchangeable models [4,17]) Bertoin classified all exchangeable dislocation measures, i.e. measures that are invariant under the action of permutations of N on P, giving an integral representation where c ≥ 0, ε (j) is the partition with blocks {j} and N\{j}, and ν is a measure on S ↓ with The splitting rules (p n , n ≥ 2) associated with Bertoin's exchangeable dislocation measures κ via (10) give rise to the consistent exchangeably labelled Markov branching trees of [17]. For for some γ > 0 and some slowly varying function ℓ, then (3) also holds. It is also easy to verify the condition of Theorem 5 in this case.
There are simpler growth rules, which in general lead to models that are not fully exchangeable. Before we present these simpler growth rules, we mention a large class of models that retain a weak form of exchangeability and for which scaling limits have been obtained.
Example 14 (Restricted exchangeable models [6]) Let us define restricted exchangeable dislocation measures by their integral representation, referring to [6] for a full discussion: This includes all exchangeable dislocation measures, for c j = c, The splitting rules (p n , n ≥ 2) associated with restricted exchangeable dislocation measures κ give rise to the consistent restricted exchangeable labelled Markov branching trees of [6].
Consider the case where c j = k j = 0 and λ n = κ(P \ P [n] ) = n γ ℓ(n) for some slowly varying function ℓ. The push-forward of κ under Γ → |Γ| ↓ is given by Suppose that for each j ≥ 1, ν j has its support in S ↓ 1 \ {(1, 0, . . .)} and that S ↓ (1 − s 1 )ν(ds) < ∞. Assuming further that ν j = ν m for all j ≥ m for some m ≥ 1, as in [6,Theorem 7] where scaling limits were established for convergence in probability, we deduce that (3) holds for from the exchangeable case and by dominated convergence, because on the RHS only the measure κ νm = S ↓ κ s (·)ν m (ds) is infinite.
One of the early families of regenerative tree growth processes to be studied was Ford's alphamodel. It has also been a main driver for much of the literature on scaling limits of Markov branching trees, both for general models and for further models with special structure.
Example 15 (Ford's alpha-model [10]) This family is parametrized by α ∈ [0, 1] as follows. For each edge e of T n , give e weight α if both of its vertices are internal and weight 1 − α if one of its vertices is a leaf. Choose an edge with probability proportional to its weight and attach n + 1 to a new branch point between the two vertices of the selected edge. From this description it is easy to check that (T n , n ≥ 1) is a family of binary trees that forms a regenerative tree growth process. Moreover, for π = (B 1 , B 2 ) we have g n (π, 0) = α n − α and This model was introduced in [10] as a model on cladograms that interpolates between the Yule model (α = 0), the uniform model (α = 1/2), and the comb (α = 1).
The alpha-model is a restricted exchangeable model of binary trees that admits (at least) two natural extensions. The alpha-gamma model, which is restricted exchangeable but not binary, and the alpha-theta model, which is binary but not, in general, restricted exchangeable. The details of these models are our next two examples.
All the examples we have given so far satisfy the hypotheses of Theorem 4. In fact, in these examples both tree convergence and residual mass process convergence were previously known to hold. The exchangeable case is [17, Proposition 7], the particle labelled 1 in the restricted exchangeable case is [6, Proposition 28] and particle labelled 1 in the alpha-theta model is [ for some Γ(j) ∈ P [j−1],{j} , j ≥ 2.

(b) For (3) to fail, first let Γ(j)
[n] 1 approach frequency 1/2, applying Step A 1/2 for n < a j , so that |Γ(j) Step A x (j) for n ≥ a j . Then we will have |Γ(j) 1 | = x (j) for all j ≥ 2, but for all n sufficiently large, j≥2 |Γ(j) Intuitively, the approximating trees have too many even branchpoints splitting into two equal-sized subtrees making trees wide and small in height, while the proposed limiting distribution produces uneven branch points leading to thin and high trees with higher probability. Gromov-Hausdorff convergence fails, if total heights do not converge [8].
In our next example, we show that the hypotheses of Theorem 5 are strictly stronger than the hypotheses of Theorem 4.

Example 19
In the general setting of Example 18, consider Γ(j) [n] 1 that first approaches (the wrong!) frequency 1 − x (j) , applying Step A 1−x (j) for n < a j , so that |Γ(j)

. Then we apply
Step A x (j) for n ≥ a j to achieve |Γ(j) 1 | = x (j) . We call these partitions "evil". If we did this for all j ≥ 2, too many partitions would have intermediate frequencies around 1/2 when restricted to [n] and tree convergence may fail. Note that while at 1 − x (j) , the block not containing 1 has frequency x (j) and is the larger block size that appears in the tree convergence criterion, while frequency 1 − x (j) is relevant for the residual mass process.
2. Given (m, E m , j m ), release the smallest evil partition e m = min E m by setting a em = j m .
The criterion (5) of Theorem 5 for residual mass process convergence is not satisfied, because for every n ≥ 3/(1 − x (3) ), there are evil partitions of weight at least λ 3 − λ 2 which have a frequency |Γ(j) [n] 1 | ≈ 1 − x (j) that is smaller by more than 1/4 than their limit frequency x (j) , since x (j) − (1 − x (j) ) > 1/4 for all j ≥ 3, and this cannot be offset by partitions that exceed their limit frequencies, by the argument in Example 18(a).

Background
In this section we present the background information needed to understand the statements of our results on scaling limits of random trees. Since the proofs of our results do not require any technical details about the constructions in this section we keep the discussion light and heuristic at times, referring to the existing literature for details.

Trees as metric measure spaces
The trees under discussion in this paper can naturally be considered as metric spaces with the graph metric. That is, the distance between two vertices is the number of edges on the path connecting them. Let (T, d, root) be a tree equipped with the graph metric. For a > 0, we define at to be the metric space (T, ad, root), i.e. the metric is scaled by a. Moreover, the trees we are dealing with are rooted so we consider (T, d, root) as a pointed metric space with the root as the point. Additionally, we let µ T be the uniform probability measure on the leaves of T . If we have a random tree T , this gives rise to a random pointed metric measure space (T , d, root, µ T ). For this last statement to be made rigorous, it is clear that we need to put a topology on pointed metric measure spaces. This is hard to do in general, but note that the pointed metric measure spaces that come from the trees we are discussing are compact and this simplifies matters.
Let M w be the set of equivalence classes of compact pointed metric measure spaces (equivalence here being up to point and measure preserving isometry). We endow M w with the pointed Gromov-Hausdorff-Prokhorov metric (see [16]). Fix (X, d, ρ, µ), (X ′ , d ′ , ρ, µ ′ ) ∈ M w and define where the first infimum is over metric spaces (M, δ), the second infimum if over isometric embeddings φ and φ ′ of X and X ′ into M , δ H is the Hausdorff distance on compact subsets of M , and δ P (φ * µ, φ ′ * µ ′ ) is the Prokhorov distance between the push-forward φ * µ of µ by φ and the pushforward φ ′ * µ ′ of µ ′ by φ ′ . It is worth noting briefly that the definitions of M w and d GHP as just given do not make formal sense in Zermelo-Fraenkel set theory with the axiom of choice (ZFC); one might just as well try metrizing the set of all sets. Nonetheless, it is not hard to formalize the heuristic definitions we have given. For example, one can use the fact that every separable metric space can be isometrically embedded in ℓ ∞ to find an honest set M w in ZFC such that every compact pointed metric measure space is isometric, by point and measure preserving isometry, to exactly one element of M w and then do everything internally in this set.
Scaling limits of discrete trees are elements of M w that are tree-like metric spaces. An R-tree is a complete metric space (T, d) with the following properties: • For v, w ∈ T , there exists a unique isometry φ v,w : Definition 21 A continuum tree is an R-tree (T, d, ρ, µ) with a choice of root and probability measure such that µ is non-atomic, µ(L(T )) = 1, and for every non-leaf vertex w, µ({v ∈ T : A continuum random tree (CRT) is an (M w , d GHP )-valued random variable that is almost surely a continuum tree. The continuum random trees we will be interested in are those associated with self-similar mass fragmentation processes.

Self-similar mass fragmentations
We are now prepared to introduce self-similar mass fragmentations and their genealogical trees. Suppose γ > 0 and let ν be a σ-finite measure on S ↓ such that ν({(1, 0, 0, . . . )}) = 0 and S ↓ (1 − s 1 )ν(ds) < ∞ and ν( i s i < 1) = 0. Heuristically, a self-similar mass fragmentation with characteristics (γ, ν) is an S ↓ -valued Markov process (F (t), t ≥ 0) such that F (0) = (1, 0, 0, . . . ) and such that a block of size x splits into blocks xs = (xs 1 , xs 2 , . . . ) at rate x −γ ν(ds). A rigorous construction of such processes can be found in [4], though we remark that there is a slight difference in notation: our index γ of self-similarity corresponds to the index −γ in [4]. The idea of the genealogical tree of a self-similar mass fragmentation is to construct an R-tree that keeps track of the sizes of the blocks of the fragmentation process as time progresses.
The following summarizes the parts of Theorem 1 and Lemma 5 in [14] that we will need.
The Brownian continuum random tree introduced by Aldous [2] as the scaling limit of conditioned Galton-Watson trees is an example of a self-similar fragmentation tree.

Definition 23
The Brownian CRT is the 1/2-self-similar random tree with dislocation measure ν given by One of our main tools will be the general theory of scaling limits of unordered Markov branching trees. In particular, we make use of the following theorem.
Theorem 24 (Theorem 5 in [16]) Let (T • n , n ≥ 1) be a Markov branching model based on (p • n , n ≥ 2) as in Section 2.3. Suppose that there is a characteristic pair (γ, ν) with γ > 0, and ν satisfying the conditions at the start of Section 4.2 as well as a function ℓ : (0, ∞) → (0, ∞), slowly varying at ∞ such that, in the sense of weak convergence of finite measures on S ↓ , we have wherep • n is the push-forward of the measure on P • n with probability function p • n onto S ↓ by the map . . . , n p , 0, 0, . . . ).
If we view T • n as a random element of M w with the graph distance and the uniform probability measure its leaves, then we have the convergence in distribution with respect to the rooted Gromov-Hausdorff-Prokhorov topology.

Scaling limits of regenerative tree growth processes
While every dislocation measure κ on P gives rise to a regenerative tree growth process, not every such process has a scaling limit. Examples without scaling limit include the (0, θ)-tree growth process studied in [28,Proposition 13], where the growth is logarithmic and the branching structure degenerates under logarithmic scaling. In view of Proposition 11, it makes sense to try interpreting the hypotheses of Theorem 24 in terms of κ. In particular, let us examine the LHS of (13). From Proposition 11 we see that for a bounded continuous function f on S ↓ , S ↓ f (s)n γ ℓ(n)(1 − s 1 )p • n (ds) = n 1 ≥···≥n k :n 1 +···+n k =n n γ ℓ(n)p • n (n 1 , . . . , n k ) 1 − n 1 n f n 1 n , . . . , n k n , 0, . . .
where now we write (#π) ↓ for the decreasing rearrangement of the block sizes of π, with an infinite string of zeros appended (whereas in our previous usage (#π) ↓ was a finite vector). Given this expression and the convergence (13) we need to establish, natural assumptions on κ become that λ n = n γ ℓ(n) for some γ > 0 and ℓ(n) slowly varying at ∞ and that Of course, for this last equation to have hope of holding, we must assume that κ-a.e. Γ ∈ P has asymptotic ranked frequencies. This holds for exchangeable and restricted exchangeable κ, and when κ is partially exchangeable in the sense of [26]. Let us, though, clarify the relationship between the existence of asymptotic frequencies and the existence of asymptotic ranked frequencies.
Lemma 25 is inessential to the remainder of our results, but for completeness we include a proof in Appendix B. Note that asymptotic (ranked) frequencies need not be in S ↓ 1 and that |Γ i | may vanish. The partition Γ = ({2 i−1 , . . . , 2 i − 1}, i ≥ 1) is an example where |Γ i |, i ≥ 1, exists, but |Γ| ↓ 1 does not.
We can now give our main result on the existence of scaling limits of regenerative growth processes, which contains the statement of Theorem 4.
Proof. If (i) holds, we obtain (ii) as a rearrangement of the special case f = 1. Now assume (ii). Let us prove (iii). The main difficulty arises from the possibility that κ(P [m] ) = ∞. For all m ≥ 1 we have Since κ(P \P [m] ) < ∞ and κ-a.e. Γ has asymptotic ranked frequencies an application of dominated convergence shows that, for each fixed m, the second term vanishes as n → ∞. From the triangle inequality, we see that (1−|Γ| ↓ 1 )κ(dΓ).
6 Residual mass processes in regenerative tree growth processes Let (T n , n ≥ 1) be a regenerative tree growth process and (X (n) m , m ≥ 0) the associated residual mass processes of label 1 in T n , n ≥ 1, with transition probabilities as identified in Proposition 7, with λ n = κ({Γ ∈ P : Γ [n] 1 = [n]}). The existence of a scaling limit for trees T • n as studied in Section 5 does not imply the existence of a scaling limit for associated residual mass processes X (n) , in general (see Example 19). In this section, we study scaling limits X (n) ⌊λnt⌋ /n → X t , as n → ∞. Since, for fixed n ≥ 1, (X (n) m , m ≥ 0) is a non-increasing Markov chain with X (n) 0 = n, we can make use of the general theory of self-similar scaling limits for such chains that was recently developed in [15]. Suppose that there exists a sequence (a n , n ≥ 0) of the form a n = n γ ℓ(n) for some γ > 0 and a slowly varying function ℓ as well as a non-zero finite measure µ on [0, 1] such that in the sense of weak convergence of finite measures on [0, 1]. Then we have the following conver- in the Skorokhod sense, where X is a self-similar Markov process and in Lamperti's representation (4), we have E(e −sξr ) = exp(−rψ(s)) with ψ(s) = Moreover, letting A n be the absorption time of Y (n) at 0, the above convergence in distribution happens jointly with the convergence of a −1 n A n to the absorption time at 0 of the limiting process.
We note that the statement of Theorem 5 is contained in the statement of this theorem.
Residual mass convergence and tree convergence are not equivalent. The last part of Theorem 28 finds that under the conditions for residual mass convergence in this theorem, we just need to assume the existence of asymptotic ranked frequencies to also obtain tree convergence. Example 19 demonstrates that residual mass convergence does not follow from tree convergence. In the following corollary we explore additional conditions in the tree convergence setting of Theorem 26, under which we also obtain residual mass convergence. Roughly speaking, condition (ii) below expresses the following intuition: we need label 1 in the asymptotically largest block most of the time, and on the corresponding set {|Γ| ↓ 1 = |Γ 1 |} of infinite κ-measure, |Γ [n] | ↓ 1 and |Γ [n] 1 | approach their limit |Γ| ↓ 1 = |Γ 1 | in a sufficiently regular way. The following statement includes Corollary 6.

Corollary 29
In the setting of Theorem 26 (ii), the block Γ 1 containing 1 of κ-a.e. Γ ∈ P has an asymptotic frequency in (0, 1). With Λ as in Theorem 28, the following are equivalent: If condition (ii) holds, then X (n) ⌊λnt⌋ /n → X t in distribution, as n → ∞, in the Skorohod sense as functions of t ≥ 0, and this convergence holds jointly with the convergence of λ −1 n A n to the absorption time of X at 0, where our notation is as in Theorem 28.

Further problems and related work
Due to the coupling of (T n , n ≥ 1) in a regenerative tree growth process, the convergence in distribution in Theorems 26 and 28 should be strengthened to a convergence in probability or even to almost sure convergence in all cases discussed here. We have proved tree convergence in probability in the exchangeable case [17], and in the restricted exchangeable case [6] provided that ν j = ν m , j ≥ m, but the general case including the alpha-theta model remains open.
In the alpha-theta model [28] and the (restricted) exchangeable [17,6] cases, we have established a two-stage almost sure convergence to a self-similar tree T by passing via reduced subtrees of T n and of T spanned by the first k labelled leaves and letting first n → ∞ and then k → ∞. More specifically, we have embedded (T n , n ≥ 1) in T as discrete trees with edge lengths.
The basic embedding problem is to find a random leaf in a self-similar tree (T , µ) that induces a given decreasing self-similar Markov process as residual mass process, i.e. as the process that is parametrised by distance from the root on the path to the random leaf and that records for each point on the path the µ-mass in the subtree above the point. Another interesting structure is the joint distribution of two residual mass processes (see [28,29]). When embedded in the same tree, they coincide up to a branch point and then evolve independently. In [29], we use the terms fragmenter for exponential subordinators (e −ξs , s ≥ 0), which are time-changed in Lamperti's representation (4), and bifurcator for pairs of fragmenters that coincide up to an exponential time and then evolve independently. In [29] we investigate the fact that not all fragmenters appear as residual mass processes of typical (uniformly random) leaves. We introduce the notion of Markovian embedding in an exchangeable fragmentation process and show that for every (purejump) fragmenter X there is a unique exchangeable dislocation measure κ such that X has a Markovian embedding into an associated exchangeable fragmentation process.
In [28,29], we study an autonomous description of the evolution of reduced subtrees, viewed as weighted trees equipped with an (atomic) measure on the branches. We refer to a single branch with an atomic measure as a string of beads, see also [25] for related structures. We refer to the evolution of reduced subtrees as bead splitting. In [29], we study certain binary bead splitting processes that evolve by size-biased branching, i.e. where an atom (a bead) is selected at random according to the measure on the branches and replaced by a (rescaled independent) copy of a given string of beads. We study the convergence of bead-splitting processes to self-similar CRTs.

A Proof of Proposition 2
First consider a regenerative tree growth rule, i.e. a sequence of transition probability matrices g n from P n \ {1 [n] } to {0, . . . , n + 1} with g n (π, 0) independent of π and g n (π, i) = 0 if π has strictly fewer than i − 1 blocks. For n = 1 and n = 2 the regenerative property is trivial. Consider the induction hypothesis that the growth rule gives rise to distributions Q m on T m , m ≤ n, and hence to Q B on T B after relabelling via the increasing bijection [m] → B, for all B ⊂ N with #B = m, such that conditionally given a first split Π n = (B 1 , . . . , B k ), the subtrees above the first split are independent, and the ith subtree T n,B i has conditional distribution Q B i , 1 ≤ i ≤ k. For the induction step, note that conditionally given Π n = (B 1 , . . . , B k ), the tree growth step from n to n + 1 specifies Q n+1 on each of the events G n,i : • G n,0 : here, Π n+1 = ([n], {n + 1}) is not related to Π n ; we will get back to this; • G n,i , 1 ≤ i ≤ k: here, Π n+1 = (B 1 , . . . , B i−1 , B i ∪{n+1}, B i+1 , . . . , B k ); regenerative growth in the ith subtree preserves the conditional independence of subtrees, and the induction hypothesis also yields that T n+1,B j = T n,B j has conditional distribution Q B j for j = i, while T n+1,B i ∪{n+1} has conditional distribution Q B i ∪{n+1} obtained from Q B i via the growth rule applied to B i with #B i ≤ n − 1.
Also, (8) holds and determines P(T n = t | T n−1 ) as required, since T n determines T n−1 .

B Proof of Lemma 25
We consider the set c 0 = (s 1 , s 2 , . . . ) ∈ [0, 1] N : lim i→∞ s i = 0 , which is equipped with the uniform norm || · || ∞ . This set is clearly closed when considered as a subset of ℓ ∞ and thus is a complete metric space. Let F : c 0 → c 0 be the map defined by F (s) = s ↓ , that is, F is the map that takes a sequence to its non-increasing rearrangement. Our first step is to prove that F is continuous since this immediately implies that if |Γ [n] | converges uniformly, say to (y i ) i≥1 , then |Γ [n] | ↓ converges to the non-increasing rearrangement of (y i ) i≥1 . Fix ǫ > 0 and s ∈ c 0 . Without loss of generality, we may assume that ǫ < sup i s i and ǫ / ∈ {s i , i ≥ 1}. Let B s = {i ≥ 1 : s i > ǫ}. The fact that s ∈ c 0 implies that #B s < ∞. Observe that F (s) is equal to the sequence obtained by concatenating the non-increasing rearrangement of (s i : i ∈ B s ) with the non-increasing rearrangement of (s i : i / ∈ B s ). Suppose that s n → s. For sufficiently large n we have B sn = B s . Since ranking is continuous on R #Bs , it follows that lim n→∞ (s ↓ n,1 , . . . , s ↓ n,#Bs ) = (s ↓ 1 , . . . , s ↓ #Bs ) and also that sup i>#Bs s ↓ i + lim sup n→∞ sup i>#Bs s ↓ n,i ≤ 2ǫ.
As a result we have lim sup n→∞ ||F (s) − F (s n )|| ∞ ≤ 2ǫ, and the continuity of F follows. We now prove the opposite direction. To that end, assume that |Γ [n] | ↓ → (x i ) i≥1 pointwise. We will prove that |Γ [n] | converges in c 0 and the proof of the previous part then identifies the limit. Since |Γ [n] | ↓ is non-increasing for each n with sums uniformly bounded by 1, this implies that |Γ [n] | ↓ → (x i ) i≥1 uniformly. If x 1 = 0 we are done, so we assume that x 1 > 0. Let ǫ > 0 be given, and without loss of generality suppose that ǫ < x 1 . By Fatou's lemma we have i≥1 x i ≤ 1 < ∞ and, consequently, we can choose K so that i≥K+1 x i < ǫ. Let Since |Γ [n] | ↓ → (x i ) i≥1 uniformly, we can choose N > 1/ǫ 1 such that sup i |Γ [n] | ↓ i − x i < ǫ 1 for all n ≥ N . For each n ≥ N let σ n : N → N be a bijection such that (|Γ [n] | σn(i) ) i≥1 = |Γ [n] | ↓ . Note that we have used the fact that |Γ [n] | has only finitely many non-zero entries to obtain this bijection. Since N > 1/ǫ 1 , for all i ≥ 1 and n ≥ N we have |Γ [n] | i − |Γ [n+1] | i ≤ 1/(n + 1) < ǫ 1 .
By our choice of ǫ 1 , for any 1 ≤ j ≤ K and any i ≥ 1 such that x j = x i we have Combining these, we see that for n ≥ N We are not quite done since σ N depends on ǫ 1 . Note, however, that the above inequality implies for n, m ≥ N sup This shows that (|Γ [n] |) n≥1 is a Cauchy sequence in the complete metric space c 0 and, therefore, converges uniformly.