Local limits of Markov Branching trees and their volume growth

We are interested in the local limits of families of random trees that satisfy the Markov branching property, which is fulfilled by a wide range of models. Loosely, this property entails that given the sizes of the sub-trees above the root, these sub-trees are independent and their distributions only depend upon their respective sizes. The laws of the elements of a Markov branching family are characterised by a sequence of probability distributions on the sets of integer partitions which describes how the sizes of the sub-trees above the root are distributed. We prove that under some natural assumption on this sequence of probabilities, when their sizes go to infinity, the trees converge in distribution to an infinite tree which also satisfies the Markov branching property. Furthermore, when this infinite tree has a single path from the root to infinity, we give conditions to ensure its convergence in distribution under appropriate rescaling of its distance and counting measure to a self-similar fragmentation tree with immigration. In particular, this allows us to determine how, in this infinite tree, the"volume"of the ball of radius $R$ centred at the root asymptotically grows with $R$. Our unified approach will allow us to develop various new applications, in particular to different models of growing trees and cut-trees, and to recover known results. An illustrative example lies in the study of Galton-Watson trees: the distribution of a critical Galton-Watson tree conditioned on its size converges to that of Kesten's tree when the size grows to infinity. If furthermore, the offspring distribution has finite variance, under adequate rescaling, Kesten's tree converges to Aldous' self-similar CRT and the total size of the $R$ first generations asymptotically behaves like $R^2$.


Introduction
The focus of this work is to study the asymptotic behaviour of sequences of random trees which satisfy the Markov branching property first introduced by Aldous in [6,Section 4] and later extended for example in [17,30,31]. See Haas [28] for an overview of this general model and Lambert [40] for applications to models used in evolutionary biology. Our study will therefore encompass various models, like Galton-Watson trees conditioned on their total progeny or their number of leaves, certain models of cut-trees (see Bertoin [12,13,14]) or recursively built trees (see Rémy [45], Chen, Ford and Winkel [19], Haas and Stephenson [32]) as well as models of phylogenetic trees (Ford's α-model [24] and Aldous' β-splitting model [6]).
Informally, a sequence (T n ) n of random trees satisfies the Markov branching property if for all n, T n has "size" n, and conditionally on the event "T n has p sub-trees above its root with respective sizes n 1 ≥ · · · ≥ n p ", these sub-trees are independent and for each i = 1, . . . , p, the i th largest sub-tree is distributed like T ni . The sequence of distributions of (T n ) n is characterised by a family q = (q n ) n of probability distributions, referred to as "first-split distributions" (see next paragraph), where q n is supported by the set of partitions of the integer n. We will detail two different constructions of Markov branching trees corresponding to a given sequence q for two different notions of size: the number of leaves or the number of vertices.
Let (q n ) n be a sequence of first-split distributions. A tree with n leaves with distribution in the associated Markov branching family is built with the following process.
Consider a cluster of n identical particles and with probability q n (λ 1 , . . . , λ p ), split it into p smaller clusters containing λ 1 , . . . , λ p particles respectively. For each i = 1, . . . , p, independently of the other sub-clusters, split the i th cluster according to q λi . When a sub-cluster contains only 1 particle, with probability q 1 (1) < 1, let it either give birth to a new sub-cluster which only contains 1 particle as well, or, with probability 1 − q 1 (1), let the particle "die". Repeat this procedure until each of the particles are dead. The genealogy of these splits may be encoded as a tree with n leaves (which correspond to the death of each particle). We'll denote by MB L,q n the distribution of such a tree. Figure 1: Example of a tree with 7 leaves (in red) and first-split equal to (5,2).
A Markov branching tree with a given number of vertices, say n, is built with a slightly different procedure and we will call MB q n its distribution. Section 2.2.1 will rigorously detail the constructions of both MB q n and MB L,q n . Rizzolo [46] considered a more general notion of size and described the construction of corresponding Markov branching trees.
One way of looking at the behaviour of large trees is through the local limit topology.
For a given tree t and R ≥ 0, we denote by t| R the subset of vertices of t at graph distance less than R from its root. We will say that a sequence t n converges locally to a limit tree t ∞ if for any radius R, t n | R = t ∞ | R for sufficiently large n. There is considerable literature on the study of the local limits of certain classes of random trees or, more generally, of graphs. For instance, see Abraham and Delmas [1,2], EJP 22 (2017), paper 95.

Background on trees
Let U := n≥0 N n be the set of finite words on N with the conventions N = {1, 2, 3, . . . } and N 0 = {∅}. We then call a plane tree or ordered rooted tree any non-empty subset t ⊂ U such that: • The empty word ∅ belongs to t, it will be thought of as its "root", • If u = (u 1 , . . . , u n ) is in t, then its parent pr(u) := (u 1 , . . . , u n−1 ) is also in t, • For all u in t, there exists a finite integer c u (t) ≥ 0 such that u i := (u 1 , . . . , u n , i) is in t iff 1 ≤ i ≤ c u (t). We will say that c u (t) is the number of children of u in t.
Let T ord be the set of plane trees. Observe that if t is an infinite plane tree, this definition requires the number of children of each of its vertices to be finite.
Plane trees are endowed with a total order which is of limited interest to us. Because of this, we define an equivalence relation on T ord to allow us to consider as identical two trees which have the same "shape" but different vertex orderings.
Say that two plane trees t and t are equivalent (written t ∼ t ) iff there exists a bijection σ : t → t such that σ(∅) = ∅ and for all u ∈ t \ {∅}, pr[σ(u)] = σ[pr (u)]. Finally, set T := T ord / ∼. From now on, unless otherwise stated, we will only consider unordered trees, i.e. by "tree" we will mean an element of T.
Let t be a tree. We say that a vertex u on t is a leaf if it has no children, i.e. if c u (t) = 0. Define #t as the total number of vertices of t and # L t as its number of leaves. For any positive integer n, let T n and T L n be the sets of finite trees with n vertices and n leaves respectively. Moreover, write T ∞ for the set of infinite trees.
We will use the following operations on trees: • Let t 1 , . . . , t d be trees; their concatenation is the tree [[t 1 , . . . , t d ]] obtained by attaching each of their respective roots to a new common root, see Figure 2, • Let t and s be two trees and u be a vertex of t; set t ⊗ (u, s) the grafting of s on t at u, i.e. the tree obtained by glueing the root of s on u, see Figure 3, • Fix t a tree, a non-repeating family (u i ) i∈I of vertices of t, and a family of trees (s i ) i∈I ; let t i∈I (u i , s i ) be the tree obtained by grafting s i on t at u i for each i in I. For all n ≥ 0, let b n be the branch of length n, i.e. the tree with n + 1 vertices among which a single leaf. Similarly, define the infinite branch b ∞ and let (v n ) n≥0 be its vertices where v 0 is its root and for all n ≥ 0, v n = pr(v n+1 ).
The local limit topology If t is a tree, we may endow it with the graph distance d gr where for all u and v in t, d gr (u, v) is defined as the number of edges in the shortest path between u and v. For any non-negative integer R, we will write t| R for the closed ball of radius R centred at the root of t, that is the tree t| R := {u ∈ t : d gr (∅, u) ≤ R}.
The local distance between two given trees t and s is defined as The function d loc is an ultra-metric on T and the resulting metric space (T, d loc ) is Polish.
The following well-known criterion for convergence in distribution with respect to the local limit topology will be useful. See for instance [2, Section 2.2] for a proof (which relies on [16,Theorem 2.3] and the fact that d loc is an ultra-metric).
Lemma 2.1. Let T n , n ≥ 1 and T be T-valued random variables. Then, T n → T in distribution with respect to d loc iff for all t ∈ T and R ≥ 0, P[T n | R = t| R ] → P[T | R = t| R ] as n tends to infinity.

Partitions of integers
As discussed in the Introduction, Markov branching trees are closely related to "partitions of integers". This section thus aims to introduce a few notions on these objects which will be useful for our forthcoming purposes.
We endow P with an ultra-metric distance defined similarly to d loc . For all λ and µ in P, let d P (λ, µ) := exp − inf {K ≥ 0 : λ ∧ K = µ ∧ K} . (ii) The metric space (P, d P ) is Polish.
(ii) Observe that P ⊂ n≥0 (N ∪ {∞}) n and is as a result both countable and separable.
Therefore, it only remains to show that it is complete.
Let (λ n ) n be a Cauchy sequence with respect to d P . By assumption, there exists an increasing sequence (n K ) K such that for all K ≥ 0, λ n ∧ K = λ m ∧ K when n, m ≥ n K . In particular, there exists a constant p ≥ 0 such that p(λ nK ) = p for all K. Furthermore, notice that for all i = 1, . . . , p, the sequence [λ nK (i) ∧ K] K is non-decreasing. For each i = 1, . . . , p, set λ(i) := sup K λ nK (i) ∧ K ≤ ∞. Clearly, λ := [λ(1), . . . , λ(p)] is in P and is such that d P (λ n , λ) → 0 when n → ∞. This proves that (P, d P ) is indeed complete. Lemma 2.4. Let (Λ n ) n≥1 and Λ be P-valued random variables. Then, Λ n converges to Λ in distribution with respect to d P iff for all λ in P <∞ and all K ≥ 0, we have P[Λ n ∧ K = λ ∧ K] → P[Λ ∧ K = λ ∧ K] as n → ∞.
Proof. Uses the same arguments as the proof of Lemma 2.1 (recall that d P is an ultrametric and use [16,Theorem 2.3]). Remark 2.5. Elements of P <∞ are closely related to elements of T. Indeed, if t is a finite tree which can be written as the concatenation of p trees t 1 , . . . , t p , i.e. t = [[t 1 , . . . , t p ]], then the decreasing rearrangement of #t 1 , . . . , #t p is a partition of n when t has n + 1 vertices (the root plus n descendants). We will write Λ(t) := (#t 1 , . . . , #t p ) ↓ , where (x 1 , . . . , x k ) ↓ stands for the decreasing rearrangement of (x 1 , . . . , x p ), and call Λ(t) the partition at the root or first split of t.
Similarly, if we consider leaves instead of vertices, then Λ L (t) := (# L t 1 , . . . , # L t p ) ↓ is a partition of n when t has n leaves.
In this article, we will often have to consider sequences of random partitions Λ n ∈ P n that will weakly converge to a limit partition Λ ∞ ∈ P ∞ such that, m ∞ (Λ ∞ ) = 1 a.s.. In this particular setting, the weak convergence can be characterised as follows.
Lemma 2.6. For all 1 ≤ n ≤ ∞, let q n be a probability measure on P n and assume that q ∞ (m ∞ = 1) = 1. Then, q n ⇒ q ∞ with respect to d P iff for all λ in P <∞ we have q n (n − λ , λ) → q ∞ (∞, λ) as n → ∞.
As a result and thanks to Lemma 2.4, we get that q n ⇒ q ∞ .

Finite Markov branching trees
We will now follow [30, Section 1.2] and define two types of family of probability measures on the set of finite unordered rooted trees, satisfying the Markov branching property discussed in the Introduction.
Informally, for a given sequence q = (q n ) of probability measures respectively supported by P n (referred to as "first-split distributions" in the Introduction), we want to define a sequence MB q = (MB q n ) n of probability measures on the set of finite trees where • For all n, MB q n is supported by the set of trees with n vertices, • A tree T with distribution MB q n is such that -The decreasing rearrangement Λ(T ) of the sizes of the sub-trees above its root is distributed according to q n−1 , -Conditionally on Λ(T ) = (λ 1 , . . . , λ p ), the p sub-trees of T above its root are independent with respective distributions MB q λi .
Similarly, if q = (q n ) n is a sequence of probability measures respectively on P n , we will define a sequence MB L,q satisfying the same Markov branching property where we count leaves instead of vertices to measure the size of a tree.
Markov branching tree with n vertices First of all, set N an infinite subset of N with 1 ∈ N . This set will index the possible number of vertices of the trees we want to generate. Let q = (q n−1 ) n∈N be a sequence of probability measures such that q 0 (∅) = 1, , and for all n in N , n ≥ 2, q n−1 is supported by the set {λ ∈ P n−1 : λ i ∈ N , i = 1, . . . , p(λ)}.
Remark 2.7. This last condition comes from the fact that if T is distributed according to MB q n , the blocks of Λ(T ) need to be in N because the distributions of the corresponding sub-trees belong to the family (MB q k ) k∈N . We now detail a recursive construction for MB q . Let MB q 1 ({∅}) = 1 and for n ≥ 2, proceed by a decreasing induction as follows: • Let Λ have distribution q n−1 , • Conditionally on Λ = (λ 1 , . . . , λ p ) ∈ P n−1 , let (T 1 , . . . , T p ) be independent random trees such that T i is distributed according to MB q λi for each 1 ≤ i ≤ p,  Markov branching tree with n leaves Similarly, fix an infinite subset N of N such that 1 ∈ N (corresponding to the possible number of leaves of the trees we will generate) and let q = (q n ) n∈N be such that: • For all n > 1 in N , q n is a probability measure supported by the set {λ ∈ P n : λ i ∈ N , i = 1, . . . , p(λ)}.
To define MB L,q , we will proceed by the same recursive method used for MB q : first choose how the size is split between the children sub-trees of the root, and then generate the said sub-trees adequately. However, if for some n in N we have q n (n) = 1, the recursion will be endless. For this reason, we also require that for all n in N , q n (n) < 1 (i.e. with positive probability, a tree "splits" into smaller trees).
Let MB L,q 1 be the distribution of a branch of geometric length with parameter 1−q 1 (1), i.e. MB L,q For n > 1, we do as follows: • Let T 0 be a branch with geometric length with parameter 1 − q n (n) and call U its leaf, • Let Λ have distribution q n conditioned on the event {m n = 0}, • Conditionally on Λ = (λ 1 , . . . , λ p ), let (T 1 , . . . , T p ) be independent random trees respectively distributed according to MB q λi for 1 ≤ i ≤ p, • Graft the concatenation of these trees on the leaf U of T 0 , i.e. set T := T 0 ⊗ U, [[T 1 , . . . , T p(Λ) ]] and let MB L,q n be the distribution of T .

Infinite Markov branching trees
Using the same principle as before (split the number of vertices above the root and generate independent sub-trees with corresponding sizes) we will define a probability measure supported by the set of infinite trees which satisfies a version of the Markov branching property. Let N and q = (q n−1 ) n∈N satisfy the conditions exposed in the construction of the sequence MB q . In order to lighten notations, for any finite decreasing sequence of integers λ = (λ 1 , . . . , λ p ), we define MB q λ as the distribution of the concatenation of independent MB q λi -distributed trees. More precisely: • Let MB q ∅ be the Dirac measure on the tree with a single vertex (its root), namely MB q ∅ = δ {∅} , • For any λ ∈ P <∞ with p = p(λ) > 0 and λ i ∈ N for i = 1, . . . , p, let (T 1 , . . . , T p ) be independent trees with respective distributions MB q λi for all i = 1, . . . , p. Set MB q λ as the distribution of the concatenation of these trees. Consider q ∞ , a probability measure on P ∞ supported by the set λ ∈ P ∞ : λ i ∈ N ∪ {∞}, i = 1, . . . , p(λ) and let Λ follow q ∞ . Let T • be a Galton-Watson tree with offspring distribution the law of m ∞ (Λ). Conditionally on T • , let (Λ u , T u ) u∈T • be independent pairs and such that: • Λ u has the same distribution as Λ conditioned on the event m ∞ (Λ) = c u (T • ), Then, for every vertex u in T • , graft the corresponding tree T u on T • at u. Let T be the tree hence obtained, i.e. set T :

Remark 2.8.
• Suppose that q ∞ (m ∞ = 1) = 1. In this case, the construction of MB q,q∞ ∞ is much simpler: the tree T • is simply the infinite branch and the family (Λ vn , T vn ) n≥0 is i.i.d.. In particular, T a.s. has a unique infinite spine, i.e. a unique infinite non-backtracking path originating from the root.
• A tree T with distribution MB q,q∞ ∞ satisfies the Markov branching property: conditionally on Λ(T ), the sub-trees of T above its root are independent and their respective distributions are either MB q,q∞ ∞ or in the family (MB q n ) n∈N , depending on their sizes.
• The same exact construction can be used to define a measure MB L,q,q∞ ∞ .

Local limits of Markov-branching trees
Let q be the sequence of first-split distributions associated to a Markov-branching family MB q (respectively MB L,q ). Suppose q ∞ is a probability measure on P ∞ supported by the set of sequences λ such that for all i = 1, . . . , p(λ), λ i is either infinite or in N . The aim of this section is to expose suitable conditions on q and q ∞ such that MB q n converges weakly to MB q,q∞ ∞ (or MB L,q n ⇒ MB L,q,q∞ ∞ ) for the local limit topology. Theorem 2.9. Suppose that when n goes to infinity, q n converges weakly to q ∞ with respect to the topology induced by d P . Then, with respect to d loc , MB q n ⇒ MB q,q∞ ∞ (respectively MB L,q n ⇒ MB L,q,q∞ ∞ ).
In many cases, the infinite trees we will consider will have a unique infinite spine, which corresponds to q ∞ (m ∞ = 1) = 1 and the particular construction mentioned in Remark 2.8. In this situation, we may use Theorem 2.9 alongside Lemma 2.6 to get the following corollary.
Corollary 2.10. Assume that q ∞ is such that q ∞ (m ∞ = 1) = 1 and suppose that for any finite partition λ in P ∞ we have q n (n − λ , λ) → q ∞ (∞, λ). Then, MB q n ⇒ MB q,q∞ ∞ (or MB L,q n ⇒ MB L,q,,q∞ ∞ ) with respect to the local limit topology.
Proof of Theorem 2.9. For all n in N ∪ {∞}, let T n follow MB q n and Λ n−1 follow q n−1 . To prove this theorem, we will use Lemma 2.1 and proceed by induction on R. First, it clearly holds that for every tree t, t| 0 = {∅} = T n | 0 = T ∞ | 0 a.s..
where S d denotes the set of permutations of {1, . . . , d}. There exists a subset S of S d such that for any σ ∈ S d there is a unique τ ∈ S satisfying t σ·i = t τ ·i as elements of T for all i = 1, . . . , d. Observe that S only depends on t and the (arbitrary) labelling of its sub-trees. Then, where we have used the Markov branching property. Our induction assumption implies in particular that for all i = 1, . . . , d and s in T with height R or less, the function P → [0, 1], λ → P[T λi | R = s] 1 p(λ)=d is continuous. As a result, P[T n | R+1 = t| R+1 ] may be expressed as the integral against q n−1 of a finite sum of continuous functions. Therefore, since We proceed in the same way to prove the claim on MB L,q trees.
In the next proposition, we prove that the condition "q n ⇒ q ∞ " in Theorem 2.9 is optimal for MB q trees. Proposition 2.11. Let q = (q n−1 ) n∈N be the sequence of first split distributions associated to a family MB q of Markov branching trees with given number of vertices. If there exists a probability measure q ∞ on P ∞ such that MB q n converges weakly to MB q,q∞ ∞ for the local limit topology, then q n−1 ⇒ q ∞ in the sense of the d P topology.
Proof. Observe that for all K ≥ 0 and t, s ∈ T, if t| K = s| K then Λ(t) ∧ K = Λ(s) ∧ K. As a result, d P [Λ(t), Λ(s)] ≤ d loc (t, s) which proves in particular that Λ : T → P is a continuous function. Consequently, since for all possibly infinite n, Λ(T n ) has distribution q n−1 , in the sense of the d P topology we have q n−1 ⇒ q ∞ when n → ∞.

Background on scaling limits
In this section, we will introduce the framework needed to consider the scaling limits of both finite and infinite Markov branching trees as well as the corresponding limiting objects: self-similar fragmentation trees with or without immigration. Afterwards, we will also give a few useful results on point processes related to our models of trees.

R-trees and the GHP topology
To talk about scaling limits of discrete trees, we need to introduce a continuous analogue. We use the framework of R-trees. An R-tree (or real tree) is a metric space (T, d) such that for all x and y in T : This roughly means that any two points in an R-tree can be continuously joined by a single path, up to its reparametrisation, which is akin to the acyclic nature of discrete trees.
To compare two such objects, we will use the Gromov-Hausdorff-Prokhorov distance. More precisely, we will follow the definition from [4] and extend it in a way similar to that of [3].
For any metric space (X, d) let M f (X) be the set of all finite non-negative Borel measures on X and M(X) be the set of all non-negative and boundedly finite Borel measures on X, i.e. non-negative Borel measures µ on X such that µ(A) < ∞ for all measurable bounded A ⊂ X.
A pointed metric space is a 3-tuple (X, d, ρ) where (X, d) is a metric space and ρ ∈ X is a fixed point, which we will call its root. For any x ∈ X, set |x| := d(ρ, x) the height of x in (X, d, ρ), and let |X| := sup x∈X |x| be the height of X.
We will call pointed weighted metric space any 4-tuple X = (X, d, ρ, µ) where (X, d) is a metric space, ρ ∈ X is its root and µ is a boundedly finite Borel measure on X.
Remark 3.1. If X is a pointed weighted metric space, we will implicitly write X = (X, d X , ρ X , µ X ) unless otherwise stated.
Two pointed weighted metric spaces X and Y will be called GHP-isometric if there exists a bijective isometry Φ : be the set of GHP-isometry classes of compact pointed weighted metric spaces.

Comparing compact metric spaces
Let X and Y be two pointed weighted compact metric spaces. A correspondence between X and Y is a measurable subset C of X × Y which contains (ρ X , ρ Y ) such that for any x ∈ X there exists y ∈ Y with (x, y) ∈ C and conversely, for any y ∈ Y there is x ∈ X such that (x, y) ∈ C. We will denote by C(X, Y) (or C(X, Y ) with a slight abuse of notation) the set of all pointed correspondences between X and Y. For any C ∈ C(X, Y), let its distortion be defined as follows: When the setting is clear, we will simply write dis C := dis X,Y C. Observe that dis C ≤ 2 |X| ∨ |Y | < ∞ and that dis C ≥ |X| − |Y | .
For any finite Borel measure π on X × Y , we define its discrepancy with respect to µ X and µ Y as: where · TV is the total variation norm, and p X : (x, y) ∈ X × Y → x, p Y : (x, y) ∈ X × Y → y are the canonical projections from X × Y to X and Y respectively. The definition of the total variation norm and the triangular inequality give D(π; Following [4, Section 2.1], we define the Gromov-Hausdorff-Prokhorov distance (or GHP distance for short) between two pointed weighted compact metric spaces X and Y as: . Therefore, the functions K → R + , X → ||X and X → µ X (X) are both continuous with respect to d GHP .
As was mentioned in [4, Section 2.1], d GHP is a well-defined distance on K and (K, d GHP ) is both complete and separable and thus, Polish. Furthermore, it was also noted that d GHP gives rise to the same topology as the GHP distance defined in [3].
Rescaling compact metric spaces For all m ≥ 0, let 0 (m) := {∅}, d, ∅, mδ ∅ ∈ K be the degenerate metric space only made out of its root on which a mass m is put. For a pointed weighted metric space X and any non-negative real numbers a and b, we will write (aX, bµ X ) := (X, ad X , ρ X , bµ X ). When X is in K and µ X (X) = m, we will use the convention (0X, µ X ) = 0 (m) (which makes sense since (εX, µ X ) converges to 0 (m) as ε goes to 0 with respect to d GHP ). Lemma 3.3. Let X and Y be two elements of K. For any non-negative real numbers a, b, c and d: Then D(π; bµ X , dµ X ) = |b − d|µ X (X) and π(C c ) = 0.
(ii) For every correspondence C ∈ C(X, Y ), we clearly have dis (aX,bµ X ), (aY,bµ Y ) C = a dis X,Y C. No less clearly, for any finite measure π on X × Y , D(bπ; bµ X , bµ Y ) = bD(π; µ X , µ Y ). Concatenated compact metric spaces Let (X i ) i∈I be a countable family of pointed weighted metric spaces with X i = (X i , d i , ρ i , µ i ). Let (X, d, ρ, µ) where: With a slight abuse of notation, we will consider (X, d) to be the quotient metric space X/ ∼ d where x ∼ d y iff d(x, y) = 0. For each i in I, we will also identify X i with its image in X by the quotient map. Write X =: X i ; i ∈ I . Remark 3.5. If (T i ) i∈I is a countable family of weighted R-trees, then T i ; i ∈ I is clearly an R-tree itself.
is an element of K iff the height |X i | of X i goes to 0 as i goes to infinity and i≥1 µ i (X i ) is finite.
Proof. Set X := X i ; i ≥ 1 and for all x in X and positive r, denote the open ball of X centred at x with radius r by B X (x, r) := {y ∈ X : d X (x, y) < r}. Similarly, for all i ≥ 1 and x ∈ X i , write B i (x, r) := {y ∈ X i : d i (x, y) < r}. Clearly, the measure µ X is finite iff the sum i≥1 µ i (X i ) is.
If |X i | → 0, then in particular, for all positive ε, there exists a integer n such that i>n X i ⊂ B X (ρ X , ε). Moreover, since X i is compact for all i = 1, . . . , n, we can find a finite ε-cover of X i , i.e. a finite subset Observe that it is finite and that X ⊂ x∈A B X (x, ε). Since this holds for all positive ε, it follows that X is compact.
If lim sup |X i | > 0, then there exists a positive ε such that |X i | > ε for infinitely many indices i. As a result, X cannot have a finite ε-cover, which implies that it is not compact.
Set C := i≥1 C i , which is a correspondence between X and Y. Let (x, y) and (x , y ) be in C. If both (x, y) and (x , y ) are in C i for some i, then clearly, |d X (x, x ) − d Y (y, y )| ≤ dis C i . Otherwise, if (x, y) ∈ C i and (x , y ) ∈ C j with i = j, then using the definition of d X and d Y as well as the triangular inequality, we get |d For all n ≥ 0, define the finite Borel measure π (n) on X × Y by π (n) (A) : Moreover, the discrepancy of π (n) with respect to µ X and µ Y satisfies In light of Lemma 3.6, there exists n such that i>n µ Xi ( which holds for all positive ε.

Extension to locally compact R-trees
Let X = (X, d X , ρ X , µ X ) be a locally compact pointed weighted metric space such that µ X is a boundedly finite measure. For all r > 0, let X| r := X| r , d X , ρ X , µ X | r where X| r := {x ∈ X : |x| ≤ r} is the closed ball with radius r centred at ρ X and µ X | r := 1 X|r µ X EJP 22 (2017), paper 95. Page 13/53 http://www.imstat.org/ejp/ is the restriction of µ X to X| r . Observe that if r ≤ R, clearly (X| R )| r = (X| r )| R = X| r . We also define ∂ r X := {x ∈ X : |x| = r}.
For any two locally compact pointed weighted metric spaces X and Y, we define the extended Gromov-Hausdorff-Prokhorov distance between them as: This definition closely resembles that of the GHP distance on locally compact metric spaces defined and studied in [3].
Remark 3.8. Let X and Y be two weighted locally compact pointed metric spaces. For all R ≥ 0, Let T be the set of GHP-isometry classes of locally compact rooted R-trees endowed with a boundedly finite Borel measure and T c , be that of compact weighted and rooted R-trees (i.e. T c = K ∩ T).  (ii) Suppose d GHP (T n | r , T| r ) → 0 for all r ≥ 0 with µ T (∂ r T ) = 0. Since µ T is a locally finite measure, the set {r > 0 : µ T (∂ r T ) > 0} is at most countable. As a result, the sequence r → 1 ∧ d GHP (T n | r , T| r ) n≥1 converges to r → 0 almost everywhere in [0, ∞). Lebesgue's dominated convergence theorem then ensures that D GHP (T n , T) → 0.
Assume D GHP (T n , T) → 0 and let r > 0 be such that µ T (∂ r T ) = 0. For every subsequence (n k ) k , there exists a sub-subsequence (k ) such that 1 ∧ d GHP (T nk | t , T| t ) → 0 for almost every t ≥ 0 as → ∞. In particular, there exists R > r such that Recall that d GHP is topologically equivalent to the metric on K studied in [3]. Therefore, in light of the proof of [3, Proposition 2.10], if τ n , n ≥ 1 and τ are compact R-trees such that d GHP (τ n , τ ) → 0, then for all r > 0 such that µ τ (∂ r τ ) = 0, d GHP (τ n | r , τ | r ) → 0.
As a result, d GHP (T nk | r , T| r ) → 0. From every subsequence (n k ) k we can thus extract a sub-subsequence (k ) such that d GHP (T nk | r , T| r ) → 0, which is equivalent to saying that d GHP (T n | r , T| r ) → 0 as n → ∞.
(iii) Since a criterion similar to (ii) holds for the metric studied in [3], this metric is topologically equivalent to D GHP . As a result and thanks to Theorem 2.9 and Corollary 3.2 in [3], it follows that (T, D GHP ) is completely metrisable and separable, i.e. it is Polish.

Continuum grafting
and µ is the measure defined for all Borel set A by µ(A) := i∈I µ τi (A ∩ τ i ). The function G grafts the trees τ i at height u i for each i ∈ I on R + which can be thought of as an infinite (continuous) branch. It is quite obvious that the weighted pointed metric space be a sequence of compact weighted R-trees and (u i ) i≥1 be a sequence of non-negative real numbers. Then the weighted R-tree T : Proof. For all x in T and positive r, denote by B T (x, r) := {y ∈ T : d T (x, y) < r} the open ball of T centred at x with radius r and similarly for all i ≥ 1 and x ∈ τ i , write Therefore, the measure µ T is boundedly finite and we only need to prove that T is locally compact.
Fix K ≥ 0 and let ε be positive. For all i ≥ 1, because τ i is compact, there exists a finite subset A i of τ i such that τ i ⊂ x∈Ai B i (x, ε). To build an ε-cover of T | K , first observe that if i is such that u i ≤ K and |τ i | < ε/2, then τ i is contained in some open ball with radius ε centred at some nε for 0 ≤ n ≤ K/ε. Moreover, by assumption, there are only finitely many indices i with u i ≤ K and |τ i | ≥ ε/2. Therefore, if we let . As a result, T | K has a finite ε-cover for all positive ε which means that it is compact.
⇒ Suppose the set {i ≥ 1 : u i ≤ K, |τ i | ≥ ε} is infinite for some K ≥ 0 and positive ε. In particular, we can find an increasing sequence (i n ) n with u in ≤ K and |τ in | ≥ ε for all n. For each n ≥ 1, let x n be in τ in and such that ε/2 < d in (ρ in , x n ) ≤ ε. If n = m, the definition of the metric on T gives d T (x n , x m ) > ε. Therefore, (x n ) n has no Cauchy subsequence which implies that T | K+ε isn't compact and that T / ∈ T. Assume that {i ≥ 1 : u i ≤ K, |τ i | ≥ ε} is finite for all K ≥ 0 and ε > 0, and that Remark 3.11. In the following, when we consider discrete trees, we will see them as R-trees by replacing their edges by segments of length 1.

Fragmentation trees
In this section, we will present a few results on certain classes of T c -and T-valued random variables: self-similar fragmentation trees (introduced in [29]) and self-similar fragmentation trees with immigration (see [27]
These processes can be seen as the evolution of the fragmentation of an object of mass 1 into smaller objects which will each, in turn, split themselves apart independently from one another, at a rate proportional to their mass to the power α.
It was shown in [8,9] that the distribution of a self-similar fragmentation process is characterised by a 3-tuple (α, c, ν) where α is the aforementioned self-similarity index, c ≥ 0 is a so-called erosion coefficient which accounts for a continuous decay in the mass of each particle and ν is a dislocation measure on S ↓ ≤1 , i.e. a σ-finite measure such that (1 − s 1 ) ν(ds) < ∞ and ν({1}) = 0. Informally, at any given time, each particle with mass say x will, independently from the other particles, split into smaller fragments of respective masses xs 1 , xs 2 , . . . at rate x α ν(ds).
We will be interested in fragmentation processes with negative self-similarity index −γ < 0 with no erosion, i.e. with c = 0. Furthermore, we will require the dislocation measure ν to be non-trivial, i.e. ν(S ↓ ≤1 ) > 0, and conservative, that is to satisfy ν( s < 1) = 0. Therefore, the fragmentation processes we will consider will be characterised by a fragmentation pair (γ, ν) and we will refer to them as (γ, ν)-fragmentation processes.
Under these assumptions, each particle will split into smaller ones which will in turn break down faster, thus speeding up the global fragmentation rate. Let X be a (γ, ν)-fragmentation process and set τ 0 := inf{t ≥ 0 : X(t) = 0} the first time at which all the mass has been turned to dust. It was shown in [10, Proposition 2] that τ 0 is a.s. finite and in [25,Section 5.3] that it has exponential moments, i.e. that there exists a > 0 such that E exp(aτ 0 ) < ∞.
Furthermore, a T c -valued random variable that encodes the genealogy of the fragmentation of the initial object was defined in [29]. This random R-tree is a (γ, ν)-fragmentation process. We will denote the distribution of (T , d, ρ, µ) by T γ,ν .

Remark 3.12.
• More general self-similar fragmentation trees, where both the assumptions "c = 0" and "ν is conservative" are dropped, were defined and studied in [49].
• Let T be a (γ, ν)-self-similar fragmentation tree and m > 0. The tree (m γ T , m µ T ) encodes the genealogy of a (γ, ν)-self-similar fragmentation process started from a single object with mass m. Classical examples It was observed in [9] that the Brownian tree, which was introduced in [5], may be described as a self-similar fragmentation tree with parameters Another important example of fragmentation trees is the family of α-stable trees from [23], where α belongs to (1,2). Indeed, a result from [43] states that the α-stable tree is a (1−1/α, ν α )-self-similar fragmentation tree with ν α defined as follows:

Scaling limits of Markov branching trees Self-similar fragmentation trees bear
a close relationship with Markov branching trees. Let ι : • Let (q n ) n∈N be the sequence of firstsplit distributions of a Markov branching family MB L,q and for all adequate n ≥ 1, setq n := q n • ι −1 . Suppose there exists a fragmentation pair (γ, ν) and a slowly varying function such that, for the weak convergence of finite measures on S ↓ , For all n ∈ N , let T n have distribution MB L,q n and set µ n := u∈L(Tn) δ u the counting measure on the leaves of T n . • Let (q n−1 ) n∈N be the sequence associated to a Markov branching family MB q .
Assume that there exists a fragmentation pair (γ, ν) and a slowly varying function with either γ < 1 or γ = 1 and (n) → 0 such that n γ (n) (1 − s 1 )q n (ds) ⇒ (1 − s 1 ) ν(ds). For each n ∈ N , let T n be a MB q n tree and endow it with its counting measure µ n . Under either set of assumptions, with respect to the GHP topology on T c , The following useful result on the heights of Markov branching also holds.
Lemma 3.14. Suppose that (q n ) n∈N satisfies the assumptions of Theorem 3.13 with respect to a given fragmentation pair (γ, ν) and a slowly varying function . Then for any Proof. Clearly T s is an R-tree and its total mass is µ s (T s ) = i≥1 s i µ i (T i ) = s which is finite. It only remains to show that it is compact or, in light of Lemma 3.6, that s. converges to 0 as i grows to infinity. Since s is summable, for any positive ε, where we have used Markov's inequality and the fact that |T i | 1/γ ∈ L 1 (see Lemma 3.14). Borel-Cantelli's lemma then allows us to deduce that s γ i |T i | → 0 a.s. as i → ∞.
Proof. For all n ≥ 0, in light of Lemmas 3.3 and 3.7, Consequently, there is a non-negative constant C such that for all integer n and s in S ↓ , Hence, for all s and r in S ↓ and any n ≥ 1 As a result,

Fragmentation trees with immigration
We say that a non-negative Borel measure I on S ↓ is an immigration measure if it satisfies Fix an immigration measure I such that I(S ↓ ) > 0 and let (γ, ν) be a fragmentation pair. Let Σ = n≥1 δ (un,sn) be a Poisson point process on R + ×S ↓ with intensity du⊗I(ds) independent of a family (X (n,k) , n ≥ 1, k ≥ 1) of i.i.d. (γ, ν)-fragmentation processes. Define the S ↓ -valued process X as follows: We call X a fragmentation process with immigration with parameters (γ, ν, I). Like pure fragmentation processes, the genealogy of these immigrations and fragmentations can be encoded as an infinite weighted R-tree (see [27]), say (T (I) , d, ρ, µ), such that if for all t ≥ 0, we denote the set of the closures of the bounded connected components of is a (γ, ν, I)-fragmentation process with immigration. Let T I γ,ν be the distribution of (T (I) , d, ρ, µ).

Point process construction
The construction of (γ, ν)-fragmentation trees with immigration I described in [27] can be expressed using Poisson point processes, concatenated (γ, ν)-fragmentation trees and the continuum grafting function G from the end of Sec- The random tree T (I) has distribution T I γ,ν .
Observe that for all K ≥ 0, we can write the total mass grafted on the infinite branch at height less than K as an integral against the point-process Σ: Since 1 ∧ 1 u≤K s du I(ds) = K (1 ∧ s)I(ds) < ∞, we may use Campbell's theorem (see [38,Section 3.2]) and claim that 1 u≤K s Σ(du, ds) < ∞ a.s.. The second condition of Lemma 3.10 is thus met. Moreover, for all i ≥ 1, which is, according to Campbell's formula, a.s. finite. Consequently, using Borel-Cantelli's lemma, we deduce that conditionally on Σ, with probability one, there are finitely many indices i ≥ 1 such that u i ≤ K and T i is higher than ε. It follows from Lemma 3.10 that T (I) is a.s. T-valued.

Self-similar immigration measures
We will say that an immigration measure I with I(S ↓ ) > 0 is self-similar with positive index γ (or simply, γ-self-similar) if for all c > 0 and measurable F : S ↓ → R + , c F (s) I(ds) = F (c 1/γ s) I(ds). 17. An immigration measure I is γ-self-similar iff γ ∈ (0, 1) and there exists a positive constant K as well as an S ↓ 1 -valued random variable X such that for all measurable F : where Z := I · ≥ 1 , and let X be a σ-distributed random variable. Now, for any measurable g : S ↓ 1 → R + and t > 0, because I is self-similar, we get that Since this identity holds for any t > 0 and measurable g : S ↓ 1 → R + and because I({0}) = 0, it follows that I may be written in the desired way. Finally, because I is an immigration measure, it must integrate s → 1 ∧ s, which implies that γ belongs to (0, 1).
The point process construction of fragmentation trees with immigration may be used to prove this next proposition. Proposition 3.18. Suppose I is a γ-self-similar immigration measure and let ν be a dislocation measure. If (T , µ) denotes a (γ, ν, I)-fragmentation tree with immigration, then for any positive m, • (m γ T , m µ) has the same distribution as (T , µ), Relationship to compact fragmentation trees Let (γ, ν) be a fragmentation pair and I an immigration measure with I(S ↓ ) > 0. Theorem 17 in [27] states that under suitable conditions, if (T , µ T ) denotes a (γ, ν)-self-similar fragmentation tree, then (m γ T , mµ T ) converges to T I γ,ν in distribution as m → ∞ with respect to the extended GHP topology.
For instance, Theorem 11 (iii) in [5], states that if (T , µ T ) is a standard Brownian tree then when m → ∞, (m 1/2 T , m µ T ) converges in distribution to Aldous' "self-similar CRT". This result was reformulated in terms of fragmentation trees in [27, Section 1.2]: (m 1/2 T , m µ T ) converges in distribution as m → ∞ to a (1/2, ν B , I B )-fragmentation tree with immigration, where ν B is the Brownian dislocation measure (see Section 3.2.1) and the Brownian immigration measure I B is defined for all measurable f : Page 20/53 http://www.imstat.org/ejp/ We will call a (1/2, ν B , I B )-fragmentation tree with immigration a Brownian tree with immigration. As mentioned in the Introduction, this tree will appear in many of our applications.
Set α ∈ (1, 2) and recall the notations used to define ν α in Section 3.2.1, in particular, that ∆ denotes the decreasing rearrangement of the jumps on [0, 1] of an 1/α-stable In [27, Section 5.1], it was observed that if (T , µ T ) is an α-stable tree, then, as m → ∞, )-fragmentation tree with immigration. These trees coincide with the α-stable Lévy trees with immigration introduced in [22, Section 1.2].

Convergence of point processes
With the notations used in Section 3.2.2, let Π := i≥1 δ (ui,si,Ti) . It is a Poisson point process on R + × S ↓ × T c with intensity du ⊗ I (ds, dτ ) where the measure I on S ↓ × T c is defined as follows: let (τ i , µ i ) i≥1 be a sequence of i.i.d. (γ, ν)-fragmentation trees and for any s in S ↓ , similarly to Section 3.2.1, set τ s : Moreover, recall the construction of Markov branching trees with a unique infinite can be built in the following way: consider the infinite branch and for all n ≥ 0, graft a tree T n at height n (where the sequence (T n ) n≥0 is i.i.d.), such that Λ n := Λ(T n ) has distribution q * = q ∞ (∞, · ) and conditionally on Λ n = λ in P <∞ , T n has distribution MB q λ . As a result, T is characterised by the point process n≥0 δ (n,Λn,Tn) (or simply by n≥0 δ (n,Tn) ).
Therefore, when considering scaling limits of such trees, it seems natural to take a step back and instead consider the convergence of the underlying point processes on R + × S ↓ × T c . We will follow the spirit of [27, Section 2.1.2] and introduce a topology on the set of such point measures adequate for our forthcoming purposes.
Let R be the set of integer-valued Radon measures on R + × S ↓ × T c which integrate the function (u, s, τ ) −→ 1 u≤K s for all K ≥ 0 and are such that µ(R + × {0} × T c ) = 0.

Remark 3.19.
Recall that as an immigration measure, I integrates the function s ∈ S ↓ → 1 ∧ s. Campbell's theorem (see [38,Section 3.2]) therefore ensures that Π, the Poisson point process associated to a T I γ,ν tree, a.s. belong to R. Let F be the set of continuous functions F : If µ n , n ≥ 1 and µ are elements of R, we will say that µ n → µ iff for all F ∈ F , F dµ n → F dµ. Appendix A7 of [35] ensures that when endowed with the topology induced by this convergence, R is a Polish space. Moreover, Theorems 4.2 and 4.9 of [35] give the following criterion for convergence in distribution of elements of R. Proposition 3.20 ([35]). Let ξ n , n ≥ 1 and ξ be R-valued random variables. Then ξ n converges to ξ in distribution with respect to the topology on R iff for all F ∈ F , L ξn (F ) → L ξ (F ).

Scaling limits of infinite Markov-branching trees
In this section, we will state and prove our main result on scaling limits of infinite Markov branching trees as well as its corollary on their volume growth.
Let N be an infinite subset of N containing 1 and let q = (q n−1 ) n∈N be a sequence of first-split distributions where for each n, q n−1 is supported by λ ∈ P n−1 : λ i ∈ N , i = 1, . . . , p(λ) . Recall from Section 2.2.1 that the associated Markov branching family MB q is well defined. Furthermore, let q ∞ be a probability measure on P ∞ supported by the set (∞, λ) : λ ∈ P <∞ , λ i ∈ N , i = 1, . . . , p(λ) . In this way, the probability measure MB q,q∞ ∞ on T ∞ is also well defined and a.s. yields trees with a unique infinite spine. To lighten notations, let q * := q ∞ (∞, · ) which is a probability measure on P <∞ .
In the remainder of this section, we will assume that: (S) There exist some γ > 0 and a dislocation measure ν on S ↓ , such that n γ (1 − s 1 )q n (ds) ⇒ (1 − s 1 ) ν(ds). In particular, Theorem 3.13 and Lemma 3.14 hold. (I) There exists an immigration measure I on S ↓ such that if Λ has distribution q * , for any continuous F : where we have taken S = cR. As a result, the immigration measure I is γ-self-similar, as defined in Section 3.2.2, and Proposition 3.18 therefore holds for (γ, ν, I)-fragmentation trees with immigration. http://www.imstat.org/ejp/ Let T be a fixed element of T. We define its volume growth function as V T : R + → R + , R → µ T (T | R ). In other words, V T (R) is the mass or volume of the closed ball T | R . Once Theorem 4.2 is proved, we will be interested in the volume growth processes associated to these trees. and (T , µ T ) be a (γ, ν, I)-fragmentation tree with immigration. Then, the volume growth function of (T /R, µ T /R 1/γ ) converges in distribution to that of (T , µ T ) with respect to the topology of uniform convergence on compacts of R + . In particular We may adapt the proofs of Theorem 4.2 and Proposition 4.3 to get the following theorem.
and is endowed with its counting measure µ T under (S) and (I ), we get that (T /R, µ T /R α ) converges in distribution to the infinite branch R + endowed with the random measure µ = i≥1 s i δ ui , where {(u i , s i ); i ≥ 1} are the atoms of a Poisson point process Σ on R + × S ↓ with intensity du ⊗ I(ds). The tree (R + , µ) encodes the genealogy of a pure immigration process. Furthermore, µ T (T | R )/R α converges in distribution to µ([0, 1]) = [0,1]×S ↓ s Σ(du, ds).
Similarly, if T is distributed according to MB L,q,q∞ ∞ and is endowed with the counting measure on its leaves, the same results hold under (S) and (I ).
To prove Theorem 4.2, we will first study the convergence of the underlying point processes in Section 4.1 which will give us more leeway to manipulate the corresponding trees and end the proof in Section 4.2. Section 4.3 will then focus on proving Proposition 4.3.

Convergence of the associated point processes
Since (T c , d GHP ) is Polish, in light of Assumption (S), Theorem 3.13 and Skorokhod's representation theorem, we can find an i.i.d. sequence [(T i,n ) n∈N , T i ] i≥1 , where for each i ≥ 1, the family (T i,n ) n∈N , T i of random trees is such that: • T i is a (γ, ν) self-similar fragmentation tree, • (T i,n /n γ , µ Ti,n /n) =: T i,n a.s. converges to T i as n → ∞. Proof. Assume the contrary, i.e. that there exists a sequence (s (n) ) n≥1 in K and a positive constant c such that i>n s (n) i > c for all n ≥ 1. Since K is compact, we can find a subsequence (s (n k ) ) k and s ∈ K such that s such that for all s, s in S ↓ and τ, τ in T c , G(s, τ ) − G(s , τ ) ≤ s − s + d GHP (τ, τ ). Further assume that G(s, · ) ≤ 1 ∧ s for any s ∈ S ↓ . Finally, set g : S ↓ → R + the function defined by g(s) := E[G(s, T s )].
Proof. Clearly, g(s) ≤ 1 ∧ s. Moreover, for any s and r in S ↓ , where we have used Lemma 3.16. Therefore, g is continuous and Assumption (I) ensures Consequently, noticing that it will be sufficient to prove that ∆ R → 0 as R → ∞. For all n ≥ 1, thanks to Lemma 3.7 and Remark 3.2 we get Let ε > 0 be fixed. As a result of Assumption (I), the sequence R (1∧ s) q (R) (ds), R ≥ 1 is tight and so there exists a compact subset K of S ↓ such that sup R≥1 R (1 ∧ s) 1 − 1 K (s) q (R) (ds) < ε. Moreover, as a compact subset, K is bounded, i.e. sup s∈K s = C < ∞.
For all n ≥ 1, recall that T 1,n and T 1 are endowed with probability measures. Remark 3.2 therefore ensures that d GHP (T 1,n , T 1 ) ≤ 2 ∨ |T 1,n | ∨ |T 1 |. As a result, thanks to Lemma 3.14, For all δ > 0, in light of Lemma 4.6, there exists an integer m K,δ which depends only on K and δ such that sup s∈K i>mK,δ s i < δ. Then for all R ≥ 1 and λ ∈ P <∞ with where h 1 is defined as in Lemma 3.14. Similarly, In summary, for all λ in P <∞ such that λ/R 1/γ belongs to K, we get that for some finite constant B independent of ε, η, δ and K. Therefore, for all positive ε, δ, η, and any R ≥ 1, Let δ be such that (2δ+δ γ )B < ε and set η < ε/[(C +C γ )m K,δ ]. Because of Assumption (I), we therefore get that lim sup The monotone convergence theorem implies that the right hand side of this last inequality vanishes when ε decreases to 0. This proves that ∆ R → 0, which concludes this proof. We will now endeavour to prove that the point processes associated to adequately rescaled Markov branching trees with a unique infinite spine converge in distribution to the point process associated to fragmentation trees with immigration. Let Π be a Poisson point process on R + × S ↓ × T c with intensity du ⊗ I (ds, dτ ), where I is the measure defined at the beginning of Section 3.3. Observe that for all K ≥ 0, Campbell's theorem (see [38,Section 3.2]) therefore ensures that Π a.s. satisfies the integrability conditions necessary to belong to the set R of point measures on R + ×S ↓ ×T c defined in Section 3.3.
Let T have distribution MB q,q∞ ∞ . By construction of Markov branching trees with a unique infinite spine (see Remark 2.8), there exists a sequence (Λ n , T n ) n≥0 of i.i.d. random variables such that T = b ∞ n≥0 (v n , T n ), where Λ n is distributed according to q * and conditionally on Λ n = λ, T n has distribution MB q λ . For all R ≥ 1, let Π R be the point process associated to (T /R, µ T /R 1/γ ), i.e. the R-valued random variable defined for all measurable f :

Lemma 4.9.
With respect to the topology on R introduced in Section 3.3, Π R converges to Π in distribution as R goes to infinity.
Proof. In light of Proposition 3.20, it will be enough to prove that for any function F in the set F , the Laplace transform of Π R evaluated in F converges to that of Π. Fix such F in F and recall that it is continuous and that there exists K ≥ 0 such that 0 ≤ F (u, s, τ ) ≤ s 1 u≤K for all (u, s, τ ). Campbell's theorem for Poisson point processes gives L Π (F ) = exp − 1 − e −F (u,s,τ ) du ⊗ I (ds, dτ ) .
For all R ≥ 1 and u ≥ 0, set Using these notations, we may write log L Π (F ) = − K 0 ϕ(u) du and thanks to the i.i.d. nature of the sequence (Λ n , T n ) n≥0 , for all R ≥ 1, The functions ϕ R , R ≥ 1 and ϕ all have support in [0, K] and are continuous (in light of the dominated convergence theorem). Observe that 0 ≤ 1 − e −F (u,s,τ ) ≤ 1 ∧ s. From Corollary 4.8, we know that for all fixed u ≥ 0, ϕ R (u) → ϕ(u) as R → ∞ and that i.e. that the sequence (ϕ R ) R≥1 is uniformly bounded by a finite constant, say C. Let ε be positive. It also follows from Corollary 4.8 that there exists a compact subset A of S ↓ × T c with Recall that F is continuous, hence there exists δ > 0 such that for any (u, s, τ ) and (u , s , τ ) in the compact set [0, K] × A, if |u − u | + s − s + d GHP (τ, τ ) < δ, then |F (u, s, τ ) − F (u , s , τ )| < ε. As a result, and because x → e −x is 1-Lipschitz continuous on R + , for all R ≥ 1 and u, v in [0, K] with |u − v| < δ, and in light of Corollary 4.8 and the monotone convergence theorem, we get This ensures that the sequence (ϕ R ) R≥1 is equicontinuous on [0, K]. It follows from the Arzelà-Ascoli theorem that ϕ R converges uniformly to ϕ. In turn, we deduce that Observe that for all R ≥ 1, we may write Recall that sup R≥1,u≥0 ϕ R (u) ≤ C. Therefore, because the function [0, 1) → R + , x → − log(1 − x) − x increases with x, for any R ≥ C and n ≥ 0, we get Finally, as Riemann sums of the continuous function ϕ, In summary, log L ΠR (F ) → log L Π (F ) when R → ∞.

Proof of Theorem 4.2
Now that we know that the underlying point processes converge, we can prove convergence of the trees themselves.
Recall that the topology we defined on R in Section 3.3 makes it a Polish topological space. As such, Skorokhod's representation theorem holds for R-valued random variables. In particular, because of Lemma 4.9, there exist: such that if for any R we let Π R be the random element of R defined for all measurable f : be the atoms of Π and set Σ := i≥1 δ (ui,si) . By definition of the intensity measure of Π, there exists a family {T i,j ; i, j ≥ 1} of i.i.d. (γ, ν)-fragmentation trees independent of Σ such that for all i ≥ 1, where G is the continuum grafting function defined in Section 3.1.2 and recall that it is a (γ, ν)-fragmentation tree with immigration I (see Section 3.2.2). For all ε > 0, let This tree can be thought of as T (I) on which all sub-trees grafted on the spine with mass less than ε have been cut away. Observe that because of the definition of the function G, the measure on T For all R, set τ (R) := b ∞ n≥0 (v n , τ (R) n ) and denote its counting measure by µ τ (R) . Observe that τ (R) is distributed according to MB q,q∞ ∞ . Let T (R) := (R −1 τ (R) , R −1/γ µ τ (R) ) be the rescaled infinite Markov branching tree associated to Π R . Moreover, for all positive ε, let T (R) ε be the tree obtained by removing from T (R) all the sub-trees grafted on its spine with mass less than ε, i.e. set T (R) The tree T (R) ε is clearly a subset of T (R) and it is endowed with the restriction of µ T (R) .
In this section we will endeavour to prove Theorem 4.2. In order to do so, we will use the following criterion for convergence in distribution.
n , X n ) > η = 0, Then X n converges to X in distribution.
Remark 4.11. Condition (i) is akin to finite-dimensional convergence of X n to X and Conditions (ii) and (iii) to tightness of (X n ) n .
In our setting, the sequence (T (R) ; R ∈ N) of rescaled MB q,q∞ ∞ trees will play the role of (X n ) n and the limit variable X will be T (I) , a (γ, ν)-fragmentation tree with immigration I. The intermediate family (X (k) n ) n,k will be replaced by (T (R) ε ; R ≥ 1) with ε → 0 along some countable subset of (0, ∞). Similarly, we'll consider T (I) ε trees instead of (X (k) ) k .
ε | K and observe that it is a correspondence between T (I) | K and T (I)  Proof. We will proceed in a way similar to the proof of Lemma 4.12. For all R ≥ 1 and ε > 0, define the correspondence C (R) ε between T (R) and T (R) which is a correspondence between T (R) | K and T (R) ε | K , and let π (R) ε K be the restriction of π (R) ε to T (R) | K × T (R) ε | K . Then, for any non-negative K, For all n ≥ 0 and R ≥ 1, |τ (R) n )}. Further observe that thanks to Lemma 3.14, we can find a finite constant h such that for all n ≥ 0, R ≥ 1 and i = 1, . . . , p(Λ Similarly, In light of Assumption (I), The next result is both intuitive and easy to prove. Its proof will therefore be left to the reader. Lemma 4.14. Fix n a positive integer and let G n be the restriction of G to (R + × T c ) n ; G n is a continuous function for the product topology. Lemma 4.15. Let K ≥ 0 and ε > 0 be fixed. Almost surely, for any continuous F : Proof. Let ϕ and ϕ n , n ≥ 1 be the functions from R + × S ↓ × T c to R + defined for all (u, s, τ ) by ϕ(u, s, τ ) := 1 u≤K 1 s≥ε and ϕ n (u, s, τ ) : Observe that for all n ≥ 1, ϕ n is continuous and that for n large enough, εϕ n F is an element of F . Therefore, everywhere on the event {Π R → Π}, ϕ n F dΠ R → ϕ n F dΠ for any fixed n ≥ 1. Furthermore, ϕ n ↓ n ϕ so the monotone convergence theorem yields inf n≥1 ϕ n F dΠ = ϕ F dΠ and for all R ≥ 1, inf n≥1 ϕ n F dΠ R = ϕ F dΠ R . As a result, on {Π R → Π}, Similarly, if we let ψ(u, s, τ ) := 1 u<K 1 s>ε , there exists a sequence (ψ n ) n of continuous functions such that ψ n ↑ n ψ and for n large enough, εψ n F is in F . The same kind of Proof. Observe that for any K ≥ 0, Π (u, s, τ ) : u = K = 0 a.s. which implies that with probability 1, for any continuous bounded F :  Page 30/53 http://www.imstat.org/ejp/ Moreover, the (finite) measures 1 u≤K, s≥ε Π(du, ds, dτ ) and 1 u≤K, s≥ε Π R (du, ds, dτ ), R ≥ 1 may be written as finite sums of Dirac measures. As a result, almost surely, the atoms of 1 u≤K, s≥ε Π R (du, ds, dτ ) converge to those of 1 u≤K, s≥ε Π(du, ds, dτ ) when R → ∞. Lemma 4.14 then ensures that T (R) ε | K a.s. converges to T (I) ε | K . Since this holds for any K ≥ 0, Proposition 3.9 allows us to conclude.

Volume growth of infinite Markov branching trees
We now turn to the proof of Proposition 4.3. Recall that if T ∈ T is fixed, then V T , the volume growth function of T, is given by Notice that V T is a non-negative, non-decreasing càdlàg function. If (f n ) n is a sequence of monotone functions from a compact interval I to R such that f n → f point-wise for some continuous function f , then f n → f uniformly on I.
With these notations, we may write V T which is a.s. finite, as already noticed. As a result and in light of the Weierstrass M -test, the restriction of V T (I) to the compact interval [0, K] is a series which a.s. converges uniformly on [0, K]. Proposition 1.9 in [11] implies that the volume growth function of (γ, ν)-fragmentation trees is a.s. continuous. In particular, with probability one, V Ti,j is continuous for all i and j.

Unary immigration measures
Before concluding this section, we will state a useful criterion to prove Assumption (I) when the limit immigration measure is unary, i.e. when it is supported by the set {(s, 0, 0, . . . ) : s > 0}. In light of Remark 4.1, we will only study self-similar unary immigration measures.
Let γ ∈ (0, 1). Proposition 3.17 ensures that any unary γ-self-similar immigration measure may be written as c I un γ where c is a positive constant and I un γ is the measure defined by Proof. By assumption, for all ε > 0, there exists an integer N such that for all n ≥ N ,  Proof. The main idea for this proof is to show that the tail of Λ is asymptotically negligible when its first component is large, or more precisely, that R E 1 ∧ [ Λ − Λ 1 ] R 1/γ converges to 0 when R goes to infinity. Since Λ fulfils the assumptions of Lemma 4.18, In light of Fatou's lemma and the assumption on the probability tail of Λ 1 , Now observe that if a, b, x and y are four real numbers, then a∧x+b∧y ≤ (a+b)∧(x+y). In particular, for all ε ∈ (0, 1) Let f : S ↓ → R + be a Lipschitz-continuous function bounded by 1 and set g(x) := f (x, 0, 0, . . . ) for all x ≥ 0. There exists a constant K ≥ 0 such that for all x and y in S ↓ , Used jointly with our assumption on Λ and Lemma 4.18, this ensures that R E (1 concludes this proof.

Applications
In this section, we will develop applications of our three main results (Theorems 2.9, 4.2 and Proposition 4.3) to various models of random trees which satisfy the Markov branching property. With our unified approach, we will recover known results and get new ones.

Galton-Watson trees
Let ξ be a probability measure on Z + with mean 1 and ξ(1) < 1 (critical regime). We will be interested in unordered Galton-Watson trees with offspring ditribution ξ, the law of which we will write GW ξ . For any finite tree t, For each positive integer n such that GW ξ (T n ) > 0, let GW n ξ be the measure GW ξ conditioned on the set T n of trees with n vertices. Similarly, if n satisfies GW ξ (T L,n ) > 0, define GW L,n ξ as GW ξ conditioned on the set T L,n of trees with n leaves. Moreover, let d := gcd {n − 1; GW ξ (T n ) > 0} and d L := gcd {n − 1; GW ξ (T L,n ) > 0}.
Kesten's tree Letξ be the size-biased distribution of ξ, that isξ(k) = kξ(k) for all k ≥ 0. By assumption, the mean of ξ is 1, soξ is a probability measure. We define GW ∞ ξ as the distribution of Kesten's tree which is obtained as follows: • Let (X n ) n≥0 be a sequence of i.i.d. random variables such that X n + 1 followsξ, • Independently of this sequence, let (T n,k ; n ≥ 0, k ≥ 1) be i.i.d. This result entails that if T is a GW ξ tree, the conditional distribution of T on |T | ≥ n converges to GW ∞ ξ as n → ∞. Kesten's tree can thus be, in a way, considered as a GW ξ tree conditioned to have infinite height.
This tree also appears as the local limit of conditioned critical Galton-Watson trees under various types of conditioning, see [2]. In particular, it was first proved in [36] (in terms of Galton-Watson processes) and in [7] (in terms of trees) that if ξ is critical and has finite variance, then GW n ξ ⇒ GW ∞ ξ . In [20], it was shown that under the same assumptions, GW L,n ξ ⇒ GW ∞ ξ . In both cases, the finite variance assumption may be dropped, see [33] and [2].
The local limits of Galton-Watson trees conditioned on their size with offspring distribution with means less than 1 were studied in [34], [33] and [1]. See also [50] for the study of the local limits of multi-type critical Galton-Watson trees.
Using Theorem 2.9, we will recover the following proposition in Section 5.1.1.

Proposition 5.2.
In the sense of the d loc topology, GW n ξ and GW L,n ξ both converge weakly towards GW ∞ ξ . Afterwards, we will study scaling limits of Kesten's tree in the spirit of Theorem 4.2.
Recall the descriptions of the Brownian tree with immigration and α-stable Lévy trees with immigration from Section 3.2.2. (i ) If ξ has finite variance σ 2 and if d L = 1, then (ii) Stable case: Suppose that ξ(n) ∼ c n −1−α as n → ∞ for some positive constant c and α ∈ (1, 2). Then, Remark 5.4. Both (i) and (ii) were proved in [22] and (i ) seems to be a new, if predictable, result.
We also mention that under the assumptions of (ii), (T /R, µ L T /R α/(α−1) ) should converge in distribution to T α , (ck α ) 1/(α−1) ξ(0)µ α . We won't prove this statement as Assumption (S) hasn't been proved in this case and to do so would require quite a bit of computation. The scaling limits of Galton-Watson trees with such an offspring distribution conditioned on their number of leaves were however studied in [39]. Section 5.1.2 will focus on the finite variance case, first on (i) and then on (i ). We will prove Proposition 5.3 in the stable case (ii) in Section 5.1.3.

Markov branching property and local limits
Let N := {n ≥ 1 : GW ξ (T n ) > 0}. Proposition 37 in [30] states that the sequence of probability measures (GW n ξ ) n∈N satisfies the Markov branching property, i.e. we have GW n ξ = MB q n for all adequate n with q n−1 defined for all λ = (λ 1 , . . . , λ p ) in P n−1 by where T is a GW ξ tree. Similarly, if we let N L := {n ≥ 1 : GW ξ (T L,n ) > 0}, then in light of [46,Lemma 8], the family (GW L,n ξ ) n∈NL of probability measures satisfies the Markov branching property and the associated sequence q L of first-split distributions such that GW L,n ξ = MB L,q L n is given for all n in N L and λ = (λ 1 , . . . , λ p ) in P n by where T still denotes a GW ξ tree.
A Kesten tree with distribution GW ∞ ξ can be seen as an infinite Markov branching tree with distribution MB q,q∞ ∞ where q ∞ is defined for any λ = (λ 2 , . . . , λ p ) in P <∞ by The distribution of Kesten's tree may also be rewritten as Proof of Proposition 5.2. Let λ = (λ 2 , . . . , λ p ) be an element of P <∞ . If there exists 2 ≤ i ≤ p such that λ i − 1 isn't divisible by d, then for all n ∈ N , q n−1 (n − 1 − λ , λ) = 0 = q ∞ (∞, λ). Otherwise, for n ∈ N large enough, in light of Lemma 5.5 Similarly, as n goes to infinity, q L n (n − λ , λ) → q L ∞ (∞, λ). Since these hold for any λ in P <∞ , we end this proof by using Corollary 2.10.

Scaling limits, finite variance
In the remainder of this section, (T i ) i≥1 will denote i.i.d. Galton-Watson trees with offspring distribution ξ, (Y n ) n≥1 , i.i.d. ξ distributed random variables and for all n ≥ 1, http://www.imstat.org/ejp/ S n := Y 1 + · · · + Y n − n. We will also consider N , a random variable independent of both (T i ) i and (Y n ) n and such that N + 1 followsξ.
The following so called Otter-Dwass' formula or cyclic lemma (see [44,Chapter 6] for instance) will be the cornerstone of many forthcoming computations. Lemma 5.6 (Otter-Dwass' formula). With these notations, for all k ≥ 1 and n ≥ 1, Let q * be the probability distribution on P <∞ defined by q * = q ∞ (∞, · ). Let Λ follow q * and recall that it has the same distribution as (#T 1 , . . . , #T N ) ↓ .
In this paragraph, we'll assume that the variance σ 2 of ξ is finite and that d = 1. Recall that the Brownian tree with immigration is a (1/2, ν B , I B )-fragmentation tree with immigration. It was proved in [30, Section 5.1] that Assumption (S) of Theorem 4.2 is fulfilled for γ = 1/2 and ν = σ/2 · ν B . To prove Proposition 5.3, it will therefore be sufficient to show that Assumption (I) is satisfied for γ = 1/2 and I = σ/2 · I B . For all R ≥ 1, let q (R) be the distribution of Λ/R 2 .
Proposition 5.7. In the sense of weak convergence of finite measures on S ↓ , R (1 ∧ s) q (R) (ds) converges as R goes to infinity toward (1 ∧ s) σ/2 · I B (ds). Since I B is unary, in order to prove Proposition 5.7, it will be enough to show that Λ satisfies the assumptions of Proposition 4.19. The next two lemmas will prove that both are met.

Scaling limits, stable case
In this paragraph, we'll suppose that there exist α ∈ (1, 2) and a positive constant c such that n 1+α ξ(n) → c when n → ∞.
It was proved in [30,Section 5.2] that the family q = (q n ) n∈N of first-split distributions associated to (GW n ξ ) n∈N satisfies Assumption (S) of Theorem 4.2 for γ = 1 − 1/α and ν = (c k α ) 1/α · ν α . Proposition 5.3 (ii) will therefore be a consequence of the next proposition. For all R ≥ 1, write q (R) for the distribution of R −α/(α−1) Λ. More accurately, in the Skorokhod topology, This, in conjunction with Skorokhod's representation theorem, implies that there exists a sequence (X n ) n≥0 , where for all n ≥ 1, which a.s. converges to (a version of) ∆.
Let F : S ↓ → R + be a Lipschitz continuous function such that F (s) ≤ 1 ∧ s and set f : The dominated convergence theorem ensures that the function f is continuous. It is clearly bounded by 1 and where K · (c k α ) is bigger than the Lipschitz constant of F . We will now endeavour to prove that this last quantity goes to 0 when R → ∞. For all s in S ↓ , let s ∧ 1 be the sequence (s i ∧ 1) i≥1 . Then for any x and y in S ↓ , we may write In light of Lemma 4.18, n E 1 ∧ (#T 1 /n α ) converges to [(c k α ) 1/α Γ(2 − 1/α)] −1 . It ensues from the i.i.d. nature of the sequence ensures that E[ ∆ ∧ 1 2 ] is also finite. As a result, the sequence ( Moreover, since it converges, the sequence m 1+1/α P[#T 1 = m] m is bounded by a finite constant, say Q. Consequently, Page 38/53 http://www.imstat.org/ejp/ which proves that the sequence E[ X n − X n ∧ 1 β ] n≥1 is bounded. Since this holds for all β < 1/α, if ε is positive and such that (1 + ε)β =: β < 1/α, then Hence, the sequence (X n − X n ∧ 1) − (∆ − ∆ ∧ 1) β n≥1 is bounded in L 1+ε . Because it converges to 0 almost surely, its mean also goes to 0 as n tends to infinity.
For all β < 1/α and ε > 0, there exist a finite constant C and a finite integer n ε such that for all n ≥ 1 Using the same arguments as in the proof of Lemma 4.18 it is easy to prove that for any κ > α − 1, Since this holds for any positive ε, it follows that We conclude with Lemma 3.21.

Cut-trees
Let τ be a finite labelled tree. If τ is made out of a single vertex, let its cut-tree Cut (τ ) be the tree with a single vertex. Otherwise, define the cut-tree of τ as the (unordered) binary tree Cut (τ ) obtained by the following recursive process: With this definition, if τ has n vertices, then Cut (τ ) has n leaves. The cut-tree of τ represents the genealogy of its dismantling when we remove edge after edge, until all have been deleted. Cut (τ ) Figure 5: A labelled tree τ and its cut-tree (the edges of τ are labelled in the order they are removed).
Cut-trees were introduced in [12] as a means of generalising the study of the number of cuts necessary to isolate a marked vertex or a finite number of marked vertices. In this section, we will study the local and scaling limits of two models of cut-trees, studied in [12] and [14], which both satisfy the Markov branching property. Also see [15] and [21] for the study of the cut-trees of conditioned Galton-Watson trees.

Cut-trees of Cayley trees
A Cayley tree of size n ≥ 1 is a labelled tree τ n chosen uniformly at random in the set of trees with n labelled vertices (for convenience, with labels 1 through n). It is well-known that, viewed as an unlabelled tree, τ n has the same distribution as an unordered Galton-Watson tree with offspring law Poisson (1) conditioned to have n vertices. For all n ≥ 1, let T n := Cut (τ n ) be the cut-tree of a Cayley tree with size n.
Let (ϑ n ) n≥0 be a sequence of i.i.d. unconditioned GW Poisson (1) trees. Let T ∞ be the tree obtained by attaching for each n ≥ 0 the cut-tree of ϑ n to the vertex of an infinite branch at height n by an edge. In other words, set The aim of this section will be to prove the next two results. Proposition 5.11. When n → ∞, T n converges to T ∞ in distribution with respect to the local limit topology. Proposition 5.12. Endow T ∞ with counting measure on its leaves µ ∞ . Then, as R goes to infinity, (T ∞ /R, µ ∞ /R 2 ) converges to (T B , 1/2 · µ B ) in distribution with respect to the D GHP topology, where (T B , µ B ) denotes the Brownian tree with immigration.
Markov branching property It was stated in [12] that (T n ) satisfies the Markov branching property and more specifically, that the distribution of T n is MB L,q n where the EJP 22 (2017), paper 95. Page 40/53 http://www.imstat.org/ejp/ associated first-split distributions are given by q 1 (1) = 1, for all n ≥ 2, q n (p = 2) = 0 and if 1 ≤ k < n/2, The tree T ∞ can be described as an infinite Markov branching tree with distribution MB L,q,q∞ ∞ where the probability measure q ∞ is defined by q ∞ (p = 2) = q ∞ (m ∞ = 1) = 0 and for all positive k, q ∞ (∞, k) = P[#ϑ = k] where ϑ is a GW Poisson (1) tree. Recall that the size of ϑ has Borel distribution with parameter 1, therefore, for any positive k, q ∞ (∞, k) = k k−1 e −k /k!.
Local limits For any k ≥ 1, when n → ∞, Stirling's approximation gives We may then use Corollary 2.10 and thus prove Proposition 5.12.

Cut-trees of uniform recursive trees
A recursive tree with n vertices is a labelled tree (with labels 1 through n) such that the labels on the shortest path from 1 to any given leaf are increasing. For all n ≥ 1, let τ n denote a labelled tree chosen uniformly at random among the set of recursive trees with n vertices and call T n its cut-tree.
Define a probability measure π on N by π(n) = 1/[n(n + 1)] and let (X n , ϑ n ) n≥0 be a sequence of i.i.d. variables, where for each n, X n follows π and conditionally on X n = , ϑ n is a recursive tree with vertices. Define T ∞ as the tree obtained by attaching the cut-tree of ϑ n by an edge to an infinite branch at height n, i.e. set Proposition 5.13. In the sense of the local limit topology, T n converges in distribution to T ∞ when n → ∞.
It was observed in [13] and [14] that the sequence (T n ) n≥1 is Markov branching. Moreover, we may deduce from [13, Section 2] the expression of the respective distributions q n of Λ L (T n ). Clearly, q 1 (1) = 1, and for n ≥ 2, if X denotes a random variable with distribution π, then for all k ≤ n/2, q n (n−k, k) = P[X = k|X < n]+P[X = n−k|X < n] 1 k =n/2 .
In particular, The tree T ∞ may also be described as an infinite Markov branching tree with distribution MB L,q,q∞ ∞ where the measure q ∞ is given by q ∞ (p = 2) = q ∞ (m ∞ = 1) = 0 and for all k ≥ 1, q ∞ (∞, k) = π(k).
If k is a fixed integer, then q n (n − k, k) clearly converges to q ∞ (∞, k). We conclude the proof of Proposition 5.13 with Corollary 2.10.
Remark 5.14. It was shown in [14] that (n/ log n) −1 T n converges to the real interval [0, 1] rooted at 0 and endowed with the Lebesgue measure. However, Assumption (S) doesn't hold.

The α-γ model
In this section, we will study trees generated according to the algorithm of the α-γ model described in [19]. This algorithm was introduced as an interpolation between various models of sequentially growing trees such as Rémy's algorithm [45], used to generate uniform binary trees with any number of leaves, Marchal's [42], which gives the n-dimensional marginal of Duquesne-Le Gall's stable trees (the discrete tree spanned by n leaves chosen uniformly at random in a stable tree), and Ford's α-model [24], used for instance in phylogeny.
Let 0 ≤ γ ≤ α ≤ 1. Start with T 1 := {∅}, the trivial tree, and T 2 := {∅, (1), (2)}, a tree with two leaves attached to its root. Then for n ≥ 3, conditionally on the tree T n−1 : • Assign to each edge of T n−1 (considered as a planted tree, i.e. a tree in which a phantom edge has been attached under the root) the weight 1 − α if the edge ends with a leaf or γ otherwise, • Also assign to each non-leaf vertex u the weight [c u (T n−1 ) − 1]α − γ, • Pick an edge or a vertex in T n−1 with probability proportional to these weights, -If an edge was picked, place a new vertex at its middle and attach a new leaf to it, -If a vertex was selected, attach a new leaf to it, and let T n be the tree thus obtained. We will also call AG n α,γ its distribution for all n ≥ 1 and 0 ≤ γ ≤ α ≤ 1. Remark 5.15. As mentioned at the beginning of this section, some particular choices of parameters give previously studied algorithms: • When α = γ = 1/2, we get Rémy's algorithm [45], • If β ∈ (1, 2), taking α = 1/β and γ = 1 − α gives Marchal's algorithm [42], • When α = γ, this algorithm coincides with that of Ford's α-model [24].
Finally, conditionally on (X n , Y n,k , τ n,k ; n ≥ 0, k ≥ 0), define T ∞ as the tree obtained by grafting for each n ≥ 0 the concatenation of τ n,i , 0 ≤ i ≤ X n at height n on an infinite branch. In other words, and denote by AG ∞ α,γ its distribution.
We will start our study of the α-γ model by proving this next proposition with the help of Theorem 2.9. Similar results for α = γ were already proved in [47] and in [18,Lemma 3.8] for any 0 < γ ≤ α ≤ 1.
Proposition 5.17. For any 0 < γ ≤ α ≤ 1, the probability measure AG n α,γ converges weakly to AG ∞ α,γ as n grows to ∞ in the sense of the local limit topology.
We will then study the scaling limits of these infinite trees: Section 5.3.2 will focus on the case 0 < γ < α < 1 and Section 5.3.3, on α = γ.

Markov branching property and local limits
Proposition 1 in [19] states that the sequence (AG n α,γ ) n satisfies the Markov branching property. Moreover, the sequence q = (q n ) n associated to the first split distributions of T n , i.e. such that q n is the law of Λ L (T n ) for all n ≥ 1, is given by q 1 (∅) = 1, and for any n ≥ 2, for all λ = (λ 1 , . . . , λ p ) ∈ P n , with the conventions Γ(0) = ∞ and Γ(0)/Γ(0) = 1 (which will be used throughout this section).
We conclude with Corollary 2.10.
To prove this claim, we may proceed as in the proof of Proposition 5.10. The only significant difference is that the constant β used near the end of that proof must now belong to the open interval (γ, α).
Scaling limits of Ford's α model Let α ∈ (0, 1). Results from [31, Section 5.2] ensure that (T n ) n satisfies Assumption (S): when n → ∞, Furthermore, q ∞ is a.s. binary and Stirling's approximation ensures that q ∞ (∞, n) is equivalent to [α/Γ(1 − α)] n −1−α when n → ∞. Consequently, if Λ is such that (∞, Λ) follows q ∞ and q (R) denotes the distribution of Λ/R 1/α , Proposition 4.19 proves that When α = 1 In this case, the algorithm's output is deterministic: for each n ≥ 2, a tree T n with distribution AG n 1,1 is simply equal to a branch of length n − 1 upon which a single leaf has been grafted at each non-leaf vertex (a "comb" of length n). Similarly, an infinite tree with distribution AG ∞ 1,1 is the "infinite comb", obtained by attaching a single leaf to all the vertices of the infinite branch.
As a result, if T has distribution AG ∞ 1,1 and µ T denotes the counting measure on the set of its leaves, then clearly, (T /R, µ T /R) converges as R → ∞ to the metric space R + rooted at 0 and endowed with the usual Lebesgue measure.
When α = 0 Observe that q n (n − k, k) = (2 − 1 k=n/2 )/(n − 1). Then for all K ≥ 1 and n large enough, which implies Λ L (T n ) → (∞, ∞) a.s. when n → ∞. Theorem 2.9 then ensures that T n converges in distribution to the complete infinite binary tree (in which every vertex has 2 children). Moreover, since T n ⊂ T n+1 a.s., this convergence happens almost surely. For any α in (0, 1) and n ∈ N ∪ {∞}, denote by q (α) n the first-split distribution (with respect to the number of leaves) associated to a tree with distribution AG n α,α . Now, fix 0 < α < β < 1, and consider a tree T with distribution MB L,q (α) ,q (β) ∞ ∞ endowed with µ T , the counting measure on the set of its leaves. We may deduce from previous results that (q (α) n ) n≥1 and q (β) ∞ satisfy Assumptions (S) and (I ). As a result, (T /R, µ T /R 1/β ) converges in distribution to the metric space R + rooted at 0 and endowed with a random measure µ = i≥1 s i δ ui , where i≥1 δ (ui,si) is a Poisson point process on R + × S ↓ with intensity measure du ⊗ I (F) β (ds).

Local limits
In this paragraph, we will focus on the study of the local limits of T n . We will once again rely on the Markov branching nature of the model and on Theorem 2.9. Proposition 5.23. β ≥ −1 : In the sense of the local limit topology, T n converges in distribution to the infinite binary tree. β ∈ (−2, −1) : Let X follow the beta geometric distribution with parameters (2 + β, −1 − β) (see Section 5.3). Define q ∞ , a probability measure on P ∞ , by q ∞ (∞, k) = P[X = k − 1] for any k ≥ 1 and q ∞ (λ) = 0 if p(λ) = 2 or m ∞ (λ) = 1. With these notations, T n converges in distribution to MB L,q,q∞ ∞ with respect to the local limit topology.
Lemma 2.4 then ensures that q n ⇒ δ (∞,∞) . It follows from Theorem 2.9 that T n converges in distribution to the (deterministic) infinite binary tree.

Scaling limits
We will now study the scaling limits of the β-splitting model when β ∈ (−2, −1) with the help of Theorem 4. Let Λ denote a random integer such that (∞, Λ) has distribution q ∞ and for all R ≥ 1, set q (R) as the distribution of Λ/R 1/(−1−β) . Just like in Section 5.3.3, Stirling's approximation and Proposition 4.19 ensure that Assumption (I) is met for γ = −1 − β and the immigration measure I

k-ary growing trees
Let k ≥ 2 be an integer. In this section, we will study a model of k-ary trees, i.e. trees in which vertices have either 0 or k children, described in [32]. This model is yet another generalisation of Rémy's algorithm [45] (which corresponds to k = 2).
The following algorithm allows us to get a sequence (T n ) n≥0 of k-ary trees such that for all n, T n has n internal vertices (vertices that aren't leaves) or, equivalently, kn + 1 vertices or (k − 1)n + 1 leaves. First, let T 0 be the trivial tree {∅} and for n ≥ 1, conditionally on T n−1 : • Pick an edge of T n−1 (considered as a planted tree) uniformly at random, • Place a new vertex on that edge and attach k − 1 new leaves to it, and call T n the resulting tree. We will denote the distribution of T n by GT n k .
Proposition 5.26. In the sense of the local limit topology, GT n k converges weakly to GT ∞ k when n goes to ∞. Let Π be a (k − 1)-dimensional Dirichlet variable with parameters (1/k, . . . , 1/k). The aim of Section 5.5.2 will be to prove the next proposition. Corollary 2.10 concludes this proof.
Proof. Let (Y n ) n≥1 be i.i.d. and such that conditionally on ∆, Y n is multinomial with parameters (1; ∆). Moreover, set Z n := Y 1 + · · · + Y n . The law of large numbers ensures that Z n /n converges almost surely to ∆. Let N be independent of ∆ and (Z n ) n and have beta geometric distribution with parameters (1/k, 1 − 1/k). Observe that X has the same distribution as Z N .
Define g : R + → R + by g(t) := E[G(t ∆)]. The dominated convergence theorem implies that it is continuous and it clearly satisfies g(t) ≤ 1 ∧ t. Lemma 4.18 then ensures that R E[g(N/R k )] → k Γ(1 − 1/k) −1 ∞ 0 t −1−1/k g(t) dt. Since Z n /n a.s. converges to ∆ and because (Z n /n) − ∆ ≤ 2, we can use the dominated convergence theorem to state that for all positive ε, there exists n ε such that E[ (Z n /n) − ∆ ] < ε as soon as n ≥ n ε . Therefore, if K is the Lipschitz constant of G, where we have used Lemma 4.18. This last quantity in turn converges to 0 when ε → 0 which proves the desired result.
Proof of Proposition 5.27. Recall that if Λ is such that (∞, Λ) follows q • ∞ , then Λ is distributed like X ↓ . We may then deduce from Lemma 5.