Scaling limits for some random trees constructed inhomogeneously

We define some new sequences of recursively constructed random combinatorial trees, and show that, after properly rescaling graph distance and equipping the trees with the uniform measure on vertices, each sequence converges almost surely to a real tree in the Gromov-Hausdorff-Prokhorov sense. The limiting real trees are constructed via line-breaking the real half-line with a Poisson process having rate $(\ell+1)t^\ell dt$, for each positive integer $\ell$, and the growth of the combinatorial trees may be viewed as an inhomogeneous generalization of R\'emy's algorithm.


Introduction
Understanding the structure of large random trees and graphs is an important topic of much recent interest in mathematics, statistics, and science. Random trees appear in population genetics and computer science, and statistical data with network structure is now generated in many fields. One important approach to studying a large random discrete structure is to determine limiting behavior as its size tends to infinity, in particular the structure may converge in a suitable sense to a limit object. Two well-known illustrations of this approach are the classical functional central limit theorem and the recently developed notion of dense graph limits (so-called graphons). In this paper we are interested in a third setting that has been an important and active research area for the last 25 years: continuum tree limits of combinatorial (i.e., graph-theoretic) trees; here trees are viewed as measured metric spaces and convergence is in the Gromov-Hausdorff-Prokhorov (GHP) topology. Necessary background on GHP topology is provided in Section 2, but roughly speaking, two measured metric spaces are close in the GHP topology if each can be isometrically embedded into a common metric space so that both of their Hausdorff distance and the Lévy-Prokhorov distance between the push-forwards of their measures are small.
To clarify the upcoming discussion, we first mention how trees are viewed as metric spaces; see Evans [16] for a more thorough treatment. Throughout the article, trees are not embedded in the plane (i.e., unordered). A compact metric space pT , d len q is a real tree if the following two properties hold for every x, y P T .
(1) There is a unique isometric map f x,y from r0, d len px, yqs into T such that f x,y p0q " x and f x,y pd len px, yqq " y. (2) If g is a continuous injective map from r0, 1s into T such that gp0q " x and gp1q " y, then we have gpr0, 1sq " f x,y pr0, d len px, yqsq. We call the metric d len the intrinsic length metric on T . For every x, y P T , we call f x,y a (non-graph-theoretic) path in T , and denote by |f x,y | :" d len px, yq the intrinsic length of the path. For ease of notation, write T , instead of pT , d len q, for a real tree. A leaf of a real tree T is a point x P T such that T ztxu is connected.
To emphasize the difference from real trees, we call graph-theoretic trees combinatorial trees. Given a combinatorial tree T , let vpT q be the vertex-set of T , denote by d gr the graph distance on T , and view T as the metric space pvpT q, d gr q. All edges and paths in a combinatorial tree are of graph-theoretic sense (i.e., an edge has length 1, and the length of a path is the number of edges in it). We often consider rooted trees which are pairs pT, uq, where T is a combinatorial (resp. real) tree, and u is a distinguished vertex (resp. point) of T . We call u the root of pT, uq.
The fundamental results for tree convergence in our setting are due to Aldous [4,5,6], who constructed and studied a limit object now called the Brownian continuum random tree (BCRT). Aldous showed that the BCRT is the (1) limit as the number of vertices tends to infinity of certain random combinatorial trees with rescaled edge-lengths (more specifically, the combinatorial trees are those formed from a critical Galton-Watson branching process with finite variance offspring distribution and conditioned on their numbers of vertices), (2) limit of a Poisson line-breaking construction, (3) real tree with contour process equal in distribution to Brownian excursion, and (4) real tree having a certain finite-dimensional distribution on k-leaf trees obtained as subtrees spanned by the root and k leaves chosen independently according to a mass measure.
The general versions of the constructions of Items (1) and (2) are most important for this paper. Focusing on Item (1), an important class of combinatorial trees that converge to CRTs are those given by various recursive constructions. In these models, a growing sequence of random combinatorial trees pTpnq : n P Nq is defined so that Tpn`1q is constructed conditional on Tpnq by adding vertices and edges according to specified random rules. Examples of such constructions are Rémy's algorithm for recursively constructing uniformly chosen leaf-labeled full binary trees; [32]; Marchal's generalization of Rémy's algorithm; [23]; Ford's α-model and generalizations; Chen, Ford & Winkel [12]; and others: Haas & Stephenson [21], Pitman & Winkel [29], Pitman, Rizzolo & Winkel [31], Pitman & Winkel [30], Rembart & Winkel [33].
The (sometimes Poisson) line-breaking constructions of Item (2) starts with a sequence of growing random real trees pT k : k P Nq, and then a real tree is defined to be the closure of the union of the sequence. The sequence is recursively constructed: given T k , we create T k`1 by attaching the end of a branch of a random length to a randomly chosen point of T k . To describe the Poisson line-breaking construction of the scaled BCRT, let C 1 , C 2 , . . . be the points of an inhomogeneous Poisson process on p0, 8q with intensity measure 2tdt. Then we set T 1 to be a single branch of length C 1 , and recursively construct T k`1 from T k by attaching the end of a branch of length C k`1´Ck to a uniform point of T k . The closure of this sequence is a compact metric space with a measure supported on the leaves that is the weak limit of the uniform measure on the sequence trees. An important remark for our purposes is that it is possible to embed Rémy's algorithm into this Poisson line-breaking construction of BCRT, and this embedding can be used to show that uniformly chosen full binary trees with rescaled edge-lengths converge to the BCRT as the number of vertices goes to infinity. Similarly, Marchal's algorithm can be embedded into a line-breaking construction of stable trees, and this embedding can be used to show convergence of Marchal's trees with rescaled edge-lengths to continuum stable trees [17].
In this paper, we extend these ideas by defining a new family of sequences of growing recursively constructed combinatorial trees in the spirit of Rémy's algorithm and show these sequences of trees can be embedded into appropriate Poisson line-breaking constructions. We use this embedding to show that the sequence of combinatorial trees, equipped with the uniform measure on vertices, almost surely converges in the GHP topology to the closure of the union of the line-breaking constructed trees equipped with a probability measure supported on the leaves. Curien & Haas [13] recently systematically studied the trees that appear as limits, and determined useful properties regarding compactness, boundedness, asymptotic height, and Hausdorff dimension. See also the recent works of Amini, Devroye, Griffiths & Olver [11] and Haas [18] for related constructions and discussions.
1.1. Main result. For each ℓ P N :" t1, 2, 3, . . .u, we define a sequence of growing random combinatorial trees endowed with the uniform probability measure and a real tree limit. To ease notation, fix ℓ P N.
Construction of the combinatorial trees. Consider growing a sequence of random combinatorial trees pTpnq : n P Nq in the following inhomogeneous manner. Let Tp0q be a single (graph-theoretic) edge, call one endpoint a leaf, denoted by L 1 , and call the other endpoint the root of Tp0q, denoted by v 0 . For each n P N, given Tpn´1q, insert a new vertex v n in the interior of a uniformly chosen edge of Tpn´1q. If ℓ divides n, then, at the same time as v n appears, insert an edge connecting v n and a new leaf, denoted by L 1`n ℓ . The resulting tree Tpnq is rooted at v 0 . Note that for k P N, Tpkℓ´1q has k leaves and Tpkℓq has k`1 leaves. In addition, for all n P N, let ν n be the uniform probability measure over vpTpnqq. Note that for ℓ ě 2 there are degree-2 vertices in the trees and that the case ℓ " 1 coincides with Rémy's algorithm [32].
Construction of the limiting real trees. The limiting real trees have been recently studied [13] and are generalizations of the line-breaking construction for the BCRT described above and due to Aldous [6]. Given a " pa k : k P Nq Ă R`, we construct a sequence of random real trees pT a k : k P Nq by starting with T a 1 , which is made of a single branch of length a 1 . For integer k ě 2, we recursively construct T a k from T a k´1 by attaching the end of a branch of length a k to a point chosen uniformly from T a k´1 . For all k P N, root T a k at an arbitrarily fixed end of the initial branch. Furthermore, let T a be the closure of T a k as k Ñ 8. Next, write C 0 " 0, and let C 1 , C 2 , . . . be the times in p0, 8q of an inhomogeneous Poisson process of rate pℓ`1qt ℓ dt. For all k P N, write T k " T pC k´Ck´1 :kPNq k . Finally, let T be the completion of T k as k Ñ 8, which is a random real tree with intrinsic length metric d len . Curien & Haas [13] show that the limit tree T is almost surely compact and has a natural "uniform" probability measure supported on the leaves; see Theorems 1.5 and 1.7 below. Note that the case ℓ " 1 is exactly the Poisson line-breaking construction of the BCRT.
We can now state our main result. For any a ą 0, write a¨d gr for the metric so that pa¨d gr qpx, yq " a¨d gr px, yq. For the remainder of the paper, define α " αpℓq " ℓ ℓ`1 and c " cpℓq " ℓ α ℓ`1 .
There is a probability space where we can construct copies of pTpnq : n P Nq and pT k : k P Nq such that the following holds. There almost surely exists a probability measure µ supported by the leaves of T such that vpTpnqq, c n α¨d gr , ν n¯Ñ pT , d len , µq almost surely for the Gromov-Hausdorff-Prokhorov topology as n Ñ 8.
The proof of Theorem 1.1 follows from three main steps. First, we can embed the combinatorial trees into the Poisson line-breaking trees (Proposition 1.2). The embedding follows from beta-gamma algebra and is similar in spirit to that described in [17,Proposition 3.7] for Marchal's algorithm. Second, we can use the embedding to show that for T k pnq defined to be the subtree of Tpnq spanned by the root and the first k leaves, T k pnq is close to T k even for growing k (Proposition 1.3). Essentially this requires careful analysis of distances and masses in the combinatorial tree, which in turn boils down to understanding a time inhomogeneous Pólya urn model studied by Peköz, Röllin & Ross [27,28], where distributional convergence results complementary to this paper are derived. Note also that once the correspondence to the urn model is made (in Section 5.1), the choice of the scaling constant c agrees with that of [28,Proposition 2.1]; in our work, c is chosen to cancel the leading term in (4.7). Finally, we show what is left over in Tpnq outside of T k pnq is sufficiently small (Proposition 1.4). This tightness argument requires careful analysis of two Pólya urn models and an understanding of exchangeable random "decorated" masses.
The layout of the remainder of the paper is as follows. We present the three key propositions and a detailed proof outline in the last subsection of this introduction. In Section 2, we provide necessary background on GHP convergence, and then we prove the three propositions in Sections 3, 4, and 5. We conclude this subsection of the introduction with a few remarks contextualizing our result and discussing further work.
For related work, as previously discussed, there is much interest in limits of recursively constructed trees. However, typically the models considered have some nice consistency properties such as Markov branching (see [19] for a recent review), perhaps with some consistent leaf-labeling, e.g., a regenerative structure as in [31], or having fully exchangeable leaf-labels. By consideration of small cases, it is clear that the leaf-labeling in our models is not exchangeable and the combinatorial trees do not have the Markov branching property so we cannot directly apply the general theory developed for such models. Also, it is unusual for recursively built combinatorial tree models of the kind studied here to allow for degree-2 vertices and this case is excluded from some studies. Having GHP convergence results for an example falling outside the general theory is interesting in its own right, but may also lead to further natural classes of models and general theory.
There are many avenues for future study. The most obvious open problem is to provide a description analogous to Items (3) and (4) above for the limit trees. Moreover, there are many other decompositions and properties of recursively defined trees and their limits that are important and appear in the CRT literature -what are the analogs of these in our setting? Note that our combinatorial trees provide one path to understanding properties of the limit trees.
1.2. Proof outline of Theorem 1.1. First, we examine the topologies of the combinatorial trees and the real trees. The idea is to embellish the real trees with random vertices so that the resulting trees, equipped with the graph distance, have the same law as the combinatorial trees.
Embellished trees. Write T p0q " T 1 . For each k P N and i P t1, . . . , ℓ´1u, let T ppk´1qℓ`iq be obtained from inserting a vertex at a random point uniformly chosen with respect to the normalized Lebesgue measure over T ppk´1qℓ`i´1q. Let T 1 k be formed by inserting a vertex uniformly in T ppk´1qℓ`ℓ´1q and define T pkℓq by attaching a branch of length C k`1´Ck to this last inserted vertex. We call T p1q, T p2q, . . . the embellished trees, rooted at the same point as T 1 . This construction is analogous to that of the combinatorial trees.
All vertices inserted in the above manner are called the embellished vertices. For all k P N and i P t1, . . . , ℓ´1u, if we forget about the embellished vertices, then T ppk´1qℓ`iq with the intrinsic length metric has the same law as the real tree T k . A leaf of the embellished tree T ppk´1qℓ`iq is the corresponding leaf of T k . A vertex of the embellished tree T pnq is either an embellished vertex, a leaf, or the root. Denote by vpT pnqq the set of vertices of T pnq. We view T pnq as the union of the (non-graph-theoretic) branches and the vertices, i.e., a hybrid of the real tree and the combinatorial tree.
For all integers n, k with n ě pk´1qℓ, let T k pnq be the subtree of the embellished tree T pnq spanned by the root and the first k leaves (in the order of appearance). Analogously, write T k pnq for the subtree of Tpnq spanned by the root and the first k leaves.
The embellished trees give a way to couple Tpnq and T pnq as follows. Recall that we often write T k pnq "`vpT k pnqq, d gr˘a nd T k "`T k , d len˘.
There is a probability space where we can construct copies of pTpnq : n P Nq, pT pnq : n P Nq, and pT k : k P Nq such that vpT k pnqq, d gr˘" T k pnq and`T k pnq, d len˘" T k , for all integers k, n with n ě pk´1qℓ, equalities considered up to isometry-equivalence.
The proof of Proposition 1.2, given in Section 3, relies on that when vertices are inserted into the embellished tree, branches are fragmented into Dirichlet-distributed lengths. Proposition 1.2 gives us a direct coupling to compare the rescaled graph-theoretic pathlengths of T k pnq and the corresponding intrinsic path-lengths of T k , which leads to our next result showing that the combinatorial trees spanned by a subset of leaves and the analogous subtree of the limit tree are close.
Before stating the result, we need some facts and notation. Firstly, as discussed in greater detail just below and in Section 2, all the metric spaces appearing in this paper are compact, and so in fact we can define a distance on such metric spaces (modulo isometry-equivalence), denoted d GHP , which induces the GHP topology. Next, for all k P N, let µ k be the normalized Lebesgue length measure on T k . For all integers k, n with n ě pk´1qℓ, write ν k,n for the uniform probability measure over vpT k pnqq. For two sequences gpnq, f pnq, write gpnq " Ωpf pnqq if there exists C ą 0 such that gpnq ě Cf pnq for all n, and gpnq " opf pnqq if gpnq{f pnq Ñ 0, as n Ñ 8. , which says that the number of vertices along a path in T kpnq pnq has order c´1n α times the Lebesgue length of the path, and that the vertices are regularly distributed.
Next, to ensure that T k pnq is close to Tpnq, we need a tightness property of the sequence pT k pnq : k P N, n ě kℓq, i.e., the Hausdorff distance between T kpnq pnq and Tpnq is diminishing, and the Lévy-Prokhorov distance between their uniform probability measures also vanishes in the limit. Recall that ν n is the uniform probability measure over vpTpnqq. Proposition 1.4. Suppose k : N Ñ N satisfies kpnq " Ω`n 1{100˘a nd kpnq " o`n 1{3˘a nd assume now ℓ ě 2. Then, almost surely as n Ñ 8, d GHP´`v pT kpnq pnqq, c n α¨d gr , ν kpnq,n˘,`v pTpnqq, c n α¨d gr , ν n˘¯Ñ 0. Note the restriction in Proposition 1.4 to ℓ ě 2, which stems from Lemma 5.4 and in particular the proof of Lemma 5.12. The restriction is due to balancing asymptotic terms and probably some version of the proposition and these lemmas hold for ℓ " 1, but convergence in this case is well-covered in the literature and so it is enough for us to consider ℓ ě 2. All other lemmas and propositions in the paper hold for ℓ " 1.
To establish Proposition 1.4, we deduce a height-bound for the subtrees of Tpnq pendant to T kpnq pnq, and we also show that subtrees pendant to T kpnq pnq are "uniformly asymptotically negligible" (a similar property is used in Addario-Berry & Wen [3], Wen [36]). That is, Lemmas 5.1 and 5.2 imply that with 1´op1q probability, the maximal height of the subtrees of Tpnq pendant to T kpnq pnq has order opn α q (yielding GH convergence) and Lemma 5.12 implies that the maximal size of the subtrees has order o`n¨kpnq´8 {p3pℓ`1qq˘. By projecting the masses of pendant subtrees onto T kpnq pnq, we can deduce a bound on the relevant Lévy-Prokhorov distance. Details are given in Section 5.
As shown in the next several results of [13], T is almost surely compact, which allows for the convergence to hold in the GHP topology instead of, say, the local GHP topology. . Suppose that there exists α 1 P p0, 1s such that for a :" pa k : k P Nq Ă R`we have a k ď k´α 1`o p1q and ř k i"1 a i " k 1´α 1`o p1q as k Ñ 8. Then T a is almost surely a compact real tree. Fact 1.6. ( [13]). If a k " C k´Ck´1 for n P N, then almost surely a :" pa k : k P Nq satisfies the assumption in Theorem 1.5 for α 1 :" ℓ ℓ`1 . Theorem 1.7. ([13, Theorem 4]). Almost surely, there exists a probability measure µ supported by the leaves of T such that µ k Ñ µ weakly as k Ñ 8.
With these results we can now prove Theorem 1.1.
Proof of Theorem 1.1. The case ℓ " 1 is just the well-known almost sure GHP convergence of uniform ordered binary trees with uniform measure to the BCRT (e.g, Curien & Haas [14,Theorem 5]), so we assume ℓ ě 2. We work on the probability space where the equalities of Proposition 1.2 hold, and condition on the a.s. event that T is compact and µ exists, where µ is the uniform probability measure supported by the leaves of T .

Gromov-Hausdorff-Prokhorov topology
In this section we review the definition of GHP distance and the topology it induces, referring the reader to the papers by Miermont [ We first give the standard and intuitive definition of GHP distance. A measured metric space is a triple pV, d, νq where pV, dq is a metric space and ν is a finite non-negative Borel measure on V . Let Z :" pZ, δq be a metric space. Given non-empty A Ă Z and ε ą 0, the ε-neighborhood of A is A ε :" A ε δ :" tx P Z : Dy P A, δpx, yq ă εu. The Hausdorff distance δ H between two non-empty subsets X, Y of Z is Next, denote by PpZq the collection of all finite non-negative Borel measures on the measurable space pZ, BpZqq, where BpZq denotes the Borel σ-algebra of Z. The Lévy-Prokhorov distance δ P : PpZq 2 Ñ r0, 8q between two measures ν and ν 1 on Z is δ P pν, ν 1 q " inf ε ą 0 : νpAq ď ν 1 pA ε q`ε and ν 1 pAq ď νpA ε q`ε, @A P BpZq ( .
We can now define the standard metric used to define the Gromov-Hausdorff-Prokhorov topology. For two measured metric spaces V " pV, d, νq and where the infimum is over all metric space Z and all isometries ϕ, ϕ 1 from V, V 1 into Z, and where ϕ˚ν and ϕ 1˚ν1 denote push-forward measures. On the space of measured metric spaces modulo isometry-equivalence (measured metric spaces pV, d, νq and pV 1 , d 1 , ν 1 q are isometry-equivalent if there exists a measurable bijective isometry Φ : V Ñ V 1 such that Φ˚ν " ν 1 ), dG HP is a metric that induces the GHP topology. The definition above can be difficult to use, so we now state some alternative notions and results for showing GHP convergence. For pV, dq and pV 1 , d 1 q two metric spaces, a correspondence between V and V 1 is a set R Ă VˆV 1 such that for every x P V , there is x 1 P V 1 with px, x 1 q P R, and vice versa. We write RpV, V 1 q for the set of correspondences between V and V 1 . The distortion of any R P RpV, V 1 q with respect to d and d 1 is Furthermore, let M pV, V 1 q be the set of finite non-negative Borel measures on VˆV 1 . Denote by p and p 1 the projections from VˆV 1 to V and V 1 , respectively. Let ν and ν 1 be finite non-negative Borel measures on pV, dq and pV 1 , d 1 q, respectively. The discrepancy of π P M pV, V 1 q with respect to ν and ν 1 is where }¨} denotes the total variation for a signed measure. Given measured metric spaces V " pV, d, νq and V 1 " pV 1 , d 1 , ν 1 q, we define the Gromov-Hausdorff-Prokhorov distance by where the infimum is over all R P RpV, V 1 q and π P M pV, V 1 q.
Writing K for the set of all compact measured metric spaces modulo isometry-equivalence, pK, d GHP q is a Polish space; see Abraham, Delmas & Hoscheit [1]. GHP convergence refers to convergence in this space. (It can be shown that dG HP and d GHP induce the same topology on K.) The Gromov-Hausdorff distance between two metric spaces pV, dq and pV 1 , d 1 q is given by d GH ppV, dq, pV 1 , d 1 qq " inf 1 2¨d ispR; d, d 1 q, where the infimum is over all R P RpV, V 1 q.
. Moreover, pC k : k P Nq are the points of an inhomogeneous Poisson process on p0, 8q with intensity pℓ`1qt ℓ dt.  Next, recall the definition of the embellished tree T pnq from Section 1. A (non-graphtheoretic) path, S, in an embellished tree T pnq is defined as the corresponding path in the underlying real tree, and the path-length, |S|, is equal to the intrinsic length of S. An edge in an embellished tree is a path between two adjacent vertices, and the edge-length refers to the path-length of the edge. We show that the rescaled edge-lengths of the embellished tree T pnq are Dirichlet distributed.
Recall that, for all k P N, T 1 k is the embellished tree T pkℓq without the latest branch (i.e., the pk`1q:th branch, of length C k`1´Ck ), but it includes the the embellished vertex to which the pk`1q:th branch is to be attached. For all k P N and i P t0, . . . , ℓ´1u, let E k,i p0q, . . . , E k,i ppk´1qpℓ`1q`iq be the edge-lengths of the embellished tree T ppk´1qℓ`iq, in the order of appearance. For i " ℓ, let E k,i p0q, . . . , E k,i ppk´1qpℓ`1q`iq be the edgelengths of T 1 ppk´1qℓ`iq. Finally, let E k`1,0 ppℓ`1qkq " C k`1´Ck . If two edges appear at the same time, the one closer to the root has a smaller index.
Proof. We prove by induction on i and k. Let U " Uniformp0, 1q, independent of everything else. E 1,0 p0q " Dirp1q is trivial. For k " 1 and i " 1, we may assume that . Now suppose that the claim holds for some k P N and i P t0, . . . , ℓ´1u. We are about to insert a vertex uniformly over T ppk´1qℓ`iq, for the normalized Lebesgue length measure. Let V " Uniformp0, 1q, independent of everything else. Conditioned on selecting the edge with length E k,i pjq to insert such a vertex, for an appropriate j, by Lemma 3.1 we have that the pk´1qpℓ`1q`i`2 dimensional vector 1 The above holds regardless of the choice of j, so it also holds without conditioning and the claim follows for k and i`1.
Next, we show that the claim holds for k`1 and i " 0 as well. Recall that T pkℓq is obtained from attaching a branch of length C k`1´Ck to T 1 k . So It follows from Fact 3.3 that we may write We have proved that the claim holds for k and i " ℓ: Then by Fact 3.2, The lemma follows by induction.
Recall that vpT pnqq is the union of the embellished vertices, the leaves, and the root, and recall the definition of the combinatorial tree Tpnq from Section 1.
Proof. We prove by induction on n. For n " 0, pvpT p0qq, d gr q " Tp0q (both consist of an edge). Now assume that, for n P N such that ℓ does not divide n, pvpT pn´1qq, d gr q " Tpn´1q. We are about to insert a vertex into T pn´1q and Tpn´1q respectively. It follows from Lemmas 3.1 and 3.5 that the new vertex has equal probability to land on any edge of T pn´1q. This holds true for the insertion into Tpn´1q as well, by construction. It then follows from the induction hypothesis that pvpT pnqq, d gr q and Tpnq have the same law. We may and shall assume that pvpT pnqq, d gr q " Tpnq. Next, we show that the claim also holds for n P N such that ℓ divides n, assuming that pvpT pn´1qq, d gr q " Tpn´1q. After inserting a vertex into both T pn´1q and Tpn´1q as above, we additionally attach a new branch to the last inserted vertex. The resulting trees are T pnq and Tpnq. It is easily seen that their laws are the same, and we may view them equal. The lemma then follows by induction.
This lemma immediately yields Proposition 1.2. Recall that T k pnq is the subtree of T pnq spanned by the root and the first k leaves.
Proof of Proposition 1.2. By the constructions above, pT k : k P Nq has the same distribution as ppT k ppk´1qℓq, d len q : k P Nq " ppT k ppk´1qℓ`1q, d len q : k P Nq " . . . We may and shall assume that for all integers k, n with n ě pk´1qℓ. In the product of the probability spaces where (3.2) and (3.3) hold respectively, the proposition easily follows.

Almost sure convergence for subtrees with finite leaves
Hereafter, we work in the probability space where the equalities of Proposition 1.2 hold. In this section we prove Proposition 1.3, which requires the following lemmas.
Corollary 4.2. Let k : N Ñ N be such that kpnq α " opn α´1{4 q and kpnq Ñ 8. Then for sufficiently large n, with probability greater than 1´2 tn{ℓu ř m"kpnq Cm ă`n ℓ˘α´1 α`5 n 1{4¯. We defer the straightforward proofs of Lemma 4.1 and Corollary 4.2 to Section 6. For all k P N, let Bk be the Borel σ-algebra of T k (i.e., of pT k pnq, d len q, in view of Proposition 1.2). Note that conditioned on T k , we do not know any information of the embellished vertices. Given S P Bk , write |S| " µ k pSq¨C k ; so when S is a path, |S| is the intrinsic path-length. For all integer n ě pk´1qℓ, let M pS, nq be the number of vertices of T k pnq on S, and write x M pS, nq " c n α¨M pS, nq.
Fact 4.3. Fix k P N and S P Bk . Then for all integer j ě k, given T k and C j , and given C k , C k`1 , . . ., the variables M pS, jℓq´M pS, pj´1qℓq ( jěk are independent. We first define a nice event, then prove an exponential bound given such an event. For all integers k, n with n ě kℓ, define the event F k,n as F k,n :" . (4.1) Given k : N Ñ N such that kpnq Ñ 8 and kpnq " opn 1{2 q, by Corollary 4.2, with sufficiently large n, Given an event F , the notation F c denotes the complement of F . The next result is the key to the results of this section, which says that the rescaled number of vertices falling into a subset S of the tree has the same asymptotics as |S|.
where the last inequality is because k " kpnq " Ωpplog nq 10 q. Then it follows by (4.3) that Notice that the second term e´k 1{3 in the bound does not depend on ε.
Lemma 4.4 easily leads to the Gromov-Hausdorff (GH) version of Proposition 1.3; see (4.9) for argument. To extend the result to GHP convergence (see Section 2), we need to consider the measures on the trees. First we simplify notation; given appropriate k : N Ñ N, write T n " T kpnq pnq. We need to bound the minimal discrepancy with respect to the uniform probability measures ν kpnq,n on pvpT n q, d n q and µ kpnq on pT n , d len q. To accomplish that, we show that ν kpnq,n and µ kpnq are close.
Corollary 4.5. Fix ε ą 0. Let k : N Ñ N be such that kpnq " Ω`plog nq 10˘a nd kpnq " opn 1{10 q. Then for all S P Bk pnq , as n Ñ 8, where the rate of decay does not depend on S.
Proof. Fix sufficiently large n P N and write k " kpnq. Let S P Bk . Recall that |S| " µ k pSq¨C k , |ν k,n pvpSqq| ď 1, and x M pS, nq " c n α¨M pS, nq " c n α¨ν k,n pvpSqq¨|vpT n q|. Next, note that, conditioned on T k , the event t|ν k,n pvpSqq´µ k pSq| ą 2εu is a subset of the union of the events "ˇˇˇˇν k,n pvpSqq¨c¨| vpT n q| n α´µ k pSq¨C kˇą ε¨C k * ď "ˇˇˇˇc¨| vpT n q| n α´C kˇ¨νk,n pvpSqq ą ε¨C k * .
On the event F k,n from (4.1), C k ą 1. It then follows from the triangle inequality that the last inequality follows by applying Lemma 4.4 twice. Now, replacing ε by ε 2¨11k 2 ℓ`1 in the above inequality and noticing that given the event F k,n , ε The lemma then follows from that k " kpnq " Ωpplog nq 10 q and kpnq " opn 1{10 q.
It may be helpful to recall the definitions relating to GHP convergence in Section 2 before reading the next proof.
Proof of Proposition 1.3. For most part of the proof we fix a large enough n and write k " kpnq for simplicity, unless we consider varying n. Let ε n " k´1 ℓ`1 . Since the total length of T n is C k , we may cover T n by M n :" P ε´1 n C k T balls, denoted by B n,1 , . . . , B n,Mn , with diameter at most ε n . Let A n,1 " B n,1 , and for i ą 1, let A n,i " B n,i zA n,i´1 . Then tA n,1 , . . . , A n,Mn u is a covering of pT n , d len q by disjoint sets with diameter at most ε n .
Next, define S n " Ť Mn i"1 vpA n,i qˆA n,i , then S n is a correspondence between vpT n q and T n . Moreover, for each 1 ď i ď M n , let w i be the element of vpA n,i q such that w i is closest to the root of T n . The distortion of S n can be bounded as follows: dis n :" dispS n ; d n , d len q " sup |d n px, yq´d len px 1 , y 1 q| : px, x 1 q P S n , py, y 1 q P S n ( " max 1ďiďjďMn sup |d n px, yq´d len px 1 , y 1 q| : px, x 1 q P vpA n,i qˆA n,i , py, y 1 q P vpA n,j qˆA n,j ( ď max 1ďiďjďMn sup |d n pw i , w j q´d len pw i , w j q|`d n pw i , xq`d n pw j , yq d len pw i , x 1 q`d len pw j , y 1 q : px, x 1 q P vpA n,i qˆA n,i , py, y 1 q P vpA n,j qˆA n,j ( ď max 1ďiďjďMn |d n pw i , w j q´d len pw i , w j q|`2ε n`2 c n α sup 1ďiďMn vpA n,i q. Now, given x, y P vpT n q, write rx, yq for the path in T n from x (included) to y (excluded). So d n px, yq " x M prx, yq, nq and d len px, yq " |rx, yq|. So Recall the definition of the event F k,n from (4.1) and note that on this event, we have M n ď m n :" ℓ`1´1`1 0k´1 4¯U . Then for any ε ą 0, it follows that P pdis n ą 4ε n`3 εq ď P pdis n ą 2ε n`3 ε, F k,n q`P`F c P´ˇˇx M pA n,i , nq´|A n,i |ˇˇą ε, F k,n¯ff`P`F c k,n˘.
To show GHP convergence, we follow [2, Proof of Proposition 4.8] and define πn on the product space vpT n qˆT n as follows. Given 1 ď i ď M n , for Borel sets X Ă vpA n,i q of pvpT n q, d n q and Y Ă A n,i of pT n , d len q, define πnpX, Y q " ν k,n pXq¨µ k pY q max tν k,n pvpA n,i qq, µ k pA n,i qu .
For i ‰ j, let πnpvpA n,i q, A n,j q " 0; so πnpS c n q " 0. (4.10) Such rectangles XˆY form a π-system generating the product σ-algebra, so πn extends uniquely to a measure π n on the product σ-algebra of pvpT n q, d n q and pT n , d len q. Now we derive the discrepancy D n :" Dpπ n ; ν k,n , µ k q of π n with respect to ν k,n and µ k . Note that π n pvpA n,i q, A n,i q " min tν k,n pvpA n,i qq, µ k pA n,i qu. Writing p and p 1 for the projections of vpT n qˆT n to the first and the second coordinates respectively, an easy calculation shows that D n " }ν k,n´p˚πn }`}µ k´p Pˆ|ν k,n pvpA n,i qq´µ k pA n,i q| ą ε m n , F k,nˇTk˙ff`P`F c k,n˘. We now use the notation kpnq to emphasize that kpnq changes with n and note that m n ă 11kpnq 2 ℓ`1 and kpnq " opn 1{10 q. Summing over n P N and applying Corollary 4.5 then yields that P´|ν k,n pvpA n,i qq´µ k pA n,i q| ą ε{M n , F k,nˇTk¯ff ď m n¨o pn´3q, and so this combined with Lemma 4.4 to bound ÈpF c k,n q yields ÿ nPN P pD n ą εq ď ÿ nPN´m n¨o pn´3q`e´k pnq 1{3¯ă 8.
Since we are working in the probability space where the equalities of Proposition 1.2 hold, the proof is completed.

Tightness property
In Section 5.1, we describe how the combinatorial tree Tpnq relates to an infinite-colors Pólya urn, which helps us analyse the heights and sizes of subtrees in Tpnq. In Section 5.2, we establish Proposition 1.4, with the proofs of several lemmas deferred to the subsequent subsections.

5.
1. An infinite-colors Pólya urn. At time 0, an urn contains only one ball of color 1. At time n P N, pick a ball from the urn uniformly at random, return the ball to the urn along with another ball of the same color. In addition, if ℓ divides n, and if the urn contains balls of colors 1, . . . , k´1, then an additional ball of color k is added to the urn. For n, k P N with n ě pk´1qℓ, let U k pnq be the number of balls of color k at time n, and let M k pnq " U 1 pnq`. . .`U k pnq. Note that at time kℓ, there are pℓ`1qk`1 balls of colors 1, . . . , k`1 (the extra 1 accounts for the initial ball of color 1), and there is only 1 ball of color k`1.
Recall the construction of the combinatorial tree Tpnq from Section 1. For all k P N, v kℓ is a branchpoint, i.e., a vertex with degree at least 3. For all k, n P N with n ě pk´1qℓ, we call the (graph-theoretic) path in Tpnq from v kℓ to the leaf L 1`k branch k. The length of a path in Tpnq is the number of (graph-theoretic) edges in it. Note that for k, n P N with n ě pk´1qℓ, the length of branches 1, . . . , k in Tpnq have the same law as pU 1 pnq, . . . , U k pnqq, and pM 1 pnq, . . . , M k pnqq have the same law as the number of edges in pT 1 pnq, . . . , T k pnqq. We may and shall use U k pnq to denote the length of branch k in Tpnq.

5.2.
Outline and proof for Proposition 1.4. We first outline the essential step to prove the GH version of Proposition 1.4: to obtain a height-bound for the subtrees of Tpnq pendant to T k pnq, where T k pnq is the subtree of Tpnq spanned by the root and the first k leaves. To accomplish this, we express the height-bound of the subtrees in terms of ř tn{ℓu`1 i"kpnq`1 pC i´Ci´1 qU i pnq C i , in Lemma 5.1; then deduce a bound for this sum, in Lemma 5.2. Write F k,n for the σ-algebra generated by C k , . . . , C tn{ℓu`1 , U k`1 pnq, . . . , U tn{ℓu`1 . For all i P N write ∆C i " C i´Ci´1 . Lemma 5.1. Fix n, k P N with n ě kℓ, and let u be a uniformly chosen vertex from v pTpnqq zv pT k pnqq. Then for positive λ ď max k`1ďiďtn{ℓu`1 U i pnq (´1 , Lemma 5.2. Let ε P p0, 1q. Let k : N Ñ N be such that kpnq " Ω`n 1{100˘a nd kpnq " We defer the proofs of Lemmas 5.1 and 5.2 to Sections 5.3 and 5.4, respectively. They lead to a tightness property of pT k pnq : k P N, n ě kℓq, i.e., the GH version of Proposition 1.4. To wit, for all k, n P N with n ě kℓ, let D k,n " c n α¨m ax td gr pw, T k pnqq : w P vpTpnqqzvpT k pnqqu . It follows from the definition of GH distance that d GH´`v pT k pnqq, c n α¨d gr˘,`v pTpnqq, c n α¨d gr˘¯ď D k,n . We use Lemmas 5.1 and 5.2 below to show that for an appropriate sequence of increasing kpnq, D kpnq,n Ñ 0 a.s.; details are given in the proof of Proposition 1.4.
Next, we take the measures into consideration, and examine GHP convergence of Proposition 1.4. We start by stating a fact about the GHP distance between subspaces that follows in a straightforward way from constructions and definitions; more general statements appear in [3,Fact 6.4] and [36,Fact 8.6]. For all k, n P N with n ě pk´1qℓ, let ν k,n be the projection of the uniform probability measure ν n of vpTpnqq onto vpT k pnqq, i.e., for any w P vpT k pnqq, write w for the maximal subset of vpTpnqq such that the removal of w disconnects w from T k pnq, and let ν k,n pwq " ν n pw Y twuq. Write p Tpnq "`vpTpnqq, c n α¨dgr , ν n˘, T k pnq "`vpT k pnqq, c n α¨dgr , ν k,n˘, and p T k pnq "`vpT k pnqq, c n α¨dgr , ν k,n˘. Fact 5.3. For all k, n P N with n ě pk´1qℓ, Upon showing that D kpnq,n Ñ 0 a.s. for an appropriate kpnq, to prove Proposition 1.4 it suffices to bound d GHP´Tk pnq, p T k pnq¯. Note that T k pnq and p T k pnq differ only in their measures, so d GHP´Tk pnq, p T k pnq¯" d k,n pν k,n , ν k,n q, where d k,n denotes the Lévy-Prokhorov distance on the metric space pvpT k pnqq, c n α¨dgr q. We show the following lemma in Section 5.5.
We can now make the discussion above into a precise proof.
Proof of Proposition 1.4. Fix a sufficiently large n and write k " kpnq, until near the end of the proof when we let n vary. It then follows from Fact 5.3 that d GHP´p Tpnq, p T k pnq¯ď d GHP´p Tpnq, T k pnq¯`d GHP´Tk pnq, p T k pnqď D k,n`dk,n pν k,n , ν k,n q.
Now, we deduce a bound for D k,n . Set ε " α{4, and define the event Choose u uniformly at random from v pTpnqq zv pT k pnqq, and take λpnq " n´αk α´ε ą 0 in Lemma 5.1, noticing that, on the event E n , λpnq ď . It then follows from Markov inequality and Lemma 5.1 that P`d gr pu, T k pnqq ě n α k´α`2 ε˘ď P`d gr pu, T k pnqq ě n α k´α`2 ε , E n˘`P pE c n q ď E " 1 rEns¨E " exp pλpnq¨d gr pu, T k pnqqqˇˇF k,n ıı exp pλpnqn α k´α`2 ε q`P pE c n q ď exp pλpnqℓ`5λpnq¨n α k´α`εq exp pλpnq¨n α k´α`2 ε q`P pE c n q ď exp pℓ`5´k ε q`P pE c n q . Recall that ε " α{4, so´α{2 "´α`2ε. It then follows from a union bound that wPvpTpnqqzvpT k pnqq P`d gr pw, T k pnqq ě n α k´α`2 ε˘fi fl ď 2n¨texp pℓ`5´k ε q`P pE c n qu . Now we use the notation kpnq and sum over n on both sides of the above inequality: ÿ nPN P´D kpnq,n ě c¨kpnq´α {2¯ď ÿ nPN 2n¨texp pℓ`5´kpnq ε q`P pE c n qu .
We only have left to prove Lemmas 5.1, 5.2, and 5.4, which we do in the forthcoming Sections 5.3, 5.4, and 5.5, respectively.

Height bound.
In this subsection we prove Lemma 5.1. Recall from Section 1 that, T 1 k is obtained from inserting ℓ vertices uniformly over T ppk´1qℓq, i.e., T 1 k is T pkℓq without the pk`1q:th branch. Now, for all k P N, let X k be the last inserted vertex of T 1 k ; so X k has the uniform law over T 1 k with respect to the Lebesgue measure. Next, we construct a sequence of new embellished trees pTk : k P Nq, coupled with a sequence of vertices pXk : k P Nq, such that pTk , Xk q d " pT 1 k , X k q. Our construction is a variant of the one in [13, Section 1.2].
A coupling. Let pW k , V k : k P Nq be i.i.d. Uniformp0, 1q-variables. We construct T1 by (1) inserting ℓ´1 vertices at uniform points over a branch of length C 1 ; and (2) let X1 be the point at distance V 1 C 1 from a fixed endpoint X0 of the branch. Given the pairs pTi , Xi q for i " 1, . . . , k for some k P N, we construct pTk`1, Xk`1q as follows. Note that, before pTk`1, Xk`1q is constructed, we do not know yet whether to view Xk as a vertex (in the upcoming case (a)) or just a point (case (b)). The reason to emphasize the difference between vertices and points is to align with the distribution of T 1 k , which is viewed as a union of a real tree and vertices, and the last ℓ vertices in T 1 k each has 1{ℓ probability of becoming a junction vertex in T pkℓq, but a random point has 0 probability of becoming a juntion. Recall that ∆C k`1 " C k`1´Ck .
(a) If W k`1 ď ∆C k`1 C k`1 , then let Tk`1 be obtained from Tk by (1) attaching a branch of length ∆C k`1 to Xk ; (2) inserting a vertex, denoted by Xk`1, in the latest branch at distance ∆C k`1 V k`1 from Xk ; and (3) inserting ℓ´1 vertices at random points of the existing tree, uniform for the Lebesgue measure. We view Xk as a vertex in this case.
C k`1 , then let Tk`1 be obtained from Tk by (1) inserting a vertex at a random point of Tk , uniform for the Lebesgue measure; (2) attaching a branch of length ∆C k`1 to this last inserted vertex of Tk ; and (3) inserting ℓ´1 vertices at random points of the existing tree, uniform for the Lebesgue measure. Let Xk`1 " Xk , viewed as a random point rather than a vertex. Note that the projection of Xj to Tk is Xk for all integers j ě k ě 1.
Lemma 5.5. For all k P N, Xk and X k are respectively uniform over Tk and T 1 k for the Lebesgue measure, and pTk , Xk q d " pT 1 k , X k q. Proof. First note that X k , the last inserted vertex of T 1 k , is uniform over T 1 k for the Lebesgue measure. Next, we show by induction on k that Xk has the uniform law over Tk for the Lebesgue measure. Base case k " 1 is trivially verified. Given that Xk is uniform over Tk for some k P N, since Xk`1 has ∆C k`1 C k`1 probability of landing on a uniform location of branch k`1, with the complement probability of being Xk which is uniform on Tk , it is clear that Xk`1 has the uniform law over Tk`1 for the Lebesuge measure.
Furthermore, it is easily seen that the pk`1q:th branch is attached to a uniform point of Tk , for the Lebesgue measure, for all k P N (step (1) of case (a): Xk is uniform over Tk ; step (2) of case (b): the last inserted vertex of Tk is uniform over Tk ). Moreover, given Tk , the first vertex to be inserted has ∆C k`1 C k`1 probability of landing on a uniform location of the pk`1q:th branch (step (2) of case (a)), with the complement probability of landing on a uniform point of Tk (step (1) of case (b)). Also, the next ℓ´1 vertices to be inserted are uniform over the existing tree (step (3) in both cases). We thus have Tk`1 d " T 1 k`1 . It follows by induction that Tk d " T 1 k for all k P N. Since both Xk and X k are respectively uniform over these two trees for the Lebesgue measure, we have pTk , Xk q d " pT 1 k , X k q. For ease of notation, fix k, n P N with n ě kℓ, and let m be the largest integer such that n ě mℓ. For all integer 1 ď i ď m, given that W i ď ∆C i C i , write S i for the path rXi´1, Xi q in Tm, and write M˝pS i , mq for the number of vertices on S i in Tm. Let Tk ,m (resp. T 1 k,m ) be the subtree of Tm (resp. T 1 m ) spanned by the root and the first k leaves. Denote by Et he event that Xm R Tk ,m , and analogously denote by E the event that X m R T 1 k,m . Recall from Section 5.1 that U i pmℓq is the length of the i:th branch in Tm (it can also be viewed as the number of balls of color i at time mℓ in the Pólya urn model therein). Let L denote law.
Lemma 5.6. We have Lˆd gr´Xm , T 1 k,m¯ˇE˙d " s ; when m " k the summation is 0.
Proof. Without loss of generality, assume that m ą k. It follows from Lemma 5.5 and the constructions of T 1 k,m and Tk ,m that L`d gr`Xm , T 1 k,m˘ˇE˘d " L`d gr`Xm , Tk ,m˘ˇE˝˘( 5.2) Moreover, it follows from the construction of Tk ,m that L`d gr`Xm , Tk ,m˘ˇE˝˘" m ÿ i"k`1 Now, due to the definition of U i p¨q and that the Xi are placed uniformly according to normalized Lebesgue measure, we havê Together with (5.2) and (5.3) we may conclude the proof.
Proof of Lemma 5.1. This proof is an easy generalization of the argument in [13, Section 1.2]. Note that λ¨U i pmℓq ď 1 for all k`1 ď i ď m. Applying Lemma 5.6 and using the bound e x ď 1`x`x 2 for 0 ď x ď 1, we have for λ ą 0, Notice that 0 ď λ ď 1, so e λ´1 ď λ`λ 2 ď 2λ ď 2λU i pmℓq and λe λ ď 3λ. Thus, the above quantity is bounded by Finally, for a tree T , write epT q for the set of edges of T . Given the event E, it follows from Lemma 3.5 and the first assertion of Lemma 5.5 that X m is on an edge uniformly chosen from epT 1 m qzepT 1 k,m q. Now, recall that u is uniform over vpTpnqqzvpT k pnqq, where Tpnq " pT pnq, d gr q. Since n ă mℓ`ℓ, it follows that that L´d gr pu, T k pnqqˇˇF k,n¯s t ď L´ℓ`d gr`Xm , T 1 k,m˘ˇE , F k,n¯, where st ď denotes stochastic domination. The lemma easily follows.

Moment bound for Pólya urn.
In this subsection, we prove Lemma 5.2 under the framework of Section 5.1. Denote by P`b w ; m˘the distribution of white balls in a classical Pólya urn after m completed draws, starting with b black and w white balls. Denote by P ℓ Im`b w ; m˘the number of white balls after m completed steps in the Pólya urn with immigration, starting with b black and w white balls: at the nth step, a ball is picked at random from the urn and returned along with an additional ball of the same color; additionally, if n is a multiple of ℓ, then a black ball is added after the n:th draw and return. We use the notation L p¨q to denote the law of some random variable. Proof. From [27, Lemma 4.1], for Y " P ℓ Im`1 w ; t˘and integer q ą 0, Setting T " t t´1 ℓ u, we calculate E rY pY`1q¨¨¨pY`qpℓ`1q´1qs Now setting w " pℓ`1qk and t " n´kℓ and noting that with this choice of parameters, we find E rM k pnqpM k pnq`1q¨¨¨pM k pnq`qpℓ`1q´1qs ď ppℓ`1qpk´s`qqq qˆ1`q pℓ`1q`ℓ`n ℓ`1 ℓ˙q ℓ˜1`q pℓ`1q 1`pn´1q ℓ`1 ℓ¸ℓ . Lemma 5.9. Fix k, n, q P N with n ě kℓ. There is a constant c " cpq, ℓq such that for all positive integer j ď qpℓ`1q, Proof. For j ď qpℓ`1q, Jensen's (or Hölder's) inequality implies Using (5.5) of Lemma 5.8 now implies E rM k pnqpM k pnq`1q¨¨¨pM k pnq`qpℓ`1q´1qs ď ck q n qℓ , and the result for j ď qpℓ`1q easily follows from this and (5.6).
Lemma 5.10. Fix k, n, p P N with n ě kℓ. There is a constant c " cpp, ℓq such that E rU k pnq p s ď c´n k¯p ℓ{pℓ`1q and E "ˆ∆ Proof. Recall from (5.4) of Lemma 5.7 that L`U k pnq|M k pnq˘" P`p k´1qpℓ`1q Let the random variable B " Betar1, pk´1qpℓ`1qs be independent of M k pnq. Conditional on B and M k pnq, let XpM k pnq, Bq be binomial with parameters M k pnq´pk´1qpℓ`1q´1 and B. By the de Finetti representation of the classical Pólya urn, we have L`U k pnq|M k pnq˘" L`p1`XpM k pnq, Bqq|M k pnq˘.
Hölder's inequality implies that for non-negative x, y and positive integer p, px`yq p ď 2 p´1 px p`yp q, and so starting from (5.7), we have Now note that, if L pY q " BipN, qq, then for positive integer p, and denoting Stirling numbers of the second kind by p j ( (and note these are non-negative), So from (5.8), condition on M k pnq (noting that M k pnq is independent of B) to find Standard formulas for beta moments imply where c " cpℓ, jq is a constant. Taking the expectation on both sides of (5.9), together with Lemma 5.9 and (5.10), yields that, for some c " cpp, ℓq, .
(5.10) then leads to E "´∆ C k C k¯2 p  ď ck´2 p . By Cauchy-Schwarz and the inequalities in the previous two displays, , .
is a subset of the union of events , over i " kpnq`1, . . . , t n ℓ u`1 and for a sufficiently small constant c α . This is because on the complement of this union, the sum is no greater than one. Next, we use Lemma 5.10, noting that kpnq " Opn ℓ{p2ℓ`1q q, to find for integer q ą 2{pεpℓ`1qq, where c is a constant depending only on q, ℓ. Multiplying by n both sides of the above inequalities and then summing over n P N yields Since kpnq " Ωpn 1{100 q, we can choose q large enough so that ř nPN n 2¨k pnq´ε qpℓ`1q{2`1 is finite.

Convergence of measures.
In this subsection, we prove Lemma 5.4. Fix integers k, n with n ě pk´1qℓ unless specified otherwise. Recall from Section 1 that v i denotes the vertex inserted in a uniform edge of the combinatorial tree Tpi´1q, and, if ℓ divides i, v i is a branchpoint (i.e., there is a new edge attached to v i at the time v i appears). Note that tv 0 , v 1 , . . . , v kℓ , L 1 , . . . , L k u Ă vpT k pnqq, where v 0 is the root, L 1 , . . . , L k are the first k leaves. Hereafter, conditioned on |vpT k pnqq| " pℓ`1qk`m`1 for some appropriate integer m :" mpT k pnqq, list the internal vertices of T k pnq as pv 1 , . . . , v kℓ , v i 1 , . . . , v im q, in the order of appearance; the other k`1 vertices of T k pnq are the leaves and the root. For convenience, denote i 0 " kℓ. Given w P vpT k pnqq, recall that w is the maximal subset of vpTpnqq such that the removal of w disconnects w from T k pnq; so for 1 ď i ď kℓ´1, v i " H. For all integer 0 ď j ď m, let T i j be the subtree of Tpnq restricted to v i j Y tv i j u, and let If ℓ divides i j , then v i j ‰ H and n i j ą 1, otherwise, v i j " H and n i j " 1.
Next, list the vertices v i 0 , v i 1 , . . . , v im in the breadth-first search order of T k pnq, as w 0 , w 1 , . . . , w m . So there is a bijection f : ti 0 , . . . , i m u Ñ t0, . . . , mu such that, in T k pnq, v i j is identified with w f pi j q and T i j is attached to w f pi j q . Now, let σ :" pσpi j q : 0 ď j ď mq be a uniformly chosen random permutation of i 0 , . . . , i m . We construct a random tree Tpn, σq by identifying the vertex w j of T k pnq with the vertex v σpi j q of T σpi j q , for each 0 ď j ď m. Since each v i j is inserted in a uniform edge of the existing tree T k pi j´1 q, uniformly permuting the attaching locations of T i j does not change the law of the resulting tree. It thus follows that Tpn, σq d " Tpnq. Write |n m,k,n | 2 " p ř m j"0 n 2 i j q 1{2 and N n " |vpTpnqq| " n`tn{ℓu`2. Recall that m :" |vpT k pnqq|´pℓ`1qk´1 is the number of the internal vertices of T k pnq except for tv 1 , . . . , v kℓ u.
Lemma 5.11. Let k : N Ñ N be such that kpnq " Ωpplog nq 10 q and kpnq " opn 1{2 q. For all n P N, denote the event .
Fix a sufficiently large n and write k " kpnq, m " mpnq " |vpT k pnqq|´pℓ`1qk´1. Then for any V Ă vpT k pnqq and t ą 4cpℓ`4q¨n α k 1 ℓ`1 , Pˆ|ν k,n pV q´ν k,n pV q| ą 2t Proof. For the duration of the proof, let V Ă vpT k pnqq, and write V 1 " V Ş tv i 0 , . . . , v im u, N " N n " |vpTpnqq|. Recall that σ " pσpi j q : 0 ď j ď mq is a uniform permutation of i 0 , . . . , i m . Write σ j " σpi j q for each 0 ď j ď m. By definition, the second equality follows from the fact that Tpn, σq ř vσ j PV 1 n σ j . Note that, exchangeability resulting from the uniform permutation σ only exhibits through pn σ j : 0 ď j ď mq, but not V zV 1 . Let n 1 " ř m j"0 n i j " N´pℓ`1qk. To use exchangeability to deduce the tail bound for |ν k,n pV q´ν k,n pV q|, we first show that it is close tǒˇˇˇř vσ j PV 1 nσ j n 1´| V 1 | m`1ˇ, and then employ exchangeability to bound the latter. Indeed, by the triangle inequality, writing M " M n " |vpT k pnqq| and noting that ν k,n pV q " |V | M , we have |ν k,n pV q´ν k,n pV q| Hence, even conditionally, Pˆˇˇν k,n pV q´ν k,n pV qˇˇą 2t NˇˇˇˇF nď P¨ˇˇÿ vσ j PV 1 We now compute (5.15). It is convenient to keep in mind that we have chosen n sufficiently large, and the variables k " kpnq, M " M n " |vpT k pnqq|, m " mpnq " M´pℓ`1qk´1, N " N n " |vpTpnqq|, depend on n. Note that |V zV 1 | ď pℓ`1qk`1 and t ą 4cpℓ`4q¨n α k 1 ℓ`1 , so t ą 2|V zV 1 | for large enough n. It follows that Given the event F n , it is easily seen that 1 2c¨n α k 1 ℓ`1 ď M ď 2 c¨n α k 1 ℓ`1 . Also, we have N " N n " Θpnq and n 1 " N´pℓ`1qk, so Note also that |V zV 1 | ď pℓ`1qk`1, |V 1 | ď M´pℓ`1qk, and N ď np1`1{ℓq`2, it then follows by the triangle inequality that on the event F n ,ˇˇˇ| where the last inequality is due to M ě 1 where the last equality is due to the fact that α ě 1 2 , and when α " 1 2 we have ℓ " 1. Combined with (5.16), we have It remains to bound (5.14), shown below. The rest of the proof follows a similar argument as in [3, Lemma 5.3], so we only point out the differences, and refer the reader to that work for omitted explanations. Let r 0 , . . . , r m be independent random variables with uniform law over ti 0 , . . . , i m u. Recall that n 1 " ř m j"0 n i j . It follows by symmetry that Taking a -4t |n m,k,n | 2 2 and applying Markov's inequality as in [25, Theorem 2.5] gives a Hoeffding-type inequality for ř vσ j PV 1 n σ j : for any t ą 0, " expˆ´2 t 2 |n m,k,n | 2

2˙
; the first inequality follows from Markov's inequality; the second one is due to [7,Proposition 20.6] and [35,Theorem 2]; the last inequality follows from a straightforward calculation as in [25,Lemma 2.6]. Together with (5.14) and (5.17), we may conclude the proof.
Given a graph G and a subgraph G 1 of G, write G´G 1 for the components obtained by removing all edges and vertices of G 1 from G. For all integers k, n with n ě pk´1qℓ, let S k,n " S pT k pnqq " max t|vpT q| : T P Tpnq´T k pnqu .
The proof of Lemma 5.12 is deferred to Section 5.6.
In the remaining proof we fix a large n P N and write k " kpnq, unless we consider varying n. With a similar argument as in Proposition 1.3, we find a covering, denoted by tB n,1 , . . . , B n,Mn u, of pvpT k pnqq, c n α¨dgr q, with diameter at most ε n . Let A n,1 " B n,1 , and for i ą 1, let A n,i " B n,i zA n,i´1 . Then tA n,1 , . . . , A n,Mn u forms a disjoint cover of pvpT k pnqq, c n α¨dgr q, with diameter at most ε n . This paragraph follows a similar argument as in [3, Corollary 6.2], and we refer the reader to that work for omitted details. Recall that d k,n denotes the Lévy-Prokhorov distance on pvpT k pnqq, c n α¨dgr q. We claim that, td k,n pν k,n , ν k,n q ą ε n u Ă " |ν k,n pA n,j q´ν k,n pA n,j q| ą ε n M n for some 1 ď j ď M n * ; a quick proof is provided as follows. Suppose that d k,n pν k,n , ν k,n q ą ε n . Then there exists a set S Ă vpT k pnqq such that either ν k,n pS εn q ă ν k,n pSq´ε n or ν k,n pS εn q ă ν k,n pSq´ε n . Since tA n,1 , . . . , A n,Mn u is a disjoint cover of T k pnq, there exists j such that either ν k,n pS εn X A n,j q ă ν k,n pS X A n,j q´ε n {M n or ν k,n pS εn X A n,j q ă ν k,n pS X A n,j q´ε n {M n .
So S X A n,j ‰ H. Since the diameter of A n,j is at most ε n , we have A n,j Ă S εn . So, either ν k,n pA n,j q ă ν k,n pA n,j q´ε n {M n or ν k,n pA n,j q ă ν k,n pA n,j q´ε n {M n .
Hence the claim. Next, let as before We now easily obtain that P pd k,n pν k,n , ν k,n q ą ε n , F n q ď Pˆ|ν k,n pA n,j q´ν k,n pA n,j q| ą ε n M n for some 1 ď j ď M n , F n˙. Furthermore, let N n " |vpTpnqq| " n`tn{ℓu`2, m " mpnq " |vpT k pnqq|´pℓ`1qk´1, and write |n n | 2 "´ř m j"0 n 2 i j¯1
Finally, recalling that m n " Θ´kpnq 13 12pℓ`1q¯a nd kpnq " Ω`n 1{100˘, summing over n and applying the Borel-Cantelli lemma yields the almost sure convergence. 5.6. Maximal occupation of an infinite-colors Pólya urn. In this subsection we prove Lemma 5.12, by viewing sizes of the subtrees as the occupations of a modified version of infinite-colors Pólya urn, introduced below.
At time 0, the urn contains b P N black balls and w P N balls of color 1. At time t P N, a ball is chosen at random from the urn and returned along with an additional ball of the same color. Additionally, if ℓ divides t and the urn has black balls and balls of color t1, . . . , pu, then (i) if the chosen ball is black, add a ball of color p`1; or (ii) if the chosen ball is non-black, add a ball of the same color as the chosen ball. For each i P N, let U i pt; b, wq be the number of color-i balls and let U 0 pt; b, wq be the number of black balls in the urn after t P N draws. Let M i pt; b, wq " U 1 pt; b, wq`¨¨¨`U i pt; b, wq, noticing that the sum does not include U 0 pt; b, wq. Let S i pb, wq be the random time that color i appears.
Recalling the definition (5.18), the key relation between this urn model and the quantities appearing in Lemma 5.12 is S k,n :" max t|vpT q| : T P Tpnq´T k pnqu d " max iPN U i pn´kℓ; pℓ`1qk, 1q .
We prove the lemma by deducing moment bounds of S k,n from moment bounds of the U i .
For the moment-bounds on U i pt; b, wq (and M i pt; b, wq), we consider the following auxiliary 2-colors Pólya urn, which is equivalent to the above urn by regarding color 1 as white and t2, 3, . . .u-colors as black. At time 0, the urn contains b P N black balls and w P N white balls. At time t P N, a ball is chosen at random from the urn and returned along with an additional ball of the same color. Additionally, if ℓ divides t, then an additional ball of the chosen color is added. Let W pt; b, wq be the number of white balls in the urn recalling the definition of d t´1,p´1 from (5.25). For m ď t, let γ m,t´1 " t´1 ź j"mˆ1`p p1`1 rℓ|pj`1qs q n j˙, where ś t´1 j"t p¨q :" 1. Applying the inequality above recursively, we find E rW p t s ď w p γ 0,t´1`t´1 ÿ i"0 d i,p´1 e p n i γ i`1,t´1 . (5.27) To bound and simplify γ m,t´1 , write t 1 " t t ℓ u, m 1 " t m ℓ u and using the inequality 1`x ď e x for all x P R, γ m,t´1 ď t´1 ź j"ℓm 1ˆ1`p p1`1 rℓ|pj`1qs q n jď t 1 ź r"m 1ˆ1`2 p b`w`rpℓ`1q˙t 1 b`w`rpℓ`1q¸.
Since ş a 0 dx γ`βx " 1 β log´1`a β γ¯, t 1 ď t ℓ , and (using b`w ě 1) there is a constant f 1 ℓ not depending on b, w, m, t such that αpb`wq`m αpb`wq`ℓm 1 ď f 1 ℓ , Taylor expansion then yields that For the rest of the proof we assume that k ą p3{2q 4 , then p1`