Cost functionals for large (uniform and simply generated) random trees

Additive tree functionals allow to represent the cost of many divide-and-conquer algorithms. We give an invariance principle for such tree functionals for the Catalan model (random tree uniformly distributed among the full binary ordered trees with given number of internal nodes) and for simply generated trees (including random tree uniformly distributed among the ordered trees with given number of nodes). In the Catalan model, this relies on the natural embedding of binary trees into the Brownian excursion and then on elementary second moment computations. We recover results first given by Fill and Kapur (2004) and then by Fill and Janson (2009). In the simply generated case, this relies on the convergence of conditioned Galton-Watson towards stable L\'evy trees. We recover results first given by Janson (2003 and 2016) in the quadratic case and give a generalization to the stable case.


Introduction
Trees have lots of applications in various fields such as computer science for data structure or in biology for genealogical or phylogenetic trees of extant species.Related to those applications, the study of large trees has attracted some attention.In this paper, we shall consider asymptotics for additive functionals of large trees corresponding to the Catalan model and some simply generated trees.
1.1.A finite measure indexed by a tree.Let T denote the set of all rooted finite ordered trees.For t ∈ T, let |t| be the the number of nodes of t; for a node v ∈ t, let t v denote the sub-tree of t above v (see (11) in Section 2.1 for a precise definition).We consider the following unnormalized non-negative finite measure A t : (1) where f is a measurable real-valued function defined on [0, 1].We are interested in the asymptotic distribution of A t (f ) when t belongs to a certain class of trees and |t| goes to infinity.We shall consider two classes of trees: the binary trees (and more precisely the Catalan model) and some simply generated trees.We give some examples related to the measure A t which are commonly used in the analysis of trees.In what follows, for a tree t ∈ T, we denote by ∅ its root and by d the usual graph distance on t.For v, w ∈ t, we say that w is an ancestor of v and write w v if d(∅, v) = d(∅, w) + d(w, v).For u, v ∈ t, we denote by u ∧ v, the most recent common ancestor of u and v: u ∧ v is the only element of t such that: w u and w v implies w u ∧ v.
The measure A t is also related to other additive functionals in the particular case of binary trees, see Section 1.2.
1.2.Additive functionals and toll functions for binary trees.Additive functionals on binary trees allow to represent the cost of algorithms such as "divide and conquer", see Fill and Kapur [21].For t ∈ T a full binary tree, we shall denote by 1 (resp.2) the left (resp.right) child of the root.Thus t 1 (resp.t 2 ) will be the left (resp.right) sub-tree of the root of t.A functional F on binary trees is called an additive functional if it satisfies the following recurrence relation: In the particular case where the toll function is a power function, that is b n = n β for n ∈ N * and some β > 0, we get F (t) = |t| −β+1 A t (x β−1 ).In such cases, the asymptotic study of the measure A t will provide the asymptotic of the additive functionals.
We say that v ∈ t is a leaf if |t v | = 1.We denote by L(t) the set of leaves of t and, when |t| > 1, by t * = t \ L(t) the tree t without its leaves.We stress that the additive functional considered in [21] is exactly (5) However the asymptotics will be the same as the one for F when the toll function is a power function, see Remark 3.4.We complete the examples of the previous section for binary trees.
• The Sackin index (or external path length) of a tree t, used to study the balance of the tree, is similar to the total path length of t when one considers only the leaves: S(t) = w∈L(t) d(∅, w).Using that for a full binary tree we have |t| = 2|L(t)| − 1, we deduce that 2S(t) = v∈t |t v | − 1 = A t (1) − 1.

1.3.
Main results on the asymptotics of additive functionals in the Catalan model.
We consider the Catalan model: let T n be a random tree uniformly distributed among the set of full binary ordered trees with n internal nodes (and thus n + 1 leaves), which has cardinal C n = (2n)!/[(n! 2 )(n + 1)].We have: Recall that T n is a (full binary) Galton-Watson tree (also known as simply generated tree) conditioned on having n internal nodes.It is well known, see Takàcs [50], Aldous [7,8] and Janson [29], that |T n | −3/2 P (T n ) converges in distribution, as n goes to infinity, towards 2 1 0 B s ds, where B = (B s , s ∈ [0, 1]) is the normalized positive Brownian excursion.This result, see Corollary 3.9, can be seen as a consequence of the convergence in distribution of T n (in fact the contour process) properly scaled towards the Brownian continuum tree whose contour process is B, see [7] and Duquesne [14], or Duquesne and Le Gall [15] in the setting of Brownian excursion.For a combinatorial approach, which can be extended to other families of trees, see also Fill and Kapur [22,23] or Fill, Flajolet and Kapur [19].
In [21], the authors considered the toll functions b n = n β with β > 0 and they proved that with a suitable scaling the corresponding additive functional F β (T n ) = |T n | −β+1 A Tn (x β−1 ) converge in distribution to a limit, say Y β .The distribution of Y β is characterized by its moments.(In [18,21], the authors considered also the toll function b n = log(n).)See also Janson and Chassaing [32] for asymptotics of the Wiener index, which is a consequence of the joint convergence in distribution of (A Tn (1), A Tn (x)) with a suitable scaling and Blum, François and Janson [10] for the convergence of the Sackin and Colless indexes.In Theorem 3.1 (take α = 2), we prove that, in the Catalan model, the random measure |T n | −3/2 A Tn converges weakly a.s., as n goes to infinity, to a random measure 2Φ B , built on the Brownian normalized excursion B, see (18) with h = B. Using the notation T n,v = (T n ) v for v ∈ T n , this proves in particular the following a.s.convergence (8) |T simultaneously for all real-valued continuous function f defined on [0, 1].Notice that Theorem 3.1 is more general as the convergences hold jointly for all measurable real-valued functions f defined on [0, 1] such that f is continuous on (0, 1] and sup x∈(0,1] x a |f (x)| is finite for some a < 1/2.Notice this covers the case of toll functions b n = n β with β > 1/2 in [21] which corresponds to the so called "global" regime.The limit 2Φ B (x β−1 ) gives a representation of Y β for β > 1/2, which, thanks to Corollary 3.2, corresponds when β ≥ 1 to the one announced in Fill and Janson [20], that is where m B (s, t) = inf u∈[s∧t,s∨t] B(u).In the "local" regime, that is β ∈ (0, 1/2], according to Corollary 3.2 and Lemma 2.1, the convergence ( 8) is not relevant as Φ B (x β−1 ) = +∞ a.s.; see [21] for the relevant normalization.The proof of Theorem 3.1 relies on the natural embedding of T n into the Brownian excursion, see [8] and Le Gall [37], so that the convergence in distribution of the random measure |T n | −3/2 A Tn or of the additive functionals F β (which holds simultaneously for all β > 1/2) is then an a.s.convergence.We also give the fluctuations for this a.s.convergence, see Proposition 3.5.In Remark 3.3, we provide, as a direct consequence of Theorem 3.1, the joint convergence of the total length path, the Wiener, Sackin, Colless and cophenetic indexes defined in Sections 1.1 and 1.2.
Remark 1.1.The method presented in this section based on the embedding of T n into a Brownian excursion can not be extended directly to other models of trees such as binary search trees, recursive trees or simply generated trees.
Concerning binary search trees (or random permutation model or Yule trees), see [46] and [48] for the convergence of the external path length (which corresponds in our setting to the Sackin index), [42] for toll function b n = n β , [43] for the Wiener index (and [29] for simply generated trees), [10] (and [25] for other trees) for the Sacking and Colless indexes, and [18] for the shape function.
Concerning recursive trees, see [40,13] for the convergence of the total path length and [43] for the Wiener index.In the setting of recursive trees, then (3) is a stochastic fixed point equation, which can be analyzed using the approach of [49].
Remark 1.2.One can replace the toll function b |t| in (3) by a function of the tree, say b(t).For example, if one consider b(t) = 1 {t=t 0 } , with t 0 a given tree, then the corresponding additive functional gives the number of occurrence of the motif t 0 .The case of "local" toll function b (with finite support or fast decreasing rate) has been considered in the study of fringe trees, see [5], [12,24] for binary search trees, and [31] for simply generated trees and [27] for binary search trees and recursive trees.
See [28] for the study of the phase transition on asymptotics of additive functionals with toll functions b n = n β on binary search trees between the "local" regime (corresponding to β ≤ 1/2) and the "global" regime (β > 1/2).The same phase transition is observed for the Catalan model, see [21].Our main result, see Theorem 3.1, concerns specifically the "global" regime.
1.4.Main results on the asymptotics of additive functionals for simply generated trees.We consider a weight sequence p = (p(k), k ∈ N) on R + with generating function g p .We assume that g p has a positive radius of convergence, g p (0) = 0, g p = 0 and p is generic, that is there exists a positive root to the equation g p (q) = qg p (q).A simply generated tree of size p ∈ N * with weight function p is a random tree τ (p) such that the probability of τ (p)  to be equal to t, with |t| = p, is proportional to v∈t p(k v (t)), where k v (t) is the number of children of the node v in t.According to Section 2.5, since g p is generic, without loss of generality we can assume that p is a critical probability (g p (1) = g p (1) = 1), so that τ (p) is distributed as a Galton-Watson (GW) tree τ with offspring distribution p conditioned to |τ | = p.Global convergence of scaled GW trees τ to Lévy trees has been studied in Le Gall and Le Jan [39] and in [15] using the convergence of contour process.
Assume p belongs to the domain of attraction of a symmetric stable distribution of Laplace exponent ψ(λ) = κλ γ with γ ∈ (1, 2] and κ > 0.Then, the convergence of τ (p) properly scaled to the normalized Lévy trees holds according to [14].This result is recalled in section 7.3.We recall that the normalized Lévy tree is a real tree coded by the normalized positive excursion of the height function Under the hypothesis of Theorem 7.3, there exists a sequence (a p , p ∈ N * ) such that we have the following convergence in distribution, see Corollary 3.8: simultaneously for all real-valued continuous function f defined on [0, 1].The convergence (9) has to be understood along the infinite sub-sequence of p such that P(|τ | = p) > 0. The proof relies on the fact that one can approximate A t (x k ), for k ∈ N * , by an elementary continuous functional of the contour process of t, see Section 7.2.Then, we use the convergence of the contour process of τ (p) to the contour process of H to conclude.We also provide the first moment of Φ H (x β−1 ), see Lemma 3.10 and conjecture that β = 1/γ corresponds to the phase transition between the "global" and "local" regime in this setting.
Remark 1.3.We make the following comments.
• Assume that p has finite variance, say σ 2 .Then one can take a p = √ p and H is equal to (2/σ)B which corresponds to ψ(λ) = σ 2 λ 2 /2.By scaling, or using that the limit in Theorem 3.1 does not depend on α, we deduce that Φ cB = cΦ B .We can then rewrite (9) as: (10) p −3/2 where the convergence holds simultaneously for all real-valued continuous function f defined on [0, 1] and along the infinite sub-sequence of p such that P(|τ | = p) > 0. • If one consider the binary offspring distribution p such that p(2) + p(0) = 1 (recall that 1 > p(0) > 0 by assumption), one gets that τ (2n+1) is uniformly distributed among the full binary trees with n internal nodes (and n + 1 leaves), that is τ (2n+1) is distributed as T n , see the Catalan model studied in Section 1.3.Take p(0) = 1/2 to get the critical case, and notice that σ = 1 in (10).The convergence (10), with p = 2n + 1, is then a weaker version of (8) (convergence in distribution instead of a.s.convergence, and continuous functions on [0, 1] instead of continuous functions on (0, 1] with possible blow up at 0+). • If one consider the (shifted) geometric distribution: p(k) = q(1 − q) k for k ∈ N with q ∈ (0, 1), one gets that τ (p) is uniformly distributed among the rooted ordered trees with p nodes.Take p(0) = 1/2 to get the critical case, and notice that σ = 2 in (10).
1.5.Organization of the paper.Section 2 is devoted to the definition of the main objects used in this paper (ordered rooted discrete trees using Neveu's formalism, real trees defined by a contour function, Brownian tree whose contour function is a Brownian normalized excursion, the embedding of the discrete binary trees from the Catalan model into the Brownian tree, and simply generated random trees).We present our main result about the Catalan model in Section 3.1 on the convergence (8), see Theorem 3.1 and Corollary 3.2.(The proofs are given in Sections 4 and 5.) The corresponding fluctuations are stated in Proposition 3.5.
(The proof is given in Section 6.) Section 3.2 is devoted to the main results concerning the convergence of A τ when τ is a simply generated tree, see Corollaries 3.8 and 3.9.(Their proofs are provided in Section 7.) Some technical results are gathered in Section 8.

Notations and a preliminary result
Let I be an interval of R with positive Lebesgue measure.We denote by B(I) the set of real-valued measurable functions defined on I.We denote by C(I) (resp.C + (I)) the set of real-valued (resp.non-negative) continuous functions defined on I.For f ∈ B(I) we denote by f ∞ the supremum norm and by f esssup the essential supremum of |f | over I.The two supremums coincide when f is continuous.
2.1.Ordered rooted discrete trees.We recall Neveu's formalism [44] for ordered rooted discrete trees, which we shall simply call trees.We set U = n≥0 (N * ) n the set of finite sequences of positive integers with the convention (N * ) 0 = {∅}.For n ≥ 0 and u ∈ (N * ) n ⊂ U, we set |u| = n the length of u.Let u, v ∈ U. We denote by uv the concatenation of the two sequences, with the convention that uv = u if v = ∅ and uv = v if u = ∅.We say that v is an ancestor of u (in a large sense) and write v u if there exists w ∈ U such that u = vw.If v u and v = u, then we shall write v ≺ u.The set of ancestors of u is the set Āu = {v ∈ U; v u}.The most recent common ancestor of a subset s of U, denoted by m(s), is the unique element v of u∈s Āu with maximal length.We consider the lexicographic order on U: for u, v ∈ U, we set v < u either if v ≺ u or if v = wjv and u = wiu with w = m({v, u}), u, u ∈ U and j < i for some i, j ∈ N * .A tree t is a subset of U that satisfies: • For every u ∈ t, there exists k u (t) ∈ N such that, for every i ∈ N * , ui ∈ t if and only if 1 Let u ∈ t.The integer k u (t) represents the number of offsprings of the node u.The node u is called a leaf (resp.internal node) if k u (t) = 0 (resp.k u (t) > 0).The node ∅ is called the root of t.We define the sub-tree t u ∈ T of t "above" u as: (11) t u = {v ∈ U, uv ∈ t}.
We denote by |t| = Card (t) the number of nodes of t and we say that t is finite if |t| < +∞.Let d t denote the usual graph distance on t.In particular, we have d t (∅, u) = |u| for u ∈ t.
When the context is clear, we shall write d for d t .
We denote by T the set of finite trees and by T (p) = {t ∈ T, |t| = p} the set of trees with p nodes, for p ∈ N * .Let us recall that, for a tree t ∈ T, we have (12) u∈t k u (t) = |t| − 1.

Real trees.
We recall the definition of a real tree, see [17].A real tree is a metric space (T , d) which satisfies the following two properties for every x, y ∈ T : (i) There exists a unique isometric map f x,y from [0, d(x, y)] into T such that f x,y (0) = x and f x,y (d(x, y)) = y.(ii) If φ is a continuous injective map from [0, 1] into T such that φ(0) = x and φ(1) = y, then we have φ([0, 1]) = f x,y ([0, d(x, y)]).Equivalently, a metric space (T , d) is a real tree if and only if T is connected and d satisfies the four point condition: )) for all s, t, x, y ∈ T .
A rooted real tree is a real tree (T , d) with a distinguished element ∅ called the root.For x, y ∈ T , we denote by [[x, y]] the range of the map f x,y described above.Let x, y ∈ T .We denote by x ∧ y their most recent common ancestor which is the only The out-degree d x (T ) of x is the number of connected components of T \{x} which do not contain the root.We say x is a leaf (resp.branching point) if d x (T ) = 0 (resp.d x (T ) ≥ 2).We say T is binary if d x (T ) ∈ {0, 1, 2} for all x ∈ T .
For h ∈ C + ([0, 1]), we define its minimum over the interval with bounds s, t ∈ [0, 1]: We shall also use the length of the excursion of h above level r straddling s defined by: For β > 0, we set: Let h ∈ C + ([0, 1]) be such that m h (0, 1) = 0.For every x, y ∈ [0, 1], we set d h (x, y) = h(x) + h(y) − 2m h (x, y).It is easy to check that d h is symmetric and satisfies the triangle inequality.The relation ∼ h defined on [0, 1] 2 by x ∼ h y ⇔ d h (x, y) = 0 is an equivalence relation.Let T h = [0, 1]/ ∼ h be the corresponding quotient space.The function d h on [0, 1] 2 induces a function on T 2 h , which we still denoted by d h , and which is a distance on T h .It is not difficult to check that (T h , d h ) is then a compact real tree.We denote by p h the canonical projection from [0, 1] into T h .Thus, the metric space (T h , d h ) can be viewed as a rooted real tree by setting ∅ = p h (0).The image of the Lebesgue measure on [0, 1] by p h is a measure µ h on T h .

2.3.
The Brownian continuum random tree T .Let B = (B t , 0 ≤ t ≤ 1) be a positive normalized Brownian excursion.Informally, B is just a linear standard Brownian path started from the origin and conditioned to stay positive on (0, 1) and to come back to 0 at time 1.For α > 0, let e = 2/α B and let T e denote the associated real tree called Brownian continuum random tree.(We recall the associated branching mechanism is ψ(λ) = αλ 2 .)The continuum random tree introduced in [6] corresponds to α = 1/2 and the Brownian tree associated to the normalized Brownian excursion corresponds to α = 2.We shall keep the parameter α so that the two previous cases are easy to read on the results.See [38] for properties of the Brownian continuum random tree.In particular µ e (dx)-a.s.x is a leaf and a.s.T e is binary.
We shall forget to stress the dependence in e in the notations, when there is no ambiguity, so that for example we simply write T , µ, σ r,s and Z β for respectively T e , µ e , σ r,s (e) which is defined in (14) and Z e β which is defined in (15).For r ≥ 0 and s ∈ [0, 1], we also have: which is the mass of the sub-tree of T containing p(s) and at distance r from the root.The next result is a consequence of Lemma 3.10 in Section 3.2 (with H = e, γ = 2 and κ = α).

2.4.
The discrete binary tree from the Brownian tree.A marked tree t = (t, (h v , v ∈ t)) is a tree t ∈ T with a label on each node.The label h v ∈ (0, +∞) will be interpreted as the length of the branch from below v. (Notice, there is a branch below the root.)We define the concatenation of two marked trees t(i) = (t (i) , (h ] the random real tree spanned by the n + 1 leaves p g (t 1 ), . . ., p g (t n+1 ) with root ∅.We define recursively the associated marked tree t , where intuitively t(G n ) is similar to T g (G n ) but with the branch lengths equal to 1 and no branch below the root, and Let e be the Brownian excursion defined in Section 2.3.Let (U n , n ∈ N * ) be a sequence of independent random variables uniform on [0, 1], independent of e.In particular (p(U n ), n ∈ N * ) are a.s.distinct leaves of T .Let (U 1,n , . . ., U n+1,n ) be the a.s.increasing reordering of (U 1 , . . ., U n+1 ) and set G n = (e; (U 1,n , . . ., U n+1,n )).We write the random real tree spanned by the n + 1 leaves p(U 1 ), . . ., p(U n+1 ) and the root and Tn = (  2.5.Simply generated random tree.We consider a weight sequence p = (p(k), k ∈ N) of non-negative real numbers such that k∈N p(k) > p(1) + p(0) and p(0) > 0. For t ∈ T, we define its weight as: We set w(T (p) ) = t∈T (p) w(t).For p ∈ N * such that w(T (p) ) > 0, a simply generated tree taking values in T (p) with weight sequence p is a T (p) -random variable τ (p) whose distribution is characterized by, for all t ∈ T (p) : Let g p be the generating function of p: g p (θ) = k∈N θ k p(k) for θ > 0. From now on, we assume there exists θ > 0 such that g p (θ) is finite.For q > 0 such that g p (q) < +∞, let p q be the probability distribution with generating function θ → g p (qθ)/g p (q).According to [33] see also [3], the distribution of the GW tree τ with offspring distribution p q conditioned on {|τ | = p} is the distribution of τ (p) and thus does not depend on q.It is easy to check there exists at most one positive root, say q p , of the equation g p (q) = qg p (q).We say that p is generic (for the total progeny) if such root q p exists and non-generic otherwise.In particular, all weight sequences such that there exists q > 0 with g p (q) finite and g p (q) < qg p (q) (that is p q is a super-critical offspring distribution), are generic.
From now on, we shall assume that p is generic.Without loss of generality, by replacing p by the probability distribution with generating function θ → g p (q p θ)/g p (q p ), we will assume that p is a critical probability distribution, that is: We recall that τ (p) is distributed as a critical GW tree τ with offspring distribution p conditioned on {|τ | = p}, as for all finite tree t, P(τ = t) = w(t).
Local limits for critical GW trees conditioned on having a large total progeny go back to [33] for the generic case (infinite spine case) and [30] for the non-generic case (condensation case), see also [3,4] and reference therein for more general conditionings.Scaling limits or global limits for GW tree conditioned on having a large total progeny have been studied in [15] for forests (that is collection of GW trees) and in [14,35] for critical GW tree in the domain of attraction of Lévy trees, see also [34] for more general conditioning of GW trees and [36] for non-generic cases.

Main results
For t ∈ T, we define the unnormalized measure A t on [0, 1] by: For h ∈ C + ([0, 1]), we also consider the random measure Φ h on [0, 1] defined by: We endow the space of non-negative finite measures on [0, 1] with the topology of the weak converge.
3.1.Catalan model.Let α > 0 and recall e = 2/α B, where B = (B t , t ∈ [0, 1]) denotes the normalized Brownian excursion.We also recall that the discrete binary tree T n , defined in Section 2.4 from the Brownian tree T e , is uniformly distributed among the full ordered rooted binary trees with n internal nodes.In particular, we have |T n | = 2n + 1.For n ∈ N * , we define the weighted random measure A n on [0, 1] defined by The next result is proved in Section 5.
Notice the fluctuations for the a.s.convergence towards Z β with β ≥ 1, given in Corollary 3.2, have an asymptotic variance (up to a multiplicative constant) given by Z 2β .
Remark 3.7.The contribution to the fluctuations is given by the error of approximation of A n,1 (f ) by A n,2 (f ), see notations from the proof of Theorem 3.1.This corresponds to the fluctuations coming from the approximation of the branch lengths (h n,v , v ∈ T n ) by their mean, which relies on the explicit representation on their joint distribution given in Lemma 4.1.In particular, there is no other contribution to the fluctuations from the approximation of the continuum tree T by the sub-tree T [n] .

3.2.
Simply generated trees model.We keep notations from Section 2.5 on simply generated random tree.We assume the weight sequence p = (p(k), k ∈ N) of non-negative real numbers such that k∈N p(k) > p(1) + p(0) and p(0) > 0 is generic.As stated in Section 2.5, without loss of generality, we will assume that p is a critical probability distribution, that is: The next result is a direct consequence of [14] on the convergence of the contour process of random discrete tree, see Corollary 7.5 given in Section 7. We keep notations and definitions of Section 7, with H the normalized excursion of the height function associated to the branching mechanism ψ.
Corollary 3.8.Let p be a critical probability distribution on N, with 1 > p(1)+p(0) ≥ p(0) > 0, which belongs to the domain of attraction of a symmetric stable distribution of Laplace exponent ψ(λ) = κλ γ with γ ∈ (1, 2] and κ > 0, and renormalizing sequence (a p , p ∈ N * ).Let τ be a GW tree with offspring distribution p, and τ (p) be distributed as τ conditionally on {|τ | = p}.We have the following convergence in distribution: where we endow the space of non-negative measures with the topology of the weak converge and where the convergence is taken along the infinite sub-sequence of p such that P(|τ | = p) > 0.
We set for β > 0 and t ∈ T: Corollary 3.9.Under the hypothesis and notations of Corollary 3.8, we have the following convergence in distribution for all β ≥ 1, with Z H β given by (15) and where the convergence is taken along the infinite sub-sequence of p such that P(|τ | = p) > 0.
The proof of the first part of the next Lemma is given in Section 8.6.The second part, which is the representation formula, is a direct consequence of the deterministic Lemma 8.6 in Section 8.5 (with β = a + 1).Lemma 3.10.Assume the height function H is associated to the Laplace exponent ψ(λ) = κλ γ with γ ∈ (1, 2] and κ > 0. We have that a.s.for all 1/γ ≥ β > 0, Z H β = +∞, that a.s.for all β > 1/γ, Z H β is finite and We also have the representation formulas Remark 3.11.We conjecture that for simply generated trees, under the hypothesis of Corollary 3.8, there is a phase transition at β = 1/γ between a "global" regime (β > 1/γ), where the convergence (21) holds, and a "local" regime (β ≤ 1/γ) where the convergence ( 21) is irrelevant as the right-hand-side of ( 21) is a.s.infinite.
Using the Skorohod representation theorem, notice that all the convergences in distribution of Corollary 3.9 hold simultaneously.Remark 3.12.If p has finite variance, say σ 2 , then one can take a p = √ p in Corollaries 3.8 and 3.9 and H is equal to (2/σ)B which corresponds to ψ(λ) = σ 2 λ 2 /2, see Remarks 7.2 and 7.4.By scaling, or using that the limit in Theorem 3.1 does not depend on α, we deduce that in this case Φ H = 2 σ Φ B and Z H β = 2 σ Z B β in Corollaries 3.8 and 3.9.

Preliminary Lemmas
Recall T is the real tree coded by the excursion e, see Section 2.3 and T [n] is the (smallest) sub-tree of T e containing n + 1 leaves picked uniformly at random and the root, see Section 2.4.Recall (T n , (h n,v , v ∈ T n )) denote the corresponding marked tree.Intuitively, for v ∈ T n , h n,v is the length of the branch below the branching point with label v in T [n] (when keeping the order on the leaves).We recall, see [8], [45] (Theorem 7.9) or [15], that the density of (h n,v , v ∈ T n ) is, conditionally on T n , given by: where L n = v∈Tn h n,v denotes the total length of T [n] .Notice that the edge-lengths have an exchangeable distribution and are independent of the shape tree T n .Furthermore, elementary computations give that (h n,v , v ∈ T n ), with v ∈ T n ranked in the lexicographic order, has, conditionally on T n and L n , the same distribution as (L n ∆ 1 , . . ., L n ∆ 2n+1 ), where ∆ 1 , . . ., ∆ 2n+1 represents the lengths of the 2n + 1 intervals obtained by cutting [0, 1] at 2n independent uniform random variables on [0, 1] and independent of L n .We thus deduce the following elementary Lemma.
Lemma 4.1.Conditionally on T n = t, the random vector (h n,v , v ∈ t) has the same distribution as (L n E v /S t , v ∈ t), where (E u , u ∈ U) are independent exponential random variables with mean 1, independent of T n and L n , and S t = v∈t E v .
According to [2], we have that a.s.lim n→+∞ L n / √ n = 1/ √ α.We then deduce from Lemma 4.1 that (2n + 1) √ α h n,∅ / √ n converges in distribution towards E ∅ as n goes to infinity.Intuitively, we get that 2 √ αn E[h n,∅ ] is of order 1, for v ∈ T n .Recall the random measure A n is defined in (19).We introduce the random measure: There exists a finite constant C such that for all f ∈ B([0, 1]) and n ∈ N * , we have: Proof.Let a ∈ [0, 1/2) and f ∈ B([0, 1]).Using (55) in the Appendix, we deduce that for all n ∈ N * , we have 1 (44) in Lemma 8.1, we deduce that: Intuitively, h n,v is of the same order of its expectation.Since the random variables (h n,v , v ∈ T n ) are exchangeable, we deduce that h n,v is of the same order as E[h n,∅ ].Based on this intuition, we define the random measure A 2,n as follows.For f ∈ B([0, 1]), we set: There exists a finite constant C such that for all f ∈ B([0, 1]) and n ∈ N * , we have: ) and Using that (h n,v , v ∈ T n ) is exchangeable, elementary computations give: 44) and ( 45) in Lemma 8.1 and (56) in Lemma 8.5, we get: for some finite constant c which does not depend on n and f .
Lemma 4.4.Let a ∈ [0, 1/2).For all f ∈ B([0, 1]) and n ∈ N * , we have: We deduce that: According to (54), we have 2 We define N n,r,U k as the number of leaves of the sub-tree T [n] which are distinct from p(U k ) and such that their most recent common ancestor with p(U k ) is at distance further than r from the root.More precisely, using the definition (13) of m, we have: In particular, we deduce from the construction of T [n] and T n that for 1 ≤ k ≤ n + 1: Recall that, for v ∈ T n , L n,v denotes the set of leaves of T n with ancestor v and L(T n ) = L n,∅ denotes the set of leaves of T n .We deduce that: where we used (24) for the last equality.Notice that by construction, conditionally on e and U k , the random variable N n,r,U k is binomial with parameter (n, σ r,U k ).For this reason, we consider the following approximation of A 3,n (f ).For f ∈ B([0, 1]) non-negative, we set: Lemma 4.5.We have the following properties.
(i) For a ∈ (0, 1), there exists a finite constant C(a) such that if f ∈ B([0, 1]) is locally Lipschitz continuous on (0, 1], we have for all n ∈ N * : (ii) If a ∈ (−1/2, 0], there exists a finite constant C(a) such that we have for all n ∈ N * : Remark 4.6.We can extend (i) of Lemma 4.5 to get that for uniformly Hölder continuous function f with exponent λ > 1/2, we have ).This allows to extend Proposition 3.5 to such functions.
In particular, for all a ∈ (−1/2, 0], a.s.for all k ∈ N, we have lim n→+∞ A n (x a+k ) = √ 2α Φ e (x a+k ).Since on [0, 1], the convergence of moments implies the weak convergence of measure, we deduce that a.s. the random measure A n (x a •) converges weakly towards √ 2α Φ e (x a •).By taking a dense subset of a in (−1/2, 0] and using monotonicity, we deduce that a.s.for all a ∈ (−1/2, 0] the random measure A n (x a •) converges weakly towards √ 2α Φ e (x a •).This ends the proof of Theorem 3.1.

6.1.
A preliminary stable convergence.Let (E v , v ∈ U) be independent exponential random variables with mean 1 and independent of e.Let f ∈ C([0, 1]).We set for v ∈ T n : (32) We have the following lemma.
Lemma 6.1.Let f ∈ C([0, 1]) be locally Lipschitz continuous on (0, 1] such that x a f esssup is finite for some a ∈ (0, 1).We have the following stable convergence: where G is a standard Gaussian random variable independent of e.
For x ≥ 0, we have Thanks to Theorem 3.1, we have: and v∈Tn We deduce that a.s.lim n→+∞ E e −λZn(f ) |T n = exp (λ 2 √ 2α Φ e (xf 2 )/2).Let K > 0, and consider the event B K = n∈N {A n (xf 2 ) ≤ K}.Since on B K , the term E e −λZn(f ) |T n is bounded by exp(λ 2 K/2), we deduce from dominated convergence that for any continuous bounded function g on the set of finite measure on [0, 1] (endowed with the topology of the weak convergence), we have: where G is a standard Gaussian random variable independent of e.We deduce that the convergence in distribution (33) holds conditionally on B K .Since A n (xf 2 ) is finite for every n and converges a.s. to a finite limit, we get that for any ε > 0, there exists K ε finite such that P(B Kε ) ≥ 1 − ε.Then use Lemma 6.2 below to conclude that (33) holds for f non-negative.
In the general case, we set f + = max(0, f ) and f − = max(0, −f ) so that f = f + − f − .Notice that f + and f − are non-negative and continuous.We have proved that (33) holds with f replaced by λ + f + + λ − f − for any λ + ≥ 0 and λ − ≥ 0. Since f + f − = 0, this implies the following convergence in distribution: where G + and G − are independent standard Gaussian random variables independent of e.
Then, using again that f + f − = 0, we obtain that, conditionally on e, where G is a standard Gaussian random variable independent of e.We deduce that (33) holds.This ends the proof.Lemma 6.2.Let (Γ ε , ε > 0) be a sequence of events such that lim ε→0 P(Γ ε ) = 1.Let (W n , n ∈ N) and W be random variables taking values in a Polish space M. Assume that for all ε > 0, conditionally on Γ ε , the sequence (W n , n ∈ N) converges in distribution towards W . Then (W n , n ∈ N) converges in distribution towards W .
Proof.Let g be a real-valued bounded continuous function defined on M. It is enough to prove that lim n→+∞ |E[g(W n )] − E[g(W )]| = 0.By hypothesis, we have that for all ε > 0: We get: )]| = 0.This ends the proof.6.2.Proof of Proposition 3.5.We deduce Proposition 3.5 directly from Lemmas 6.3 and 6.4 below.
Using notations from Section 4, we set: ) be locally Lipschitz continuous on (0, 1] such that x a f esssup is finite for some a ∈ (0, 1).We have the following convergence in probability: Proof.We keep notations from Section 4. We have: where Using Lemmas 4.2, 4.4 and 4.5 part (i), we deduce the following convergence in probability: We study the convergence of ∆ 5,n .We set: By conditioning with respect to e, we deduce that:

Using the definition of
From the a.s.convergence of A 4,n (|f |) towards a finite limit, see Lemma 4.7, we deduce that a.s.lim n→+∞ ∆ 6,n = 0. Since E 1 0 ds e(s) 2 is finite, see [47], we deduce from (34) that lim n→+∞ E[∆ 2  7,n ] = 0. We obtain that: Then, we collect all the convergences together to get the result.Now, we study the convergence in distribution of ∆ n (f ).
Lemma 6.4.Let f ∈ C([0, 1]) be locally Lipschitz continuous on (0, 1] such that x a f esssup is finite for some a ∈ (0, 1).We have the following convergence in distribution: where G is a standard Gaussian random variable independent of e.
Proof.According to Lemma 4.1, we get that (∆ n (f ), A n ) is distributed as (∆ n (f ), A n ) where: and S t = v∈t E v for t ∈ T, with L n a random variable distributed as L n , and thus with density given by (52), independent of T n and (E u , u ∈ U) independent exponential random variables with mean 1, independent of L n and T n .So it is enough to prove (35) with ∆ n replaced by ∆ n .
Recall the definition (32) of Z n (f ).Since L n is independent of (E u , u ∈ U) and T n , we get: Thanks to Corollary 8.4 with α = γ = 1 and β = 0, we have that: Using (54), we get: We deduce that lim n→+∞ κ 1,n = 0 in probability.Using (53) and Corollary 8.4 (three times), we get: We deduce that lim n→+∞ κ 2,n = 0 in probability.We deduce from the law of large numbers that lim n→+∞ S Tn /|T n | = 1 in probability.According to [2], we have that a.s.lim n→+∞ L n / √ n = 1/ √ α.This implies the following convergence in probability lim n→+∞ L n / √ n = 1/ √ α.We obtain that: We deduce that (2 √ α ∆ n (f ), A n ) has the same limit in distribution as (−Z n (f ), A n ) as n goes to infinity.Then use Lemma 6.1 to get that (35) holds with ∆ n replaced by ∆ n .This ends the proof of the Lemma.

Proof of Corollary 3.8
Before stating the proof, we recall the definition of the contour process of a discrete rooted ordered tree, see [15].7.1.Contour process.Let t ∈ T be a finite tree.The contour process C t = (C t (s), s ∈ [0, 2|t|]) is defined as the distance to the root of a particle visiting continuously each edge of t at speed one (where all edges are of length 1) according to the lexicographic order of the nodes.More precisely, we set ∅ = u(0) < u(1) < . . .< u(|t| − 1) the nodes of t ranked in the lexicographic order.By convention, we set u(|t|) = ∅.
We set 0 = 0, |t|+1 = 2 and for k ∈ {1, . . ., We define for k ∈ {0, . . ., |t| − 1}: For u ∈ t, we define I u the time interval during which the particle explores the edge attached below u.More precisely for k ∈ {1, . . ., |t| − 1}, we set: 7.2.Elementary functionals of finite trees.Let t ∈ T be a finite tree and k ∈ N * .For u = (u 1 , . . ., u k ) ∈ t k , we define m(u) = m({u 1 , . . ., u k }) the most recent common ancestor of u 1 , . . ., u k .We consider the following elementary functional of a tree, defined for t ∈ T: We have: v∈t which we obtain from the following equalities For x = (x 1 , . . ., x k ) ∈ R k , denote by (x (1) , . . ., x (k) ) its order statistic which is uniquely defined by , with δ z the Dirac mass at z. Recall notation m h (s, t), see (13), for the minimum of h over the interval with bounds s and t.We set: We have the following lemma.
Lemma 7.1.We have for t ∈ T and k ∈ N * : Proof.For u = (u 1 , . . ., u k ) ∈ t k , we have the following generalization of (36 (Notice that m C t (x (1) , x (k) ) = d(∅, m(u)) as soon as m(u) ≺ u i for all i ∈ {1, . . ., k}.)We deduce that: By summing over u ∈ t k , we get: Use the change of variable 2y = x to get (40).
7.3.Convergence of contour processes.We assume that p is a probability distribution on N such that 1 > p(1) + p(0) ≥ p(0) > 0 and which is critical (that is k∈N kp(k) = 1).We also assume that p is in the domain of attraction of a symmetric stable distribution of Laplace exponent ψ(λ) = κλ γ with γ ∈ (1, 2] and κ > 0, and renormalizing sequence (a p , p ∈ N * ) of positive reals: if (U k , k ∈ N * ) are independent random variables with the same distribution p, and W p = p k=1 U k − p, then W p /a p converges in distributions, as p goes to infinity, towards a random variable X with Laplace exponent −ψ (that is E[e −λX ] = e ψ(λ) for λ ≥ 0).Notice this convergence implies that: (41) lim p→+∞ a p p = 0.
The main theorem in Duquesne [14] on the functional convergence in distribution of the contour process stated when p is aperiodic, can easily be extended to the case p periodic.(Indeed the lack of periodicity hypothesis is mainly used in Lemma 4.5 in [14] which is based on Gnedenko local limit theorem.Since the latter holds a fortiori for lattice distributions in the domain of attraction of stable law, it allows to extend the result to such periodic distribution, as soon as one uses sub-sequences on which the conditional probabilities are well defined.)It will be stated in this more general version, see Theorem 7.3 below.Since the contour process is continuous as well as its limit, the convergence in distribution holds on the space C([0, 1]) of real continuous functions endowed with the supremum norm.
The process H, see [14] for a construction of H, is the so called normalized excursion for the height process, introduced in [39], of a Lévy tree with branching mechanism ψ.
where σ r,s (H) is the length of the excursion of the height process H above r straddling s defined in (14) and where the convergence is taken along the infinite sub-sequence of p such that P(|τ | = p) > 0.
We deduce from their proofs, using the Skorohod representation theorem, that all the convergences in distribution of Corollary 7.5 hold simultaneously for all k ∈ N * .
Proof.Recall notation m h (s, t) and σ r,s (h) given in (13) and (14).We shall take limits along the infinite sub-sequence of p such that P(|τ | = p) > 0.
Recall definitions (37) of D k (t) and (39) of D k (t).Thanks to Lemma 7.1 and (41) which implies that (p −(k+1) a p (D k (τ (p) ) − D k (τ (p) )), p ∈ N * ) converges in probability towards 0 and to (38), we see the proof of the corollary is complete as soon as we obtain that for all k ∈ N * : (42) lim We deduce from Theorem 7.3 the following convergence in law:

ds H(s).
This gives (42) for k = 1.Thanks to equality (60) with a = k − 1, we have that for k ≥ 2 and t ∈ T: We deduce from Theorem 7.3 the following convergence in law for all k ∈ N * such that k ≥ 2: Then use (57) from Lemma 8.6 to get (42).This ends the proof.and k ∈ N * , there exists a finite constant C k,β such that for all n ∈ N * , ( 43) (Notice that ( 43) is stated in [21] with T * n,v = T n,v \L(T n,v ) instead of T n,v ; but using that |T n,v | = 2|T * n,v | + 1 it is elementary to get (43).)The following lemma, which plays a key role in the proofs of Lemmas 4.2 and 4.3, is a direct consequence of these upper bounds.Lemma 8.1.For all a ∈ [0, 1/2) and f ∈ B([0, 1]), we have for k ∈ N * : Proof.Let k ∈ N * .Using (43), we have: which gives (44).Moreover, we also have: and we get (45).

8.2.
A lemma for binomial random variables.We give a lemma used for the proof of Lemma 4.5.
Lemma 8.2.Let X be a binomial random variable with parameter (n, p) ∈ N * × (0, 1).(i) For a ∈ (0, 1], we have ]) be locally Lipschitz continuous and b ∈ (0, 1).Then we have: Proof.We prove (i).Let a ∈ (0, 1].Let X be a binomial random variable with parameter (n, p).An elementary computation gives that: Using Jensen inequality and (46), we get We prove (ii).Let b ∈ (0, 1).We have f We decompose the right-hand side term into two parts: We shall use the following key inequality: for all x, y > 0 and 0 < b < 1, we have: For the first term of the right hand side of (48), using (49), we have p Hence, we get: For the second term of the right hand side of (48), using (49) again, we get: Using ( 47), ( 48), ( 50) and (51), we get the expected result.
8.3.Some results on the Gamma function.We give here some results on the moments of Gamma random variables.
We directly deduce the following result.
Elementary computations on the branch length of T [n] .We keep notations from Section 4. Recall that the density of (h n,v , v ∈ T n ) is, conditionally on T n , given by (23).
Recall L n = v∈Tn h n,v denotes the total length of T [n] .It is easy to deduce that the density of L n , conditionally on T n , is given by: In particular, the random variable L n is independent of T n .The first two moments of L n are given by According to [26], we have that (n + 1) s−1 ≤ Γ(n + s)/Γ(n + 1) ≤ n s−1 for n ∈ N * and s ∈ [0, 1].Hence, we obtain: Using that L n = v∈Tn h n,v and that, conditionally on T n , the random variables (h n,v , v ∈ T n ) are exchangeable, we deduce that E[h n,∅ ] = E[L n ]/(2n + 1) and thus: We finish by a result on the covariance of the branch lengths, used in Lemma 4.3.We define The lemma is then a consequence of these equalities and (55).
8.6.Proof of the first part of Lemma 3.10 (finiteness of Z H β and ( 22)).We use the setting of [15] on Lévy trees.Let H be the height function of a stable Lévy tree with branching mechanism ψ(λ) = κλ γ , with γ ∈ (1, 2] and κ > 0. Let N be the excursion measure of the height process and set σ = inf{s > 0, H(s) = 0} for the duration of the excursion so that: N[1 − e −λσ ] = ψ −1 (λ) for all λ > 0. Let N (a) [•] = N[•|σ = a] be the distribution of the excursion of the height process with duration a.In particular, we shall prove the result of Lemma 3.10 under N (1) .We recall that:

•
In this proof only, we shall write m for m H defined by (13).We extend the definitions ( 14) and (15)  for β > 0.
associated marked tree.For 1 ≤ k ≤ n + 1, we denote by u(U k ) the leaf in T n corresponding to the leaf p(U k ) in T [n] .See Figure (1) for an example with n = 4.It is well known that T n is uniform among the discrete full binary ordered trees with n internal nodes.
be the set of leaves of T n with ancestor v, and |L n,v | be its cardinal.Notice the number of leaves of T n,v is exactly |L n,v |.We now approximate the multiplying factor |T n,v | in A 2,n by twice the number of leaves in T n,v as 2|L n,v | = |T n,v | + 1.For this reason, we set for f ∈ B([0, 1]):

Corollary 7 . 5 .
Under the hypothesis and notations of Theorem 7.3, we have the following convergences in distribution for all k ∈ N * : lim p→+∞ a p p k+1 v∈τ(p)