Vertices with fixed outdegrees in large Galton-Watson trees

We are interested in nodes with fixed outdegrees in large conditioned Galton-Watson trees. We first study the scaling limits of processes coding the evolution of the number of such nodes in different explorations of the tree (lexicographical order and contour order) starting from the root. We give necessary and sufficient conditions for the limiting processes to be centered, thus measuring the linearity defect of the evolution of the number of nodes with fixed outdegrees. This extends results by Labarbe & Marckert in the case of the contour-ordered counting process of leaves in uniform plane trees. Then, we extend results obtained by Janson concerning the asymptotic normality of the number of nodes with fixed outdegrees.


Introduction
Much attention has been recently given to the fine structure of large random trees. In this paper, we focus particularly on the distribution of vertex degrees in large conditioned Galton-Watson trees, and on how they are spread out in these trees.

Motivations
The study of scaling limits of Galton-Watson trees (in short, GW trees) with critical offspring distribution (that is with mean 1) conditioned by their number of vertices has been initiated by Aldous [4,5,6]. Aldous showed that the scaling limit of large critical GW trees with finite variance is the so-called Brownian continuum random tree (CRT). As a side result, he proved the convergence of their properly rescaled contour functions, which code the trees, to the Brownian excursion. This result was extended by Duquesne, who showed that the scaling limits of critical GW trees, when the offspring distribution has infinite variance and is in the domain of attraction of a stable law, are α-stable trees (with α ∈ (1, 2]), which were introduced by Le Gall & Le Jan [26] and Duquesne & Le Gall [12]. From a more discrete point of view, Abraham and Delmas [2,1] extended the work of Kesten [19] and Janson [16] by describing in full generality the local limits of critical GW trees conditioned to have a fixed large number of vertices.
The number of vertices with a fixed outdegree in large conditioned critical GW trees with finite variance was studied by Kolchin [20], who showed that it is asymptotically normal. This topic has recently triggered a renewed interest. Minami [30] established that these convergences hold jointly under an additional moment condition, which was later lifted by Janson [17]. Rizzolo [34] considered more generally GW trees conditioned on a given number of vertices with outdegree in a given set. One of the motivations for studying these quantities is that there is a variety of random combinatorial models coded by GW trees in which vertex degrees represent a quantity of interest. For example, in [3], vertex degrees code sizes of 2-connected blocs in random maps and, in [22], vertex degrees code sizes of faces in dissections. Also, Labarbe & Marckert [24] studied the evolution of the number of leaves in the contour process of a large uniform plane tree.

Evolution of vertices with fixed outdegrees
Our first contribution concerns scaling limits of processes coding the evolution of vertices with fixed outdegrees in different explorations of large GW trees starting from the root. We shall explore the tree in two ways by using either the contour process (which was considered by Labarbe & Marckert [24]), or the lexicographical order.
In order to state our result, we need to introduce some quick background and notation (see Section 2 for formal definitions). An offspring distribution µ, which is a probability distribution on Z + , is said to be critical if it has mean 1. To simplify notation, we set µ i = µ(i) for i ≥ 0. If T is a plane tree and A ⊂ Z + , we say that a vertex of T is a A-vertex if its outdegree (or number of children) belongs to A. We define N A (T ) as the number of A-vertices in T , and we set µ A = i∈A µ i to simplify notation. We say that T is a µ-GW tree if it is a GW tree with offspring distribution µ. We will always implicitly assume, for the sake of simplicity, that the support of the offspring distribution µ is non-lattice (a subset A ⊂ Z is lattice if there exists b ∈ Z and d ≥ 2 such that A ⊂ b + dZ), so that for every n sufficiently large a µ-GW tree conditioned on having n vertices is well defined (but all the results carry through to the lattice setting with mild modifications).
For n ≥ 1, we denote by T n a µ-GW tree conditioned to have n vertices.
Let T be a plane tree with n vertices. To define the contour function (C t (T ), 0 ≤ t ≤ 2n) of T , imagine a particle that explores the tree from the left to the right, starting from the root and moving at unit speed along the edges. Then, for 0 ≤ t ≤ 2(n − 1), C t (T ) is defined as the distance to the root of the position of the particle at time t. We set C t (T ) = 0 for t ∈ [2(n − 1), 2n] (see Fig. 1 for an example). For every 0 ≤ t ≤ 1, let N A 2nt (T ) be the number of different A-vertices already visited by C(T ) at time 2nt . In particular, N A 2n (T ) = N A (T ).
When µ follows a geometric distribution of parameter 1/2 (so that T n follows the uniform distribution on the set of all plane trees with n vertices) and A = {0}, Labarbe & Marckert showed that the convergence holds jointly in distribution in C([0, 1], R 2 ), where e is the normalized Brownian excursion, B is a Brownian motion independent of e and C([0, 1], R 2 ) is the space of continuous R 2 -valued functions on [0, 1] equipped with the uniform topology.
In words, the counting process N {0} (T n ) behaves linearly at the first order, and has centered Brownian fluctuations. Labarbe  shall use. If T is a plane tree with n vertices, we denote by (v i (T )) 0≤i≤n−1 the vertices of T ordered in the lexicographical order (also known as the depth-first order). The Fig. 1 for an example). For t ∈ [0, n], we set W t (T ) = W t (T ). For t ∈ [0, 1], we define K A nt (T ) as the number of A-vertices visited by W (T ) at time nt (in other words, K A nt (T ) is the number of A-vertices in the first nt vertices of T in the lexicographical order). In the next result, convergences hold in distribution in the space D([0, 1], R 2 ) of càdlàg processes on [0, 1] equipped with the Skorokhod J 1 topology (for technical reasons it is simpler to work with càdlàg processes; see [15,Chap. VI] for background). Theorem 1.1. Let µ be a critical distribution with finite variance σ 2 > 0 and T n be a µ-GW tree conditioned to have exactly n vertices. Let A ⊂ Z + be such that µ A > 0, and Then the following assertions hold: where B is a standard Brownian motion independent of e (see Fig. 2 for a simulation). (ii) The following convergence holds in distribution, jointly with that of (i): As was previously mentioned, assertion (ii) of Theorem 1.1, in the particular case where A = {0} and µ is a geometric 1/2 offspring distribution, was proved by Labarbe & Marckert [24]. It turns out that for leaves, the fluctuations of the counting process N {0} (T n ) are always centered, irrespective of the offspring distribution. However, the fluctuations are different when one considers other outdegrees or the lexicographical order instead of the contour visit counting process. Let us briefly comment on the strategy of the proof of Theorem 1.1, which is different from the approach of Labarbe & Marckert (who rely on explicit formulas for the number of paths with ±1 steps and various constraints). We start by working with the Lukasiewicz EJP 25 (2020), paper 64.
The second one evolves asymptotically as half of the first one plus an independent Brownian motion.
path and establish Theorem 1.1 (i) by combining a general formula giving the joint distribution of outdegrees in GW trees in terms of random walks (Section 3) with absolute continuity arguments and the Vervaat transform. Theorem 1.1 (ii) is then a rather direct consequence of (i) by relating the contour exploration to the depth-first search exploration (see in particular Lemma 4.3).
In Section 4.3, we extend Theorem 1.1 (ii) when we only take into account the k-th time we visit a vertex with outdegree i (with k, i integers such that 1 ≤ k ≤ i + 1). To this end, we give a description of the structure of branches in the tree using binomial-tail inequalities, which could be of independent interest.
Finally, an extension of this theorem to offspring distributions with infinite variance can be found in Section 6.
Asymptotic normality of the number of vertices with fixed outdegree Our next contribution is to extend the joint asymptotic normality of the number of vertices with a fixed outdegree in large conditioned critical GW trees obtained by Janson [17], by counting vertices whose outdegree belongs to a fixed subset of Z + and by allowing a more general conditioning. Indeed, we shall focus on µ-GW trees conditioned to have n B-vertices, for a fixed B ⊂ Z + (we shall always implicitly restrict ourselves to values of n such that this conditioning makes sense). Theorem 1.2. Let µ be a critical offspring distribution with positive finite variance and let A, B be subsets of Z + such that µ B > 0. For n ≥ 1, let T B n be a µ-GW tree conditioned to have n B-vertices. Then: is a centered Gaussian random variable with variance δ 2 A,B . In addition, δ A,B = 0 if and only if µ A = 0 or µ A\B = µ B\A = 0. (iii) the convergences (1.1) hold jointly for A ⊂ Z + , in the sense that for every j ≥ 1 Gaussian vector.
As previously mentioned, this extends results of Kolchin [20], Minami [30] and Janson [17]. The main idea is, roughly speaking, to use a general formula giving the joint distribution of outdegrees in GW trees in terms of random walks of Section 3 (which was already used in the proof of Theorem 1.1), combined with various local limit estimates (Section 5). As we will see (cf (5.2)), in the case A = Z + , we have δ 2 A,B = γ 2 B /µ 3 B (with γ B defined as in Theorem 1.1 by replacing A by B). Also, the proof of Theorem 1.2 (ii) gives a way to compute explicitly δ A,B (see Example 5.6 for the explicit values of the variances and covariances in the cases B = Z + and B = {a} for some a ∈ Z + ). See Section 6 for discussions concerning other offspring distributions.
Our approach, based on a multivariate local limit theorem, applies more generally when µ is in the domain of attraction of a stable law. In this case, it allows us to prove the convergence of T B n (properly renormalized) towards a Lévy tree, thus generalizing [21, Theorem 8.1] which was stated only under the condition that B or Z + \B is finite.
These new results can be found in Section 6.

Background on trees and their codings
We start by recalling some definitions and useful well-known results concerning Galton-Watson trees and their coding by random walks (we refer to [25] for details and proofs).

Plane trees
We first define plane trees using Neveu's formalism [31]. First, let N * = {1, 2, . . .} be the set of all positive integers, and U = ∪ n≥0 (N * ) n be the set of finite sequences of positive integers, with (N * ) 0 = {∅} by convention. By a slight abuse of notation, for k ∈ Z + , we write an element u of (N * ) k by u = u 1 · · · u k , with u 1 , . . . , u k ∈ N * . For k ∈ Z + , u = u 1 · · · u k ∈ (N * ) k and i ∈ Z + , we denote by ui the element u 1 · · · u k i ∈ (N * ) k+1 and iu the element iu 1 · · · u k ∈ (N * ) k+1 . A tree T is a subset of U satisfying the following three conditions: (i) ∅ ∈ T (the tree has a root); (ii) if u = u 1 · · · u n ∈ T , then, for all k ≤ n, u 1 · · · u k ∈ T (these elements are called ancestors of u); (iii) for any u ∈ T , there exists a nonnegative integer k u (T ) such that, for every i ∈ N * , ui ∈ T if and only if 1 ≤ i ≤ k u (T ) (k u (T ) will be called the number of children of u, or the outdegree of u). The elements of T are called the vertices of T . The set of all the ancestors of a vertex u will be called the ancestral line of u, by analogy with genealogical trees. We denote by |T | the total number of vertices of T .
The lexicographical order ≺ on U is defined as follows: ∅ ≺ u for all u ∈ U\{∅}, and for u, w = ∅, if u = u 1 u and w = w 1 w with u 1 , w 1 ∈ N * , then we write u ≺ w if and only if u 1 < w 1 , or u 1 = w 1 and u ≺ w . The lexicographical order on the vertices of a tree T is the restriction of the lexicographical order on U; for every 0 ≤ k ≤ |T | − 1 we write v k (T ), or v k when there is no confusion, for the (k + 1)-th vertex of T in the lexicographical order. Recall from the introduction that the Lukasiewicz path Galton-Watson trees Let µ be an offspring distribution with mean at most 1 such that µ(0) + µ(1) < 1 (implicitly, we always make this assumption to avoid degenerate cases). A GW tree T with offspring distribution µ (also called µ-GW tree) is a random variable taking values in the space of all finite plane trees, characterized by the fact that P(T = T ) = u∈T µ ku(T ) for every finite plane tree T . We also always implicitly assume that gcd(i ∈ Z + , µ i > 0) = 1, so that P(|T | = n) > 0 for every n sufficiently large (µ is said to be aperiodic). All the results can be adapted to the periodic setting with mild modifications. EJP 25 (2020), paper 64.
A key tool to study GW trees is the fact that their Lukasiewicz path is, roughly speaking, a killed random walk, which allows to obtain information on GW trees from the study of random walks. More precisely, let S be the random walk on Z + ∪ {−1} starting from S 0 = 0 with jump distribution given by P(S 1 = i) = µ i+1 for i ≥ −1 (we keep the dependency of S in µ implicit). The proof of the following lemma can be found in [25]. Lemma 2.1. Let µ be an offspring distribution with mean at most 1 and T n be a µ-GW tree conditioned on having n vertices. Then (W i (T n )) 0≤i≤n has the same distribution as (S i ) 0≤i≤n conditionally given the event {S n = −1, ∀ 0 ≤ i ≤ n − 1, S i ≥ 0}.

Several useful ingredients
We finally gather two very useful ingredients. The first one is a joint scaled convergence in distribution of the contour process (which was defined in the introduction) and the Lukasiewicz path of a critical GW tree with finite variance, conditioned to have n vertices, to the same Brownian excursion. [29], Duquesne [10]). Let µ be a critical offspring distribution with finite positive variance σ 2 . Then the following convergence holds jointly in distribution:

Theorem 2.2 (Marckert and Mokkadem
where e has the law of the normalized Brownian excursion.
This result is due to Marckert and Mokkadem [29] under the assumption that µ has a finite exponential moment. The result in the general case can be deduced from [ When Supp(S 1 ) is non-lattice, observe that one can take b = 0 and h = 1 in the previous result. This theorem admits the following generalization in the multivariate setting (see e.g. [35, Theorem 6.1]). In the multivariate case in dimension j ≥ 1, we say that a random variable Y ∈ Z j is aperiodic if there is no strict sublattice of Z j containing the set of Furthermore, S j denotes the set of symmetric positive definite matrices of dimension j.
random variables in Z j , such that the covariance matrix Σ of Y 1 is positive definite. Assume in addition that Y 1 is aperiodic, and denote by M the mean of Y 1 . Finally, define for n ≥ 1 Then, as n → ∞, uniformly for x ∈ R j such that P (T n = x) > 0, This theorem can easily be adapted when Y 1 is not aperiodic. However, for convenience, we shall restrict ourselves to this case in what follows.

Joint distribution of outdegrees in GW trees
The first steps of the proofs of Theorems 1.1 and 1.2 both reformulate events on trees in terms of events on random walks, whose probabilities are easier to estimate. In this direction, in this section, we give a general formula for the joint distribution of outdegrees in GW trees in terms of random walks (Proposition 3.1) and establish technical estimates (Lemma 3.4) which will be later used several times.

A joint distribution
The following proposition is a key tool in the study of the outdegrees in a µ-GW tree T , as it allows to study the joint distribution of (N Z+ (T ), N B (T )): be a random walk starting from 0, whose jumps are independent and distributed according to µ(· + 1), and let (J B i ) i≥0 be the walk starting from 0 such that, for Then, for every n ≥ 1 and k ≥ 0, To see this, notice that [21, Equation (2)] exactly provides the result in the case B = {0}, and that the same argument works for a general subset B.
The following asymptotics, which can be derived from a local limit theorem (see e. g. [34] or [21, Theorem 8.1]) will be useful throughout the paper:

A technical estimate
We introduce other probability measures as follows: We let m C be the expectation of p C and σ 2 C be its variance. The following identities will be useful.

Lemma 3.3.
Assume that µ is critical and has finite positive variance σ 2 .
Then the following identities hold: In particular, observe that γ B is well-defined by (iii). Furthermore, if #Supp(µ) ≥ 3, then at least one of the variances σ 2 B and σ 2 B c is positive, which implies by (iii) that γ B > 0.
Proof. For (i), simply write that the quantity which is equal to 0 since µ is critical. The second assertion is clear, while the proof of the last one is similar to the first one and is left to the reader. Let us keep the notation of Proposition 3.1. In particular, recall that the walk ( The following estimate will play an important role. Lemma 3.4. Let µ be an aperiodic critical offspring distribution with positive finite variance σ 2 such that #Supp(µ) ≥ 3, and let B ⊂ Z + be such that µ B > 0 and µ B c > 0. Assume in addition that p B or p B c is aperiodic. Fix a ∈ R and let (a n ) be a sequence of integers such that a n / √ n → n→∞ a. Then the following assertions hold as n → ∞, uniformly for c in a compact subset of R: Observe that (ii) is a straightforward consequence of (i) and Proposition 3.1. (i) itself follows from the multivariate local limit theorem 2.4: Proof of Lemma 3.4 (i). The idea is to apply Theorem 2.4 to a sequence of i.i.d. variables is aperiodic as well. Furthermore, the mean and the covariance matrix of Y 1 are respectively equal to: where σ 2 is the variance of µ. In particular, det Σ = σ 2 γ 2 B > 0. On the other hand, as µ is non-lattice, for n large enough, uniformly for c in a compact subset of R, P(S n = a n , J B n = k n (c)) > 0. An easy computation, with the help of Lemma 3.3 (ii), gives the result that we want.

Evolution of outdegrees in an exploration of a Galton-Watson tree
The aim of this section is to establish Theorem 1.1. Recall from the introduction that if T is a tree and A ⊂ Z + , C(T ) denotes the contour function of T , for 0 ≤ t ≤ 1, N A 2nt (T ) denotes the number of different A-vertices already visited by C(T ) at time 2nt and K A nt (T ) denotes the number of A-vertices in the first nt vertices of T in the depth-first search (or, equivalently, the lexicographical order).
We assume here that A ⊂ Z + is such that µ A > 0. We keep the notation of Section 3.1, and denote in particular by m A the expectation of a random variable with law given by p A (i) = µi+1 µ A 1 i+1∈A for i ∈ Z.

Depth-first exploration
In this section, we study the evolution of the number of A-vertices in conditioned GW trees for the depth-first search, and establish in particular Theorem 1.1 (i). Throughout this section, we fix a critical distribution µ with finite positive variance σ 2 , and we let T n denote a µ-GW tree conditioned on having n vertices.
The idea of the proof of Theorem 1.1 (i) is the following. By Lemma 2.1, the convergence of Theorem 1.1 (i) can be restated in terms of the random walk (S i ) 0≤i≤n (with jump distribution given by P( We first establish a result for the "bridge" version where one works conditionally given the event {S n = −1} (Lemma 4.1) and then conclude by using the so-called Vervaat transform.
To simplify notation, for every t ≥ 0, we set S t = S t and Lemma 4.1. The following convergence holds in distribution where B br is a standard Brownian bridge and B is a standard Brownian motion independent of B br .
Proof. We first check that the corresponding nonconditioned statement holds, namely that the following convergence holds in distribution: where B is a standard Brownian motion and B a standard Brownian motion independent of B. To this end, by [18,Theorem 16.14], it is enough to check that the one-dimensional convergence holds for t = 1. By Lemma 3.4 (i), uniformly for a, b in a compact subset of R: It is standard (see e.g. [8,Theorem 7.8]) that this implies that (S n / √ n, We now establish (4.1) by using an absolute continuity argument. We fix u ∈ (0, 1), a bounded continuous functional F : D([0, u], R 2 ) → R, and to simplify notation set Then, setting φ n (i) = P(S n = i), we have An application of the local limit theorem 2.3 allows to write as n → ∞ where q t denotes the density of a centered Brownian motion of variance σ 2 at time t. Therefore, by (4.2), as n → ∞, where the last identity follows from standard absolute continuity properties of the Brownian bridge (see e.g. [33, Chapter XII]).
The convergence (4.3) shows in particular that, conditionally given S n = −1, the pro- is tight conditionally given S n = −1. To this end, notice that by time-reversal the process ( S i , J i ) 0≤i≤n := (S n − S n−i , J A n − J A n−i ) 0≤i≤n has the same distribution as (S i , J i ) 0≤i≤n (and this also holds conditionally given S n = −1). Then write . Now, by Lemma 3.4 (i) and the local limit theorem, uniformly for b in a compact subset of R, P( is tight on [u, 1] conditionally given S n = −1. This allows us to conclude that this process is actually tight on [0, 1], and in addition, this identifies the convergence of the finite dimensional marginal distributions. In order to deduce Theorem 1.1 (i) from the bridge version of Lemma 4.1, we now use the Vervaat transformation, whose definition is recalled here.
We shall also need the notation 1] ω}. The shifted function ω (g1(ω)) is usually called the Vervaat transform of ω. Proof. Since B and B br are independent, it readily follows that B (τ ) has the law of a standard Brownian motion, and is independent of (τ, B br ), and therefore is independent of B br,(τ ) . On the other hand, B br,(τ ) has the law of e (see e.g. [36]). The result follows.
Proof of Theorem 1.1 (i). We keep the notation of Lemma 4.2, and also let (S br,n , J n ) = (S br,n nt , J n nt ) 0≤t≤1 be a random variable distributed as (S nt , J A nt − ntµ A ) 0≤t≤1 conditionally given S n = −1. We set τ n = g 1 (S br,n ). It is well-known (see e.g. [36]) that S br,n,(τn) has the same distribution as (W nt (T n )) 0≤t≤1 . It follows that S br,n,(τn) , J n,(τn) Since B br and B are almost surely continuous at τ , by Lemma 4.1 and standard continuity properties of the Vervaat transform, it follows that By Lemma 4.2, this last process has the same distribution as (σe t , i∈A (i−1)µi and this completes the proof.

Contour exploration
We are now interested in the evolution of the number of A-vertices in conditioned GW trees for the contour visit counting process, and establish in particular Theorem 1.1 (ii). The idea of the proof is to obtain a relation between the counting process N A for the contour process and the counting process K A for the depth-first search order.
In this direction, if T is a tree with n vertices, for every 0 ≤ k ≤ 2n − 2, we denote by b k (T ) the number of different vertices visited by the contour process C(T ) up to time k. We set b k (T ) = b 2n−2 (T ) for k ≥ 2n − 2, and b t (T ) = b t (T ) for t ≥ 0. It turns out that the following simple deterministic relation holds between b(T ) and C(T ).
Proof. We show that the result holds for k = 0, and that if it holds at time 0 ≤ k ≤ 2n − 3 then it holds at time k + 1. For 0 ≤ k ≤ 2n − 2, let u k be the vertex visited by the contour process at time k. First, at time k = 0, the root is the only vertex visited and b 0 (T ) = 1. Now assume that the result holds until time 0 ≤ k ≤ 2n − 3. Then we see that u k+1 is visited for the first time at time k + 1 if and only if the contour process goes up between u k and u k+1 .
In both cases, the formula is also valid at time k + 1.
We are now in position to establish Theorem 1.1 (ii).  Next, for every t By (4.4) and (4.5), it follows that the convergence holds in distribution, jointly with (4.4). Since µ A m A = i∈A (i − 1)µ i , this completes the proof.

Extension to multiple passages
In Theorem 1.1 (ii), the process N A counts A-vertices the first time they are visited by the contour exploration. In this Section, we are interested in what happens when instead we count vertices at later visit times. In this direction, if T is a tree, for every and 1 ≤ k ≤ i + 1 and 0 ≤ ≤ 2|T |, we denote by N i,k (T ) the number of vertices of outdegree i visited at least k times by the contour exploration of T between times 0 and . Finally, for i ≥ 0, we set N i = N {i} to simplify notation. As before, we fix a critical distribution µ with finite positive variance σ 2 , and we let T n denote a µ-GW tree conditioned on having n vertices.

Theorem 4.4. We have
where B is a standard Brownian motion independent of e, and The main ingredient of the proof is a relation between N i (T ) and N i,j (T ), for which we need to introduce some notation. If T is a tree, for u ∈ T and 1 ≤ j ≤ i, we denote by A i,j u (T ) the number of ancestors of u in T with i children whose jth child is an ancestor of u. For 0 ≤ t ≤ 2|T | − 2, denote by u t (T ) the vertex visited at time t by contour exploration. Then, for every 0 ≤ ≤ 2|T | − 2, observe that because i-vertices of T that have been visited at least once up to time , but not k times yet, are necessarily ancestors of u (T ). Indeed, all the subtrees attached to a strict ancestor of u (T ) have either been completely visited or not visited at all (except the subtrees containing u (T )).
The following result, which is of independent interest, will allow to control the asymptotic behaviour of A i,j u (T n ), when the height of u is large enough. See [28] for other bounds on A i,j (T n ) under an additional finite exponential moment assumption. For a nonnegative sequence (r n ), we write r n = oe(n) if there exist C, ε > 0 such that r n ≤ Ce −n ε for every n ≥ 1. P ∃u ∈ T n , ∃j ∈ 1, i : |u| ≥ n 1/10 , Before proving this bound, let us explain how Theorem 4.4 follows.
We now get into the proof of Proposition 4.5.
Proof of Proposition 4.5. First, observe that if T is a nonconditioned µ-GW tree, then P ∃u ∈ T n , ∃j ∈ 1, i : |u| ≥ n 1/10 , In order to compute these expectations, let us mention the existence of the local limit T * of the trees T n . This limit is defined as the random variable on the set of infinite trees satisfying, for any r ≥ 0, where B r denotes the ball of radius r centered at the root for the graph distance (all edges of the tree having length 1). T * is an infinite tree called Kesten's tree, made of a unique infinite branch on which i.i.d. µ-Galton-Watson trees are planted (see [19] for details). The local behaviour of the trees T n can be deduced from the properties of this infinite tree; in particular, a standard size-biasing identity à la Lyons-Pemantle-Peres [27] (see [11,Eq. (23)] for a precise statement) gives where U k (T * ) denotes the vertex of the unique infinite branch of T * at height k. In particular, this expectation does not depend on j.
Finally, let us remark that the estimate of Proposition 4.5 is strong enough to get the following refinement of Theorem 4.4 (whose proof is left to the reader): Then the following convergence holds in distribution: where B is a standard Brownian motion and

Asymptotic normality of outdegrees in large Galton-Watson trees
The main goal of this Section is to prove Theorem 1.2 (i) and (ii). We fix a critical offspring distribution µ with finite positive variance σ 2 , and A, B ⊂ Z + such that µ B > 0. If T is a tree, recall that N A (T ) is the number of A-vertices in T , and that T B n is a µ-GW tree conditioned to have n B-vertices. In the sequel, T is a nonconditioned µ-GW tree. We also assume for technical convenience that p B and p B c are both aperiodic (but the results carry through in the general setting with mild modifications).

Expectation of N A (T B
n ) Our goal is here to prove Theorem 1.2 (i). To this end, for every n ≥ 1, define the interval I n := n µ B − n 3/4 , n µ B + n 3/4 . The proof relies on the following estimates.
Lemma 5.1. We have: ∈In ] = oe(n) by Lemma 5.1 (i). In order to bound the first term in the sum of (5.1), remark that we can bound This last quantity tends to 0 as n → ∞ by Lemma 5.1 (ii) and since sup I n /n → 1/µ B . In order to complete the proof, it remains to observe that since N Z+ (T B n ) ≥ n, Lemma 5.1 (i) implies that P N Z+ T B n / ∈ I n → 0.
We now use the fact that, for any k, J B k has a binomial distribution of parameters (k, µ B ). Remark that, if k / ∈ I n , then |n − kµ B | ≥ k 3/5 . Hence, by Hoeffding's inequality, for k ∈ I n , . Therefore k / ∈In,k≥n P(J B k = n) = oe(n).
which grows at most polynomially in n according to Lemma 3.4 (ii). The second assertion now follows from the fact that In virtue of (3.1) (applied with B = Z + ), it suffices to check that P(| N A (T ) n − µ A µ B | ≥ n −1/5 , N Z+ (T ) = k) = oe(n) when k ∈ I n . By Proposition 3.1, When k ∈ I n , this last quantity is bounded from above by P(|J A k − kµ A | ≥ n 4/5 − µ A n 3/4 ), which is oe(n) since J A k has a binomial distribution of parameters (k, µ A ). This proves (ii).

Asymptotic normality of N A (T B
k ) The first step is to establish the following local version of Theorem 1.2 when A = Z + . Proposition 5.2. As k → ∞, uniformly for y in a compact subset of R.
It is standard that this implies the following asymptotic normality: ).

(5.2)
Proof of Proposition 5.2. By Lemma 3.4 (ii), we have as n → ∞, uniformly for c in a compact subset of R, By using (3.1), we have Then observe that for y ∈ R, as n, k → ∞, it is equivalent to write n = k/µ B + y This completes the proof.
We are now in position to establish Theorem 1.2 (ii), which will be a consequence of the following estimate. Lemma 5.3. Let A, B ⊂ Z + such that the quantities µ A∩B , µ A\B , µ B\A , µ A c ∩B c are all positive. Then there exists σ 2 A,B > 0, C A,B ∈ R such that for fixed u, v ∈ R ∪ {+∞, −∞}, u < v and y ∈ R, we have, as k → ∞, (z−CA,By) 2 dz.

(5.4)
Proof of Theorem 1.2 (ii), using Lemma 5.3. First assume that the four quantities µ A∩B , µ A\B , µ B\A , µ A c ∩B c are all positive. Fix u < v. For y ∈ R and k ∈ Z + , set EJP 25 (2020), paper 64. and remark that P(( Also, for y, z ∈ R define g(y, z) by g(y, z) = 1 Observe that R 2 g(y, z)dydz = 1. Then, by Proposition 5.2 and Lemma 5.3, f k (y) converges pointwise, as k → ∞, to v u g(y, z)dz. Hence, by Fatou's lemma and Fubini-Tonnelli's theorem, By Portmanteau theorem, if (X k ) is a sequence of real-valued random variables such that for every u < v, lim inf k→∞ P(u < X k < v) ≥ P(u < X < v) for a certain random variable X, then X n converges in distribution to X. This implies that We leave the case where at least one of the quantities µ A∩B , µ A\B , µ B\A , µ A c ∩B c is 0 to the reader, which is treated in the same way. In particular, one gets that δ 2 A,B > 0 except when µ A = 0 or µ A\B = µ B\A = 0. This establishes the asymptotic normality of (N A (T )|N B (T ) = k) with an expression of the limiting variance.
The proof of Lemma 5.3 is based on the following result, whose proof is a direct adaptation of the proof of Lemma 3.4 in the multivariate setting.
Lemma 5.4. Fix a ∈ R, and let (B 1 , . . . , B j ) be a partition of Z + , satisfying, for all i ∈ 1, j , µ Bi > 0. Assume in addition that at least one of the laws p B1 , . . . , p Bj is aperiodic. For 1 ≤ i ≤ j and c i ∈ R, define n i (c i ) := nµ Bi + c i √ n . Then there exists a symmetric positive definite matrix Σ := Σ (B 1 , . . . , B j ) ∈ S j (R) such that the following assertions hold, uniformly for (c 1 , . . . , c j ) in a compact subset of R j satisfying in addition j i=1 n i (c i ) = n: (i) Let (a n ) be a sequence of integers such that a n / √ n → a. Then, as n → ∞, P S n = a n , J B1 n = n 1 (c 1 ), . . . , J Bj , c 1 , . . . , c j−1 ). (ii) With the same notations, as n → ∞, we have Proof of Lemma 5.3. Let us fix y ∈ R. First, write where the last asymptotic equivalent follows from (5.
In order to prove that this quantity has a limit as k → ∞ and compute it, it is enough to prove that the map g k defined by In other words, we sum over all possible values of N A∩B (T ). The idea is that, if is far from its expectation (namely, kµ A∩B /µ B ), then q is small. On the other hand, we control q by Lemma 5.4 when is close to its expectation. More specifically, set where B i ∼ Bin( k/µ B , µ Ai ). Thus, using Hoeffding inequality, we get: On the other hand, by Lemma 5.4 (ii), there exists an invertible matrix Σ ∈ S 4 (R) and a constant C 1 > 0 such that, uniformly for h ∈ R, EJP 25 (2020), paper 64. for x := (( − k µ A∩B µ B )/ √ k, y, h, 0). By Equations (5.5) and (5.6), by summing over all ∈ Z + , we get that, as k → ∞, 2 for a certain C 2 depending on A, B, and some constants B A,B , C A,B depending on A and B. Since this limiting function is integrable, by uniform convergence, for any u, v ∈ R ∪ {+∞, −∞}, whereC(y) is a constant only depending on y (and A, B). By taking u = −∞ and v = +∞, one sees thatC(y) does not depend on y. Hence, there exists σ 2 A,B > 0 such that, for any y ∈ R,C(y) = 1/ 2πσ 2 A,B . Furthermore, by taking again u = −∞ and v = +∞, the value of the right hand side is 1, which tells us that B A,B = 1 . Finally, we conclude that for every y ∈ R and u < v: which completes the proof of Lemma 5.3.
Finally, we briefly present the proof of Theorem 1.2 (iii), which is based again on Lemma 5.4 (ii).
Let us consider the tree T B k for a certain B ⊂ Z + . Let A 1 , . . . A j ⊂ Z + . It induces a partition of Z + made of the set E : for some finite set I n ∈ Z |E| + . We can now rewrite this probability in terms of random walks and use Lemma 5.4 (ii) in order to get the asymptotic normality of the quantity

Several extensions
We now present some possible extensions of Theorems 1.1 and 1.2 for other types of offspring distributions. A natural one is the extension of these results to distributions µ that are said to be in the domain of attraction of a stable law. We first properly define this notion, before explaining how the two abovementioned theorems can be generalized in this broader framework. The second extension that we present is the case of subcritical non-generic laws, where the offspring distribution is not critical anymore. In this case, we asymptotically observe in the random tree T n a condensation phenomenon, where one vertex has macroscopic degree. See e.g. [16, Example 19.33] for more context.

Stable offspring distributions
Let us first provide some background. We say that a function L : R * + → R * + is slowly varying if, for any c > 0, L(cx)/L(x) → 1 as x → ∞. For α ∈ (1, 2], we say that a critical distribution µ belongs to the domain of attraction of an α-stable law if either µ has finite variance (in which case α = 2) or there exists a slowly varying function L such that where X is a random variable of law µ. In this case, for any sequence (D n ) n≥1 of positive numbers satisfying we have the following joint convergence: Theorem 6.1. Let α ∈ (1, 2] and µ a critical distribution with infinite variance in the domain of attraction of an α-stable law. Let (D n ) n≥1 be a sequence satisfying (6.2). Then, there exists two nondegenerate random processes X (α) , H (α) , depending only on α, such that the following convergences hold jointly: .
(ii) The following convergence holds in distribution, jointly with that of (i): Here, B denotes is a standard Brownian motion independent of (X (α) , H (α) ).
The processes X (α) , H (α) only depend on α, and are the continuous-time analogues of respectively the Lukasiewicz path and the contour function of the so-called α-stable tree (see Fig. 3 for a picture, and [12] for more details). This stable tree is a random compact metric space introduced by Duquesne and Le Gall [12], known to be the scaling limit of the sequence of size-conditioned µ-Galton-Watson trees (T n ), when µ is in the domain of attraction of an α-stable law. Notably, when α = 2, X (2) = H (2) = e. Figure 3: An approximation of the α-stable tree and the processes X (α) and H (α) , for α = 1.6.
Note that, setting σ 2 = ∞ in the definition of γ A given in Theorem 1.1, we obtain exactly γ A = µ A (1 − µ A ), so that Theorem 6.1 is indeed the natural generalization of the finite variance case. An interesting remark, in the infinite variance case, is that the two marginals of the limiting processes are independent.
On the other hand, the results of Theorem 1.2 still hold in this case: Theorem 6.2. Let µ be a critical offspring distribution with infinite variance in the domain of attraction of a stable law, and let A, B be subsets of Z + such that µ B > 0. For n ≥ 1, let T B n be a µ-GW tree conditioned to have n B-vertices. Then: (ii) there exists δ A,B ≥ 0 such that the convergence holds in distribution, where N (0, δ 2 A,B ) is a centered Gaussian random variable with variance δ 2 A,B . In addition, δ A,B = 0 if and only if µ A = 0 or µ A\B = µ B\A = 0. (iii) the convergences (1.1) hold jointly for A ⊂ Z + , in the sense that for every j ≥ 1 converges in distribution to a Gaussian vector.
These two generalizations can be obtained by slightly adapting the proofs of Theorems 1.1 and Theorem 1.2. Let us only explain the most important changes in these proofs, which consist in generalizing Theorems 2.2 and 2.4 in the stable framework: [12]). Let α ∈ (1, 2], and let µ a critical distribution in the domain of attraction of an α-stable law. Let (D n ) n≥1 be a sequence satisfying (6.2).
Then, the following convergence holds jointly in distribution: The other ingredient is a multivariate local limit theorem in the stable case. When the first coordinate of a random vector is in the domain of attraction of a stable law and has infinite variance, while all others coordinates have finite variance, the random vector satisfies a local limit theorem. In addition, the first coordinate of the limiting object is independent of all others, which themselves are distributed as a Gaussian vector: Theorem 6.4 (Resnick & Greenwood [32], Hahn & Klass [13], Doney [9]). Let α ∈ (1, 2].
is in the domain of attraction of an α-stable law µ and has infinite variance, and that the covariance matrix Σ of the vector (Y Then, as n → ∞, uniformly for x := (x (1) , . . . , x (j) ) in a compact subset of R j satisfying P (T n = x) > 0, where g is the density of µ andx := (x (2) , . . . , x (j) ).
Let us explain how we obtain this result, by combining the results of [9], [13] and [32]. We first focus on the case j = 2. When α ∈ (1, 2), [32,Theorem 3] states that the convergences of the two marginals  (2) )/ √ n hold, and that obtaining these two convergences separately is enough to get Theorem 6.4. The same theorem states in addition that the two limiting marginals are independent.
On the other hand, when j = 2, α = 2 and µ has infinite variance, Theorem 3 in [13] shows that T n converges in distribution to a bivariate normal variable, and that the first coordinate of the limiting distribution is independent of the second (the constant γ n that appears in the statement of [13,Theorem 3] can be proved to be 0, so that the renormalization matrix A n appearing in this theorem is diagonal). This, coupled with [9, Theorem 1] (which, roughly speaking, states that a bivariate central limit theorem implies a local limit theorem), implies Theorem 6.4 in the case α = 2, j = 2. Although these results are only stated for j = 2 (with the exception of [13, Theorem 3], which is generalized in [13, Theorem 5]), they still hold for j ≥ 3 with mild motifications.
The proof of Theorem 6.1 follows the proof of Theorem 1.1 in the finite variance case, applying Theorem 6.4 to the random vector (S 1 , J A 1 ). In order to generalize the results of Theorem 1.2 to the infinite variance case, we apply Theorem 6. .

Convergence of T A
n to the stable tree We finish the study of the stable case by proving the convergence of the conditioned trees (T A n ), properly renormalized, to the stable tree, for any A ⊂ Z + satisfying µ A > 0. More precisely, the multivariate theorem 2.4, along with Proposition 3.1, allows to obtain the following asymptotics, which generalizes [21, Theorem 8.1 (i)]: Proposition 6.5. Let α ∈ (1, 2], and let µ be in the domain of attraction of an α-stable law with infinite variance. Let A ⊂ Z + be such that µ A > 0, µ A c > 0 and T be a µ-GW tree. Then, there exists a constant C depending only on µ and A such that the following holds as n → ∞, for the values of n such that P N A (T ) = n > 0: Note that our bivariate approach allows to prove it for all A ⊂ Z + , while [21, Theorem 8.1 (i)] holds only when A or Z + \A is finite. An immediate corollary of Proposition 6.5 is the joint convergence of the contour function and the Lukasiewicz path of the conditioned tree T A n : Corollary 6.6. Restricting ourselves to the values of n such that P(N A (T ) = n) > 0, .
The proof of this corollary follows exactly the proof of [21, Theorem 8.1 (II)]. In particular, this convergence implies the convergence in distribution of the tree T A n , viewed as a metric space for the graph distance and properly renormalized, towards the α-stable tree for the Gromov-Hausdorff distance (see e.g. [25, Section 2] for details).

Subcritical non-generic offspring distributions
We now focus on the case where µ is subcritical (that is with mean strictly less than 1) and µ k ∼ ck −β as k → ∞, with fixed c > 0 and β > 2, and B = Z + . This is an interesting case, as a condensation phenomenon occurs (see [16,23]): a unique vertex with macroscopic degree comparable to the total size of the tree emerges. Then the following asymptotic normality holds. Theorem 6.7. Assume that µ is an offspring distribution such that µ k ∼ ck −β as k → ∞, with fixed c > 0 and β > 2, and denote by T n a µ-GW tree conditioned to have n vertices. Let k ≥ 1 and A 1 , A 2 , . . . , A k ⊂ Z + be finite. Then we have the joint convergence in where Z Ai ∼ N (0, µ Ai (1 − µ Ai )) and for i = j: Cov(Z Ai , Z Aj ) = µ Ai∩Aj − µ Ai µ Aj .
The result follows.

Conjecture
We have seen that the conclusions of Theorem 6.7 hold for µ with infinite variance in the domain of attraction of a stable law and for µ a subcritical power law.
We believe that these conclusions should hold for any µ critical with infinite variance, as well as for µ subcritical with no exponential moment. In particular, we should get, for any A ⊂ Z + , (N A (T n ) − nµ A )/ √ n d → N (0, µ A (1 − µ A )). However, in the general case, nothing is known about the scaling limits of such GW trees (see [16] for detailed arguments and counterexamples) and no general local limit theorem exists, which prevents us from directly generalizing our methods.