Non-fringe subtrees in conditioned Galton--Watson trees

We study $S(\mathcal T_{n})$, the number of subtrees in a conditioned Galton--Watson tree of size $n$. With two very different methods, we show that $\log(S(\mathcal T_{n}))$ has a Central Limit Law and that the moments of $S(\mathcal T_{n})$ are of exponential scale.


Definitions
1.1. Subtrees. We consider only rooted trees. We denote the node set of a rooted tree T by V (T ), and the number of nodes by |T | = |V (T )|. We denote the root of T by o = o(T ). We regard the edges of a rooted tree as directed away from the root.
A (general) subtree of a rooted tree T is a subgraph T ′ that is a tree. T ′ is necessarily an induced subgraph, so we may identify it with its node set V ′ = V (T ′ ); hence we can also define a subtree as any set of nodes that forms a tree; in other words, any non-empty connected subset V ′ of the node set V (T ).
Note that a subtree T ′ of T has a unique node o ′ of smallest depth in T , and that all edges in T ′ are directed away from o ′ . We define o ′ to be the root of T ′ . Thus every subtree T ′ is itself a rooted tree, with the direction of any edge agreeing with the direction in T .
A fringe subtree is a subtree T ′ that contains all children of any node in it, i.e., if v ∈ V ′ = V (T ′ ) then w ∈ V ′ for every child w of v. Equivalently, a fringe subtree is the tree T v consisting of all descendants (in T ) of some node v ∈ V (T ) (which becomes the root of T v ). Hence the number of fringe subtrees of T equals the number of nodes of T .
Fringe subtrees are studied in many papers; often they are simply called subtrees. To avoid confusion, we call the general subtrees studied in the present paper non-fringe subtrees. (This is a minor abuse of notation, since fringe subtrees are examples of non-fringe subtrees; the name should be interpreted as "not necessarily fringe".) A root subtree of a rooted tree T is non-fringe subtree T ′ that contains the root o(T ) (which then becomes the root of T ′ too). Equivalently, a root subtree is a non-empty set V ′ ⊆ V (T ) such that if v ∈ V ′ , then the parent of v also belongs to V ′ . Note that a non-fringe subtree of T is a root subtree of a unique fringe subtree T v . Hence, Furthermore, for any v ∈ T , R(T v ) R(T ), since we obtain an injective map R(T v ) → R(T ) by adding to each tree T ′ ∈ R(T v ) the unique path from o to v. Consequently, using (1.1), 1.2. Conditioned Galton-Watson trees. A Galton-Watson tree T is a tree in which each node is given a random number of child nodes, where the numbers of child nodes are drawn independently from the same distribution ξ which is often called the offspring distribution. (We use ξ to denote both the offspring distribution and a random variable with this distribution.) Galton-Watson trees were implicitly introduced by Bienaymé [1] and Watson and Galton [10] for modeling the evolution of populations. A conditioned Galton-Watson tree T n is a Galton-Watson tree conditioned on having size n. It is well-known that T n encompasses many random tree models. For example, if P (ξ = i) = 2 −i−1 , i.e., ξ has geometric 1/2 distribution, then T n is a uniform random tree of size n. Similarly, if P (ξ = 0) = P (ξ = 2) = 1/2, then T n is a uniform random full binary tree of size n.
As a result, the properties of T n has been well-studied. See, e.g., [7] and the references there. For fringe and non-fringe subtrees of T n , see [8; 4; 2; 3].
1.3. Simply generated trees. Let (w i ) i 0 be a given sequence of nonnegative numbers, with w 0 > 0. For a tree T , let D + (v) be the out-degree (number of children) of a node v ∈ T , and define the weight of T by n be a tree chosen at random from all ordered trees of size n with probability proportional to their weights. In other words, its generator.
Note that the conditioned Galton-Watson tree T n with the offspring distribution ξ is the same as the T [s] n with the weight sequence (P (ξ = i)) i 0 . In this case, the generator Φ(z) is just the probability generating function of ξ. Hence, simply generated trees generalize conditioned Galton-Watson trees. On the other hand, given a sequence (w i ) with generator Φ(z), any sequence with a generator aΦ(bz) with a, b > 0 yields the same T [s] n , and in many cases a and b can be chosen such that the new generator is a probability generating function, and then T [s] n is a conditioned Galton-Watson tree. Consequently, simply generated trees and conditioned Galton-Watson trees are essentially the same, and we use in the sequel the notation T n for both. In particular, see, e.g., [7,Section 4], a simply generated tree with generator Φ(z) is equivalent to a conditioned Galton-Watson tree with offspring distribution ξ satisfying E ξ = 1 and E e tξ < ∞ for some t > 0, if and only if Φ(z) has a positive radius of convergence R ∈ (0, ∞] and Although the two formulations are equivalent under our conditions, the formulation with simply generated trees is sometimes more convenient, since it gives more flexibility in choosing a convenient Φ; see for example Section 4.1.
1.4. Some further notation. If v and w are nodes in a tree T , then v ≺ w denotes that v is ancestor of w.
We denote T ′ being a non-fringe (general) subtree of T by T ′ ⊆ T and T ′ being a root subtree of T by T ′ ⊆ r T .
For a formal power series f (z) := n f n z n , we let [z n ]f (z) := f n .

Main results
We give two types of results in this paper, proved by two different methods. First, both R(T n ) and S(T n ) have an asymptotic log-normal distribution, as conjectured by Luc Devroye (personal communication).
Theorem 2.1. Let T n be a random conditioned Galton-Watson tree of order n, defined by some offspring distribution ξ with E ξ = 1 and 0 < Var ξ < ∞. Then there exist constants µ, σ 2 > 0 such that, as n → ∞, where N (0, σ 2 ) denotes the normal distribution with mean 0 and variance σ 2 . Furthermore, The proof is given in Section 3, and is based on a general theorem in [8].
It is in principle possible to calculate µ and σ 2 in Theorem 2.1, at least numerically, see Remark 3.5. Secondly, if we also assume that ξ has a finite exponential moment (a mild assumption satisfied by all standard examples), then we can use generating functions and singularity analysis to obtain asymptotics for the mean and higher moments of R(T n ).
Theorem 2.2. Let T n be as in Theorem 2.1, and assume further that E e tξ < ∞ for some t > 0. Assume further that if R ∞ is the radius of convergence of the probability generating function Φ(z) := E z ξ , then Φ ′ (R) := lim zրR Φ ′ (z) = ∞. Then there exist sequences of numbers γ m > 0 and 1 < τ 1 < τ 2 < . . . such that for any fixed m 1, We will later use the formulation of simply generated trees. In this language, Theorem 2.2 has the following, equivalent, formulation. Theorem 2.3. Let T n be a simply generated tree with generator Φ(z). Let R ∞ be the radius of convergence of Φ(z). Assume that R > 0 and that Then (2.5) holds.
The proof of Theorems 2.2-2.3 is given in Section 4. We first (Sections 4.1-4.2) illustrate the argument by studying the simple case of full binary trees, where we do explicit calculations. (Similar explicit calculations could presumably be performed, e.g., for full d-ary trees, or for ordered trees.) Then we give the proof for the general case in Section 4.3. Note that the condition (2.6) is the same as (1.6); however, we need also the extra condition (2.7). The latter condition is a weak assumption that is satisfied in most applications, and in particular if R = ∞, or if Φ(R) = ∞. Nevertheless, this extra condition (or some other) is necessary; we give in Section 4.4 an example showing that Theorems 2.2-2.3 are not valid without (2.7).
For moments of S(T n ), we have by (1.2) the same exponential growth τ n m , but possibly also a polynomial factor. In fact, there is no such polynomial factor, and E S(T n ) m and E R(T n ) m differ asyptotically only by a constant factor, as shown by the following theorem, proved in Section 5.
Theorem 2.4. Let T n be as in Theorem 2.2 or 2.3. Then, for any m 1, where τ m is as in (2.5) and γ ′ m > 0. More generally, for m, ℓ 0, for some γ ′ m,ℓ > 0. The constants γ ′ m,ℓ can be calculated explicitly, see (5.29). Remark 2.5. We can express (2.1) and (2.2) by saying that R(T n ) and S(T n ) have the asymptotic distribution LN (nµ, nσ 2 ). Note that if W ∼ LN (nµ, nσ 2 ) exactly, so W = e Z with Z ∼ N (nµ, nσ 2 ), then the moments of W are given by E W m = E e mZ = e mnµ+m 2 nσ 2 /2 = e (mµ+m 2 σ 2 /2)n . (2.10) We may compare this to Theorem 2.2 and ask whether τ m ? = e mµ+m 2 σ 2 /2 . (2.11) It seems natural to guess that equality holds in (2.11); however, we show in Remark 4.3 that it does not hold, at least not for all m, in the case of full binary trees. We therefore conjecture that, in fact, equality never holds in (2.11). This may seem surprising; however, note that the same happens in the simpler case Y = e X with X ∼ Bi(n, p), with p fixed. Then Y is asymptotically LN (np, np(1 − p)) in the sense above, but In other words, F (T ) is an additive functional with toll function f (T ) := log 1 + R(T ) −1 , see e.g. [8, §1].
For any tree T , and any node v ∈ T , the path from the root o to v is a root subtree. Hence, R(T ) |T |, (3.4) and as a consequence, In particular, we have the deterministic bound |f (T n )| 1/n. This bound implies that the conditions of [ Consider a tree T . Denote the depth and out-degree (number of children) of a node v ∈ T by d(v) and D + (v). Fix a node v ∈ T , write d = d(v), and let the path where α j is the product of R(T w ) + 1 over all children w = v j+1 of v j . Note that each R(T w ) 1, and thus and Then repeated applications of (3.7) (i.e., induction on d) yield the expansion we have Define also and note that γ * * (v) γ * (v) by (3.8)-(3.12). Now, let T ′ be a modification of T , where the subtree T v is replaced by some tree T ′ v , but all other parts of T are left intact. Then all α j , β(v j ), γ(v), γ * (v) and γ * * (v) are the same for T ′ as for T . Hence, if we further assume that Next, fix an ℓ 2 be such that P(ξ = ℓ) > 0. Let T a be a tree where the root o and two of its children have out-degrees ℓ, and all other nodes have out-degree 0 (i.e., they are leaves). Similarly, let T b be a tree where o, one of its children, and one of its grandchildren have out-degree ℓ, and all other nodes have out-degree 0. Then both T a and T b are trees of order 3ℓ + 1, and both are attained with positive probability by T 3ℓ+1 . Furthermore, a simple calculation using (3.1) shows that and thus R(T a ) > R(T b ). Consequently, the random variable R(T 3ℓ+1 ) is not a.e. equal to a constant. Fix also a large constant A, to be chosen later, and say that a node v ∈ T is good if |T v | = 3ℓ+1 and γ * * (v) A. Define the core T * of T as the subtree obtained by marking all good nodes in T and then deleting all descendants of them. Note that adding back arbitrary trees of order 3ℓ + 1 at each marked node of T * yield a tree T ′ of the same order as T , and with the same good nodes, because |T v | and γ * * (v) are unchanged for every v ∈ T * . It follows that the random tree T n , conditioned on its core T * n = T * , consists of T * with an added tree T v at each good (i.e., marked) node of T * , and that these added trees T v all have order 3ℓ + 1 and are independent copies of T 3ℓ+1 . Now suppose (in order to obtain a contradiction) that σ 2 = 0; then (2.1) and (3.2) We show in Lemma 3.1 below that there exists a constant c > 0 such that, for large n, T n has with probability 1/2 at least cn good nodes. Hence, (3.18) holds also if we condition on the existence of at least cn good nodes. Condition further on the core T * n , and among the possible cores T * of T n with at least cn good nodes, choose one that minimizes P |F (T n ) − nµ| > √ n | T * n = T * . For each n, fix this choice T * = T * (n), and note that Let m be the number of good (i.e., marked) nodes in T * = T * (n) and label these v 1 , . . . , v m . Condition on T * n = T * . Then, as noted above, T n consists of T * with a tree T i added at v i , for each i, and these trees T 1 , . . . , T m are m independent copies of T 3ℓ+1 . Let X i := R(T i ); thus X 1 , . . . , X m are i.i.d. random variables with some fixed distribution. Furthermore, repeated applications of (3.1) show that R(T n ) is a function (depending on T * (n)) of X 1 , . . . , X m . Hence, by (3.2), we have, still conditioning on T * n = T * , for some function g n . Consequently, writing Y m := g n (X 1 , . . . , X m ), we have by (3.19) as n → ∞. Recalling that m cn, this implies We now obtain the sought contradiction from (3.40) in Lemma 3.4 below. (To be precise, we use a relabelling. We have m = m(n) → ∞ as n → ∞; we may select a subsequence with increasing m and consider this sequence only, relabelling g n as g m .) Note that in this application of Lemma 3.4, S is a finite set of integers (the range of R(T 3ℓ+1 )). The conditions of Lemma 3.4 are satisfied: by (3.16)-(3.17), we can find s such that 0 < P(X 1 s) < 1; furthermore, (3.39) holds (under the stated condition) with δ : A by the definition of good vertices and This completes the proof that σ 2 > 0, given the lemmas below.
Lemma 3.1. With notations as above, there exists A < ∞ and c > 0 such that, for large n, P T n has at least cn good nodes 1/2.
Proof. Note first that if P(ξ = 1) = 0, and thus T n has no nodes of out-degree 1, then this is easy. In this case, (3.14) yields for any c < P(|T | = 3ℓ + 1). In general, (3.24) still holds, but there is no uniform bound on γ * * (v), as is shown by the case of a long path, and it remains to show that γ * * (v) is bounded for sufficiently many nodes. We define, for a given tree T and any pair of nodes v, w with v w, (3.25) We then can rewrite (3.14) as and we define also the dual sum We may also note, although we do not use this explicitly, that ζ(v) has a natural interpretation: π(v, w) is the probability that a random walk, started at v and at each step choosing a child uniformly at random, will pass through w. Hence, ζ(v) is the expected length of this random walk.
If the root o of T has D children v 1 , . . . , v D , and the corresponding fringe trees are denoted T 1 , . . . , T D , then We apply this with T = T n , the conditioned Galton-Watson tree. Note that conditioned on the root degree D, and the sizes n i := |T i | of the subtrees, each T i is a conditioned Galton-Watson tree T n i . Consequently, (3.28) yields for some constant C 1 and all n. We prove this by induction, assuming that (3.30) holds for all smaller n. Note also that if |T | = 1, then ζ(T ) = 1. Hence, by (3.29) and the induction hypothesis, if D 1 := |{i : n i = 1}|, the number of children of o that are leaves, then and hence where D 1 and D are calculated for the random tree T n . As n → ∞, the distribution of the pair (D, D 1 ) converges to the (D,D 1 ), the same quantities for the random limiting infinite treeT , see for example [7, Section 5 and Theorem 7.1]. Hence, using bounded convergence, Thus there exists a constant C 2 such that for all n 1, Consequently, by Markov's inequality, with probability Hence, if we choose A := 6C 2 /c, then (3.37) implies that at most 3C 2 n/A = cn/2 nodes w in T n satisfy γ * * (w) > A, and hence at least n 3ℓ+1 (T n ) − cn/2 nodes are good. This and (3.24) show that, with probability 2 3 + o(1), T n has at least cn/2 good nodes. Remark 3.2. As the proof shows, the probability 1/2 in Lemma 3.1 can be replaced by any number < 1. We conjecture that in fact, for suitable A and c, the probability tends to 1. Remark 3.3. If we assume that the offspring distribution ξ has an exponential moment, so that its probability generating function has radius of convergence > 1, then one can alternatively derive (3.30) and (3.36), and precise asymptotics, using generating functions. We leave this to the reader.
1, and assume that there is a number s and a δ > 0 such that 0 < P(X 1 s) < 1 and that g m y 1 , . . . , y j−1 , y ′ j , y j+1 , . . . , y m g m y 1 , . . . , y j−1 , y j , y j+1 , . . . , y m + δ, (3.39) for any m, j m, y 1 , . . . , y m ∈ S and y ′ j ∈ S, such that y j s and y ′ j > s. Then, for any constant B and any sequence µ m , Proof. First, by replacing g m by g m − µ m , we may assume that µ m = 0. If (3.40) does not hold, then, by restricting attention to a subsequence, we may assume P |Y m | B √ m → 1, as m → ∞. Let N m := |{i : X i > s}|. Thus N m has a binomial distribution Bi(m, p), where p := P(X 1 > s) ∈ (0, 1). Fix a large number K > 0, and define the events By the central limit theorem for the binomial distribution, P(E + m ) → q and P(E − m ) → q for some q > 0, and thus our assumption implies that Hence we can find integers n + m and n − m with 0 (3.44) On the other hand, (3.39) and the construction imply that Choosing K = Bδ −1 , we obtain a contradiction with (3.44). (3.46) Again, we do not know any closed form expression, but numerical calculation should be possible.

Moments of the number of root subtrees
In this section we prove Theorem 2.3, using generating functions and the language of simply generated trees; note that this also shows the equivalent Theorem 2.2. In Sections 4.1 and 4.2, we study a simple example of simply generated trees to illustrate the main idea behind Theorem 2.3; in this example we derive explicit formulas for some generating functions. The proof for the general case is postponed to Section 4.3; it uses the same argument (but in general we do not find explicit formulas).

4.1.
An example: full binary trees. Consider as an example the simply generated tree T n with the generator Φ(z) := 1 + z 2 . Then T n is a uniformly random full binary tree of order n. (Provided n is odd; otherwise, such trees do not exist.) Note that Φ(z) satisfies the conditions of Theorem 2.3. (Note that we have chosen a generator that is not a probability generating function; the corresponding offspring distribution ξ has probability generating function 1 2 (1 + z 2 ), and thus P(ξ = 0) = P(ξ = 2) = 1 2 ; this generator would lead to similar calculations and the same final result.) A combinatorial class is a finite or countably infinite set on which a size function of range Z 0 is defined. For a combinatorial class D and an element δ ∈ D, let |δ| denote its size. The generating function of D is defined by where d n denotes the number of elements in D with size n. It encodes all the information of (d n ) n 0 and is a powerful tool to get asymptotic approximations of d n .
Let Z = {•} denote the combinatorial class of node, which contains only one element • since we are considering unlabelled trees. Let | • | = 1. Then the generating function of Z is simply z. Let F 0 denote the combinatorial class of full binary trees. For T ∈ F 0 , we let |T | be the total number of nodes in T . Since T is a binary tree, it must be either a node, or a node together with a left subtree T 1 and a right subtree T 2 , with T 1 , T 2 ∈ F 0 . This can be formalized by the symbolic language developed by Flajolet and Sedgewick [6, p. 67] as with + denotes "or" and × denotes "combined with". Let F 0 (z) denote the generating function of F 0 , i.e., where a n is the number of full binary trees of order n. Then the definition (4.2) directly translates into the functional equation with the explicit solution To compute E R(T n ), we consider a pair (T, T ′ ) in which T is a full binary tree and T ′ is a rooted subtree of T painted with color 1. Let F 1 be the combinatorial class of such partially colored full binary trees, with |(T, T ′ )| = |T |. Let F 1 (z) be the generating function of F 1 , i.e., Then, for any (odd) n, E R(T n ) = a (1) n /a n . (4.7) For a tree T in F 1 , its root o is always colored. Every subtree T v where v is a child of o (so d(v) = 1) can be either itself a partially colored tree (an element of F 1 ) or an uncolored tree (an element of F 0 ). Thus, we have the following symbolic specification (4.8) Consequently, using (4.4), with the explicit solution For the second and higher moments we argue similarly. For m 1, we consider a (m + 1)-tuple (T, T ′ 1 , · · · , T ′ m ) in which T is a full binary tree and T ′ 1 , · · · , T ′ m are m root subtrees of T with T ′ i painted with color i. (Note that T ′ 1 , · · · , T ′ m are not necessarily distinct. Note also that a node may have several colors.) Let F m be the combinatorial class of such partially m-colored trees. Let |(T, T 1 , · · · , T ′ m )| = |T |. Let F m (z) be the generating function of F m , i.e., Then, for any (odd) n, E R(T n ) m = a (m) n /a n . (4.14) Consequently, for the corresponding generating functions, which determines every F m (z) by recursion, solving a quadratic equation in each step. Equivalently, and perhaps more conveniently, For example, for m = 2, and Explicitly, we obtain from (4.17) or (4.18) 4.2. Singularity analysis: full binary trees. Let ρ m be the radius of convergence of F m (z); then ρ m is a singularity of F m (z) (of square-root type). We see from (4.5) that and thus Since full binary trees can only have odd number of nodes, we have a 2m = 0 for m 0. For odd n, applying singular analysis to (4.5) gives For the second moment, (4.19) similarly yields where λ 2 . = 1.883418 is a constant. Then by (4.12) It is not difficult to prove by induction that there exist sequences of numbers λ m > 0 and ρ 0 > ρ 1 > · · · such that for every fixed m 1, This is (2.5) with γ m = λ m /λ 0 and τ m = ρ 0 /ρ m = (2ρ m ) −1 . In particular, We do not have a closed form of ρ m or τ m for m 3. Table 1 gives the numerical values of τ m and ρ m for m up to 10.  Table 1. Numerical values of τ m and ρ m for full binary trees. According to Maple, these polynomials are irreducible over the rationals; moreover, the polynomial in (4.36) is irreducible over Q(ρ 1 ) and the polynomial in (4.37) is irreducible over Q(ρ 1 , ρ 2 ). In particular, we have a strictly increasing sequence of fields Q ⊂ Q(ρ 1 ) ⊂ Q(ρ 1 , ρ 2 ) ⊂ Q(ρ 1 , ρ 2 , ρ 3 ). We expect that this continues for larger m as well, and that the fields Q(ρ 1 , . . . , ρ m ) form a strictly increasing sequence for 0 m < ∞.
and thus We expect that the same holds for other conditioned Galton-Watson trees, but we have no general proof. Equivalently, ρ 3 = ρ 3 2 ρ −3 1 ρ 0 . However, in the case of full binary trees, we have noted in Remark 4.1 that ρ 3 / ∈ Q(ρ 1 , ρ 2 ) = Q(ρ 0 , ρ 1 , ρ 2 ), so (4.41) is impossible. In fact, a numerical calculation, using the values in Table 1, yields in this case    We prove this claim by induction. (The base case m = 0 is well-known, see [6,Theorem VI.6,p. 404], and follows by minor modifications of the argument below.) Note first that, by (4.48), F m (z) R when 0 < z < ρ m , and thus, letting z ր ρ m , Next, by (4.47), and, in particular, Since F m (z) has only nonnegative coefficients, it has a singularity at ρ m . This singularity can arise in one of three ways: In fact, if neither (i) nor (ii) holds, then ρ m < ∞, s m < ∞ and Ψ m is analytic in a neighbourhood of (ρ m , s m ). If also (iii) does not hold, then F m (z) is analytic in a neighbourhood of ρ m by (4.49) and the implicit function theorem, which contradicts that F m (z) has a singularity at ρ m . We will show that (i) and (ii) are impossible; thus (iii) is the only possibility.
Suppose now that (ii) holds. Then, as z ր ρ m , and thus (4.57) implies We have shown that ρ m < ρ m−1 ρ k for every k < m, and thus (4.46) shows that H m is analytic at ρ m , and H m (ρ m ) < ∞. Hence, in this case (4.59) yields, using (4.60), which contradicts the assumption (2.6). We have thus reached a contradiction in both cases, which shows that (ii) cannot hold, so We now apply [6, Theorem VII.3, p. 468], noting that the conditions are satisfied by the results above, in particular (4.63), (4.50) and (4.62). This theorem shows that F m (z) has a square-root singularity at ρ m , and that its coefficients satisfy where λ m > 0 is a constant. (In the periodic case, as usual we consider only n such that T n exists.) It follows from (4.44) that Letting γ m = λ m /λ 0 and τ m = ρ 0 /ρ m , we have shown (2.5). This prove Theorem 2.3, and thus also the equivalent Theorem 2.2.

4.4.
A counter example. The following example shows that Theorem 2.3 does not hold without the condition (2.7).
Consequently, we can find a < a 0 such that the simply generated tree with generator (4.66) does not have F 1 with a singularity of the type above. Hence, in this case, ρ 1 is instead given by (ii) in Section 4.3, i.e., F 1 (ρ 1 ) = 1, which by (4.48) implies We have shown that ∂Ψ 1 ∂w ρ 1 , F 1 (ρ 1 ) < 1, and thus it follows from (4.54) and Φ ′ (1) < ∞ that lim zրρ 1 F ′ 1 (z) < ∞. Hence the singularity of F 1 at ρ 1 is not of square root type, and the asymptotic formula (2.5) cannot hold.
We leave it as an open problem to find the asymptotics of E R(T n ) and higher moments in this case.

General subtrees
We have in Sections 4 considered root subtrees. Estimates for general nonfringe subtrees follow from (1.2), but more precise results can be obtained by introducing the corresponding generating functions cf. (4.43) and note that G 0 (z) = F 0 (z). For simplicity, we study first the case m = 1 in detail, and as in Section 4, we consider first the example of full binary trees. We assume throughout this section that the assumptions of Theorem 2.3 hold.
5.1. The mean, full binary trees. Let G 1 be the combinatorial class of pairs of trees (T, T ′ ) such that T ′ is subtree of T ∈ F 0 . In other words, an element of G 1 is a full binary tree with one non-fringe subtree colored. Such a partially colored tree is either a full binary with a root subtree colored (an element of F 1 ), or a uncolored root together with a left (right) uncolored subtree (an element of F 0 ) and a partially colored right (left) subtree (an element of G 1 ). Thus G 1 has the specification Therefore, G 1 (z), the generating function of G 1 given by .
5.2. The mean, general trees. For simply generated trees with the generator Φ(x), (5.2) and (5.4) can be generalized to and . (5.10) Note that for any m 0 and 0 < z < ρ m , by (4.53) and (4.50), Together with F 0 (z) = F 0 (z), this shows that the denominator in (5.10) is non-zero for |z| < ρ 0 , and in particular for |z| ρ 1 . Thus G 1 (z) has the same dominant singularities with |z| = ρ 1 as F 1 (z), and it follows that, as n → ∞, Remark 5.1. In the periodic case, when F 1 (z) has k 1 singularities on the circle |z| = ρ 1 , it is easily verified that 1 − zΦ ′ F 1 (z) is a power series in z k and thus has the same value at all these singularities, cf. the full binary case above. 5.3. Higher moments. The generating functions G m for higher moments of S(T n ) can be found recursively by similar methods. The recursion becomes more complicated than for F m , however. We introduce the generating functions for mixed moments of the numbers of root subtrees and general subtrees Note that G m,0 = G m and G 0,ℓ = F ℓ . It follows from (1.2) (or the recursions below) that G m,ℓ (z) has the same radius of convergence ρ m+ℓ as F m+ℓ (z). Consider first, as examples, the cases with m+ℓ = 2. G 1,1 (z) is the generating function of the combinatorial class G 1,1 consisting of triples (T, T ′ , T ′′ ) where T ′ is a subtree and T ′′ a root subtree of T , counted with weights w(T ) determined by Φ(z) by (1.3) and (1.5).
Let (T, T ′ , T ′′ ) ∈ G 1,1 . Denote the children of the root o ∈ T by v 1 , . . . , v D , and let T 1 , . . . , T D be the corresponding fringe subtrees of T . The subtree T ′ is either a root subtree, and then (T, T ′ , T ′′ ) ∈ F 2 , or it is a subtree of one of the fringe trees T j . Furthermore, the root subtree T ′′ is determined by choosing for each fringe subtree T j either a root subtree or nothing. Hence, in the case (T, T ′ , T ′′ ) / ∈ F 2 , for some j 0 D, we choose either (T j 0 , T ′ j 0 , T ′′ j 0 ) ∈ G 1,1 or (T j 0 , T ′ j 0 ) ∈ G 1,0 ; at the same time, we choose for each j = j 0 either (T j , T ′′ j ) ∈ G 0,1 = F 1 or just T j ∈ G 0,0 = F 0 . Consequently, and thus Consequently, Similarly, G 2,0 is the class of triples (T, T ′′ 1 , T ′′ 2 ) where both T ′′ 1 and T ′′ 2 are general subtrees of T . The case when both T ′′ 1 and T ′′ 2 are root trees gives F 2 , and the case where, say, T ′′ 1 , is a root tree but T ′′ 2 is not gives G 1,1 \ F 2 , found above. Finally, if neither T ′′ 1 nor T ′′ 2 is a root tree, then they are determined by one subtree in a fringe tree T j 1 and one subtree in T j 2 , where j 1 and j 2 may be equal or not. This leads to, arguing as in (5.14)-(5.15), and thus Singularity analysis of (5.16) and (5.18) show that where all denominators are positive by (5.11).
The argument is easily extended to higher powers, and in principle can any G m,ℓ (z) be found recursively by this method; however, the formulas will be more and more complicated, and we see no simple general formula. On the other hand, we are really only interested in the singular parts, and thus we can ignore most terms. Furthermore, α m,ℓ := φ m,ℓ (ρ m+ℓ ) satisfies the recursion α 0,ℓ = 1 and, for m 1, .
Proof. The case m = 0 is trivial, with φ 0,ℓ (z) = 1 and ψ 0,ℓ (z) = 0. Thus, let m 1. The combinatorial class G m,ℓ consists of sequences where T is a tree, counted with weight w(T ), each T ′ i is a subtree and each T ′′ j is a root subtree. Let k be the number of T ′ 1 , . . . , T ′ m that are not root subtrees. The case k = 0 gives F m+ℓ . Suppose 1 k m. Let again T 1 , . . . , T D be the fringe trees rooted at the children of the root o of T . Further suppose that the k non-root subtrees go into p 1 of T 1 , · · · , T D , which we call T j 1 , · · · , T jp . Then there are D p ways to select T j 1 , · · · , T jp . Suppose further that k i of the k non-root subtrees go into T j i . Given p positive integers k 1 , · · · , k p with k 1 + · · · + k p = k, there are k k 1 ,··· ,kp ways of choosing how the k non-root subtrees are divided among T j i , · · · , T jp . While T jr contains k r marked general subtrees, it also contains i m + ℓ − k marked root subtrees, which can be chosen in m+ℓ−k i ways. For any T j such that j / ∈ {j 1 , · · · , j p }, it too contains up to m + ℓ − k marked root subtrees. Hence, fixing k, p, k 1 , · · · , k p , D, we have the following term that contributes to G m,ℓ (z) (5.24) Summing over D 1 gives (5.25) In (5.25), m + ℓ − k m + ℓ − 1, and thus Φ (p) F m+ℓ−k (z) has radius of convergence at least ρ m+ℓ−1 . Similarly, k j + i m + ℓ with equality only if k j = k, and thus j = p = 1, and also i = m+ℓ−k; hence, except in the latter case, (5.25) has radius of convergence ρ m+ℓ−1 . Consequently, collecting all terms and lumping most of them together, taking into account that there are m k ways to choose the k non-root subtrees among T ′ 1 , · · · , T ′ m , where the denominator is non-zero for |z| < ρ m+ℓ−1 ρ ℓ by (5.11). The results follow by induction in m. Example 5.3. As an example of the recursion (5.26), note that by (5.15) and (5.17), G 2,0 (z) = F 2 (z) + 2zG 1,1 (z)Φ ′ F 1 (z) + 2zG 1,0 (z)Φ ′ F 1 (z) , (5.30) Returning again to the full binary trees, we find from (5.29)-(5.32) by Maple Therefore, as might be expected, we have a strong but not perfect correlation.
6. Average size of root subtrees n is the probability generating function of X n , where X n = |T ′ | for a pair (T, T ′ ) where T ′ is a root subtree of T , chosen uniformly at random from all such pairs in F 1 with |T | = n. F 1 (z, u) can be computed in the same way as in Section 4.