On the asymptotic internal path length and the asymptotic Wiener index of random split trees

The random split tree introduced by Devroye (1999) is considered. We derive a second order expansion for the mean of its internal path length and furthermore obtain a limit law by the contraction method. As an assumption we need the splitter having a Lebesgue density and mass in every neighborhood of 1. We use properly stopped homogeneous Markov chains, for which limit results in total variation distance as well as renewal theory are used. Furthermore, we extend this method to obtain the corresponding results for the Wiener index.


Introduction
The random split tree introduced by Devroye (1999) is a general tree model which for special choices of its parameters covers various random trees that are fundamental in Computer Science for their use as data structures, e.g. binary search trees, quadtrees, m-ary search trees, simplex trees, tries etc. Many characteristic quantities of these trees such as node depths, height, path length or other distance measures between nodes describe the complexity of algorithms that make use of the trees. In the probabilistic analysis of algorithms the asymptotic behavior of such quantities is studied for this reason. Whereas often such characteristic quantities are studied one by one for each tree Devroye's idea was to derive universal results valid for the whole class of his split tree model.
We recall the definition of the split tree from Devroye (1999). Four parameters b, s, s 0 , s 1 ∈ N 0 are given where b ≥ 2 is the branching factor, s > 0 is the vertex capacity and s 0 and s 1 satisfy the two conditions 0 ≤ s 0 ≤ s, 0 ≤ bs 1 ≤ s + 1 − s 0 .
Furthermore, a random vector V = (V 1 , . . . , V b ) ∈ [0, 1] b with b k=1 V k = 1 is given. The random split tree of size n is obtained by distributing n balls to the nodes of the infinite b-ary tree according to the following procedure. For a node u of the b-ary tree let C(u) denote the number of balls already assigned to this node and N (u) be the number of balls associated to any node in the subtree rooted at this node. For each node u take an independent copy V (u) = (V Adding a ball to a tree rooted at u proceeds as follows: a) If u is not a leaf (i.e. C(u) < N (u)), choose child i with probability V (u) i , increment N (u) by 1 and recursively add the ball to the subtree rooted at child i. b) If u is a leaf and C(u) = N (u) < s, then add the ball to u and stop. C(u) and N (u) are incremented by 1.
c) If u is a leaf but C(u) = N (u) = s, we set N (u) = s + 1 and C(u) = s 0 , place s 0 ≤ s randomly selected balls at u, give s 1 randomly selected balls to each of the b children of u and set C(v) = s 1 = N (v) for all children v of u. After that, we add each of the remaining s + 1 − s 0 − bs 1 ≥ 0 balls one by one randomly and independently to the subtree rooted at child i with probability V (u) i by applying the procedure recursively.
Usually, one assumes that V i d = V 1 =: V for all i = 2, . . . , b where V is called the splitter and its distribution is called the splitting distribution. By d = it is denoted that left and right hand side have identical distributions. Whenever the functional under consideration is independent of the tree ordering, this assumption does not mean any loss of generality. This can be seen by a random permutation argument, already stated in Devroye (1999). In this paper we need some additional assumption: General assumption: Throughout this paper we assume that the distribution of V has a Lebesgue density f V and that for the distribution function we have F V (x) < 1 for all x < 1.
As mentioned in the beginning, the random split tree models many common random trees. For instance, choosing s = s 0 = b − 1 for some b ≥ 2, s 1 = 0 and V = min{U 1 , . . . , U b−1 } where U 1 , . . . , U b−1 are independent random variables uniformly distributed on [0, 1] one gets the random b-ary search tree. The random median-of-(2k + 1) binary search tree can be realized by setting b = 2, s = 2k, s 0 = 1, s 1 = k and V = median(U 1 , . . . , U 2k+1 ). Also some digital data structures are covered by the split tree model. For V uniformly distributed on the deterministic set {p 1 , . . . , p b }, s = 1 and s 1 = 0 one obtains in the case s 0 = 0 the trie and in the case s 0 = 1 the digital search tree. In Table 1 in Devroye (1999) more examples of important tree models are listed with the corresponding choices of the parameters. The general assumption and with it the results of this paper hold true for many of these examples as random binary search trees, random b-ary search trees, random quadtrees, random median-of-(2k + 1) binary search trees, random simplex trees, (extended) AB trees and random m-grid trees. Whereas the results are not applicable to the common digital data structures as tries and digital search trees.
The depth of the n-th ball in a random split tree, denoted by D n , is the number of edges on the path from the ball to the root of the tree. The internal path length of balls in the split tree is the sum of all depths of balls and is denoted by P n for the tree with n balls. Thus, we have The asymptotic expansion of the expectation of P n was investigated for m-ary search trees in Mahmoud (1986), for random quadtrees by Flajolet et al. (1995) and for the median of (2k + 1)-binary search tree by Chern and Hwang (2001) and Rösler (2001). In Holmgren (2010) the internal path length of random split trees is considered under the assumption that the splitting distribution is non-lattice. The first term and an upper bound of the second term of the asymptotic mean are derived using renewal theory. Limit theorems for the distribution of the path length are proved for the random binary search tree in Régnier (1989) and Rösler (1991) and for the random recursive tree in Dobrow and Fill (1999). Using the contraction method, Neininger and Rüschendorf (1999, Theorem 5.1) showed a universal limit theorem for the internal path length of random split trees under the assumption that the asymptotic expansion of the expectation of the internal path length is of the form as n → ∞. Therefore, it is of interest to characterize all splitting distributions providing an asymptotic expectation of the form (1). The first result of this paper is the following.
Theorem 1.1. Let P n denote the internal path length in a random split tree of size n with branching factor b where the one-dimensional marginal distribution V of the splitting vector fulfills the general assumption. Then there exists a constant c p ∈ R with To state the result which follows from the combination of the limit theorem from Neininger and Rüschendorf (1999) with Theorem 1.1 we introduce some notation. By M 0,2 we denote the set of centered probability measures on R with finite second moments. We denote the distribution of a random variable X by L(X) or P X . The Wasserstein-metric ℓ 2 on M 0,2 is defined by where the L 2 -norm · 2 is given by X 2 = (E[ X 2 ]) 1/2 . For random variables X and Y we set ℓ 2 (X, Y ) := ℓ 2 (L(X), L(Y )). It is well known that convergence with respect to the metric ℓ 2 (denoted by ℓ 2 −→) is equivalent to weak convergence plus convergence of the second moments (see e.g. Bickel and Freedman (1981)).
Corollary 1.2. Let P n denote the internal path length in a random split tree of size n where the one-dimensional marginal distribution of the splitting vector (V 1 , . . . , V b ) fulfills the general assumption. Define X n := (P n −E[P n ])/n. Then the following holds true: b) In particular, the convergence in a) implies c) Exponential moments exist and converge, 1 Introduction 5 The Wiener index of a random split tree is defined as the sum of the distances between all unordered pairs of balls, where the distance between two balls is given by the minimum number of edges connecting the nodes which are associated to the balls. For trees, the two dimensional vector consisting of the Wiener index and the internal path length suffices a recursion formula similar to that of the latter one. Using this recursion formula, Neininger (2002) proved a limit theorem for the Wiener index of the random binary search tree and the random recursive tree by the use of the multivariate contraction theorem. In a final remark, Neininger (2002) mentioned that a limit theorem for the Wiener index of the general split tree can be proved in a similar way after determining the asymptotic expansion of its expectation sufficiently well. We prove this asymptotic expansion and use the contraction method to obtain the limit theorem for the Wiener index of random split trees which fulfil the general assumption.
Theorem 1.4. Let W n denote the Wiener index in a random split tree of size n with branching factor b where the one-dimensional marginal distribution V of the splitting vector fulfills the general assumption. Then there exists a constant We denote by M 2 0,2 the set of centered probability measures on R 2 with finite second moments. The Wasserstein-metric ℓ 2 on the set M 2 0,2 is defined similarly to the one-dimensional case.
Theorem 1.5. Let (W n , P n ) denote the vector consisting of the Wiener index and the internal path length of a random split tree of size n with branching factor b where the one-dimensional marginal distribution of the splitting vector (V 1 , . . . , V b ) fulfills the general assumption. Then the following holds true: a) We have as n → ∞, 2 ), and X (1) , . . . , X (b) , D, Z are independent.
Remark 1.6. The constant µ = −bE[V log V ] in the first order terms of the expectations of the internal path length and of the Wiener index appears already in the results about the height and depth in Devroye (1999). There, the explicit values of this constant for the individual splitting distributions are given in Table 2.
Remark 1.7. Besides the internal path length for the balls considered here, there is also the internal path length for the nodes where the depths of all nodes are summed up. Since there can be up to s balls in one node, these two path lengths may differ. In Holmgren (2010), the relation between the two versions is investigated. Let N n denote the number of nodes in the random split tree with n balls. Assuming that the distribution of − log V is non-lattice, P (V = 1) = P (V = 0) = 0 and for some constant α > 0 and ε > 0, Holmgren (2010) showed that Theorem 1.1 implies the similar asymptotic behavior for the internal path length for the nodes in that random split tree. This finally yields the general limit theorem for the internal path length for the nodes in split trees which additionally fulfil equation (3). For instance, Mahmoud and Pittel (1989) showed the stronger result E[N n ] = αn + O(n 1−ε ) in the case of the b-ary search tree. It seems that there are no results on the corresponding alternative version of the Wiener index in terms of the node-to-node distances.
The internal path length and the Wiener index have been considered also for random trees that do not belong to the class of split trees. A universal limit law for the path length of simply generated trees is proved in Janson (2003) where the limit distribution is given as a function of the Brownian excursion. Furthermore, the moments of the limit are derived. For the class of random increasing trees, which covers in particular the random recursive tree and the plane oriented recursive tree, the second order asymptotic of the expectation of the internal path length is derived in Bergeron et al. (1992). In Munsonius and Rüschendorf (2010) the asymptotic behavior of the expectation and a limit theorem for the internal path length of random b-ary trees with weighted edges is proved. By special choices of the edge weights, the analogous results are obtained for the class of random linear recursive trees, which encompasses in particular the random plane oriented recursive tree. Tail bounds for the Wiener index of random binary search trees have been considered by Ali Khan and Neininger (2007). For a random split tree with n balls we denote by I n = (I n,1 , . . . , I n,b ) the vector of the sizes of the subtrees, i.e. the number of balls assigned to nodes 7 in the subtrees, rooted at the children of the root. By the construction of the split tree it follows that I n is conditionally given Thus, under the assumption where we set η n := n − s 0 − bs 1 . Throughout this paper, Bin(m, x) denotes a random variable with binomial distribution with parameters m ∈ N and x ∈ [0, 1]. The proofs of Theorem 1.1 and Theorem 1.4 are based on a method developed in Bruhn (1996) for recurrences where the toll function is bounded. In Section 2, we recall definitions and results of Bruhn (1996) and extend his method to the case of an unbounded toll function. We check the conditions of this method in the case of the random split tree in Section 3. Section 4 is devoted to the application in the case of the internal path length and the proof of Theorem 1.1. In Section 5 we give the proofs of Theorem 1.4 and Theorem 1.5 concerning the Wiener index.
Acknowledgement. The author is grateful to Ralph Neininger for several hints to literature and for comments to previous versions of this paper and to Nicolas Broutin for helpful discussions and making a preliminary manuscript of the paper Broutin and Holmgren (2011) on the internal path length of split trees available to him. Furthermore, he thanks an unknown referee for valuable suggestions for improvement of the paper.

The setting of Bruhn
Starting from recursion formulas of the form where ν n is a probability measure on {0, . . . , n − 1} for all n ∈ N, the main idea of Bruhn (1996) is to define a homogeneous Markov chain where the transition probabilities are given for n > 0 by and P (S 1 = 1 | S 0 = 1) = 1. Now, let σ(n 1 ) := inf{t | S t > − log n 1 } be the stopping time when the Markov chain exceeds − log n 1 for n 1 ∈ N. Then, Bruhn proved the representation formula given in the following Lemma. (Since the PhD-thesis of Bruhn seems to be not available in English, the proofs of Bruhn (1996) are stated in Appendix B.) We denote by Y t := S t − S t−1 the increments of S. For x ∈ E we write P x (·) in short for P (· | S 0 = x) and correspondingly E x [·] for the expectation with respect to the measure P x . We denote by F x the distribution function of Lemma 2.1. Let H n be a sequence of real numbers satisfying for some function r. Then it is for any n 1 ∈ N with the notations above To analyze the Markov chain (S t ) t∈N we consider in the following a general state space E ⊂ R.
In the case of an AR-process, the theorem of dominated convergence implies that the integrability condition is equivalent to for some a ∈ R.
The first summand in (5) can be handled by considering the distribution of S σ(n 1 ) . The following key result is implicitly given in Rösler (2001) in a more general setting. The essential part of Rösler (2001) which gives the proof is stated in Appendix A in a self-contained way. For probability measures P and Q, let d TV (P, Q) denote their total variation distance. Moreover, we define τ (d) := inf{t : S t ≥ d}.
Lemma 2.4. Let (S t ) t∈N be an AR-process which fulfills the integrability condition with a discrete state space The asymptotic behavior of the second summand in (5) can be analyzed by using the elementary renewal theorem. Since the Markov chain (S t ) t∈N is not a renewal process, we couple it with three renewal processes using the functions F ,F a and F a . Because of the convergence lim x→−∞ F x (t) = F (t), the functions F a and F a are again distribution functions.
Considering the AR-process (S t ) from above, there exists a sequence of independent random variables (U r ) r∈N uniformly distributed on [0, 1] such that t is increasing as a → −∞. Both sequences converge almost surely toỸ r . Finally, we define the following stopping times for a, d ∈ R: Using the renewal process (S t ) t∈N , Bruhn (1996) shows the following result. (The proof is given in AppendixB.) Lemma 2.5 (Bruhn (1996), Lemma 3.4). Consider an AR-process (S t ) with the notations above. Then there exist a real number a * and a positive real numberû(a * ) such that for all measurable functions l : R → R + , all real numbers y, z and all x ∈ E with x < y < z < a * we have To investigate also recurrences where the toll function r is not bounded as it is for example in the case of the Wiener index, we complete the results of Bruhn by the following lemma and corollary.
Lemma 2.6. It holds for all decreasing continuous functions l : R → R + and Proof. First, we consider the sequence (S (a) t ). By the construction we know that for each s, t ∈ N the mapping a →Ȳ are decreasing and converge almost surely toỸ s andS t −S 0 as a → −∞. This yields that for d ∈ R the mapping a →γ (a) (d) is increasing and bounded from above byγ(d). It is easy to see thatγ (a) (d) →γ(d) almost surely as a → −∞. Sinceγ (a) (d) ∈ N for all a ∈ R and l is continuous, we obtain as a → −∞ almost surelȳ Furthermore, the left hand side is increasing as a → −∞ and where we use that l is decreasing. The positivity ofỸ s ensures by Gut (1988, Chapter II, Theorem 3.1) that E[γ(d)] < ∞ and the claim follows for the first sum.
With the same arguments, we have almost surely as a → −∞ and the left hand side is decreasing. It is The monotone convergence theorem provides lim t ] > 0 for a ∈ R small enough and the elementary renewal theorem (see e.g. Gut, 1988, Section II.4) implies E[γ (a) (d)] < ∞. So, the claim follows from (8) by the monotone convergence theorem. ✷ Choosing l(x) = exp(−αx) with α > 0 yields the following result.
Corollary 2.7. For α, d > 0 there exists a constant c ∈ R such that for each ε > 0 there exists n 0 ∈ N with for all n ≥ n 0 .

Proof. By construction we have for
For ε > 0, Lemma 2.6 provides a * ∈ R such that for all a < a * we have We choose n 0 such that − log n 0 + d ≤ a * . Since we have for n ≥ n 0 the claim follows using Lemma 2.6 once more. ✷

Recurrences for the random split tree
We consider a random split tree with the notation as introduced in Section 1 and set ν n ({k}) := b k n P (I n,1 = k) + s 0 n 1 {k=n−s 0 } . This function ν n defines a probability measure on the set {0, . . . , n − s 0 }. This is seen by summing up all values For the rest of the paper, we consider the Markov chain (S t ) t∈N from Section 2 where the transition probabilities are given by this special choice of ν. In this section, we prove that for this choice the conditions of the Lemmata of the previous section are fulfilled.

The distribution of the subtreesize
When doing this, we frequently use the fact that the size of the first subtree rescaled properly converges.
Lemma 3.1. For ε > 0 we have In particular, this yields Proof. Starting from the distribution of I n,1 given in (4) we obtain by Bernstein's inequality Since it is |I n,1 /n − V | ≤ 1, this yields for the expectation

✷
At this point, we prove some asymptotic expansions needed later. Proof. It is Furthermore, we have by Lemma 3.1 I n,1 /n → V in probability. Since x → x k log x is bounded on the interval [0, 1], we obtain for k = 1, 2 This implies E I k n,1 log On the other hand we have E I k n,1 log I n,1 n = E I k n,1 log I n,1 − E I k n,1 log n.

The Markov chain for the random split tree
Now, we consider the Markov chain from Section 2 with the transition probabilities ν n ({k}) = b k n P (I n,1 = k) + s 0 n 1 {k=n−s 0 } . Lemma 3.3. The process (S t ) t∈N 0 is an AR-process and the corresponding set of distributions {F x } fulfills the integrability condition.
Proof. Since ν n is a probability measure on the set {0, . . . , n − s 0 } we have Y t > 0 for all t. For x = − log n we have by dominated convergence and Lemma 3.1 for any y ∈ R Moreover, we obtain with Fubini's Theorem This yields 0 < t dF (t) < ∞. It remains to show the integrability condition, which means t dF a (t) < ∞ for an a ∈ R andF a (t) := inf x≤a F x (t). Using again Fubini's Theorem we obtain Proof. In the previous proof we have already shown that (S t ) t∈N is an ARprocess, which fulfills the integrability condition. The state space E = {− log n | n ∈ N} ∪ {1} is discrete. It remains to show conditions (7). Let x = − log n and y = − log m with m < n. It is We will show that there exists 0 <α <β < 1 such that for n large enough 0 < ⌊βn⌋+s 1 k=⌈αn⌉+s 1 For k = cn + o(n) with c ∈ (0, 1) and n → ∞ we have Hence, inequality (11) and equation (10) will imply for some ε > 0. The condition |x − y| ≤ K is equivalent to m ≥ e −K n. By the general assumption, the distribution of V has a Lebesgue density f V . Thus, there existsz ∈ (0, 1) with f V (z) > 0. Theorem 3 in Section 1.7.2 of Evans and Gariepy (1992) (which is a Corollary from the Lebesgue-Besicovitch Differentiation Theorem) implies that we can find a non-empty interval (α, β) ⊂ (0, 1) and ε 1 > 0 such that λ({z ∈ (α, β) | f V (z) < ε 1 }) = 0 with λ the Lebesgue measure. Now, we can choose some ε 2 > 0 and K > 0 withα := α + ε 2 < e −K (β − ε 2 ) =:β. We will show that for n large enough, for all k ∈ [αn + s 1 ,βn + s 1 ] ∩ N and for all l ∈ [e −K n, n] ∩ N it holds First, we consider the function g : z → z k−s 1 (1−z) η l −k+s 1 . Integration by parts yields For k = cη l + s 1 the function g reaches its maximum atẑ = c, is increasing on the interval [0, c] and decreasing on [c, 1]. Therefore, we have for any . Stirling's formula yields Considering the derivative ofg c in a neighborhood of 0, we obtaing c (x) < g(0) ≤ 1 for all x = 0 with |x| small enough. More precisely, for all c ∈ [α,β] and ε 3 > 0 small enough we haveg c (ε 3 )/g c (0) ∈ (0, C) for some constant C < 1. Thus, for ε 3 > 0 small enough and l large enough we have Together with (12), this implies for some 0 < ε 3 < ε 2 , l large enough and We obtain for any k ∈ [αn + s 1 ,βn + s 1 ] ∩ N and l ∈ [e −K n, n] ∩ N when n is large enough This finally yields (11): As in the proof of Lemma 3.3 we see that Since e −K < 1 the general assumption F V (x) < 1 for all x < 1 implies bE V 1 {V ≥e −K } > 0. This shows the second condition and the proof is finished. ✷

The internal path length
After these preliminaries, we are now able to prove Theorem 1.1. To show Theorem 1.1 we have to prove that the sequence H n := E[P n ] − µ −1 n log n n converges. The internal path length P n suffices a recursive representation (see e.g. Neininger and Rüschendorf, 1999, equation (50)) from where we get

This recursion formula implies
with t(n) = 1 n (n − s 0 − µ −1 n log n + bµ −1 E[I n,1 log I n,1 ]) and ν n ({k}) as in the previous section. From the result about the mean of the depth in Devroye (1999) we know H n ≤ C log n for some constant C > 0. Therefore, we have for any δ 1 ∈ (0, 1) Furthermore, because of n = bE[I n,1 ] + s 0 , we have The function x → x log x is Hölder continuous. Using this and considering the rate of convergence of E[| I n,1 n − V |] in Lemma 3.1 we obtain with Jensen's inequality t(n) = O(n − δ 2 ) for some δ 2 > 0. Taking all this into account, we get where r(n) = O(n − δ ) for some δ ∈ (0, 1]. Proof of Theorem 1.1. Equation (13) shows that the condition of Lemma 2.1 is fulfilled. Thus, we start with the representation of H n = E[P n ] − µ −1 n log n n from there and show that (H n ) n∈N is a Cauchy sequence. Let ε > 0 be given. For the second term in (5) we keep in mind that we have already shown |r(n)| ≤ Cn −δ for some constant 0 < C < ∞ and δ ∈ (0, 1]. We define l : R → R + by l(x) := exp(δx). As in the proof of Theorem 4.2 in Bruhn (1996) we obtain with Lemma 2.5 for n 1 ∈ N with − log n 1 ≤ a * Since 0 −∞ l(t)dt < ∞ we can choose n 1 ∈ N such that we have for all n, m > n 1 , Considering the first term in (5), we set a(n 1 , n) := E − log n H exp(−S σ(n 1 ) ) and claim that there exists n 0 such that for all n, m ≥ n 0 we have |a(n 1 , n) − a(n 1 , m)| ≤ ε/2. It is Since n 1 is fixed we have sup k∈{0,...,n 1 } |H k | ≤ C < ∞ with some constant C ∈ R. Lemma 2.4 in combination with Lemma 3.4 yields the claim. Taking everything into account, we obtain for all n, m ≥ max{n 0 , n 1 } This shows that (H n ) n∈N is a Cauchy sequence and thus it converges. ✷ Proof of Corollary 1.2. Parts a), c) and d) of Corollary 1.2 are immediate consequences of Theorem 1.1 and Neininger and Rüschendorf (1999, Theorem 5.1). To prove part b), we use that convergence with respect to the ℓ 2 -metric implies convergence of the second moments. Thus, we obtain as consequence of part a) lim n→∞ E[X 2 n ] = E[X 2 ]. Using the distributional fixed point equation characterizing X, we have where we used the independence between (V 1 , . . . , V b ) and (X (1) , . . . , X (b) ) as well as the fact that E[X (k) ] = 0 for all k. Since µ = −bE[V i log V i ] for all i = 1, . . . , b and E[X 2 ] = E[(X (k) ) 2 ] =: σ 2 the claim follows. ✷

The Wiener index
We now turn to the investigation of the Wiener index. To handle the Wiener index similarly to the internal path length, we first need a recursion formula for it. The Wiener index is the sum of the distances between all unordered pairs of balls in the tree. Let ∆ k,l denote the distance between the balls k and l. Then we have W n = k<l ∆ k,l .
Subdividing the sum into the sum for all pairs, where both balls are located in the same subtree, and the sum for all other pairs, we obtain where W (i) I n,i denotes the Wiener index of the i-th subtree T n,i being of size I n,i . For k ∈ T n,i and l ∈ T n,j with i = j it is ∆ k,l = D (i) k is the depth of the ball k with respect to the subtree T n,i . By symmetry of ∆ k,l we can sum up only the first part D (i) k + 1 but for all ordered pairs of balls and we obtain i<j l∈T n,j k∈T n,i The summation over k ∈ T n,i yields i =j l∈T n,j k∈T n,i (D where P (i) I n,i denotes the internal path length of the i-th subtree T n,i . Since there are all together n − I n,i balls not lying in T n,i , we finally obtain the recursion formula for the Wiener index of the random split tree with n balls: Proof of Theorem 1.4. Starting from equation (14) and taking the expectation yields Substituting the results from Lemma 3.2 in (16) provides . (17) We set H n := E[W n ] − 1 µ n 2 log n n .
To prove Theorem 1.4 it suffices to show that for each ε > 0 there exists a constant c ∈ R and n 0 ∈ N such that for all n ≥ n 0 H n n ∈ (c − ε, c + ε).
So, let ε > 0 be given. Substituting H n in (17) and using Lemma 3.2 yields We start again with the second term and split it in the following way r(exp(−S t )).
For the second summand we obtain by Lemma 2.5 with l(x) :=d exp(−x) and n 1 large enough such that − log n 1 ≤ a * with some constant C. We choose d large enough, such that Ce −d+3 < ε/3. For this d Corollary 2.7 yieldsn 0 ∈ N such that for all n ≥n 0 for some constant c. As in the proof of Theorem 1.1 the first summand in (18) is a Cauchy sequence, i.e. there existsñ 0 ∈ N such that for all n ≥ñ 0 we have 1 n E − log n [H exp(S σ(n 1 ) ) ] < ε 3 .
Altogether, we have seen that for n 1 ∈ N with − log n 1 ≤ a * there exists n 0 ∈ N such that for all n ≥ n 0 we have with the constant c in (19). Thus, the claim follows. ✷ Proof of Theorem 1.5. We define w n := E[W n ] = 1 µ n 2 log n + c w n 2 + o(n 2 ), p n := E[P n ] = 1 µ n log n + c p n + o(n) and X n := W n − w n n 2 , P n − p n n T .
For i ∈ {1, . . . , b} let X (i) n be an independent copy of X n . Since the subtrees of the random split tree are independent conditioned upon there sizes, we obtain from (14) for the standardized vector X n the following recursion formula This yields with By similar arguments we have b (n) In order to use the contraction method as in Neininger (2001, Theorem 4.1) it suffices to show that for n → ∞ where · op is the operator norm. By Lemma 3.1 we know that I n /n converges in probability to V := (V 1 , . . . , V b ), which is the splitting vector. By equations (20) and (21) By the boundedness of the function x → x log x on [0, 1] and as I n,i /n ∈ [0, 1] there exists a constant C such that Thus, we get the uniform integrability of (b 2 ) 2 and consequently the convergence of b (n) with respect to the ℓ 2 -metric. Similar arguments yield the convergence of A (n) i with respect to the ℓ 2 -metric to where we used Bernstein's inequality. It remains to show (24). Solving the characteristic equation for the matrix (A * i ) T A * i we obtain that its eigenvalue λ(V i ) being larger in absolute value is given by The claim for the asymptotic behavior of the variance of W n follows directly from the first part, since convergence with respect to the ℓ 2 -metric implies convergence of the second moments. ✷ We give the essential parts of Rösler (1991) which prove Lemma 2.4.
Proof of Lemma 2.4. Let a ∈ R − . We use the notation ∆(a) := lim Since the function is increasing and non-negative, the limit for x 0 → −∞ exists. We will show that ∆(a) ≤ (1 −ε)∆(a) + δ for someε > 0 and all δ > 0. Then the claim follows. Let δ > 0 be an arbitrary number. Since the process S fulfills the integrability condition and S τ (y) − y ≤ S τ (y) − S τ (y)−1 , there exists for some constant C. Thus, there exists K 1 ≥ K such that for all y < x 1 Furthermore, we have for this K 1 The distribution of the Markov chain S on the state space E is given by the kernel Let S (a) be the process S stopped at the moment when it exceeds a ∈ E. The kernel κ a corresponding to the process S (a) is then given by κ a (x, A) = κ(x, A) for x ≤ a and κ a (x, A) := 1 A (x) for x > a and for all A ⊂ E.
The property c) follows from the assumption (7) and the fact that For (x, y) ∈ E 2 let Z (x,y) = (U (x,y) , V (x,y) ) be the Markov chain generated by the kernel ̺ which starts in (x, y). We define the stopping time θ(a) := inf{t | Z (x,y) t ∈ (a, ∞) × (a, ∞)}.
Using this coupling we obtain for any K 2 > 0 and z, y < a d TV P In the last step we used that P u (S τ (a) = w) − P v (S τ (a) = w) = 0 for u = v. As seen in equation (25) and using property a) of the coupling, there exists by the integrability condition K 2 > K such that for all y < x 1 − K and y < z < y + K P Z (z,y) 1 / ∈ (−∞, y + K 2 ] 2 ≤ κ a (z, (−∞, y + K 2 ] c ) + κ a (y, (−∞, y + K 2 ] c ) After these preliminaries, we now turn to ∆(a). It is for x < y < a − K d TV P With the results in (25), (26), (27) and (28) as well as property c) of the kernel ̺ this finally yields ∆(a) ≤ lim (1 − ε) Bruhn (1996) Proof of Lemma 2.1. For n ≤ n 1 the claim follows immediately since σ(n 1 ) = 0. For n > n 1 equation (5)  Proof of Lemma 2.5. We use the notation from Section 2 and define for x ∈ R − the function u x by u x (a) := E x [|{t : S t ∈ (a, a + 1]}|]. By the monotone convergence theorem we have lim a→−∞ E[Y (a) t ] = E[Ỹ t ] > 0. Thus, there exists a * ∈ R such that for all a < a * it is E[Y (a) t ] > 0. For x, n, a < a * and k ∈ N it holds P x (|{t : S t ∈ (n − 1, n]}| ≥ k) = (n−1,n] P y (S k−1 ≤ n) dP Since it is E[Y (a) t ] > 0 the elementary renewal theorem (see e.g. Gut, 1988, Section II.4) providesû(a) < ∞. Furthermore, the function a →û(a) is decreasing as a → −∞, i.e.û(a) ≤û(a * ) for all a < a * . So we finally obtain for a function l : R → R + , y, z ∈ R and x ∈ E with x < y < z < a *