On the tails of the limiting Quicksort distribution

We give asymptotics for the left and right tails of the limiting Quicksort distribution. The results agree with, but are less precise than, earlier non-rigorous results by Knessl and Spankowski.


Introduction
Let X n be the number of comparisons used by the algorithm Quicksort when sorting n distinct numbers, initially in a uniformly random order. Equivalently, X n is the internal pathlength in a random binary search tree with n nodes. (See e.g. Knuth [ where d = denotes equality in distribution, and, on the right, U n is distributed uniformly on the set {1, . . . , n}, X * j d = X j , X 0 = 0, and U n , X 0 , . . . , X n−1 , X * 0 , . . . , X * n−1 are all independent. (Thus, (1.1) can be regarded as a definition of X n .) It is well-known, and easy to show from (1.1), that E X n = 2(n + 1)H n − 4n ∼ 2n ln n, (1.2) where H n := n k=1 k −1 is the n:th harmonic number. Moreover, it was proved by Régnier [9] and Rösler [10], using different methods, that the normalized variables converge in distribution to some limiting random variable Z, as n → ∞.
There is no simple description of the distribution of Z, but various results have been shown by several different authors. For example, Z has an everywhere finite moment generating function, and thus all moments are finite [10], with E Z = 0 and Var Z = 7 − 2 3 π 2 ; furthermore, Z has a density Date: 28 August, 2015; minor revision 26 September, 2015.
Partly supported by the Knut and Alice Wallenberg Foundation. 1 which is infinitely differentiable [11; 2]. Moreover, the recurrence relation (1.1) yields in the limit a distributional identity, which can be written as where U , Z ′ and Z ′′ are independent, U ∼ U(0, 1) is uniform, Z ′ , Z ′′ d = Z, and g is the deterministic function (1.5) Furthermore, Rösler [10] showed that (1.4) together with E Z = 0 and Var Z < ∞ determines the distribution of Z uniquely; see further [3]. The identity (1.4) is the basis of much of the study of Z, including the present work.
In the present paper we study the asymptotics of the tail probabilities P(Z −x) and P(Z x) as x → ∞. Using non-rigorous methods from applied mathematics (assuming an as yet unverified regularity hypothesis), Knessl and Szpankowski [6] found very precise asymptotics of both the left tail and the right tail. Their result for the left tail is that, as where c 1 , c 2 , c 3 are some constants (c 1 is explicit in [6], but not c 2 ). For the right tail, they give a more complicated expression, which by ignoring higher order terms implies, for example, It has been a challenge to justify these asymptotics rigorously, and so far very little progress has been made. Some rigorous upper bounds were given by Fill and Janson [4], in particular x 303, (1.8) with the same leading term (in the exponent) as (1.7), and for the left tail which is much weaker than (1.6). Also the present paper falls short of the (non-rigorous) asymptotics (1.6)-(1.7) from [6], but we show, by simple methods, the following results, which at least show that the leading terms in the top exponents in (1.6)-(1.7) are correct.
We show the lower bounds in Sections 3 and 4, and the upper bounds in Sections 5 and 6. The lower bounds are proved by direct arguments using the identity (1.4); the upper bounds are proved by the standard method of first estimating the moment generating function.
Remark 1.2. The right inequality in (1.11) follows from the more precise (1.8), where an explicit value is given for the implicit constant; we include this part of (1.11) for completeness. (The proof in Section 6 actually yields a better constant than (1.8) for large x, see (6.10).) We expect that, similarly, the implicit constants in the other parts of (1.10)-(1.11) could be replaced by explicit bounds, using more careful versions of the arguments and estimates below. However, in order to keep the proofs simple, we have not attempted this. Remark 1.3. We consider only the limiting random variable Z, and not Z n or X n for finite n. Of course, the results for Z imply corresponding results for the tails P(Z n −x) and P(Z n x) for n sufficiently large (depending on x), but we do not attempt to give any explicit results for finite n. For some bounds for finite n, see [5] and (for large deviations) [8].
Remark 1.4. Although we do not work with Z n for finite n, the proofs below of the lower bounds can be interpreted for finite n, saying that we can obtain Z n −x with roughly the given probability (for large n) by considering the event that in the first Θ(x) generations, all splits are close to balanced (with proportions 1 2 ± x −1/2 , say); similarly, to obtain Z n x we let there be one branch of length Θ(x) where all splits are extremely unbalanced (with at most a fraction (x ln x) −1 on the other side). The fact that we require an exponential number of splits to be extreme for the lower tail, but only a linear number for the right tail, can be seen as an explanation of the difference between the two tails, with the left tail doubly exponential and the right tail roughly exponential.
Let ψ(t) := E e tZ be the moment generating function of Z. As said above, Rösler [10] showed that ψ(t) is finite for every real t. The distributional identity (1.4) yields, by conditioning on U , the functional equation We may replace Z by the right-hand side of (1.4); hence we may without loss of generality assume the equality (not just in distribution) (2.2)

Right tail, upper bound
As said in the introduction, (1.8) was proved in [4]. Nevetheless we give for completeness a proof of the upper bound in (1.11), similar to the proof in Section 5. (It is also similar to the proof in [4] but simpler, partly because we do not keep track of all constants and do not try to optimize; nevertheless, it yields a slight improvement of (1.8) for large x, see (6.10) below.) Lemma 6.1. There exists a 0 such that for all t 0, ψ(t) exp e t + at . (6.1) Note that [4,Corollary 4.3] shows the bound ψ(t) exp(2e t ) for t 5.02, which is explicit, but weaker for large t.
Proof of upper bound in (1.11). For x 0 and any t 0, by Lemma 6.1, P(Z x) e −tx E e tZ = e −tx ψ(t) exp −tx + e t + at . (6.9) We take t = ln x (assuming x 1) and obtain P(Z x) exp −x ln x + x + O(ln x) , x 1. (6.10) (The optimal choice of t is actually ln(x−a), but this leads to the same result up to o(1) in the exponent, which is absorbed by the error term O(ln x).) Acknowledgement. I thank David Belius and Jim Fill for helpful comments.