From Ball's cube slicing inequality to Khinchin-type inequalities for negative moments

We establish a sharp moment comparison inequality between an arbitrary negative moment and the second moment for sums of independent uniform random variables, which extends Ball's cube slicing inequality.


Introduction
Ball's celebrated cube slicing inequality established in [3] states that the maximal volume cross-section of the centred cube [−1, 1] n in R n by a hyperplane (a subspace of codimension 1) equals 2 n−1 √ 2, attained by the hyperplane with normal vector ( 1 √ 2 , 1 √ 2 , 0, . . . , 0) (see also [4]). Khinchin-type inequalities provide moment comparison, typically for weighted sums of independent identically distributed (i.i.d.) random variables. The classical one concerns symmetric random signs and goes back to the work [18] of Khinchin. Such inequalities are instrumental in studying unconditional convergence and are used extensively in (functional) analysis and geometry, particularly in (local) theory of Banach spaces. We refer to several works [2,13,20,23,25,26,30,31,34,35] for further background and references (particularly, [2] provides a detailed historic account on Khinchin inequalities with sharp constants).
The main motivation for this article and its starting point is a fact well-known to experts that Ball's inequality can be viewed as a Khinchin-type inequality (the dual question of extremal volume hyperplane-projections of convex bodies is also linked to Khinchintype inequalities, see for example [5,6,9]). An elementary derivation can be sketched as follows. For a unit vector a = (a 1 , . . . , a n ) in R n , let f be the density of X = n k=1 a k U k , where U 1 , . . . , U n are i.i.d. uniform on [−1, 1]. Then the (n − 1)-volume of the cross-section of the cube [−1, 1] n by the hyperplane a ⊥ perpendicular to a is Vol n−1 [−1, 1] n ∩ a ⊥ = 2 n f (0). On the other hand, for every symmetric unimodal bounded random variable X with density f , we have (X is called symmetric if it has the same distribution as −X). Thus Ball's inequality, put probabilistically, says that for every unit vector a in R n , we have Our main result shows in particular that not only does this inequality hold in the limit, but also for every p ∈ (p 0 , 1), where p 0 = 0.793... . To view this inequality as actual moment comparison, let ξ 1 , ξ 2 , . . . be i.i.d. random vectors in R 3 uniform on the centered Euclidean unit sphere S 2 . As a result of Archimedes' hat-box theorem and rotational invariance, the left hand side can be rewritten as E n k=1 a k ξ k −1 , where · stands for the standard Euclidean norm on R 3 (see Lemma 3 below). We thus have the following identity for a unit vector a in R n , For a generalisation, see Proposition 3.2 in [24]. As a result, we can rephrase Ball's inequality as the following sharp L −1 − L 2 Khinchin-type inequality: for every n and every reals a 1 , . . . , a n , We extend this to a sharp L −p − L 2 moment comparison for p ∈ (0, 1) with arbitrary matrix-valued coefficients (Corollary 4 below). We refer to [2,20,23,27] for sharp results for positive moments.
We describe our results in the next section and then present our proofs, preceded with a short overview of them. We conclude with a summary highlighting possible future work. Throughout, x, y = d j=1 x j y j denotes the standard scalar product on R d , x = x, x is the Euclidean norm whose unit sphere and closed unit ball are denoted by S d−1 and B d 2 , respectively. Moreover, e j is the j-th vector of the standard basis whose j-th coordinate is 1 and the rest are 0.

Results
Let U 1 , U 2 be i.i.d. random variables uniform on [−1, 1] and let Z be a standard Gaussian random variable (mean 0, variance 1). For p ∈ (0, 1), we define the constants and By comparing c 2 (p) and c ∞ (p) as done in Lemma 7 from Section 4 below, in fact we have where p 0 is the unique p ∈ (0, 1) such that c 2 (p) = c ∞ (p). Our main result is the following L −p − L 2 Khinchin-type inequality for sums of symmetric uniform random variables.
Theorem 1. Let p ∈ (0, 1) and let C p be defined by (4). 1]. For every n and every reals a 1 , . . . , a n , we have Remark 2. Applying (5) to n = 2, a 1 = a 2 = 1 √ 2 and to n large, a 1 = · · · = a n = 1 √ n (with the aid of the central limit theorem) shows that the value of C p in (5) is sharp.
Moments of a Euclidean norm of weighted sums of independent random vectors uniform on S d+1 and B d 2 , d 1, are proportional (see Proposition 4 in [23] or its generalisation, Theorem 4 in [2]). We recall a special case of this result relevant for us and for convenience sketch its proof (particularly because the proofs available in the literature treat the case of positive moments, but of course they repeat verbatim to negative moments).
Lemma 3 (Proposition 4, [23]). Let ξ 1 , ξ 2 , . . . be i.i.d. random vectors uniformly distributed on the unit sphere S 2 in R 3 . Let U 1 , U 2 , . . . be i.i.d. random variables uniform on [−1, 1]. For a vector a = (a 1 , . . . , a n ) in R n and p ∈ (−∞, 1), we have Proof. We reproduce here an argument utilising rotational invariance from [23] attributed to Lata la. Let θ be a random vector uniform on S 2 , independent of all the other variables. By rotational invariance, for a vector x in R 3 , we have where θ 1 denotes the first component of θ, so Applying this to x = n k=1 a k ξ k and taking the expectation gives By the rotational invariance of a k ξ k , we also have However, θ is a unit vector and the random variables ξ k , e 1 are i.i.d. uniform on [−1, 1], therefore Putting these equations together finishes the proof.
It follows from Lemma 3 that (5) is equivalent to We extend this to matrix-valued coefficients using isometrical embeddings into L p spaces (Orlicz-Szarek's argument, see Remark 3 in [35]). This offers a sharp version of the very general result of Gorin and Favarov from [12] (see Corollary 2 therein) in the case of uniform vectors on S 2 and the L −p − L 2 moment comparison. For a matrix A, A HS stands for its Hilbert-Schmidt norm.
Corollary 4. Let p ∈ (0, 1) and let C p be defined by (4). Let ξ 1 , ξ 2 , . . . be i.i.d. random vectors uniform on the unit sphere S 2 in R 3 . For every n and every real 3 × 3 matrices A 1 , . . . , A n , we have Remark 5. Both (6) and (7) are sharp. The constant in (7) is larger than in (6) . The former specialised to the case when each matrix A k is proportional to the matrix Remark 6. A sharp reversal of (6) (analogously of (7)) is immediate from convexity, By (1), the case p = 1 of this inequality gives yet another simple proof of Hadwiger's and Hensley's result (see [14] and [16], see also Theorem 2 in [3]).

Proof overview
Haagerup's work [13] can perhaps be seen as a landmark in the pursuit of sharp Khinchintype inequalities. Later, Nazarov and Podkorytov in [31] offered an informative exposition of [13] (and [3]), developing novel tools which allowed for significant simplications of the most technically demanding parts of [13] (as well as of [3]). We shall closely follow their approach which comprises two main steps. (For other works which used techniques from [31] to establish sharp Khinchin-type inequalities, we refer for instance to [20,29].) Step I (Section 5.2). We prove (5) in the case that all weights a k are "small", that is for the sequences a = (a k ) n k=1 with max k n |a k | 1 √ 2 n k=1 a 2 k 1/2 (call it Case A). This in turn is accomplished by a Fourier-analytic expression for negative moments (used for instance in [12]), which allows to leverage independence. As in [3], by the use of Hölder's inequality, the following integral inequality allows to finish the whole argument, This inequality is an extension of Ball's integral inequality from [3] and is proved with the methods of [31]. For other refinements and extensions of Ball's integral and cube slicing inequalities see [8,17,21,22,28].
Step II (Section 5.3). With the aid of the result of Step I, we use induction on n to prove a certain strengthening of (5) for all sequences a = (a k ) n k=1 in order to handle those which do not satisfy Case A, that is have a "large" weight (call those Case B). Were (8) true for all s 1, this step would have been spared. In [31] the inductive step is possible thanks to an algebraic identity obtained by averaging with respect to one random sign. In our setting, for uniform [−1, 1] random variables, such an identity does not seem to present itself. To overcome this obstacle, we work with S 2 -uniform random vectors for which certain algebraic identities allowing for induction are much more natural. For Ball's inequality (2) (case p = 1), this step was in [3] taken care of by a simple projection argument, but its analogue for p < 1 is not sufficient (see Remark 21 at the end of Section 5.3).
We remark that in the range p ∈ (0, p 0 ) when C p = c ∞ (p) and the extremizing sequence is a 1 = · · · = a n = 1 √ n with n → ∞, it is only Case A which admits equality (attained asymptotically as n → ∞), whereas in the range p ∈ (p 0 , 1) when C p = c 2 (p) and the extremizing sequence is a 1 = a 2 = 1 √ 2 , n = 2, both Case A and B admit equality (in Case B when taking n = 2 and a 1 = 1 √ 2−δ , a 2 = 1 √ 2+δ , δ → 0+) and hence both Step I and II have to be subtle enough to overcome this difficulty.
As a final comment here, convexity-type arguments leading to more precise results such as Schur-convexity of moments of sums with a fixed number of summands n (see [1,2,7,9,11,15,19,23,33]) do not seem to be available here. One of the obstacles is for instance the fact that the function t → E|U 1 + √ tU 2 | −p is not convex/concave on the whole half-line (0, +∞) (it is concave on (0, 1) and convex on (1, +∞)).

Technical lemmas
We gather several elementary but technical results needed in our proofs. The first one explains the comparison between the constants c 2 (p) and c ∞ (p) arising from two different extremizing sequences of weights a k in our Khinchin inequality.
In view of the claim (after taking the logarithm and noting that a linear function intersects a strictly concave function at most twice), the proof of the lemma is finished.
To prove the claim, we let u = 1−p We now show that the right hand side is positive on (0, 1 2 ). Call it h 1 (u) and note that The right hand side has the same sign as . This shows that h is strictly convex on (0, 1 2 ).

The next three lemmas are elementary facts about functions showing up in calculations from
Step I (Section 5.2) needed to prove the integral inequality (8).
Proof. Since both cos t and sin t t are even, it suffices to consider positive t. By the Cauchy-Schwarz inequality, we have cos t − sin t t 1 + 1 t 2 , so it suffices to consider t 5 √ 11 . It remains to note that 5 √ 11 < π 2 and that on (0, π 2 ), we have cos t − sin t t = sin t t − cos t < 1 + 0 = 1.
The following lemma is an important step in the proof of (8). Essentially it is a consequence of convexity of sums of exponential functions.
Proof. For m = 1, 2, . . . , we let Then which is a sum of convex functions, thus R m (p) is convex.
Case m 2. We have, We check directly that b 2 > 2.7 and b 3 > 17. For m 4, we use the standard estimate m! > √ 2πm m e m and log π(m+3/2) = log 3π Therefore, R ′ m (0) > 0 for every m 2 and, by convexity, R m (p) is increasing. Thus, For m 2, the right hand side is lower bounded by its value at m = 2, which is greater than 1.4.
The final lemma in this section lies at the heart of the base case of the inductive argument from Step II (Section 5.3).
Then for every x ∈ (0, 1), p → h(p, x) is strictly concave and decreasing on [0, 2]. In Proof. First we show concavity. Fix x ∈ (0, 1). We have, is a strictly convex function of p as being of the form Aa p +Bb p − C with positive a, A, b, B, C. Therefore, in order to show that ∂ 2 ∂p 2 h(p, x) is negative for p ∈ (0, 2), it suffices to check that it is nonpositive at the endpoints p = 0 and p = 2.
At p = 2, using log 2 3−x 2 2 < 1 and 3 − x 2 > 2, we have Note that the right hand side at x = 0 is 0, so it suffices to show that it is decreasing in x.

Fourier-analytic formula
The following important Fourier-analytic formula for negative moments is the starting point of our proof.
Lemma 13 (Lemma 3 in [12]). For a random vector X in R d and p ∈ (0, d), we have provided that the right hand side integral exists, where φ X (t) = Ee i t,X is the characteristic function of X, · is the Euclidean norm on R d and b p,d = 2 −p π −d/2 Γ((d−p)/2) Γ(p/2) .
Using this formula, we have The proof proceeds using completely different arguments depending on whether there is a large weight a k or not.

All weights are small
Our goal here is the following special case of Theorem 1.
For the proof, we can assume that n k=1 a 2 k = 1 and by symmetry, additionally, that each a k is positive. Thus in this case 0 < a k 1 √ 2 for every k. Recall (9). By Hölder's inequality, since where we define The next step is to maximize Ψ p (s) over s 2. The answer varies depending on the value of p and is given by either s = 2 or s → ∞.
Taking these lemmas for granted for a moment, we can finish the proof as follows. Suppose that p ∈ (p 0 , 1). Then combining (9), (10) and Lemma 15, we have obtaining "half" of (5), that is when C p = c 2 (p). Of course, we proceed identically for p ∈ (0, p 0 ) using Lemma 16 to obtain the other half. Therefore, to finish the proof of Theorem 14, it remains to prove Lemmas 15 and 16.
Proof of Lemma 15. Recalling (11), the definition of Ψ p , by a change of variables, the inequality Ψ p (s) Ψ p (2) is equivalent to which can be thought of as a Ball's integral inequality with the weight t p−1 (Ball's inequality corresponds to the case p = 1, see [3,31]). For the proof, we rewrite the right hand side as ∞ 0 g p (t) p t p−1 dt with a Gaussian function g p (t) = exp(−σ 2 p t 2 ) for σ p > 0 defined such that for every s, We emphasize that this identity holds for every s with σ p depending only on p and that this is why the Gaussian function g p is a good function to compare sin t t with. Our goal is then to show that and σ p in the definition of g p is such that there is equality for s = 2 in (12). We remark that the equality for s = 2 is equivalent to where U 1 , U 2 are i.i.d. uniform [−1, 1] random variables and Z is a standard Gaussian random variable (because sin t t 2 is the characteristic function of U 1 + U 2 and g p (t) 2 is the characteristic function of 2σ p Z). This allows to explicitly compute σ p , To prove (12), we use the following "lemma on distribution functions" from [31]. Recall that given a non-negative function h : X → [0, +∞) on a measure space (X, µ) its distribution function is the non-increasing function H : (0, +∞) → [0, ∞) defined by

Lemma 17 ([31]
). Let f and g be two non-negative measurable functions on a measure space (X, µ) and F , G be the distribution functions of f and g respectively. If F (y) and G(y) are finite for every y > 0 and there is some point y 0 such that F (y) G(y) for all 0 < y < y 0 and F (y) G(y) for all y > y 0 , then the function In particular, if X (f s0 − g s0 ) dµ = 0 for some s 0 > 0, then X (f s − g s ) dµ 0 for every s s 0 .
Let µ be the Borel measure on (0, +∞) with dµ(t) = t p−1 dt. Let F and G p be the distribution functions respectively of f (t) = sin t t and g p (t) = exp(−σ 2 p t 2 ).
By Lemma 17, to establish the validity of (12) it suffices to show that the difference F − G p changes sign on (0, 1) exactly once (since both f and g p are bounded by 1, both F and G p vanish on [1, +∞)). Notice that for t ∈ (0, π), Moreover, thanks to Lemma 7, our assumption p ∈ (p 0 , 1) is equivalent to E Z which in turn by the definition of σ p is equivalent to σ 2 p < 1 6 . Thus, f (t) e −t 2 /6 < e −σ 2 p t 2 = g p (t) for t ∈ (0, π). Consequently, where y m = max t∈[mπ,(m+1)π] f (t), m = 1, 2, . . . is the decreasing sequence of successive maxima of f , as in [31]. Since ∞ 0 2y(F (y) − G p (y))dy = ∞ 0 (f (t) 2 − g p (t) 2 )dµ(t), we have that F − G p changes its sign at least once on (0, 1). Therefore, to prove that this happens exactly once, it suffices to prove that G p −F is strictly increasing on (0, y 1 ), and since G ′ p and F ′ are negative, equivalently that |F ′ | > |G ′ p | on every interval (y m , y m+1 ), m 1.
To this end, fix an integer m 1 and y ∈ (y m+1 , y m ). Note that there is one solution, call it t 0 = t 0 (y), to the equation f (t) = y on (0, π) and for every k = 1, . . . , m, there are two solutions t − k = t − k (y) and t + k = t + k (y) on (kπ, (k + 1)π). We can then write Differentiating with respect to y we get With the aid of Lemma 8 we then have Since t ± k kπ for every k 1 and, by Lemma 9, t 0 > 2 it follows that We remark that this estimate is valid for all p ∈ (0, 1).
Proof of Lemma 16. Finding the limit is standard. For instance, if the limit is taken along integral even s, this follows from Lemma 13 combined with the central limit theorem. In general, a simple analytic argument goes as follows: letting K s (t) = sin(t/ √ s) t/ √ s s , splitting the integration as From this point onwards, we repeat the proof of Lemma 15 with f (t) = sin t t and g(t) = exp(−t 2 /6), so g(t) = g p (t) with σ p set to be constant, equal to 1 √ 6 . For s = 2 inequality (16) is equivalent to c 2 (p) c ∞ (p) which holds true and is in fact a strict inequality for every p ∈ (0, p 0 ) (Lemma 7). We next look at the sign changes of the difference F − G of the distribution functions F , G of f and g, respectively. If there is no sign change, we are immediately done (in view of the identity (f s −g s )dµ = ∞ 0 sy s−1 (F (y)−G(y))dy). Thus, in view of Lemma 17, it remains to check that F − G changes sign at most once. Since (13) holds here as well (with G p replaced by G), as in Lemma 15, it suffices to check that |F ′ | > |G ′ | on every interval (y m , y m+1 ), m 1. As in the proof of Lemma 15, we have inequality (14) with σ p p replaced by 6 −p/2 . Since 6 −p/2 > π −1/2 2 1/2−p for every p ∈ (0, 1), Lemma 11 allows to finish the proof.

There is a large weight
We finish the proof of Theorem 1 by following the inductive approach from [31]. It crucially relies on strengthening the right hand side of (5) to allow the induction on the number of summands n to work. To this end, we define By this construction, the graph of Φ p (x) on [0, 1] is the graph of φ p (x) on [1,2] reflected about the point (1, φ p (1)). In particular, to the left of x = 1, Φ p and φ p share the common tangent line at x = 1. Consequently, Φ p (x) φ p (x) for every x. By homogeneity, (5) is equivalent to We shall inductively show a strengthening. As it will be clear from the proof, it is natural to run the inductive argument for spherically symmetric random vectors ξ k .
Theorem 18. For every p ∈ (0, 1), every n 2 and every vectors v 2 , . . . , v n in R 3 , we have Since v k , ξ k has the same distribution as v k U k , (17) gives (5).
Proof of Theorem 18. We use induction on n. For n = 2, we have the following lemma, the proof of which we defer for now.
Case (a): v k > 1 for some 2 k n. Then x > 1, so (17) coincides with where v 1 = e 1 . Let v * 1 , . . . , v * n be a rearrangement of v 1 , . . . , v n such that v * k v * k+1 for every k = 1, . . . , n − 1 and let for every k = 1, . . . , n, so that v ′ 1 = 1 and v ′ k 1 for k = 2, . . . , n. Then due to the homogeneity of (19) and the fact that v ′ 1 , ξ 1 has the same distribution as e 1 , ξ 1 , it is enough to prove which is handled by the next cases.
Case (b): v k 1 for every 2 k n and x 1. Then again (17) coincides with the homogeneous estimate (5). Moreover, we have that so this case reduces to Theorem 14 where all the v k are small.
By the inductive hypothesis applied to the sequence (v 2 , . . . , v n−2 , v n−1 + Q ⊤ v n ) (conditioned on the value of Q), we get thus, by the symmetry of Q, We shall now need a lemma about concavity of Φ p , the proof of which we also defer.
Lemma 20. Let p ∈ (0, 1). For every a, b 0 with a+b This lemma applied to a = x + 2 v n−1 , Q ⊤ v n and b = x − 2 v n−1 , Q ⊤ v n (which satisfy a, b 0 and a+b 2 = x < 1) finishes the proof of the inductive step.
It remains to show the lemmas we have used.
Proof of Lemma 19. First note that if v > 1 then, due to rotational invariance, This shows that the desired inequality is then equivalent to , it is sufficient to prove the lemma in the case v 1.
Fix v ∈ R 3 with x = v 1. To compute explicitly the left hand side of (18), recall that for any w ∈ R 3 , w, ξ has the same distribution as w U where ξ and U are uniformly distributed on S 2 and [−1, 1], respectively. Then, we have that Recalling the definition of c 2 (p) and Φ p on [0, 1], we thus get that (18) becomes for every 0 < x 1. Note that we can write this as so Lemma 12 finishes the proof.
Proof of Lemma 20. We can assume without loss of generality that a < b. If b 1, the desired inequality follows from the concavity of Φ p on [0, 1]. So, assume that b > 1. Then using the facts a < b and a+b 2 1, we can write using the fact that the derivative of Φ ′ p is decreasing on [0, 1]. This implies that is a decreasing function of a, so to prove the desired inequality, it suffices to show that the latter is nonnegative for the maximum value of a, that is a = a 0 = 2 − b. Since b > 1, a 0 < 1 and by the definition that is, the desired inequality is in fact an equality in this case.
Remark 21. Let p ∈ (0, 1). Let X be a rotationally invariant random vector in R 3 . For every nonzero vector y in R 3 , observing that X + y 2 has the same distribution as X 2 + y 2 + 2 X y U , where U is uniform on [−1, 1], independent of X, we have E X + y −p = E X + y 2−p − X − y 2−p 2(2 − p) X y .
In particular, by the concavity of t → t 1−p , This combined with independence gives For p = 1, this immediately gives (2) in the case of a large weight, max k n |a k | > 1 √ 2 n k=1 a 2 k 1/2 and Theorem 18 is not needed (this corresponds to the simple projection argument from [3] handling this case). For p < 1, this argument yields the nonsharp constant 2 p/2 instead of (1 − p)C p .

Proof of Corollary 4
Let G = (G 1 , G 2 , G 3 ) be a standard Gaussian random vector in R 3 (mean 0, covariance I), independent of the sequence (ξ k ) n k=1 . Then for every vector x in R 3 , since x, G has the same distribution as x G 1 , we have x −p = α p E| x, G | −p with α p = (E|G 1 | −p ) −1 . Therefore, Using this and inequality (5), we obtain Rewriting the sum of squares using the second moment, applying Minkowski's inequality (with the negative exponent − 2 p ) and using (20) again, we get

Conclusion
Continuing a long line of work and particularly addressing some questions raised in [2], we have established a sharp Khinchin-type L −p − L 2 moment comparison inequality when p ∈ (0, 1) for weighted sums of independent random variables uniform on [−1, 1], equivalently uniform vectors on the unit sphere S 2 in R 3 . In this case, this provides a sharp version of the very general results from [12].
We have not tried to optimise various technical numerical estimates which would certainly allow to extend our results to p ∈ (0, p 1 ), p 1 = 1.38 (the negative moments of order −p for S 2 -uniform vectors exist for all p < 2). The arguments seem robust enough to handle cases of S r -uniform vectors for other values of r (most notably the case of r = 1 corresponding to Steinhaus random variables as well as the case of r = 3 which would provide extensions of the polydisc slicing inequality of Oleszkiewicz and Pe lczyński from [32], just as our result extends Ball's cube slicing inequality from [3]). Moreover, the question of a sharp L p −L 2 , p ∈ (0, 1), moment comparison for n k=1 a k U k remains open with a natural conjecture that inf n inf a∈R n E| n k=1 a k U k | p ( n k=1 a 2 k ) p/2 = lim n→∞ E n k=1 n −1/2 U k p (see Question 5 and Proposition 15 in [10]). All this is the topic of ongoing and future work.