A note on suprema of canonical processes based on random variables with regular moments

We derive two-sided bounds for expected values of suprema of canonical processes based on random variables with moments growing regularly. We also discuss a Sudakov-type minoration principle for canonical processes.


Introduction and Main Results
In many problems arising in probability theory and its applications one needs to estimate the supremum of a stochastic process. In particular it is very useful to be able to find two-sided bounds for the mean of the supremum. The modern approach to this challenge is based on the chaining methods, see monograph [15].
In this note we study the class of so-called canonical processes of the form X t = ∞ i=1 t i X i , where X i are independent random variables. If X i are standardized, i.e. have mean zero and variance one, then the above series converges a.s. for t ∈ ℓ 2 and we may try to estimate E sup t∈T X t for T ⊂ ℓ 2 . To avoid measurability questions we either assume that the index set T is countable or define in a general situation E sup t∈T X t := sup E sup t∈F X t : F ⊂ T finite .
It is also more convenient to work with the quantity E sup s,t∈T (X t − X s ) rather than E sup t∈T X t . Observe however that if the set T or the variables X i are symmetric then In the case when X i are i.i.d. N (0, 1) r.v.s, X t is the canonical Gaussian process. Moreover, any centered separable Gaussian process has the Karhunen-Loève representation of such form (see e.g. Corollary 5.3.4 in [10]). In the Gaussian case the behaviour of the supremum of the process is related to the geometry of the metric space (T, d 2 ), where d 2 is the ℓ 2 -metric d(s, t) = (E|X s − X t | 2 ) 1/2 . The celebrated Fernique-Talagrand majorizing measure bound (cf. [2,13]) may be expressed in the form where here and in the sequel C denotes a universal constant, γ 2 (T ) := inf sup t∈T ∞ n=0 2 n/2 ∆ 2 (A n (t)), the infimum runs over all admissible sequences of partitions (A n ) n≥0 of the set T , A n (t) is the unique set in A n which contains t, and ∆ 2 denotes the ℓ 2 -diameter. An increasing sequence of partitions (A n ) n≥0 of T is called admissible if A 0 = {T } and |A n | ≤ N n := 2 2 n for n ≥ 1.
In [14] Talagrand derived two-sided bounds for suprema of the canonical processes based on i.i.d. symmetric r.v.s X i such that P(|X i | > t) = exp(−|t| p ), 1 ≤ p < ∞. This result was later extended in [7] to the case of variables with (not too rapidly decreasing) log-concave tails, i.e. to the case when X i are symmetric, independent, P(|X i | ≥ t) = exp(−N i (t)), N i : [0, ∞) → [0, ∞) are convex and N i (2t) ≤ γN i (t) for t > 0 and some constant γ. The aim of this note is to find two-sided bounds for suprema for a more general class of canonical processes.
For a general process (X t ) t∈T one needs to study a family of metrics instead of a single one. We define where for a real random variable Y and p ≥ 1, Y p := (E|Y | p ) 1/p denotes the L p -norm of Y . Following ideas of Talagrand, we define the functional where ∆ p denotes the diameter with respect to the distance d p and as in the case of the γ 2 -functional the infimum runs over all admissible sequences of partitions (A n ) of the set T .
It is not hard to show (as it was noted independently by Mendelson To reverse bound (1) we need some regularity assumptions. We express them for canonical processes in terms of moments growth of variables X i . It is easy to check that for a symmetric variable Y with a log-concave tail exp(−N(t)), Y p ≤ C p q Y q for p ≥ q ≥ 2. Moreover, the additional condition N(2t) ≤ γN(t) yields Y βp ≥ 2 Y p for p ≥ 2 and a constant β which depends only on γ. This motivates the following definitions.
Definition 1. For α ≥ 1 we say that moments of a random variable X grow α-regularly if Definition 2. For β < ∞ we say that moments of a random variable X grow with speed The class of all standardized random variables with the α-regular growth of moments will be denoted by R α and with moments growing with speed β by S β .
be the canonical process based on independent standardized r.v.s X i with moments growing α-regularly with speed β for some α ≥ 1 and β > 1. Then for any T ⊂ ℓ 2 , Here and in the sequel C(α, β) denotes a constant which depends only on α and β (which may differ at each occurrence). The above result easily yields the following comparison result for suprema of processes.
Corollary 2. Let X t be as in Theorem 1. Then for any nonempty T ⊂ ℓ 2 and any process Proof. The assumption implies γ Y (T ) ≤ γ X (T ) and the result immediately follows by the lower bound in Theorem 1 and estimate (1) used for the process Y .
In fact one may show a stronger result.
Corollary 3. Let X t and Y t be as in Corollary 2. Then for u ≥ 0, Another consequence of Theorem 1 is the following striking bound for suprema of some canonical processes. 4. Let X t be as in Theorem 1 and T ⊂ ℓ 2 be such that E sup s,t∈T (X s − X t ) < ∞. Then there exist t 1 , t 2 , . . . ∈ ℓ 2 such that T − T ⊂ conv{±t n : n ≥ 1} and X t n log(n+2) ≤ C(α, β)E sup s,t∈T (X s − X t ).
Remark 1. The reverse statement easily follows by the union bound and Chebyshev's inequality. Namely, for any canonical process (X t ) t∈ℓ 2 and any nonempty set T ⊂ ℓ 2 such that T − T ⊂ conv{±t n : n ≥ 1} and X t n log(n+2) ≤ M one has E sup s,t∈T (X s − X t ) ≤ CM. For details see the argument after Corollary 1.2 in [1].
where (e n ) is the canonical basis of ℓ 2 . Then obviously E sup s,t∈T (X s − X t ) = 2, moreover for any A ⊂ T with cardinality at least 2, we have ∆ 2 k (T ) ≥ ∆ 2 (T ) = √ 2, hence γ X (T ) = ∞. Therefore one cannot reverse bound (1) for Bernoulli processes, so some assumptions on the nontrivial speed of growth of moments are necessary in Theorem 1. However, Corollary 4 holds for Bernoulli processes and we believe that in that statement the assumption of the β-speed of the moments growth is not needed.
The crucial step in deriving lower bounds for suprema of stochastic processes is the Sudakov-minoration principle. Following [8] (see also [11]) we say that a process (X t ) t∈S satisfies the Sudakov minoration principle with constant κ > 0 if for any p ≥ 1, Theorem 5. Suppose that X 1 , X 2 , . . . are independent standardized r.v.s and moments of X i grow α-regularly for some α ≥ 1. Then the canonical process X t = ∞ i=1 t i X i , t ∈ ℓ 2 satisfies the Sudakov minoration principle with constant κ(α), which depends only on α.
In fact the assumption on regular growth of moments is necessary for the Sudakov minoration principle in the i.i.d. case. Proposition 6. Suppose that a canonical process X t = ∞ i=1 t i X i , t ∈ ℓ 2 based on i.i.d. standardized random variables X i satisfies the Sudakov minoration with constant κ > 0. Then moments of X i grow C/κ-regularly.
The next simple observation shows that (under mild regularity assumptions) the Sudakov minoration is necessary for reversing bound (1).

Remark 3.
Suppose that for any finite T ⊂ ℓ 2 we have E sup s,t∈T (X s − X t ) ≥ κγ X (T ). Assume moreover that for any p ≥ 1 and t ∈ ℓ 2 , X t 2p ≤ γ X t p . Then X satisfies the Sudakov minoration principle with constant κ/γ.
Proof. Let p ≥ 1 and T ⊂ ℓ 2 of cardinality at least e p be such that X s − X t p ≥ u for any s, t ∈ T , s = t. Let 2 k ≤ p < 2 k+1 and (A n ) be an admissible sequence of partitions of the set T . Then there is A ∈ A k which contains at least two points of T . Hence In fact in the i.i.d. case we do not need the regularity assumption X t 2p ≤ γ X t p .
Then (X t ) t∈ℓ 2 satisfies the Sudakov minoration principle with constant κ/2. In particular, moments of X i grow C/κ-regularly.
Methods developed to prove Theorem 5 also enable us to establish the following comparison of weak and strong moments of the canonical processes based on variables with regular growth of moments.
Theorem 8. Let X t be as in Theorem 5. Then for any nonempty T ⊂ ℓ 2 and p ≥ 1, This paper is organized as follows. In the next section we gather some general facts. In Section 3 we study the class R α and show that variables in this class have tails comparable to variables with log-concave tails. Based on this observation we establish the Sudakov minoration principle (Theorem 5). We finish that section with the proofs of Theorem 8 and Proposition 6. Section 4 is devoted to reversing bound (1). We obtain further regularity properties of the tails of variables from class R α ∩ S β and then prove Theorem 1 as well as Corollaries 3 and 4. We close Section 4 by proving Proposition 7.

Notation
By ε i we denote a Bernoulli sequence, i.e. a sequence of i.i.d. symmetric r.v.s taking values ±1. We assume that variables ε i are independent of other r.v.s. By a letter C we denote universal constants. Value of a constant C may differ at each occurrence. Whenever we want to fix the value of an absolute constant we use letters C 1 , C 2 , . . .. We write C(α) (resp. C(α, β), etc.) for constants depending only on parameters α (resp. α, β etc.).

Preliminaries
In this section we gather basic facts used in the sequel. We start with the contraction principle for Bernoulli processes (see e.g. [9,Theorem 4.4]).
where F : R + → R + is a convex function. In particular, Moreover, for a nonempty subset T of R n , The next Lemma is a standard symmetrization argument (see e.g. [9, Lemma 6.3]).
Lemma 10 (Symmetrization). Let X i be independent standardized r.v.s and (ε i ) be a Bernoulli sequence independent of (X i ). Define two canonical processes and for any T ⊂ ℓ 2 , Let us also recall the Paley-Zygmund inequality (cf. [4, Lemma 0.2.1]) which goes back to work [12] on trigonometric series.
Lemma 11 (Paley-Zygmund inequality). For any nonnegative random variable S and λ ∈ (0, 1), The next lemma shows that convolution preserves (up to a universal constant) the property of the α-regular growth of moments.
Proof. We are to show that S p ≤ Cα p q S q for p ≥ q ≥ 2. By Lemma 10 we may assume that the r.v.s X i are symmetric. Moreover, by monotonicity of moments, it is enough to consider only the case when p and q are even integers and p ≥ 2q. In [6] it was shown that for r ≥ 2, Therefore it is enough to proof the following claim.
Claim. Suppose that Y is a symmetric r.v. with moments growing α-regularly. Let p, q be positive even integers such that p ≥ 2q and To show the claim first notice that On the other hand, Since α ≥ 1 we obviously have The α-regularity of moments of Y yields which completes the proof of the claim and of the lemma.
We finish this section with the observation that will allow us to compare regular r.v.s with variables with log-concave tails.
Lemma 13. Let a nondecreasing function f : where t 0 ≥ 0, c ≥ 2 are some constants. Then there is a function g : and g(ct 0 ) = 0.
Proof. For t ≥ ct 0 we set Then g is convex on [ct 0 , ∞) as an integral of a nondecreasing function. For t ≥ x ≥ ct 0 we have sup ct 0 ≤y≤x f (y/c)/y ≤ f (t)/t, as f (λy)/(λy) ≥ f (y/c)/y for y ≥ ct 0 and λ ≥ 1. Thus Moreover, for t ≥ ct 0

Sudakov minoration principle
The main goal of this section is to prove Theorem 5. The strategy of the proof is to reduce the problem involving random variables with moments growing regularly to the case of random variables with log-concave tails, for which the minoration is known (see [7,Theorem 1]). The relevant result can be restated as follows be the canonical process based on independent symmetric r.v.s X i with log-concave tails. Then (X t ) t∈ℓ 2 satisfies the Sudakov minoration principle with a universal constant κ lct > 0.
Remark 4. Since we may normalize X i we do not need to assume that they have variance one. It suffices to have sup i Var(X i ) < ∞ in order that X t is well defined for t ∈ ℓ 2 .
The mentioned reduction hinges on the idea that the tail functions of random variables with regular growth of moments ought to be close to log-concave functions as, conversely, log-concave random variables are regular.
Proof. Fix α ≥ 1. We begin with showing that there is a constant κ α such that for any When X ∞ < ∞ it is enough to prove this assertion for t < (1−1/e) X ∞ as, providing So, fix λ ≥ 1 and 1 − 1/e ≤ t < (1 − 1/e) X ∞ . There exists q ≥ 2 such that t = (1 − 1/e) X q . Pick also p ≥ q so that λ = p/q. By the Paley-Zygmund inequality (5) and by the assumption that X ∈ R α we obtain On the other hand, setting κ α = e bα (1 − 1/e) −1 α, with the aid of the assumption that X ∈ R α and Chebyshev's inequality, we get Joining inequalities (8) and (9) we get (7) with κ α = 4e 2 e−1 α 3 . By virtue of this sublinear property (7), Lemma 13 applied to f = N, c = κ α , and t 0 = 1 − 1/e finishes the proof, providing the constants Proof of Theorem 5. We fix p ≥ 2, T ⊂ ℓ 2 such that |T | ≥ e p and X s − X t p ≥ u for all distinct s, t ∈ T . We are to show that E sup s,t∈T (X s − X t ) ≥ κ α u for a constant κ α which depends only on α. By Lemma 10 we may assume that r.v.s X i are symmetric. Proposition 15 yields that the tail functions N i (t) := − ln P(|X i | > t) of the variables X i are controlled by the convex functions M i (t), apart from t ≤ T α , i.e. we have M i (t) ≤ N i (t) ≤ M i (L α t) only for t ≥ T α . To gain control also for t ≤ T α , define the symmetric random variables X i = (sgn X i ) max{|X i |, T α }, so that their tail functions N i (t) = − ln P(| X i | > t), This allows us to construct a sequence Y 1 , Y 2 , . . . of independent symmetric r.v.s with log-concave tails given by P(|Y i | > t) = e −M i (t) such that Define the canonical processes where the first inequality follows by contraction principle (3) as |Y i | ≥ | X i | ≥ |X i |. Hence we can apply Theorem 14 to the canonical process (Y t ) and obtain 2E sup To finish the proof it suffices to show that E sup t∈T X t majorizes E sup t∈T Y t . Clearly, Recall that by the definition of X i , | X i −X i | = |T α −X i |1 {|X i |≤Tα} ≤ T α . As a consequence, the supremum of the canonical process E sup t∈T ( X t −X t ) is bounded by the supremum of the Bernoulli process E sup t∈T t i T α ε i . Indeed, using the symmetry of the distribution of the variables X i − X i and contraction principle (4), Since X i ∈ R α we get by Hölder's inequality, and thus E|X i | ≥ (2α) −2 . Hence by Jensen's inequality and by (13), Finally, notice that, by virtue of contraction principle (4), the second inequality of (11) implies that Estimates (12), (14) and (15) yield Proof of Theorem 8. Using a symmetrization argument we may assume that the variables X i are symmetric. Let variables X i , Y i and the related canonical processes be as in the proof of Theorem 5. Since the variables Y i have log-concave tails by [5] we get We showed above that Finally the contration principle together with the bounds We conclude this section with the proof of Proposition 6 showing that in the i.i.d. case the Sudakov minoration principle and the α-regular growth of moments are equivalent.
Proof of Proposition 6. Let us fix p ≥ q ≥ 2 and for 1 ≤ m ≤ n consider the following subset of ℓ 2 Then |T | = n m ≥ (n/m) m ≥ e p if n ≥ me p/m . Moreover, for any s, t ∈ T , s = t, say with s j = t j we have X s − X t p ≥ X j p . Thus the Sudakov minoration principle yields for any n ≥ me p/m , where (X * 1 , X * 2 , . . . , X * n ) is the nonincreasing rearrangement of (|X 1 |, |X 2 |, . . . , |X n |). We have Integration by parts shows that Combining this with (16) we get (recall that q ≥ 2 and constant C may differ at each occurrence) Taking m = ⌈p/q⌉ and n = ⌈me p/m ⌉ we find that n 1/q m 1−1/q ≤ 4ep/q. Hence which finishes the proof.

Lower bounds for suprema of canonical processes
As in the case of the Sudakov minoration principle the proof of the lower bound in Theorem 1 is based on the corresponding result for the canonical processes built on variables with log-concave tails. Theorem 3 in [7] (see also Theorem 10.2.7 and Exercise 10.2.14 in [15]) implies the following result.
Theorem 16. Let X t = ∞ i=1 t i X i , t ∈ ℓ 2 be the canonical process based on independent symmetric r.v.s X i with log-concave tails. Assume moreover that there exists γ such that N i (2t) ≤ γN i (t) for all i and t > 0, where N i (t) = − ln P(|X i | > t). Then there exists a constant C lct (γ), which depends only on γ such that for any T ⊂ ℓ 2 , Remark 5. Theorem 3 in [7] and Theorem 10.2.7 in [15] were formulated in a slightly different language. In particular, the latter states that there exist r > 2, an admissible sequence of partitions (A n ) and numbers j n (A) for A ∈ A n such that ϕ jn(A) (s, s ′ ) ≤ 2 n+1 for all s, s ′ ∈ A and (For the definition of ϕ see [15] -it precedes the statement of Theorem 10.2.7.) However, the condition ϕ jn(A) (s, s ′ ) ≤ 2 n+1 yields that X s − X s ′ 2 n ≤ C2 n r −jn(A) (see [3] for the i.i.d. case and Example 3 in [6] for the general situation), so ∆ 2 n (A n (t)) ≤ C2 n r −jn(An(t)) and Proposition 17. Let α ≥ 1, β > 1. For any r > 1 there exists a constant C(α, β, r) such that for X ∈ R α ∩ S β we have where N(t) := − ln P(|X| > t).
Remark 6. Taking in (18) t = 2 which corresponds to q = 2 we find that N(s) ≤ 2(ln 2 + 2β k ln(2α)), for s < 2 k−1 , which means that the tail distribution function of a variable X ∈ R α ∩ S β at a certain value s is bounded with a constant not depending on the distribution of X but only on the parameters α, β and of course the value of s.
Proof of Theorem 1. In view of (1) we are to address only the lower bound on E sup t∈T X t . A symmetrization argument shows that we may assume that variables X i are symmetric. Given symmetric X i let Y i be random variables defined as in the proof of Theorem 5, i.e. Y i 's are independent symmetric r.v.s having log-concave tails P(|Y i | > t) = e −M i (t) . Moreover, let L α , T α be the constants as in Proposition 15. Due to Proposition 17 for r = 2L α we know that the functions N i (t) := −P(|X i | > t) satisfy where γ = γ(α, β) := C(α, β, 2L α ). What then can be said about M i ? Using (6) we find that for t ≥T α := max{2, T α } which means that M i are almost of moderate growth, namely for t α := L αTα we have Therefore, we improve the function M i putting on the interval [0, t α ] an artificial linear piece t → λ(i, α)t, where λ(i, α) := M i (t α )/t α . In other words, take the numbers p(i, α) := P(|Y i | > t α ) = e −M i (tα) and let U i be a sequence of independent random variables with the following symmetric truncated exponential distribution, which are in addition independent of the sequences (X i ) and (Y i ). Define Then M i are convex functions of moderate growth, i.e.
whereγ =γ(α, β) := max{2, γ}. Thus Theorem 16 can be applied to the canonical process Z t := i t i Z i and we get What is left is to compare both the suprema and the functionals γ's of the processes (X t ) and (Z t ). The former is easy, because we have M i (t) ≤ M i (t), t ≥ 0, which allows to take samples such that |Y i | ≥ |Z i |, and consequently, thanks to contraction principle (4), E sup t∈T Z t ≤ E sup t∈T Y t . Joining this with estimates (15) and (14) we derive For the latter, we would like to show C(α, β)γ Z ≥ γ X . It is enough to compare the metrics, i.e. to prove that C(α, β) Z s − Z t p ≥ X s − X t p for p ≥ 1. We proceed as in the proof of Theorem 5. We have In the proof of Theorem 5 it was established that Y s − Y t p ≥ X s − X t p . For the second term we use the symmetry of the variables Y i − Z i , contraction principle (3), and the fact that Now we compare Z s − Z t p with moments of increments of the Bernoulli process. By Jensen's inequality we get Combining (19), (20), and (21) yields To finish it suffices to prove that E|Z i | ≥ c α,β for some positive constant c α,β , which depends only on α and β. This is a cumbersome yet simple calculation. Recall the distributions of the variables Y i and U i , the fact that they are independent, and observe that The last expression is nonincreasing with respect to M i (t α ). Since M i (t α ) ≤ N i (t α ) (see (6)), we are done provided that we can bound N i (t α ) above. Thus, Remark 6 completes the proof.
Proof of Corollary 3. Proposition 20 in [8] yields for p ≥ 1, where the third inequality follows by Theorem 1. Hence by Chebyshev's inequality we obtain Theorem 8 (used for the set T − T ) and Lemma 12 yield for p ≥ q ≥ 1, sup s,t∈T Hence, by the Paley-Zygmund inequality we get for q ≥ 1, Applying the above estimate with q = p/ ln(2C 2 (α)) we get e −p for p ≥ ln(2C 2 (α)).
Proof of Proposition 7. Fix p ≥ 1 and T ⊂ ℓ 2 such that |T | ≥ e p and X s − X t p ≥ u for distinct points s, t ∈ T . For t 1 , t 2 ∈ T define a new point in ℓ 2 by t(t 1 , t 2 ) := (t 1 1 , t 2 1 , t 1 2 , t 2 2 , . . .). Put also T := {t(t 1 , t 2 ) : t 1 , t 2 ∈ T }. It is not hard to see that X s − X t p ≥ u for t, s ∈ T , t = s.
Choose an integer k such that 2 k ≤ p < 2 k+1 and let (A n ) be an admissible sequence of partitions of the set T . Since | T | = |T | 2 ≥ e 2p > 2 2 k+1 , there is A ∈ A k which contains at least two points ofT . Hence