A remark on moment-dependent phase transitions in high-dimensional Gaussian approximations

In this article, we study the critical growth rates of dimension below which Gaussian critical values can be used for hypothesis testing but beyond which they cannot. We are particularly interested in how these growth rates depend on the number of moments that the observations possess.

It is crucial to know the critical growth rate of dimension d as a function of the sample size n for which ρ n (A) vanishes asymptotically.In particular, in their Remark 2 Zhang and Wu (2017 as soon as lim n→∞ d/n m/2−1+ε > 0 for some ε ∈ (0, ∞).This implies lim n→∞ ρ n (H) = 1 where On the other hand, it known (and a simple consequence of, e.g., Theorem 2 in Chernozhukov et al. (2023a) as detailed in Theorem 2.2 below) that lim n→∞ ρ n (R) = 0 uniformly over a large family of distributions with bounded mth moments if there exists an ε ∈ (0, ∞) such that lim n→∞ d/n m/2−1−ε = 0, where Because H ⊆ R, it follows that for Gaussian approximations over H and R a critical phase transition occurs at d = n m/2−1 .As d passes this threshold from below, the limiting GAE jumps from zero to one.We emphasize the following consequences of this phase transition: However, a primary reason for the surge of interest in high-dimensional Gaussian approximations is that they justify the use of Gaussian critical values for hypothesis testing and for the construction of confidence sets based on the statistic max 1≤j≤d S nj . 1 For this purpose it is enough that the Gaussian approximation is valid at the critical values c d (α) of the targeted size α ∈ (0, 1) of the test only rather than uniformly over H.That is, only for fixed α ∈ (0, 1) and c d (α) satisfying but not the stronger property typically focused on in the literature.In particular, even though there exist distributions with m moments such that (4) fails to be true when lim sup n→∞ d/n m/2−1+ε > 0 by the mentioned result in Zhang and Wu (2017), the statistically important approximation in (3) could still hold.This would open the door to Gaussian critical values being valid even for d growing much faster than n m/2−1 .We show that this is not the case as already the distributions constructed in Zhang and Wu (2017) satisfy that lim sup as soon as lim sup n→∞ d/n m/2−1+ε > 0 for some ε ∈ (0, ∞). 2 Thus, the phase transition also takes place at the critical values, irrespectively of the choice of α ∈ (0, 1).From a statistical perspective, our results imply that the asymptotic size of max-type tests based on critical values obtained from Gaussian approximations jumps from a desired level α ∈ (0, 1) to 1 when d passes the above threshold -a complete breakdown in size control rather than only a slight inflation.
We emphasize here that we do not show the dependence on d of many quantities introduced above (e.g., S n or X i , H and R, etc.).Furthermore, the dependence of d on n is notationally suppressed.This is done to simplify the presentation.All proofs are given in the Appendix.
2 The sequence 2 log(d) used to reveal the breakdown of the Gaussian approximation in (2) by Zhang and Wu (2017) satisfies P max 1≤j≤d Zj > 2 log(d) → 0 and thus implies an asymptotic size of zero.

The phase transition at Gaussian critical values
To present our results, from now on let Z be a random vector in R d distributed as N d (0 d , I d ) and for α ∈ (0, 1) denote by c d (α) a sequence (of critical values) satisfying The distributions P m on R used in the following theorem are given in explicit form in (11) in Section 4.1.Recall that The consequences of (7) relative to (2) and ρ n (H) → 0 for hypothesis testing are discussed further in Section 3 below.
Let m ∈ [4, ∞), c and C such that 0 < c ≤ C 2/m < ∞, and denote by P(m, c, C) the class of distributions such that the X i are i.i.d., with entries having mean zero, covariance matrix Σ, min 1≤j≤d EX 2 1j ≥ c, and max 1≤j≤d E|X 1j | m ≤ C. 3 The following theorem, which provides sufficient conditions for Gaussian approximations to hold when d/n m/2−1−ε → 0 for some ε ∈ (0, ∞), is a special case of Theorem 2 in Chernozhukov et al.
then for c and C such that 0 < c ≤ C 2/m < ∞ one has Together Theorems 2.1 and 2.2 reveal a critical phase transition in the asymptotic behavior of the Gaussian approximation error at the critical values c d (α) at d = n m/2−1 .
We next discuss the consequences of this for hypothesis testing.

Consequences for high-dimensional hypothesis testing
To appreciate the statistical importance of our results, assume that the mean µ ∈ R d of the X i is unknown and that the X ij possess m ∈ (2, ∞) moments.One then frequently wishes to test A canonical test with targeted asymptotic size α ∈ (0, 1) of H 0 is that is, the asymptotic size of ϕ n over P(m, c, C) is α as desired.However, since the distributions used in Theorem 2.1 satisfy H 0 , it follows from ( 7) that for c ≤ 1 and C sufficiently large there exists a P ∈ P(m, c, C) such that as soon as d/n m/2−1+ε → 0 for some ε ∈ (0, ∞), we have lim sup that is, the asymptotic size of ϕ n jumps to one once d exceeds the phase transition threshold.
Thus, ( 7) is important as it shows that Gaussian approximations of the cdf of max 1≤j≤d S nj by the one of max 1≤j≤d Z j break down not at statistically irrelevant regions but precisely at the quantiles c d (α) of the latter, which are used as critical values for testing.Had the approximations broken down at sequences for which (5) would converge to zero or one (as in the construction of Zhang and Wu (2017)), this would be of less importance for testing.Similarly, it is alarming that the right-hand side of ( 7) is not merely positive but equal to 1 − α, implying an asymptotic size of one rather than "only" something slightly exceeding α.
one can actually show (slightly adapting the proof of Theorem 2.1) that ( 6) also implies that lim sup Hence an identical observation to the one above holds for tests based on max 1≤j≤d |S nj |.

Appendix
4.1 The family of distributions used in establishing the phase transition Denote by µ Gm the (Lebesgue-Stieltjes) probability measure with cdf G m , which possess m moments because We note also that for no δ ∈ (0, ∞) does µ Gm possess m + δ moments.In addition, µ Gm has mean zero by virtue of being symmetric about the origin.It will be convenient to work with a version of µ Gm that has second moments equal to one.Thus, letting define the distribution P m by the image measure  . . , d(n).By (6) there exists a subsequence n ′ , say, and a c ∈ (0, 1) along which To establish (7) we can assume without loss of generality that n and it remains to show that tends to zero along the chosen subsequence.To this end by, e.g., Theorem 1.5.3 in Leadbetter et al. (1983) one has that which eventually equals 2 ln(d(n)) 1 − a(n) for a non-negative sequence a(n) converging to zero [here we suppress that a(n) depends on α].Clearly, for any δ ∈ (0, ε) one has for n sufficiently large that d(n) ≥ n m/2−1+δ , so that eventually By Theorem 1.9 in Nagaev (1979)5 , cf. in particular equation (1.25b), it follows that for n sufficiently large Recalling that c n (α) = 2 ln(d(n)) 1 − a(n) and noting that n ≤ [d(n)] 1 m/2−1+δ , one gets that for every M > 0 the denominator on the far right-hand side of the previous display is eventually no greater than d(n)/M .Therefore, by ( 13), one has that lim sup Letting M → ∞ in combination with ( 12) yields (7).

Proof of Theorem 2.2
As in the proof of Theorem 2.1 we make the dependence of d = d(n) on n explicit.By assumption there exists a C ∈ (0, ∞) such that E|X 1j | m ≤ C for all j = 1, . . ., d(n). Therefore, implying that one can choose q = m and B n = [Cd(n)] 1/m in Theorem 2 of Chernozhukov et al. (2023a).Thus, as the remaining conditions of that theorem are satisfied as well, a simple calculation reveals that if d(n)/n m/2−1−ε → 0 for some ε > 0 then where c d (α) is the sequence of critical values from (5), i.e., they are based on Gaussian critical values. 4If there exists an ε ∈ (0, ∞) such that d/n m/2−1−ε → 0, it indeed follows from Theorem 2.2 that lim n→∞ sup P ∈P(m,c,C) and we denote the corresponding cdf by F m .4.2 Proof of Theorem 2.1Fix m ∈ (2, ∞), α ∈ (0, 1), ε ∈ (0, ∞), and let the X ij be i.i.d.across i = 1, . . ., n and j = 1, . . ., d(n) with common distribution P m defined in (11) [for the sake of clarity we make explicit the dependence of d = d(n) on n in the course of the proof].Thus, the X ij have mean zero, variance one and finite absolute mth moment.Denote by F n,m the (common) cdf of n i=1 X ij , j = 1, .
1), and c d (α) be a sequence that satisfies (5).There exist i.i.d.random vectors X 1 , . . ., X n with independent entries X ij ∼ P m , and P m depending neither on n nor d, having mean zero, variance one, and finite mth absolute moment, such that if for some ε ∈ (0, ∞) it holds that