Limit theorems for mixed-norm sequence spaces with applications to volume distribution

Let $p, q \in (0, \infty]$ and $\ell_p^m(\ell_q^n)$ be the mixed-norm sequence space of real matrices $x = (x_{i, j})_{i \leq m, j \leq n}$ endowed with the (quasi-)norm $\Vert x \Vert_{p, q} := \big\Vert \big( \Vert (x_{i, j})_{j \leq n} \Vert_q \big)_{i \leq m} \Vert_p$. We shall prove a Poincar\'e-Maxwell-Borel lemma for suitably scaled matrices chosen uniformly at random in the $\ell_p^m(\ell_q^n)$ unit balls $\mathbb{B}_{p, q}^{m, n}$, and obtain both central and non-central limit theorems for their $\ell_p(\ell_q)$-norms. We use those limit theorems to study the asymptotic volume distribution in the intersection of two mixed-norm sequence balls. Our approach is based on a new probabilistic representation of the uniform distribution on $\mathbb{B}_{p, q}^{m, n}$.


Introduction and main results
The asymptotic theory of convex bodies is intimately linked to probability theory whose methods and ideas have been key elements in obtaining numerous deep results of both analytic and geometric flavour.It has led to the development of a quite powerful quantitative methodology in geometric functional analysis and allowed to form a qualitatively new picture of high-dimensional spaces and structures.The role of convexity in high-dimensional spaces is similar to the role of independence in probability and guarantees a certain regularity of the otherwise complex structure of a high-dimensional space.One of the most classical results of stochastic-geometric and high-dimensional flavor is probably the Poincaré-Maxwell-Borel lemma, which asserts that any fixed number of coordinates of a vector chosen uniform at random from the boundary of the unit Euclidean ball B n 2 is approximately Gaussian (see, e.g., [5]), and in the more modern spirit there is the pioneering work of V. D. Milman on the concentration-of-measure phenomenon, which has led to several major breakthroughs (see, e.g., [1,18]).The arguably most prominent example of the last two decades is Klartag's central limit theorem for convex bodies, showing that the marginals of a high-dimensional isotropic and log-concave random vector are approximately Gaussian distributed [16].Besides Klartag's central limit theorem for convex bodies, a number of other (weak) limit theorems have been obtained for various geometric quantities in the last decades, demonstrating their regularity and universality; we refer to the survey [22] for references.Several of those results have led to a deeper understanding of the volume distribution in highdimensional convex bodies.
The motivation of the present paper is essentially twofold and will be elaborated upon in view of classical and preceding works before presenting our main results.
Motivation 1: Poincaré-Maxwell-Borel type results.Having its roots in kinetic gas theory, and going back to Maxwell and later Poincaré and Borel, it is observed that the first k coordinates of a random point on the (n − 1)-dimensional Euclidean sphere S n−1 2 are asymptotically independent and Gaussian as n tends to infinity; to be precise, where d TV denotes the total variation distance, (X n i ) i≤n is sampled uniformly from S n−1 2 and Z 1 , . . ., Z k are independent standard Gaussian variables, and k ∈ N is fixed.We refer to Diaconis and Freedman [5,Section 6] for a more detailed account and give a more detailed statement in Proposition 10 below.In [5], Diaconis and Freedman prove an analogous result for the simplex and exponential distribution.Generalizations to the ℓ p -sphere were obtained by Mogul ′ skiȋ [19], where the point was distributed according to the normalized Hausdorff measure, and by Rachev and Rüschendorf [23] for the cone probability measure.The latter authors exploited a probabilistic representation relating a p-generalized Gaussian distribution to the ℓ p -balls, allowing one to make a transition from a random vector with dependent coordinates to one with independent ones.Naor and Romik [20] showed that the normalized Hausdorff measure and the cone probability measure are asymptotically equal (their equality for p ∈ {1, 2, ∞} irrespective of dimension being long known prior), thereby unifying the previous results.A further generalization to Orlicz balls (and even beyond) was undertaken recently by Johnston and Prochno [9].We stress that all results cited have been proved for the total variation distance of probability measures.In the present article we only consider the weak topology on probability measures (equivalently: convergence in distribution of random variables).
Motivation 2: Schechtman-Schmuckenschläger type results.Instigated by a question of V. D. Milman, Schechtman and Zinn [27] found an upper bound on the volume left over from an ℓ p -ball after cutting out a dilated ℓ q -ball; incidentally the authors utilized the same stochastic representation as did Rachev and Rüschendorf (see above).A few years after, Schechtman and Schmuckenschläger [26] used that probabilistic representation in order to investigate the limit of the volume of the cut-out portion in the very same setting as before, revealing the following threshold behaviour: below a certain critical dilation factor depending only on p and q the limit is zero, and above that it is one, provided the ℓ p -ball has unit volume.More formally, writing D n p for the n-dimensional unit-volume ℓ p -ball, About a decade later, Schmuckenschläger [28,29] determined the asymptotics at the threshold itself and found the limit to be 1/2 by proving a central limit theorem that revealed this behaviour.We refer to Proposition 11 below for the precise statement.More recently, Kabluchko, Prochno, and Thäle [12,14] revisited the results of Schechtman and Schmuckenschläger, providing a unified framework and also generalizing the previous works in various directions, yet still treating ℓ p -balls and using the probabilistic representation.A further step was taken by Kabluchko and Prochno [11], studying the intersections of Orlicz balls and observing a similar thresholding behaviour; here much finer tools from large deviations theory and statistical mechanics where required, and it is not even known whether the limit at the threshold itself exists.Another generalization from ℓ p -balls to ℓ p -ellipsoids, i.e., axis-parallel-scaled balls (a case not covered by Orlicz balls), was recently obtained by Juhos and Prochno in [10]; the phenomenon of the threshold emerges again.The case of intersections of unit balls from classical random matrix ensembles has been treated by Kabluchko, Prochno, and Thäle in [13].Let us point out that understanding the asymptotic volume of intersections of scaled unit balls naturally appears, for instance, when studying the curse of dimensionality for high-dimensional numerical integration problems [8].Suspecting a universal behaviour among symmetric convex bodies, we tackle another generalization, namely finite-dimensional sequence spaces with mixed ℓ p -norms, and consider the asymptotic volume of the intersection of two balls: the thresholding behaviour is found to be valid also in this case, and for a wide range of parameters the limit in the critical case is determined; little surprisingly, owing to the larger set of parameters as compared to the ℓ p -balls, this limit's value is much more varied and the overall analysis is considerably more delicate.
Let us point out that the study of mixed-norm spaces is a classical one in approximation theory and geometric functional analysis and we refer, for instance, to the work of Schütt regarding the symmetric basis constant of these spaces [30], the characterization of mixed-norm subspaces of L 1 by Prochno and Schütt [21] and Schechtman [24], the work on non-existence of greedy bases for the mixed-norm spaces by Schechtman [25] and the study of volumetric properties of these spaces by Kempka and Vybíral [15] as well as the recent work of Mayer and Ullrich on the order of entropy numbers of mixed-norm unit balls [17].
We would like to add that, naturally, it would be interesting to consider even more general norms.The main hindrance, though, is that each of the results referenced in the motivation above, and others more, has required tools tailored to the specific problems; to our best knowledge there is no unified theory yet that would allow us to assess such questions "in one fell swoop."Current research is conducted, e.g., for Schatten norms of not necessarily square matrices.

The mathematical setup
In order to be able to present our main results, we shall briefly introduce the most essential setup; more details can be found in Section 2 on notation and preliminaries.
For p, q ∈ (0, ∞] and m, n ∈ N define the finite-dimensional mixed-norm sequence space ℓ m p (ℓ n q ) to be the space R m×n endowed with the (m • n)-dimensional Lebesgue measure v mn and the quasinorm x p,q := (x i,j ) j≤n q i≤m p , where x = (x i,j ) i≤m,j≤n ∈ R m×n , and • p is the usual ℓ p -norm, that is, In particular, we consider the unit balls B m,n p,q := x ∈ R m×n : x p,q ≤ 1 ; the ℓ p -unit ball and sphere in R n are written B n p and S n−1 p , respectively; ω n p denotes the volume of B n p .
We seek to characterize Unif(B m,n p,q ), the uniform distribution on B m,n p,q .Given a random matrix X = (X i,j ) i≤m,j≤n ∼ Unif(B m,n p,q ), define . The notations R i and Θ i , Θ i,j are used throughout this article with the meaning given in (1); note that they actually depend on the parameters p, q, m, n, but we suppress this in our notation.
For p ∈ (0, ∞] the p-generalized Gaussian distribution, or p-Gaussian distribution for short, is defined to be the probability measure on R with Lebesgue-density

Main results -a Schechtman-Zinn probabilistic representation
The first main result, which facilitates all computations and is essential to our proofs, is a probabilistic representation of the uniform distribution on B m,n p,q , generalizing the classical result of Schechtman and Zinn [27] and Rachev and Rüschendorf [23].Given the numerous applications of the classical probabilistic representation, the following result clearly is of independent interest.Proposition 1.Let p, q ∈ (0, ∞] and m, n ∈ N, and let X ∼ Unif(B m,n p,q ).(a) The distribution of (R i ) i≤m has Lebesgue-density Therefore, (R i ) i≤m can be represented as where U, ξ 1 , . . ., ξ m are independent random variables with U distributed uniformly on [0, 1], and ξ 1 , . . ., ξ m are p n -Gaussian.(b) The random vectors (R i ) i≤m , Θ 1 , . . ., Θ m are all independent, each Θ i is distributed according to the cone measure on S n−1 q , for i ∈ [1, m], and therefore can be represented as where (η i,j ) i≤m,j≤n is an array of independent q-Gaussian random variables.
1.2 Main results -Poincaré-Maxwell-Borel principles One type of limit theorem which we are considering is a Poincaré-Maxwell-Borel principle, that is, a statement about the limiting distribution of the first few coordinates of a random vector.In the following two theorems, we shall always assume (X i,j ) i≤m,j≤n ∼ Unif(B m,n p,q ).Owing to the nature of the space ℓ m p (ℓ n q ), having two parameters for dimension, in the sequel limit theorems will usually be considered for three different regimes: firstly, letting m → ∞ while keeping n fixed; secondly, vice versa, keeping m fixed while letting n → ∞; and thirdly, letting n → ∞ while treating m as dependent on n and going to infinity as well.
In order to keep the amount of case distinctions at a minimum, for the case of the parameter value p = ∞ we agree on these conventions: For the formulation of our results we introduce the following quantities (whose superscripts denote indices, not powers): for p ∈ (0, ∞] and α ∈ (0, ∞), and We can now formulate the first Poincaré-Maxwell-Borel principle for the case where m → ∞ while n is fixed.By L(X) we denote the distribution, or law, of a random variable X.
Theorem A (m → ∞, n constant).Let p, q ∈ (0, ∞], let k, n ∈ N be fixed, and let ξ 1 , . . ., ξ k be independent p n -Gaussian random variables.(a) The following weak convergence holds true, The convergence is to be understood as convergence in probability in the space of probability measures on R and R n , respectively, endowed with the Lévy-Prokhorov metric; cf.Lemma 22.
We now formulate the second Poincaré-Maxwell-Borel principle for n → ∞ while m is either fixed or tends to infinity with n.
necessary), and let (η i,j ) i≤k,j≤l be an array of independent q-Gaussian random variables.
(a) The following weak convergence holds true, (b) The empirical measure satisfies The convergence is to be understood as convergence in probability in the space of probability measures on R endowed with the Lévy-Prokhorov metric; cf.Lemma 22.

Main results -weak limit theorems
Here we present three weak limit theorems for X p 2 ,q 2 , where X ∼ Unif(B m,n p 1 ,q 1 ) and (p 1 , q 1 ) = (p 2 , q 2 ) in general (for (p 1 , q 1 ) = (p 2 , q 2 ) see Remark 3 below).We start with the case m → ∞ while n is fixed.
The following weak limit theorem covers the case where n → ∞ while m is fixed.We obtain both central and non-central limit behaviour, depending on the relation/values of the parameters p 1 , q 1 , and q 2 .Theorem D (m fixed, n → ∞).Let p 1 , q 1 ∈ (0, ∞] and p 2 , q 2 ∈ (0, ∞) with (p 1 , q 1 ) = (p 2 , q 2 ), let m ∈ N be fixed, and for each n ∈ N let X n ∼ Unif(B m,n p 1 ,q 1 ).
(a) If q 1 = q 2 , then where N is a standard Gaussian random variable, and (q 2 M q 2 q 1 ) 2 .

Applications -asymptotic volume distribution in intersections of mixednorm balls
Kempka and Vybíral [15] have studied the volume of unit balls in the mixed-norm sequence spaces.Our distributional limit theorems of Section 1.3 now allow us to obtain Schechtman-Schmuckenschläger-type results on the distribution of volume in the mixed norm spaces; for that we write r m,n p,q := v mn (B m,n p,q ) 1/(mn) (notice that v mn ((r m,n p,q ) −1 B m,n p,q ) = 1), and for any p 1 , q 1 , p 2 , q 2 ∈ (0, ∞] and t ∈ (0, ∞) we set V m,n (t) := v mn (r m,n p 1 ,q 1 ) −1 B m,n p 1 ,q 1 ∩ t(r m,n p 2 ,q 2 ) −1 B m,n p 2 ,q 2 .Clearly V m,n (t) also depends on p 1 , q 1 , p 2 , q 2 , but since those parameters are fixed, and since we wish to keep the notation simple, we will suppress them.
2. The point p 1 (m−1) log(p 1 /p 2 ) is the positive intersection point of the two gamma densities involved; since the density of Γ m−1 2 , 2 min{1, p 1 p 2 } takes strictly smaller values on p 1 (m−1) log(p 1 /p 2 ) A simple estimate also yields lim p 1 →∞ lim n→∞ V m,n (A −1 q 1 ,q 2 ) = 0 in the case q 1 = q 2 , so we have a kind of continuity here.
Concerning tA q 1 ,q 2 = 1, in the case where Φ denotes the CDF of the standard normal distribution and σ is defined in Theorem E, (a).
Remark 8. We leave as an open problem the formulation of simple precise conditions under which the limit M exists; one main obstacle is determining the exact asymptotics of E[ Θ 1 From the definition of • p,q it is clear that any of the conditions m = 1, or n = 1, or p = q reproduces the usual ℓ p -norm, and indeed it may be verified that all results presented in this paper are consistent with the previous results pertaining to ℓ p -spaces stated in the introduction.

Notation and preliminaries
In this section we shall introduce the notation used throughout this paper, provide some background information on mixed-norm spaces, and present and prove several technical results needed in the sequel.

Notation
We suppose that all random variables occurring in this paper are defined on a common probability space (Ω, A, P).Expectations, in particular variances and covariances, are taken with respect to P and are denoted by E[•], Var[•] and Cov[•, •], respectively; for a finite-dimensional random vector E indicates the expectation vector and Cov the covariance matrix.A centred random variable has expectation zero.
Let X be an E-valued random variable, for some measurable space E, and let µ be a measure on E. We write X ∼ µ to express that X has law, or distribution, µ (equivalently, µ is the image measure of P under X); the law of X also is addressed as L(X).
If E is a separable metric space and X, X 1 , X 2 , . . .are E-valued random variables, then almost sure convergence, convergence in probability, and convergence in distribution of the sequence The Euclidean space R n is endowed with its Borel σ-algebra and the n-dimensional Lebesgue-volume v n .For a Borel set A ⊂ R n with v n (A) ∈ (0, ∞) let Unif(A) stand for the uniform distribution on A with respect to v n .For a vector µ ∈ R n (zero vector 0) and a positive-semidefinite matrix Σ ∈ R n×n (unit matrix I n ) let N (µ, Σ) be the n-dimensional normal, or Gaussian, distribution with mean µ and covariance matrix Σ. E(1) denotes the standard exponential distribution.
The (measure-theoretic) indicator function of a set A is written ½ A .
For a probability measure µ and an index set I, µ ⊗I := i∈I µ denotes its I-fold product measure; in particular, µ ⊗n := n i=1 µ.Indices of vector coordinates or sequence terms are by default natural numbers starting at 1; therefore an expression like (x i ) i≤n is to be understood as (x 1 , x 2 , . . ., x n ).Likewise, interval notation is used for natural indices.
We are going to employ Landau notation in our proofs; in particular we will use O, o and Θ.We recall their definitions: where (a n ) n≥1 and (b n ) n≥1 are real sequences, and b n ≥ 0 for all n ∈ N. Mostly we will use O(b n ) etc. as a stand-in for a n in formulas.

2.2
The ℓ p -and mixed-norm sequence spaces ℓ p -spaces For n ∈ N and p ∈ (0, ∞] let ℓ n p denote the n-dimensional ℓ p -space, that is, R n equipped with the quasinorm this is a norm iff n = 1 or p ≥ 1.The unit ball and unit sphere are written B n p and S n−1 p , resp.; the former's volume is ) .On the sphere we introduce the normalized cone measure ; it is the unique probability measure such that the following polar integration formula is valid (see, e.g., [20,Prop. 1]): for any measurable map h : The uniform distribution on B n p has a nice stochastic representation in terms of independent random variables with known distributions, having its roots in [27] and independently [23].In order to formulate it, let γ p denote the p-generalized Gaussian distribution on R; recall from the introduction that it is defined via its Lebesgue density In particular, γ 2 = N (0, 1) and ).An easy calculation shows M α p = R |x| α dγ p (x) for α ∈ (0, ∞), where M α p has been defined in (3).Now let X be a random vector in R n and p ∈ (0, ∞), then X ∼ Unif(B n p ) iff there exist independent random variables U ∼ Unif([0, 1]) and Y 1 , . . ., Y n ∼ γ p such that Obviously , and actually its distribution is ) already are independent.In order for the reader to compare the known results for ℓ p -balls with the new ones for ℓ p (ℓ q )-balls presented in Subsections 1.2-1.4we give the precise statements here.
For the Poincaré-Maxwell-Borel principle recall the notion of total variation distance of probability measures: let (E, E) be a measurable space and let µ and ν be probability measures on E, then their total variation distance is defined to be d TV (µ, In particular, for k ∈ N fixed, where The ℓ p -versions of the weak limit theorems and the asymptotic volume of intersections reach back to [26,Theorem], [28, Theorem 2.1], and [29, Theorem 3.2].The latter two papers introduced weak limit results, the first one more covertly by using the Berry-Esseen theorem, the second one directly.That thread was taken up in [12, Theorem 1.1] and subsequent works. where N ∼ N (0, 1) and e 1/p−1/q p 1/p q 1/q (M q p ) −1/q . Then ℓ p (ℓ q )-spaces One possible generalization of ℓ n p is our object under investigation, the mixednorm sequence space ℓ m p (ℓ n q ): Let m, n ∈ N and p, q ∈ (0, ∞], and endow the real space of matrices R m×n with the (m • n)-dimensional Lebesgue-volume, v mn , and with the ℓ p (ℓ q )-quasinorm (x i,j ) i≤m,j≤n p,q := (x i,j ) j≤n q i≤m p .
Pictorially speaking, for • p,q first take the q-norm along rows, then take the p-norm of the resulting numbers.For the sake of completeness, albeit irrelevant for the purpose of the present paper, we remark that • p,q is a norm iff both • p and • q are norms.Also notice ℓ 1 p (ℓ n q ) ∼ = ℓ n q , ℓ m p (ℓ 1 q ) ∼ = ℓ m p , and ℓ m p (ℓ n p ) ∼ = ℓ mn p .The corresponding unit ball shall be written B m,n p,q ; in particular we have B m,n ∞,q ∼ = (B n q ) m , that is, the m-fold Cartesian product.The precise volume of B m,n p,q has been computed recently by Kempka and Vybíral [15], who have showed that A probabilistic representation of Unif(B m,n p,q ) parallel to Equation ( 4) is precisely the content of Proposition 1. Higher-order mixed norms are introduced in the Appendix.

Auxiliary tools and results
First we state two of our main devices in dealing with convergence in distribution of random variables, presented such as fits our needs.The first is a combination of Slutsky's theorem proper, a consequence of [3, Theorem 3.1], and the continuous-mapping theorem [3, Theorem 2.7]; with a slight abuse of language we will refer to the present version as 'Slutsky's theroem.' Proposition 12 (Slutsky's theorem).Let E, F, G be separable metric spaces, let X, X 1 , X 2 , . . .be E-valued random variables, let Y, Y 1 , Y 2 , . . .be F -valued random variables, and let f : The second allows us to handle remainder terms in Taylor expansions, hence we will call it the 'remainder lemma.'In general it appears to be well-known and widely used; nevertheless, as we cannot find a good reference, and for the convenience of the reader we also provide a proof.
Lemma 13 (remainder lemma).Let d, l ∈ N, let R : R d → R be a function for which there exist Proof.Let ε ∈ (0, ∞), then for all those n ∈ N where β n = 0 (for all others the following probability is zero already), By the premises there exist n 0 ∈ N and C ∈ (0, ∞) such that 1 |αn| ≤ C and |β n | ≤ C|α n | l for all n ≥ n 0 , and this implies Because of lim n→∞ α n Z n = 0 in probability, the claim follows.
In the remainder of this subsection we gather diverse auxiliary results together with their proofs.Lemma 14.Let p, q ∈ (0, ∞] and let Proof.Case q < ∞: We have , and the claim follows from the SLLN.Case p = q = ∞: where we have used independence for the second equality and identical distribution for the third.This proves for any ε > 0 in order to strengthen convergence in probability to almost sure convergence.So let ε > 0, w.l.o.g.ε < 1, then and the proof is complete. Lemma 15.Let p ∈ (0, ∞], q, r ∈ (0, ∞), and for each n ∈ N let ξ n ∼ γ p/n .1.We have the following asymptotics, as n → ∞: and and for any α ∈ (0, ∞) we have (here E ∼ E( 1)) 2. We have the distributional limits: where N ∼ N (0, 1).

Proof.
1. (a) Recall the formula in Equation (3), and subsequently x q/n dx = 1 1 + q n ; the result then follows from the geometric series.Therewith we also get and we use the geometric series again and the Cauchy product of series.
Let α ∈ (0, ∞).It suffices to show where we have anticipated 2.(b), whose proof is independent.We may restrict ourselves to α ≥ 1, then |x + y| α ≤ 2 α−1 (|x| α + |y| α ) by Hölder's inequality, and so α converges and hence is bounded, we must ensure Exploiting Taylor expansion of the exponential function we have where we know R(x) = e y 2 x 2 with some y between 0 and x, for any x ∈ R (Lagrangian form of remainder term).In our case, since |ξ n | ∼ Unif([0, 1]), we have q log|ξn| n ≤ 0 almost surely, thus we can estimate In particular we can write E := − log|ξ n | ∼ E(1), then we get and clearly this last expression remains bounded in n ∈ N.

(a)
We show the result for q = p first.Let n ∈ N and h : R → R measurable and nonnegative, then this shows that |ξ n | p/n follows a gamma distribution with shape parameter n p and scale parameter p n .Then because of the semigroup and scaling properties of the gamma distribution, there exists a sequence (g j ) j≥1 of independent random variables, each having a gamma distribution with shape 1 p and scale p, such that The classical CLT yields with N ∼ N (0, 1), where we have used E[g 1 ] = 1 and Var[g 1 ] = p.Since this concludes the case q = p.
For general q call Ξ n := √ n(|ξ n | p/n − 1), then we have Taylor expansion gives where the remainder satsafies |R(x)| ≤ M x 2 with some M > 0 for all x ∈ R sufficiently small.From the case q = p we know where the remainder satisfies |R(x)| ≤ M x 2 with some M > 0 for all x ∈ R sufficiently small.Rearrange, As before we know −−−→ n→∞ 0, so the remainder lemma yields nR q log|ξn| n P −−−→ n→∞ 0. Thus follows the claim.
Lemma 16.Let p, q ∈ (0, ∞) with p = q, and for each n ∈ N let ξ n ∼ γ p/n ; define where N ∼ N (0, 1); and for any α ∈ (0, ∞) we have in particular convergence of moments holds true, i.e., (E[ Proof.First we prove the claimed weak convergence.From the proof of Lemma 15, 2.(a), we know ) ⊗N ; we also know with N ∼ N (0, 1).Then we have |ξ n | p/n = 1 + Ξn √ n , and via Taylor expansion of x → (1 + x) q/p we get where the remainder term satsifies |R(x)| ≤ M |x| 3 for all |x| ≤ 1 2 with some M > 0. From Lemma 15, 1.(a), we know M q/n p/n = 1 + q(q−p) the latter also implies via Slutsky's theorem Equally by Slutsky's theorem we get thence with the remainder lemma, and another application of Slutsky's theorem leads to Now we prove the boundedness of moments.Let α > 0, w.l.o.g.such that α ≥ 1 and 2αq p ≥ 1.Let n ∈ N, then and we are going to show that either term on the right-hand side remains bounded as n → ∞.For the first expectation on the right-hand side of (6) we use the same Taylor expansion as before and additionally apply the inequality We already know that the first three deterministic coefficients converge in R; because of hence eventually for all n, on the event we have Ξn √ n ≤ 1 2 , and then Now we attend to the second expectation on the right-hand side of (6).First we apply Hölder's inequality, The first factor on the right-hand side of ( 7) is dealt with rather crudely, we simply estimate for the individual summands we see and The second factor on the right-hand side of ( 7) equals P n −3/4 n i=1 (g i − 1) ≥ 1 1/2 ; because the moment generating function of g 1 is finite in a neighbourhood of 0, the series n≥1 (g n − 1) satisfies a moderate deviations principle, hence by [4, Theorem 3.7.1], This implies that eventually, and in total we obtain Proof.Actually we are going to show that convergence is in L 2 , that is, and from Lemma 15 we know (M In any case we have 1 m ≤ 1 and from Lemma 15 again we get and this finishes the proof. The next lemma states a moderate deviations result for p-Gaussian variables.Note that the case p ≥ q treated below actually is covered by the standard theory, because then the moment generating function is finite in a neighbourhood of zero.Lemma 18.Let p ∈ (0, ∞] and q ∈ (0, ∞), and let (ξ n ) n≥1 ∼ γ ⊗N p .If p < q, then let β ∈ 1 2 , 1 2−p/q ; else if p ≥ q, then let β ∈ ( 1 2 , 1).Then the moderate deviations of n i=1 (|ξ i | q − M q p ) n≥1 are determined by the following, where t ∈ (0, ∞), Proof.This follows easily from [6, Theorem 2.2] by plugging in b n = n β and using the tailestimate for γ p , to wit, if p < ∞, then and if p = ∞, then |ξ 1 | ≤ 1 a.s. and hence P[|ξ 1 | ≥ x] = 0 for any x > 1.Then condition (2.3) in [6] is equivalent to and our indicated values for β satisfy that.The rate function is stated explicitly in (2.7) of [6].
The following lemma slightly extends the results [12, Theorem 1.1] and [14, Theorem A].The case of X n ∼ κ n−1 ∞ and q < ∞ actually is addressed in [22, Theorem 4.4, 1.] and its subsequent remark; but the proof merely glosses over said case, in particular it is not mentioned how to handle (ξ i ) i≤n ∞ .For the sake of completeness, we provide a proof here.Lemma 19.Let q 1 ∈ (0, ∞] and p, q 2 ∈ (0, ∞) with q 1 = q 2 , and either let X n ∼ Unif(B n q 1 ) for any n ∈ N, or X n ∼ κ n−1 q 1 for any n ∈ N. Define (Y n ) n≥1 by where N ∼ N (0, 1) and and in particular E n p(1/q 1 −1/q 2 ) X n p q 2 n≥1 → (M q 2 q 1 ) p/q 2 .(9) then by the CLT (Ξ n ) n≥1 d − → σN with N ∼ N (0, 1) and σ 2 := V q 2 ∞ .Furthermore, as has already been glimpsed in the proof of Lemma 14, converges in distribution.Via the exponential series we have where 0 in distribution and hence in probability; from the latter and the remainder lemma (with l = 1) there follows (H n ) n≥1 P − → 0. Now we may write and rearranging terms gives where we have employed the Taylor expansion with the remainder term satisfying |R 2 (x, y)| ≤ M 2 (x, y) 2  2 in a suitable neighbourhood of (0, 0).Notice n Now by what we already have proved, (Z n ) n≥1 d − → σN , and again via Slutsky this implies But then Taylor expansion yields again the remainder term satisfies |R 3 (x)| ≤ M x 2 , and the remainder lemma and Slutsky's theorem lead to the desired conclusion.The boundedness of moments in ( 8) is subtler to prove.Let α ≥ 1, and choose β as in Lemma 18, but with β ≤ 3 4 .We treat the case X n ∼ Unif(B n q 1 ) only; the result for κ n−1 q 1 follows by replacing U with 1 in what follows.
Case q 1 < ∞: Take (ξ i ) i≥1 ∼ γ ⊗N q 1 and U ∼ Unif([0, 1]) independent, and define and we are going to show that either expectation on the right-hand side of ( 10) is bounded for n ∈ N.For the first one, write i.e., R 4 is the zeroth remainder term of Taylor's expansion, which may be bounded as follows, where we have used independence of U and {x n , y n }, and on A n the estimates n −1/2 |x n |, n −1/2 |y n | ≤ n β−1 hold true, therefore eventually they are smaller than 1 2 since β < 1.Now it is well known that for any fixed y > 0, so the first term within the parentheses in (11) behaves like n −α .For the second term notice that (x n ) n≥1 and (y n ) n≥1 satisfy the central limit theorem, and In order to tackle the second summand in (10), first apply Hölder's inequality to get With the union bound the first expectation on the right-hand side of ( 13) is further estimated ]. Writing out we have where the asymptotics are argued by Lemma 18; an analogous result is obtained for y n , hence The second expectation in ( 13) can be computed explicitly, because 1 and the latter follows a certain gamma-distribution, which yields , and that converges to 1 as n → ∞ by (12).Finally the third expectation in ( 13) is bounded from above, up to a constant factor depending only on α, by The y n -term we have dealt with before (just replace −α by α), and E[U 3pα/n ] ≤ 1. Similarly to whose law is not known explicitly though; nevertheless all moments of |ξ 1 | q 2 are finite, and (1 + xn √ n ) n≥1 → 1 almost surely by the SLLN, and by [7, Theorem 10.2] convergence is valid also in the L p -sense, which in its turn implies Taken together this amounts to lim sup n→∞ E[|Y n | α ½ A c n ] = 0 and thus, returning to (10), Case q 1 = ∞: We are not going to spell out the details here, since the line of reasoning is analogous to the first case.Take U and (ξ n ) n≥1 and define x n as before, but set y n := H n as in the proof of the CLT for X n q 2 given above, so the representation reads The remainder of this case's proof is conducted with the obvious adaptations; in particular notice 3 Proofs of the Poincaré-Maxwell-Borel principles In this section we present the proofs of the Poincaré-Maxwell-Borel principles, that is, Theorem A and Theorem B. We shall start with the probabilistic representation of Schechtman-Zinn type, which facilitates computations.

Proof of the probabilistic representation
In this subsection we present the proof of Proposition 1, which provides us with a probabilistic representation of the uniform distribution on the unit balls in mixed-norm sequence spaces.Let h : R m×n → [0, ∞) be an arbitrary measurable function, then or writing x in terms of its rows x 1 , . . ., x m , Introduce polar coordinates for each row separately, that is -notice that this corresponds to our decomposition (X i,j Finally use (r i θ i ) i≤m ∈ B m,n p,q iff (r i ) i≤m ∈ B m p , plug in ω m,n p,q = 2 −m ω m p/n (ω n q ) m (see Equation ( 5)), and gather terms to arrive at Now we recognize that the last integral proves the claimed density of (R i ) i≤m , the claimed independence and the claimed distribution of the It remains to show the representation of (R i ) i≤m .To that end define ), and by Schechtman and Zinn it can be written Transforming back to (R i ) i≤m concludes the proof.
(a) Case p < ∞: We use Proposition 1, (a), By the SLLN, k] (apply the lemma with q = 1 and m = 1), hence the right-hand-side of ( 14) converges to 1 in probability, therefore (m 1/p R i ) i≤k converges in distribution towards a constant and thus also in probability.
For a separable metric space E let M 1 (E) denote the convex set of probability measures on (E, B(E)) endowed with the topology of weak convergence of measures; then M 1 (E) is separable too.This topology on M 1 (E) may be metrized by, e.g., the Lévy-Prokhorov metric d LP .We denote by Lip b (E) the space of bounded, Lipschitz-continuous functions on E, equipped with the norm Proof.⇒: Let f ∈ Lip b (E), then the map ν → E f dν is continuous at µ w.r.t.d LP , hence for any ε > 0 there exists δ > 0 such that, for any ν ∈ M 1 (E), The union-bound then implies Proof of Theorem A.
(a) We have The claim follows from Lemma 20, (a), together with the independence of (R i ) i≤m from Θ 1 ,. . ., For the sake of legibility call ) in distribution and, because the latter is constant in M 1 (R), also in probability.
We apply Lemma 22.Let f ∈ Lip b (R), then the last line converges a.s. to zero because the sums obey the SLLN, and thus also in probability.
Essentially the same argument is valid for where the Lipschitz constant is taken with respect to • q , then again the sums obey the SLLN and hence the desired convergence is implied.
Case p = ∞: Notice that by the stochastic representation we are dealing with independent random variables and thus the convergence is immediate.

Proof of Theorem B.
(a) Recall (m 1/p n 1/q X i,j ) i≤m,j≤n = (m 1/p R i • n 1/q Θ i,j ) i≤m,j≤n .
Lemma 20, (b) and (c), imply the convergence in distribution of (m 1/p n 1/q X i,j ) i≤k,j≤l as claimed, where the joint convergence of (R i ) i≤m , Θ 1 ,. . ., Θ m may be argued either by their independence or by Slutsky's theorem.
and because of (D 1,n ) n≥1 P − → 1 the latter converges to zero as n → ∞, irrespective of whether m is fixed or diverges.We also have the laws of large numbers 1 q ≤ ε all hold true is at least, say, 1 − ε.Let n ≥ n 0 , then on the same event, for any i ∈ [1, m] and therewith Because this estimate holds for all n ≥ n 0 with probability at least 1 − ε, convergence in probability is established.
Case p = ∞: Again let f ∈ Lip b (R), then with the same notation and techniques as before, The remaining argument is the same as in the case p < ∞, only formally C m,n = 1 throughout.

Proofs of the weak limit theorems
In this section we present the proofs of the weak limit theorems, that is, Theorem C, Theorem D, and Theorem E, as well as of Corollaries 4, 5, and 7.

Proofs of the weak limit theorems
Recall that Theorem C treats the regime m → ∞ while n is fixed.
Proof of Theorem C. Case p 1 < ∞: Appealing to Proposition 1 we have then by the multivariate CLT we know with covariance-matrix . For brevity's sake we set µ := M where for the third equality we have performed the Taylor-expansion where the remainder satisfies |R(u, x, y)| ≤ M (u, x, y) 2 2 in a suitable neighbourhood of (0, 0, 0).Rearranging yields , and this converges in probability to (0, 0, 0) as m → ∞ by appealing to Slutsky's theorem and the known distributional convergence of (Ξ m , H m ).The remainder lemma then implies √ m R log(U ) m , Ξm √ m , Hm √ m m≥1 P − → 0. Since we also know (m −1/2 log(U )) m≥1 − → 0 almost surely and thus in probability, by Slutsky's theorem the right-hand-side of the last display converges to the random variable σN , where N ∼ N (0, 1) and ; a simple calculation shows that this is the desired variance.
The regime for Theorem D is n → ∞ while m is fixed.
where we have introduced the Taylor polynomial expansion (the partial derivatives of first order w.r.t.x 1 , . . ., x m are indeed zero), where again the remainder term satsifies |R u, and we apply the usual argument: ((Ξ n,i ) i≤m ) n≥1 and ((H n,i ) i≤m ) n≥1 converge in distribution, hence from Slutsky's theorem we infer n 1/4 log(U ) n , where (E i ) i≤m ∼ E(1) ⊗m is independent of (N 2,i ) i≤m introduced before; and therewith and the rest follows as before, with the modification and similarly for the remainder term.
(b) Here, as in the following case, Θ i q 2 = Θ i q 1 = 1 and therefore we have We perform Taylor expansion of the same function as in (a), case p 1 < ∞, but restricted to (y i ) i≤m = 0 and writing out second-order terms, to wit, where A = (a i,j ) i,j≤m ∈ R m×m is given by a i,i = m − 1 and a i,j = −1 for all i, j ∈ [1, m] with i = j, and the remainder term satisfies |R(u, x)| ≤ M (u, x) 3 2 with some M > 0 for all (u, x) 2 sufficiently small.So this gives or equivalently via rearrangement, We choose (l, α n , β n ) := (3, n 1/3 , n) for the remainder lemma; indeed, n 1/3 log(U ) n , Ξ n,i √ n i≤m = n −2/3 log(U ), n −1/6 (Ξ n,i ) i≤m converges to 0 in probability as n → ∞, therefore the remainder lemma implies nR log(U ) n , Ξ n,i √ n i≤m n≥1 P − → 0. Additionally we have ( log(U ) 2 n ) n≥1 → 0 almost surely and hence in probability.Thus via Slutsky's theorem we obtain and it remains to argue that the right-hand side has the claimed distribution.That − log(U ) ∼ E(1), is common lore.Since (ξ i ) i≤m is independent from U , (N 1,i ) i≤m can be assumed independent from U .The matrix A is symmetric and has eigenvalues m with multiplicity m − 1 and 0 with multiplicity 1, hence its spectral decomposition reads A = O diag(m, . . ., m, 0)O T with orthogonal O ∈ R m×m .The standard Gaussian distribution is orthogonally invariant, that is (N i ) i≤m := O T (N 1,i ) i≤m ∼ N (0, I m ), and thereby Because (N i ) i≤m still is independent from U we have finished.
(c) Using the same expansion as in (a), case p 1 = ∞, and restricting to (y i ) i≤m = 0 like in (b) while naming R ′ (x) := R(x, 0), we arrive at The result follows, after a rearrangement, from 1 p 1 (Ξ n,i ) i≤m d − → E(1) ⊗m , managing the remainder term as in (b) above.
In Theorem E now we consider n → ∞ and m = m(n) → ∞.The proof features the Lyapunov CLT: let ((Z n,i ) i≤m ) n≥1 be an array of R-valued random variables with independent rows (i.e., for any n ∈ N the variables Z n,1 , . . ., Z n,m are independent), and call As an aside, note that actually Lyapunov's condition implies Lindeberg's condition which in its turn implies the CLT.
Proof of Theorem E.
This time we have r m,n ∞,q 1 m 1/p 2 r m,n whence we see ∞,q 1 m 1/p 2 r m,n p 2 ,q 1 = 1 = A q 1 ,q 1 , and the noncritical cases follow as before.For the threshold-value t = 1, Equation ( 21 Proof of Corollary 7. Case q 1 = q 2 : Referring to Theorem E we have V m,n (t) = P √ mn m 1/p 1 −1/p 2 (M ν) := 2 sup A∈E |µ(A) − ν(A)|; if µ and ν are absolutely continuous w.r.t. a common measure λ on E with densities f and g, resp., then d TV (µ, ν) = E |f − g| dλ can be shown.Convergence w.r.t.d TV of the laws of random variables implies convergence in distribution.The following goes back to [23, Theorems 4.1, 4.5].

n→∞ 0 ;
thus by the remainder lemma √ n R Ξn √ n P −−−→ n→∞ 0, and another application of Slutsky's theorem leads to the desired statement.(b) Using Taylor expansion of the exponential function as in the proof of 1.(b), and convergence in distribution follows.Case p = ∞: Obvious because of (R i ) i≤k d = (|ξ i | 1/n ) i≤k .(b) Case p < ∞: By Lemma 17 we know both 1 m m i=1 |ξ i | p/n n≥1 P − → 1 and

Remark 21 .
Statements (a) and (c) of Lemma 20 can be seen as consequences of Proposition 10; this is immediate for (c), and for (a) recall from the proof of Proposition 1 that Now take any f ∈ Lip b (R) and consider m,n D i,n η i,j ) − E[f (η 1,1 )] ; we have to show that the probability of this expression being smaller than any positive number converges to one.So let ε > 0. Define B m,n,ε := m i=1 ½ [|D i,n −1|≥ε] , then B m,n,ε is binomially distributed with parameters m and P[|D 1,n −1| ≥ ε], and there holds ( 1 m B m,n,ε ) n≥1 P − → 0: indeed, let δ > 0, then

n→∞0Ξ n, 1 √
in distribution and in probability; the remainder lemma then gives√ nR log(U ) n , n i≤m , H n,i √ n i≤m n≥1 P − → 0; we also have log(U ) m√ n n≥1 → 0 almost surely and in probability; and a final use of Slutsky's theorem leads to the desired result.Case p 1 = ∞: Now according to Lemma 15, 2.(b),