On the Expectation of the Norm of Random Matrices with Non-Identically Distributed

We give estimates for the expectation of the norm of random matrices with independent but not necessarily identically distributed entries.


Introduction and Notation
We study the order of magnitude of the expectation of the largest singular value, i.e. the norm of random matrices with independent entries where a i,j ∈ R, i, j = 1, . . . , n, g i,j , i, j = 1, . . . , n, are standard Gaussian random variables and 2→2 is the operator norm on ℓ n 2 . There are two cases with a complete answer. Chevet [2] showed for matrices satisfying a i,j = a i b j that the expectation is proportional to where a 2 denotes the Euclidean of a = (a 1 , . . . , a n ) and a ∞ = max 1≤i≤n |a i |. * Christian-Albrechts-Universität, Mathematisches Seminar, Kiel, Germany, email: lastname@math.uni-kiel.de.
For diagonal matrices with diagonal elements d 1 , . . . , d n we have that the expectation of the norm is of the order the Orlicz norm (d 1 , . . . , d n ) M where the Orlicz function is given by M(s) = 2 π s 0 e − 1 2t 2 dt [3]. This Orlicz norm is up to a logarithm of n equal to to the norm max 1≤i≤n |d i |.
These two cases are of very different structure and seem to present essentially what might occur concerning the structure of matrices. This leads us to conjecture that the expectation for arbitrary matrices is up to a logarithmic factor equal to max i=1,...,n (a i,j ) n j=1 2 + max j=1,...,n Latala [4] showed for arbitrary matrices Seginer [11] showed for any n × m random matrix (X i,j ) n,m i,j=1 of independent identically distributed random variables The largest singular value was first investigated by [12,13]. The behavior of the smallest singular value has been determined in [1,6,7].
There is a constant c > 0 such that for all a i,j ∈ R, i, j = 1, ..., n, and all independent standard Gaussian random variables g i,j , i, j = 1, ..., n, In the same way we prove Theorem 1.1 we can show the similar formula This inequality is generalized to arbitrary random variables as in [4].
is up to a logarithmic factor equal to (1) we investigate better estimate from below. On the other hand, . We show that the expression (2) is equivalent to the Musielak-Orlicz norm of the vector (1, . . . , 1), where the Orlicz functions are given through the coefficients a i,j , i, j = 1, . . . , n. Our formula (Theorem 3.1) enables us to estimate from below the expectation of the operator norm in many cases efficiently.
Moreover, we do not know of any matrix where the expectation of the norm is not of the same order as (2).
A convex function M : [0, ∞) → [0, ∞) with M(0) = 0 is called an Orlicz function [8]. Let M be an Orlicz function and x ∈ R n then the Orlicz norm of x, x M , is defined by If two Orlicz functions are equivalent, so are their norms: For all x ∈ R n In addition, let M i , i = 1, ..., n, be Orlicz functions and let x ∈ R n then the Musielak-Orlicz norm of x, x (M i ) i , is defined by

The upper estimate
In this section we are going to prove the upper estimate. We require the following known lemma. In a more general form see e.g. ( [10], Lemma 10). T be the norm on R n whose unit ball is B T . Then, for all x ∈ R n Proof. Let x ∈ R n . Then x * 1 , . . . , x * n denotes the decreasing rearrangement of the numbers |x 1 |, . . . , |x n |.
We denote Then by our previous lemma we have We use now the concentration of sums of independent gaussian random variables where K = 2 π 2 and The following lemma is an immediate consequence.

Lemma 2.2. For all
where K is the constant from (4).

Please note that
where C is an absolute constant. Furthermore, we get for β such that K 2β 2 π = 3 ln(2n) and K = 2 Proof. We shall apply Lemma 2.2. We may assume that max i=1,...,n n j=1 Therefore, for β ∈ R >0 , we have .., n. Now we apply Lemma 2.2 and get we have for all β with β ≥ π Again, by (6) we have for all β with β ≥ π We choose β such that 3 ln(2n) = K 2β 2 π . Then Proposition 2.4. Let a i,j ∈ R, i, j = 1, ..., n, and g i,j , i, j = 1, ..., n, be independent standard Gaussian random variables, then Proof. We divide the estimate of E (a i,j g i,j ) n i,j=1 2→2 into two parts. Let M be set of all points with Clearly, Furthermore, by Cauchy-Schwarz inequality and Proposition 2.3 we get Besides, we obviously have Altogether, this yields Summing up, we get Proof. (Theorem 1.1) W.l.o.g. we assume a i,j ≤ 1, i, j = 1, ..., n, and that there is a coordinate that equals 1. For all i, j = 1, ..., n and k ∈ N we define We denote by φ(k) the number of nonzero entries of the matrix (a k i,j ) n i,j=1 and we choose γ such that (a i,j ) n a k i,j ≤ 2 γ and therefore φ(k) ≤ 2 k+γ . Therefore, the non-zero entries of G k are contained in a submatrix of size 2 k+γ × 2 k+γ . Taking this into account and applying Since one of the coordinates of the matrix is 1 Therefore, there is a constant c such that The matrix k≤2γ G k has at most entries that are different from 0. Therefore, all nonzero entries of k≤2γ G k are contained in a square submatrix having less than (7) rows and columns.
We may apply Proposition 2.4 and get with a proper constant c 3 The lower estimate Theorem 3.1. For all i, j = 1, ..., n let a i,j ∈ R and g i,j be independent standard Gaussians. For all s ∈ R ≥0 and for all i = 1, ..., n let respectively let for all s ∈ R ≥0 and for all j = 1, ..., n Then where c 1 and c 2 are absolute constants.
The following example is an immediate consequence of Theorem 3.1. It covers Toeplitz matrices. A be a n × n-matrix such that for all i, = 1 . . . , n and k = 1, . .

Example 3.2. Let
We associate to a random variable X an Orlicz function M by We have Lemma 3.3. There are strictly positive constants c 1 and c 2 such that for all n ∈ N, all independent random variables X 1 , ..., X n with finite first moments and for all x ∈ R n where M 1 , ..., M n are the Orlicz functions that are associated to the random variables X 1 , . . . , X n (8).
Lemma 3.3 is a generalization of the same result for identically distributed random variables [3]. It can be generalized from the ℓ ∞ -norm to Orlicz norms.
We use the fact [9] that for all s > 0 √ 2π Proof. (Theorem 3.1) We apply Lemma 3.3 to the random variables Now, it is enough to show that M i ∼ N i for all i = 1, . . . , n. We have two cases.
We consider first s < 1 2 E n j=1 a 2 i,j g 2 i,j The right-hand side inequality follows from (4). The left-hand side inequality follows from n j=1 , we can apply (11). Therefore,

By (10)
dt. Thus, . The following holds By (12) the first summand is of the order max j=1,...,n We estimate the second summand. The second summand is less than or equal to