On the lower bound of the spectral norm of symmetric random matrices with independent entries

We show that the spectral radius of an $N\times N$ random symmetric matrix with i.i.d. bounded centered but non-symmetrically distributed entries is bounded from below by $ 2 \*\sigma - o(N^{-6/11+\epsilon}), $ where $\sigma^2 $ is the variance of the matrix entries and $\epsilon $ is an arbitrary small positive number. Combining with our previous result from [7], this proves that for any $\epsilon>0, $ one has $$ \|A_N\| =2 \*\sigma + o(N^{-6/11+\epsilon}) $$ with probability going to 1 as $N \to \infty. $


Introduction
Wigner random matrices were introduced by E.Wigner about fifty years ago ( [15], [16]) as a model to study the statistics of resonance levels for neutrons off heavy nuclei. Nowadays, there are many fruitful connections between Random Matrix Theory and Mathematical Physics, Probability Theory, Integrable Systems, Number Theory, Quantum Chaos, Theoretical Computer Theory, Combinatorics, Statistics, and many other areas of science.
Let A N be a sequence of real symmetric Wigner random matrices with non symmetrically distributed entries. In other words, where the a ij , i ≤ j are i.i.d. random variables such that where C is some positive constant that does not depend on N. The common third moment µ 3 is not necessarily zero, which allows us to study the case when the marginal distribution of matrix entries is not symmetric. Let us denote by A N the spectral norm of the matrix A N , A N = max 1≤i≤N |λ i |, where λ 1 , . . . , λ N are the eigenvalues of A N . Clearly, the eigenvalues of A N are real random variables. It was proved in [7] that for an arbitrary small positive number ε > 0 the spectral norm of A N is bounded as with probability going to 1. In this paper, we prove that 2σ + o(N −6/11+ε ) is also a lower bound for A N . The main result of the paper is the following Theorem 1.1. Let A N denote the spectral norm of the matrix A N and ε > 0. Then with probability going to 1 as N → ∞.
Combining the result of Theorem 1.1 with (2), we obtain Theorem 1.2. Let A N denote the spectral norm of the matrix A N and ε > 0. Then with probability going to 1 as N → ∞.
Remark 1.1. In fact, one does not need the assumption that the matrix entries are identically distributed as long as {a ij , 1 ≤ i ≤ j ≤ N } are independent, uniformly bounded centralized random variables with the same variance σ 2 off the diagonal. The proofs of the results of the present paper and of [7] still hold without any significant alterations since we only use the upper bounds |Ea k ij | ≤ C k on the third and higher moments, i.e. for k ≥ 3, and not the exact values of these moments.
Remark 1.2. Similar results hold for Hermitian Wigner matrices as well. Since the proof is essentially the same, we will discuss only the real symmetric case in this paper.
We remark that 2σ is the right edge of the support of the Wigner semicircle law, and, therefore, it immediately follows from the classical result of Wigner ([15], [16], [2]) that for any A standard way to obtain an upper bound on the spectral norm is to study the asymptotics of E[T rA 2s N N ] for integers s N proportional to N γ , γ > 0. If one can show that where s N = ConstN γ (1 + o(1)), and Const 1 and γ 1 depend only on Const and γ, one can prove that with probability going to 1 by using the upper bound E[ A N 2s N ] ≤ E[T rA 2s N N ] and the Markov inequality. In particular, Füredi and Komlós in [3] were able to prove (6) for γ ≤ 1/6, and Vu [14] extended their result to γ ≤ 1/4. Both papers [3] and [14] treated the case when the matrix entries {a ij } are uniformly bounded. In [7], we were able to prove that for s N = O(N 6/11−ε ) and any ε > 0, thus establishing (2). Again, we restricted our attention in [7] to the case of uniformly bounded entries. The proof relies on combinatorial arguments going back to [8], [9], and [10]. More is known if the matrix entries of a Wigner matrix have symmetric distribution (so, in particular, the odd moments of matrix entries vanish). In the case of symmetric marginal distribution of matrix entries, one can relax the condition that (a ij ) are uniformly bounded and assume that the marginal distribution is sub-Gaussian. It was shown by Tracy and Widom in [11] in the Gaussian (GOE) case that the largest eigenvalue deviates from the soft edge 2σ on the order O(N −2/3 ) and the limiting distribution of the rescaled largest eigenvalue obeys Tracy-Widom law ( [11]): where q(x) is the solution of the Painléve II differential equation q ′′ (x) = xq(x) + 2q 3 (x) with the asymptotics at infinity q(x) ∼ Ai(x) as x → +∞. It was shown in [10] that this behavior is universal for Wigner matrices with sub-Gaussian and symmetrically distributed entries. Similar results hold in the Hermitian case (see [13], [10]). It is reasonable to expect that in the nonsymmetric case, the largest eigenvalue will have the Tracy-Widom distribution in the limit as well.
The lower bonds on the spectral norm of a Wigner random matrix with non-symmetrically distributed entries were probably considered to be more difficult than the upper bounds. Let us again restrict our attention to the case when matrix entries are uniformly bounded. It was claimed in [3] that the estimate of the type (5) for γ ≤ 1/6 immediately implies the lower bound As noted by Van Vu in [14], "We do not see any way to materialize this idea." We concur with this opinion. In the next section, we show that (5) implies a rather weak estimate for small δ > 0 and sufficiently large N. Combining (8) with the concentration of measure inequalities for A N (see [4], [1]), one then obtains that for Wigner matrices with uniformly bounded entries where C is the same as in (1). The proof of Theorem 1.1 will be given in Section 3, where we establish the analogue of the law of large numbers for T rA 2s N with probability going to 1 as N → ∞.

Preliminary lower bound
Without loss of generality, we can assume σ = 1/2. This conveniently sets the right edge of the Wigner semicircle law to be 1. Let us fix 0 < δ < 6/11, and denote Ω Choose s N to be an integer such that s N = N 6/11−ε (1 + o(1)) and 2δ/3 < ε < δ. Let us denote by 1 Ω the indicator of the set Ω and by Ω c the complement of Ω. Then which is o(1) as N → ∞. Let us now partition Ω N as the disjoint union Ω N = Ω 1 where As where in the last inequality we used (7) (for σ = 1/2) to get Combining the above estimates and (7) (for σ = 1/2), we obtain for sufficiently large N that Therefore, It was shown by by Alon, Krivelevich, and Vu ([1]), and Guionnet and Zeitouni ( [4]) that for Wigner random matrices with bounded entries, the spectral norm is strongly concentrated around its mean. Indeed, the spectral norm is a 1-Lipschitz function of the matrix entries since where HS denotes the Hilbert-Schmidt norm. Therefore, one can apply the concentration of measure results ( [12], [6], [5]). In particular (see Theorem 1 in ([1])), uniformly in N and t, where the constant C is the same as in (1). Combining (15) and (16), we arrive at for sufficiently large N . The last inequality together with (16) then implies (9) (recall that we set σ = 1/2.)

Law of Large Numbers
The main technical result of this section is the following analogue of the Law of Large Numbers for T rA 2s N N .
The Proposition 3.1 combined with (7) immediately implies that with probability going to 1 as N → ∞. To make (19) more precise, we can say that the ratio of the l.h.s. and the r.h.s. of (19) goes to 1 in probability as N → ∞.
The main part of the proof of Proposition 3.1 is the following bound on the variance.
where ε is an arbitrary small constant. There there exists Lemma is proven in the subsection below. Assuming Lemma 3.1, we obtain the proof of Proposition 3.1 via the Chebyshev inequality. Indeed, it follows from (20) and (7) that To finish the proof of the main result of the paper, we fix an arbitrary small positive constant δ > 0 and choose another constant ε in such a way that 0 < ε < δ. Setting σ = 1/2, we scale the eigenvalues in such a way that the right edge of the Wigner semicircle law is equal to 1. Let us denote, as before, At the same time, Proposition 3.1 implies (see (19)) that T rA 2s N N ≥ N 2/11 with probability going to 1. Therefore,

The proof of Lemma
We now turn our attention to the variance of the trace, which can be considered as follows.
To express Var TrA 2s N N in terms of the matrix entries, we first write TrA 2s N N as the sum of the products of matrix entries, namely we express TrA 2s N N as the sum of the diagonal entries of the matrix A 2s N N . Therefore, where we assume that i 2s N = i 0 . We can then rewrite ETrA 2s N N as the sum over the set of closed In a similar fashion (again using the agreement that i 2s N = i 0 and j 2s N = j 0 ), we can write where P 1 and P 2 are closed paths of length 2s N , The starred summation symbol ⋆ P 1 ,P 2 in the last line of the previous array of equations means that the summation is restricted to the set of the pairs of closed paths P 1 , P 2 of length 2s N on the complete graph on N vertices {1, 2, . . . , N } that satisfy the following two conditions: (i) P 1 and P 2 have at least one edge in common; (ii) each edge from the union of P 1 and P 2 appears at least twice in the union. Indeed, if P 1 and P 2 do not satisfy the conditions (i) and (ii) then the corresponding term in the expression for VarTrA 2s N N vanishes due to the independence of the matrix entries up from the diagonal and the fact that the matrix entries have zero mean. Paths P 1 , P 2 that satisfy (i) and (ii) are called correlated paths (see [8], [9]).
To estimate from above the contribution of the pairs of correlated paths, we construct for each such pair a new path of length 4s N −2. Such a mapping from the set of the pairs of correlated paths of length 2s N to the set of paths of length 4s N − 2 will not be injective. In general, a path of length 4s N − 2 might have many preimages. To construct the mapping, consider an ordered pair of correlated paths P 1 and P 2 . Let us consider the first edge along P 1 which also belongs to P 2 . We shall call such an edge the joint edge of the ordered pair of correlated paths P 1 and P 2 .
We are now ready to construct the corresponding path of length 4s N − 2 which will be denoted by P 1 ∨ P 2 . We choose the starting point of P 1 ∨ P 2 to coincide with the starting point of the path P 1 . We begin walking along the first path until we reach for the first time the joint edge. At the left point of the joint edge we then switch to the second path. If the directions of the joint edge in P 1 and P 2 are opposite to each other, we walk along P 2 in the direction of P 2 . If the directions of the joint edge in P 1 and P 2 coincide, we walk along P 2 in the opposite direction to P 2 . In both cases, we make 2s N − 1 steps along P 2 . In other words, we pass all 2s N edges of P 2 except for the joint edge and arrive at the right point of the joint edge. There, we switch back to the first path and finish it. It follows from the construction that the new path P 1 ∨ P 2 is closed since it starts and ends at the starting point of P 1 . Moreover, the length of P 1 ∨ P 2 is 4s N − 2 as we omit twice the joint edge during our construction of P 1 ∨ P 2 . We now estimate the contribution of correlated pairs P 1 , P 2 in terms of P 1 ∨ P 2 . Note that P 1 ∪ P 2 and P 1 ∨ P 2 have the same edges appearing for the same number of times save for one important exception. It follows from the construction of P 1 ∨ P 2 that the number of appearances of the joint edge in P 1 ∪ P 2 is bigger than the number of appearances of the joint edge in P 1 ∨ P 2 by two (in particular, if the joint edge appears only once in both P 1 and P 2 , it does not appear at all in P 1 ∨ P 2 ). This observation will help us to determine the number of pre-images P 1 , P 2 of a given path P 1 ∪ P 2 and relate the expectations associated to P 1 ∪ P 2 and P 1 ∨ P 2 .
Assume first that P 1 ∨ P 2 is an even path. In this case, the arguments are identical to the ones used in [9] and [10]. For the convenience of the reader, we discuss below the key steps. To reconstruct P 1 and P 2 from P 1 ∨ P 2 , it is enough to determine three things: (i) the moment of time t s in P 1 ∨ P 2 where one switches from P 1 to P 2 , (ii) the direction in which P 2 is read, and (iii) the origin of P 2 . The reader can note that the joint edge is uniquely determined by the instant t s , since the two endpoints of the joint edge are respectively given by the vertices occurring in P 1 ∨ P 2 at the moments t s and t s + 2s N − 1. It was proved in [9] (see Proposition 3) that the typical number of moments t s of possible switch is of the order of √ s N (and not s N ). This follows from the fact that the random walk trajectory associated to P 1 ∨ P 2 does not descend below the level x(t s ) during a time interval of length at least 2s N . Given t s , there are at most 2 × 2s N = 4s N possible choices for the orientation and origin of P 2 . From that, we deduce that the contribution of correlated pairs P 1 , P 2 for which P 1 ∨ P 2 is an even path is of the order of s where the extra factor 1/N arises from the contribution of the erased joint edge. Clearly, this bound is negligible compared to the r.h.s. of (20).
We now consider the contribution of correlated paths P 1 , P 2 such that P 1 ∨ P 2 contains odd edges. To do so, we use the gluing procedure defined in [9]. Two cases can be encountered: 1. the joint edge of P 1 and P 2 appears in P 1 ∨ P 2 exactly once (i.e. it appears in the union of P 1 and P 2 exactly three times).
2. all the odd edges of P 1 ∨ P 2 are read at least three times.
In case 2, one can use the results established in [7] to estimate the contribution to E[TrM 4s N −2 N ] of paths P 1 ∨ P 2 admitting odd edges, all of which being read at least 3 times. Therein it is proved by using the same combinatorial machinery that proved (7) that where the starred sum is over the set of paths P of length 4s N − 2 such that all odd edges (if any) are read in P at least three times. We first note that the number of preimages of the path P 1 ∨ P 2 under the described mapping is at most 8s 2 N . Indeed, to reconstruct the pair P 1 , P 2 , we first note that there are at most 2s N choices for the left vertex of the joint edge of P 1 and P 2 as we select it among the vertices of P 1 ∨ P 2 . Once the left vertex of the joint edge is chosen, we recover the right vertex of the joint edge automatically since all we have to do is to make 2s N − 1 steps along P 1 ∨ P 2 to arrive at the right vertex of the joint edge. Once this is done, we completely recover P 1 . To recover P 2 , we have to choose the starting vertex of P 2 and its orientation. This can be done in at most 2s N × 2 = 4s N ways. Thus, we end up with the upper bound 8s 2 where the sum is over the set of paths P (i.e. P = P 1 ∨ P 2 ) of length 4s N − 2 such that all odd edges are read in P at least three times (i.e. P does not contain edges that appear only once there). Using the results of [7], we can bound (25) from above by Finally, we have to deal with the case 1 (i.e. when the joint edge of P 1 and P 2 appears in P 1 ∨ P 2 exactly once). Thus we need to be able to estimate the contribution: * * where the two-starred sum is over the set of paths P of length 4s N − 2 such that all odd edges but one are read in P at least three times. For this, we need to modify the arguments in [7] to include the case when there is one single edge in the path. We refer the reader to the above paper for the notations we will use. As we have already pointed out, in the case 1 the path P 1 ∨ P 2 has one single edge (ij), which determines two vertices of the path P 2 . This edge serves as the joint edge of P 1 and P 2 . We recall from the construction of P 1 ∨ P 2 that in this case, the joint edge appears three times in the union of P 1 and P 2 . In other words, it either appears twice in P 1 and once in P 2 , or it appears once in P 1 and twice in P 2 . Without loss of generality, we can assume that the joint edge appears once in P 1 and twice in P 2 . Let us recall that in order to construct P 1 ∨P 2 , we first go along P 1 , then switch to P 2 at the appropriate endpoint of the joint edge, then make 2s N − 1 steps along P 2 , and, finally, switch back to P 1 at the other endpoint of the joint edge. Let the moment of the switch back to P 1 occur at time t in P 1 ∨ P 2 . Call P 3 the path obtained from P 1 ∨ P 2 by adding at time t two successive occurrences of the (unordered) edge (ij) in such a way that P 3 is still a path. Note that P 3 constructed in such way is a path of length 4s N . Furthermore, it follows from the construction of P 3 and the definition of the joint edge that the last occurrence of (ij) in P 3 is an odd edge and it necessarily starts a subsequence of odd edges (we refer the reader to the beginning of Section 2.1 in [7] for the full account of how we split the set of the odd edges into disjoint subsequences S i , i = 1, . . . , J of odd edges.) Assume that we are given a path P 3 with at least one edge read three times and where the last two occurrences of this edge take place in succession. The idea used in [7] is that a path P 3 with odd edges (seen at least 3 times) can be built from a path (or a succession of paths) with even edges by inserting at some moments the last occurrence of odd edges. Given a path with even edges only, we first choose, as described in the Insertion Procedure in Sections 3 and 4 in [7], the set of edges that will be odd in P 3 and choose for each of them the moment of time where they are going to be inserted. To be more precise, we first recall that the set of odd edges can be viewed as a union of cycles. We then split these cycles into disjoint subsequences of odd edges to be inserted into the even path (or, in general, in the succession of paths). In [7], we used the (rough) estimate that there are at most s N possible choices for the moment of insertion of each subsequence of odd edges. The expectation corresponding to such a path can be examined as in [7], up to the following modification. One of the subsequences S k of odd edges described in Section 2.1 of [7] begins with the joint edge (ij), and there are just two possible choices (instead of s N ) where one can insert that particular sequence of odd edges since the moment of the insertion must follow the moment of the appearance of (ij). This follows from the fact that the edge (ij) appears exactly three times in the path P 3 , and the last two appearances are successive. As in [7], let us denote the number of the odd edges of P 3 by 2l. Since (ij) is an odd edge of the path P 3 , there are at most 2l ways to choose the edge (ij) from the odd edges of P 3 . Once (ij) is chosen, the number of the preimages of (P 1 , P 2 ) is at most 4s N . Indeed, we need at most 2s N choices to select the starting vertex of P 2 and at most two choices to select the orientation of P 2 . Combining these remarks, we obtain that the computations preceding Subsection 4.1.2 of [7] yield that where the sum in (26) is over the pairs of correlated paths such that the case (1) takes place.
To apply the estimate (26), we have to obtain an upper bound on the typical number of odd edges in (26). Thus, we only need to slightly refine our estimates given in the Subsection 4.1.2 of [7]. As the edge (ij) appears three times in P 3 and only once in P 1 ∨ P 2 , the weight of P 3 is of the order 1/N of the weight of P 1 ∨ P 2 (since each matrix entry is of the order of N −1/2 ). Consider the path P 1 ∨ P 2 . Let ν N be the maximal number of times a vertex occurs in the even path associated to P 1 ∨ P 2 . In particular, if we know one of the endpoints of an edge (say, the left one), the number of all possible choices for the the other endpoint is bounded from above by ν N . Then the number of preimages of P 1 ∨ P 2 is at most ν N × 4s N . Indeed, since (ij) is the only single edge of P 1 ∨ P 2 (i.e. the only edge appearing in P 1 ∨ P 2 just once), there is no ambiguity in determining the joint edge (ij) in P 1 ∨ P 2 . Then, there are at most ν N choices to determine the place of the erased edge since we have to select one of the appearances of the vertex i in P 1 ∨ P 2 which can be done in at most ν N ways. Finally, there are 2s N choices for the starting vertex of P 2 and 2 choices for its orientation. As in [7], let us denote by P ′ the even path obtained from P 1 ∨ P 2 by the gluing procedure. The only modification from Subsection 4.1.2 of [7] is that the upper bound (39) on the number of ways to determine the cycles in the Insertion procedure has to be multiplied by the factor ν 2 N /s N . The reason for this modification is the following. In Section 4.1.2, we observed that the set of odd edges can be viewed as a union of cycles. In [7], these cycles repeat some edges of P ′ . We need to reconstruct the cycles in order to determine the set of odd edges. Note that to reconstruct a cycle we need to know only every other edge in the cycle. For example, if we know the first and the third edges of the cycle, this uniquely determines the second edge of the cycle as the left endpoint of the second edge coincides with the right endpoint of the first edge and the right endpoint of the second edge coincides with the left endpoint of the third edge, and so on. In [7], we used a trivial upper bound 2s N on the number of ways to choose an edge in the cycle since each such edge appears in P ′ and we have to choose it among the edges of P ′ . The difference with our situation is that one of the edges of P 1 ∨ P 2 , namely the joint edge (ij), does not appear in P ′ . However, its end points i and j appear in P ′ among its vertices. Therefore, we have at most ν 2 N choices for such edge instead of the standard bound 2s N that we used in [7]. Once the cycles are determined, one can split these cycles into disjoint sequences of odd edges to be inserted in P ′ . The total number of possible ways to insert these sequences is unchanged from Subsection 4.1.2 of [7]. These considerations immediately imply that the contribution to the variance from the correlated paths P 1 , P 2 is at most of the order as long as ν N < Cs N gives negligible contribution as it is extremely unlikely for any given vertex to appear many times in the path. We refer the reader to Section 4.1.2 of [7] where this case was analyzed.
Finally, one has to bound 1 N 2s N ′ P 1 ,P 2 E (i k i k+1 )∈P 1 a i k i i k+1 E (j k j k+1 )∈P 2 a j k j k+1 , where the sum is over the correlated pairs of paths. This can be done in the same way as we treated 1 N 2s N ′ P 1 ,P 2 E (i k i k+1 )∈P 1 (j k j k+1 )∈P ′ 2 a i k i i k+1 a j k j k+1 above. This finishes the proof of the lemma and gives us the proof of the main result.