Local semicircle law with imprimitive variance matrix

We extend the proof of the local semicircle law for generalized Wigner matrices given in [4] to the case when the matrix of variances has an eigenvalue $ -1 $. In particular, this result provides a short proof of the optimal local Marchenko-Pastur law at the hard edge (i.e. around zero) for sample covariance matrices $ \boldsymbol{\mathrm{X}}^\ast \boldsymbol{\mathrm{X}} $, where the variances of the entries of $ \boldsymbol{\mathrm{X}} $ may vary.


Model and results
The local semicircle law on the local distribution of eigenvalues of large Wigner matrices has been the basic technical input in the recent works on the Wigner-Dyson-Gaudin-Mehta universality (see [6] and references therein). The analysis was extended to generalized Wigner matrices [7,4] but always practically assuming that the matrix of variances is primitive 1 , in particular −1 is not in its spectrum. This assumption naturally holds for random band matrices that were the main motivation to generalize Wigner matrices in [7]. However, some important matrices with a certain block structure do not satisfy this condition. Most notable example is the 2N × 2N matrix where the N × N matrix X has independent entries. The matrix H is the linearization of the of the sample covariance matrix X * X. In this paper we show how to remove the primitivity assumption in [4]. We consider generalized N × N hermitian or symmetric Wigner matrix H = (h ij ) N i,j=1 with independent entries (up to the symmetry constraint H = H * ) such that h ij = 0 , and s ij := |h ij | 2 < ∞ .
(1. 2) We assume that all moments are bounded in the sense that, with constants C p independent of N. In order to avoid unnecessary clutter we have suppressed the N-dependence in the notations, e.g., we use H and S to refer to the sequences of matrices H (N ) = (h , respectively. Besides the natural constraints, S T = S, s ij ≥ 0, we make the following additional assumptions on the variance matrix: This setup is similar to that in [4], except that here we explicitly allow −1 in the spectrum of S. This allows us to consider S which contain imprimitive irreducible components. The results in [4] practically excluded this case since the estimates became unstable, see Section 7 of [4]. The main observation of this paper is that this instability is not present. The relaxation of the irreducibility condition is elementary algebra (cf. Lemma 2.1 below), and this extension was already mentioned in [4]. However, the inclusion of −1's in the spectrum of S requires a new algebraic identity that is stated as Lemma 2.2 below. We will show here how to incorporate this identity into the proof given in [4] with minor modifications.
The condition (A2) guarantees that the diagonal elements of the resolvent matrix, converge towards the Stieltjes transform of the Wigner semicircle law, ̺(x) = (2π) −1 max{4 − x 2 , 0}, as N approaches infinity. In order to state this main result, we recall the concept of stochastic domination (Definition 2.1 in [4]). We say that a (sequence) of random variables X = X (N ) is stochastically dominated by another (sequence) of random variables Y = Y (N ) , in notation X ≺ Y , if for any ε, D > 0 there exists N 0 = N 0 (ε, D) < ∞ such that If X, Y depend on some other parameters (like z or labels like i, j), then the definition is always taken uniform in these parameters (i.e. N 0 depends only on ε, D > 0). The notation X = O ≺ (Y ) means same as |X| ≺ Y . Then for any fixed γ > 0, the local estimates apply uniformly in z = E + iη ∈ D(γ). Moreover, outside the spectrum (1.11b) can be strengthened by introducing the distance, κ := max{ |E − 2|, |E + 2| }, of E from the spectral edges: This estimate is also uniform in z = E + iη ∈ D(γ) as long as the constraints |E| ≥ 2 and η √ κ + η ≥ M −1+γ are satisfied.
This theorem is a generalization of the following result.
may close at certain rates as N → ∞. However, the estimates near E = 0 in [4] deteriorated if the smallest eigenvalue approached −1, and in particular −1 was not allowed belong to the spectrum. The condition (A3) rules out closing of the upper gap, and hence the spectral domain S(γ), defined by formulas (2.14) and (2.17) in [4], has been replaced here by the simpler set D(γ). It is straighforward to extend Theorem 1.1 to the entire set S(γ) in the spirit of Theorem 2.3 in [4] but for brevity of this note we refrain from doing so. Theorem 1.1 directly implies a rigidity result for the increasingly ordered eigenvalues (λ α ) N α=1 of H in terms of the N-th quantiles (γ α ) N α=1 of the semicircle density: with ε > 0 arbitrary. See Theorem 7.6 in [4] for a proof in a more general setup and for the estimates on the extreme eigenvalues.
We remark that there have been many results on local semicircle laws prior to [4], in fact most methods used in [4] stem from [7,8,9]. See [4] for a complete account of the history and for the most concise general proof.
Finally, we mention a simple application. The eigenvalues of H in (1.1) generically come in pairs, ±λ, (see (2.7) below) and their squares λ 2 are the eigenvalues of the sample covariance matrices XX * and X * X. We assume that the elements of the square matrix X are independent, centred and their variances are chosen such that S and H satisfy (1.3)-(1.6) (note that −1 is an eigenvalue of S). Under these conditions, Theorem 1.1 can be directly used to estimate the resolvent matrix elements and the trace of X * X. Indeed, by applying the Schur formula to the N × N-block decomposition we see that the blocks on the diagonal equal: Thus Theorem 1.1 implies that the local Marchenko-Pastur law holds in the critical "hardedge" case, when the limiting density is singular at the origin.  . Under the conditions on X above, we have for any fixed γ > 0, uniformly in w ∈ C satisfying |w| ≤ 100 and Im w ≥ |Re w| M −1+γ .
The estimate outside of the spectrum (1.12) and the rigidity bound (1.13) can also be directly translated to the similar statements for the sample covariance matrices.
We remark that local Marchenko-Pastur law on the smallest local scale was first proven in [5] away from the critical case. The hard-edge case was independently considered in [3] and in [2], the latter providing an optimal error bound. Both works dealt with the case when the variances |x ij | 2 are constant, the above corollary extends the result to the case of non-constant variances.

Two algebraic lemmas
Let us define for arbitrary square matrices M 1 , . . . , M k , the diagonal block matrix by: General algebraic results for non-negative matrices yield the following decomposition when applied to S satisfying the assumptions of Theorem 1.1.
Lemma 2.1. Suppose that S is a symmetric (double) stochastic matrix that satisfies conditions (A1)-(A3). Then, after an appropriate permutation P of the indices, S has a block structure

2)
where S α , 1 ≤ α ≤ p, and S β , 1 ≤ β ≤ q, are irreducible doubly stochastic matrices with some p, q. The spectrums of the blocks The blocks S α have both +1 and −1 as simple eigenvalues, and they have the structure
Here H α and H β are independent generalised Wigner matrices satisfying |h α;ij | 2 = s α;ij and |h β;ij | 2 =s β;ij , respectively. This decomposition means that it suffices to prove Theorem 1.1 for the irreducible components separately. The components H β are already covered by Theorem 1.2. Hence, dropping the indices α ≥ 1, we are left to prove Theorem 1.1 in the case where S is irreducible, and the entries x ij of the square matrices X are independent, and satisfy |x ij | 2 = a ij . For the sake of convenience, we also redefine N to be equal to the dimension of A, so that H and S are 2N × 2N matrices. Using the special structure (2.6) it follows that if λ ∈ Spec(H) then also −λ ∈ Spec(H), and the corresponding eigenvectors are related in the following simple way: In particular, the same reasoning can be applied to the ±1 eigenvalues of S. Moreover, since S is double stochastic and irreducible, the eigenvectors of S belonging to the nondegenerate eigenvalues ±1, equal where diag(G) = (G 11 , . . . , G 2N,2N ) and f is defined in (2.8).
Proof. Let us additionally assume that 0 / ∈ Spec(H), and, that besides the pairs (2.7), there are no further degeneracies. Suppose v := u w are the eigenvectors corresponding to the eigenvalues λ and −λ in (2.7), respectively. Since λ = 0 (so that −λ = λ) these eigenvectors are orthogonal i.e., the first and the second blocks are balanced: u 2 = w 2 . Now, let v (α) , α = 1, . . . , 2N, be the 2N eigenvectors of H, and let u (α) and w (α) contain the first and the last N-components of v (α) . Then combining the spectral theorem, with the balancing conditions, u (α) which is equivalent to (2.9). Finally, by using a basic continuity argument, one sees that (2.10) must apply also without the extra assumptions concerning the degeneracies and the exclusion of 0 from the spectrum of H.
3 Translating proof of Theorem 1.2 to cover case (2.6) With the identity (2.9) at hand we may translate the proof of Theorem 1.2 from [4] to the setting (2.6) without significant changes. In order to see this, we recall that the −1 eigenvalue of S enters the proofs in [4] only when one needs to bound the inverse of the operator 1 − m 2 S. However, using the identity (2.9) one can show that in all such cases it suffices to restrict the analysis to the orthogonal complement of the eigendirection f corresponding to the eigenvalue −1. The lower spectral gap assumption (1.6) above −1 then guarantees that (1 − m 2 S) −1 stays uniformly bounded even when m(z) 2 becomes close to −1 (equivalently, z ≈ 0), i.e., Since the 'bad direction' f will not play any role in the analysis, the only necessary modification of [4] in the end, is to replace the operator norm Γ(z) (cf. equation (2.11) in [4]) of (1 − m 2 S) −1 in e ⊥ , by the analogous norm in the orthogonal complement of both e and f: The estimate (A.3) of [4], Γ(z) ≤ C(ρ) log N remains valid since the operator norm of (1 − m 2 S) −1 from ℓ 2 to itself is bounded by 1/(1 − ρ) in the complement of span{e, f }. Here the logarithm comes from the fact that the ℓ ∞ -norm is bigger by this factor over the ℓ 2 -norm (cf. p. 46 of [4]).
It remains to demonstrate why the inversion of 1 − m 2 S can be always be restricted to the orthogonal complement of f . This inversion was used to bound the random fluctuations of the diagonal resolvent elements, in terms of the small random error terms Υ i = O ≺ (N −c ) appearing the self-consistent vector equation (cf. (5.9) in [4]): Under the assumption, |v i | ≺ Λ ≺ N −c , with some control parameter Λ, and using |m| ∼ 1, (3.3), takes the form recalling (f, e) = 0, and then applying Lemma 2.2 yields: The identity (3.5) shows that inversion of 1 − m 2 S can be indeed restricted to the complement of f in the case of (3.4). The inverse of 1 − m 2 S becomes unbounded also in the direction e when m 2 ≈ 1. However, unlike with direction f, the inversions of 1 − m 2 S can not be straightforwardly restricted to the complement of e, since the average of v, is not small. For this reason the critical part [v] was treated separately from the remainder, v − (e, v) e ∈ span{e, f} ⊥ in a more precise second order scalar equation in [4]. The remainder part satisfies a linearised vector equation for which one needs to again invert 1 − m 2 S. We will now demonstrate that also in this case the component f is not present due to Lemma 2.2. Indeed, in order to get from (6.19) to (6.20) in [4] one applies the fluctuation averaging estimate (4.14) (with the choice t ij = s ij ) to bound the remainder v − (e, v) e. The crucial steps appear in the proof of (4.14) located at the end of the proof of Theorem 4.7 on p. 54 of [4], where a bound for Thus by expressing (3.8) in the vector form, we see that 1 − m 2 S can be also inverted in the subspace orthogonal to both the +1 and −1 eigendirections. Hence (3.10) yields which is exactly the fluctuation averaging bound (4.14) of [4] with Γ updated to Γ. Besides these observations and the replacement of Γ by Γ the proof from [4] can be carried out without further modifications.