Fluctuations of Functions of Wigner Matrices

We show that matrix elements of functions of $N\times N$ Wigner matrices fluctuate on a scale of order $N^{-1/2}$ and we identify the limiting fluctuation. Our result holds for any function $f$ of the matrix that has bounded variation and thus considerably relaxes the regularity requirement imposed in [7,11].

. I The density of states of an N × N Wigner random matrix H = H (N) converges to the Wigner semicircular law [ ]. More precisely, for any continuous function f : where λ1, . . . , λN are the (real) eigenvalues of H and µsc(dx) . . = 1 2π (4 − x 2 )+ dx. It is well known that for regular functions f , the normalized linear eigenvalue statistics 1 N Tr f (H) have an asymptotically Gaussian fluctuation on scale of order 1/N , see, for example, [ , , , , , , ] for different results in this direction, also for other random matrix ensembles. To our knowledge, this result under the weakest regularity condition on f was proved in [ ]; for general Wigner matrices f ∈ H 1+ǫ was required, while for Wigner matrices with substantial GUE component f ∈ H 1/2+ǫ was sufficient. Notice that the order of the fluctuation 1/N is much smaller than 1/ √ N which would be predicted by the standard central limit theorem (CLT) if the eigenvalues were weakly dependent. The failure of CLT on scale 1/ √ N is a signature of the strong correlations among the eigenvalues. In this paper we investigate the individual matrix elements of f (H). We will show that the semicircle law ( . ) holds also for any diagonal matrix element f (H)ii and not only for their average, 1 N Tr f (H); however, the corresponding fluctuation is much larger, it is on scale 1/ √ N . Moreover, the limiting distribution of the rescaled fluctuation is not necessarily Gaussian; it also depends on the distribution of the matrix element hii. Similar fluctuation results hold for the off diagonal matrix elements f (H)ij , i = j. For regularity condition, we merely assume that f is of bounded variation, f ∈ BV . We also prove an effective error bound of order N −2/3 that we can improve to N −1 if f ′ ∈ L ∞ , i.e. we provide a two-term expansion for each matrix element of f (H).
Similar results (with less precise error bounds) were obtained previously in [ ] for Gaussian random matrices and in [ , , ] for general Wigner matrices under the much stronger regularity assumptions that where f (ξ) . .= R e −iξx f (x) dx. The main novelty of the current work is thus to relax these regularity conditions to f ∈ BV . In addition, [ , , ] assumed that in the case of complex Hermitian matrices, the real and imaginary part of the entries have equal variance. Our approach does not require this technical assumption. We also refer to [ ] where similar questions have been studied for more general statistics of the form Tr[f (H)A] for non-random matrices A under the fairly strong regularity condition (1 + |ξ|) 4 | f (ξ)| dξ < ∞.
A special case of these questions is when the test function f (x) is given by ϕz(x) = (x − z) −1 for some complex parameter z in the upper half plane, η . .= ℑz > 0. In fact, for f which are analytic in a complex neighborhood of [−2, 2], a simple contour integration shows that for the linear statistics it is sufficient to understand the resolvent of H, i.e., ϕz(H) = (H − z) −1 for any fixed z in the upper half plane. If f is less regular, one may still express f (H) as an integral of the resolvents over z, weighted by the ∂z-derivative of an almost analytic extension of f to the upper half plane (Helffer-Sjöstrand formula). In this case, the integration effectively involves the regime of z close to the real axis, so the resolvent (H − z) −1 and its matrix elements need to be controlled even as η → 0 simultaneously with N → ∞. These results are commonly called local semicircle laws. They hold down to the optimal scale η ≫ 1/N with an optimal error bound of order 1/ √ N η for the individual matrix elements and a Motivated by this application, Sasha Sodin pointed out that this fluctuation can be related to the fluctuation of a single matrix element of the resolvent by the Markov correspondence,see [ ] for details. It is therefore natural to ask if one could use the fluctuation result from [ ] on the interlacing sequences to strengthen the existing results on the fluctuations of the matrix elements of the resolvent and hence of f (H). In fact, not the result itself, but the core of the analysis in [ ] can be applied; this is the content of the current paper. We thank Sasha for asking this question and calling our attention to the problem of fluctuation of the matrix elements of f (H) and to the previous literature [ , , , ]. Furthermore, he pointed out to us that the contour integral formula from Pleijel's paper [ ] could potentially replace the Helffer-Sjöstrand formula in our argument to the end of further reducing the regularity assumptions on f . We are very grateful to him for this insightful idea that we believe will have further applications. .

M
We consider complex Hermitian and real symmetric random N ×N matrices H = (hij ) N i,j=1 with the entries being independent (up to the symmetry constraint hij = hji) random variables satisfying ( . ) E hij = 0, E |hij | 2 = sij N and E |hij | p ≤ µp N p/2 for all i, j, p and some absolute constants µp. We assume that the matrix of variances is approximately stochastic, i.e.
to guarantee that the limiting density of states is the Wigner semicircular law.
To formulate the error bound concisely we introduce the following commonly used (see, e.g., [ ]) notion of high probability bound. (N) and Y = Y (N) (u) | N ∈ N, u ∈ U (N) are families of random variables indexed by N , and possibly some parameter u, then we say that X is stochastically dominated by Y , if for all ǫ, D > 0 we have sup u∈U (N ) P X (N) N ≥ N0(ǫ, D). In this case we use the notation X ≺ Y . Moreover, if we have |X| ≺ Y , we also write X = O≺ (Y ).
We further introduce a notion quantifying the rate of weak convergence of distributions. We say that a sequence of random variables XN converges in distribution at a rate r (N ) , where we allow the coefficient of the rate to be t-dependent uniformly for |t| ≤ T for any fixed T . If XN converges in distribution at a rate r(N ), we write In particular, this implies that for any analytic function Φ with compactly supported Fourier transform. Our main result for the diagonal entries of f (H) is summarized in the following theorem. By permutational symmetry there is no loss in generality in studying f (H)11. By considering real and imaginary parts separately, from now on we always assume that f is real valued.

]) be some real-valued function of bounded variation and assume that
where ∆ f is a centered Gaussian random variable of variance and the V f,i and V (σ 2 ) f,1 are given by quadratic forms defined in ( . ). More precisely, ( . ) means that, to leading order and, weakly else for all k. The speed of convergence in the Lévy metric dL is given by with some constant depending on f .
The corresponding result for the off diagonal terms is as follows.
Theorem . . Under the assumptions of Theorem . , and, introducing the notation else holds for all k, l ∈ N. The analogues of ( . ) and ( . ) also hold for T The fluctuation results in Theorems . and . for test functions satisfying the stronger regularity assumption ( . ) and without explicit error terms have been proven in [ , ].
We also remark that ( . ) implies the joint asymptotic normality of the fluctuations of f (H (N) )11 for several test functions. More precisely, for any f ∈ BV we define T (N) f via ( . ). Then for any given functions f1, f2, . . . , f k ∈ BV , the random k-vector T (N) f 1 , T (N) f 2 , . . . , T (N) f k weakly converges to a Gaussian vector with covariance given via the variance ( . ) using the parallelogram identity. Similar result holds for the joint distribution of the off diagonal elements f k (H)12. One may specialize this result to the case when f is a characteristic function, i.e. we may define Clearly, the finite dimensional marginals of the sequence of stochastic processes {T (N) x , x ∈ [−3, 3]} are asymptotically Gaussian. The tightness remains an open question. .

P '
Our main tool relating f (H)ij to the resolvent G = G(z) = (H − z) −1 is summarized in the following proposition. We formulate it for general probability measures µ supported on some [−K, K] and their Stieltjes transform Later we will apply the proposition to µ = ρN and µ = ρN with ρN , ρN being the spectral measures of typical diagonal and off-diagonal entries and df is understood as the (signed) Lebesgue-Stieltjes measure.
Before going into the proof, we present a special case of Proposition . . If f = 1 [x,x ′ ] , then ( . ) can be written as the path integral where γ(x, x ′ ) is the chain indicated in Figure ( c). We also want to remark that for our purposes ( . ) is favorable over the Helffer-Sjöstrand representation, as used in [ ], since it requires considerably less regularity on f .
where L(x) is a directed path as indicated in Figure a and z0 = x + iη0, η0 > 0.

F . Integration paths
By the definition of the Lebesgue-Stieltjes integral for functions of bounded variation we have that By virtue of ( . ) we can write where R(x) is the path indicated in Figure b and |df | indicates the total variation measure of df . We then write out the inner integral as Since the last term is x-independent, it will vanish after integrating against df since we assumed f to be compactly supported. For the second term we find for any η0, M > 0. For applications it turns out to be favorable to get rid of the real part which we can by noting that 2ℜmµ(z) = mµ(z) + mµ(z) and therefore We finally note that a variant of Proposition . could also be proven directly without appealing to the contour integration from [ ]. The key computation in that direction is summarized in the following Lemma which we establish here for later convenience.
Lemma . . Let f ∈ BV ([−L, L]) be compactly supported and let g be a function which is analytic away from the real axis and satisfies g(z) = g(z). Then for any η0, M > 0 we have that 1 2π Applying Lemma . to g = mµ yields, modulo an error term, and taking the limit η0 → 0 makes the inner integral tend to f (λ) in L 1 -sense. In this way we can establish a variant of Proposition . , albeit with a weaker error estimate.

Proof of Lemma . . This follows from the computation
where the first step follows from Stokes' or Green's Theorem. .

D
We first prove Theorem . about the diagonal entries of f (H). The spectral measure corresponding to the (1, 1)-matrix element, ρN defined as is concentrated in [−2.5, 2.5] with overwhelming probability. We can without loss of generality assume that f is compactly supported in [−3, 3] since smoothly cutting off f outside the spectrum does not change the result. Applying Proposition . to µ = ρN with K = 2.5, L = 3, we find that (using

FLUCTUATIONS OF FUNCTIONS OF WIGNER MATRICES
To analyse G(z)11 we recall the Schur complement formula To study the asymptotic behavior of G(z)11 we rely on the local semicircle law in the averaged form (see [ ] or [ , Theorem . ]) applied to the resolvent of the minor and its entry-wise form which both hold true for all |η| = |ℑz| > η0 ≫ N −1 . Here m denotes the Stieltjes transform of the semicircular distribution µsc, m(z) . .= (λ − z) −1 µsc(dλ).
Since by ( . ), In order to separate the leading order contribution from the fluctuation, we set where mN (z) = 1 N Tr G(z) and observe that and by expanding both terms around [−z − m(z)] −1 = m(z), Thus ΦN describes the leading order behavior, which is very close to a deterministic quantity, and the leading fluctuation is solely described by ΦN − ΦN . We then can write The reason for the normalization will become apparent later since in this way ∆ (N) f is an object of order 1.
For the leading order term we use ( . ) and Proposition . to compute For the fluctuation we use ( . ) to compute where the last step followed from Lemma . and We now concentrate on the computation of E ∆ We state the main estimate of E X(z)X(z ′ ) as a lemma.

Lemma . . Under the assumptions of Theorem . it holds that
where We remark that in the |x − x ′ | 2 term in Φ could be replaced by |x − x ′ | but we will not need this stronger bound here.

Proof of Lemma . . From ( ) in [ ] we know that
The last term we directly estimate as Furthermore, in Lemma of [ ] self-consistent equations for the first two terms on the rhs. of ( . ) were derived. We recall that Using the straightforward inequality |m(z)| ≤ 1 − c |η|, which holds for some small c > 0 and z in the compact region [−10, 10] × [−i, i], we find Since |m| decays outside the spectrum [−2, 2] we have that |m(z)| ≤ 1 − c ′ (|x| − 2)+ for |z| ≤ 10, and therefore Moreover, in the remaining regime where both |η| , |η ′ | ≪ 1 and |x| , |x ′ | ≤ 2, it holds that where the ± depends on the signs of η, η ′ and we allow for the constant c ′′ to change in the last inequality. This estimate follows from the explicit formula for m(z). Putting these inequalities together, we therefore find a constant C > 0 such that in the compact region Now ( . ) follows from combining ( . ), ( . ) and ( . ).
Using Lemma . we then compute where dη = dη dη ′ and df (x) = df (x) df (x ′ ). To estimate the error term we have to compute and readily check that By using Lemma . and organizing the contributions from the boundary terms at η0 and −η0, we find that the leading order of E( ∆ where z0 = x + iη0 and and for any fixed k ∈ N we can conclude that ( . ) becomes The first term of ( . ) was already computed on page of [ ]. The computation of the second term is very similar to the first one and the remaining terms are routine calculations. We arrive at in the general case and We note that V where ∆ f is a centered Gaussian of variance For higher moments we recall the following Wick type factorization Lemma from [ ].
Lemma . . For k ≥ 2 and z1, . . . , z k ∈ C with z l = x l ± iη l and η l > 0 we have that where [k] . .= {1, . . . , k}, η = η1 . . . η k , P2(L) are the partitions of a set L into subsets of size 2 and The error term in ( . ) is slightly stronger than that in [ ] since the Φ a,b includes a |xa − x b | 2 . This strengthening follows along the lines of the original proof by using the more precise analysis of the self consistent equation outlined in Lemma . . We check that integrating the error term from ( . ) over (I M η 0 ) k , with η0 being chosen as above according to the regularity of f , again gives asymptotically N −1/2 in the case of bounded f ′ and N −1/6 in the general case. By integrating the Wick type product and using ( . ) we therefore arrive at We note that the error terms are implicitly k-dependent. By counting the number of pair partitions we find that, to the leading order in N , the implicit coefficients scale like C k (k/2)! with a constant depending on f .
Recalling ( . ) and the definition of T (N) f from ( . ), we conclude that the overall fluctuations have moments Let φN (t) denote the characteristic function of T where gij . .= hij − δij z − hi, G(z)hj .
For the first term we set and by a computation analogous to ( . ) using ( . ) and an expansion of the form Similarly, from ( . ) we find that O≺ N −1/6 else. .