Eigenvector Statistics of Sparse Random Matrices

We prove that the bulk eigenvectors of sparse random matrices, i.e. the adjacency matrices of Erd\H{o}s-R\'enyi graphs or random regular graphs, are asymptotically jointly normal, provided the averaged degree increases with the size of the graphs. Our methodology follows [6] by analyzing the eigenvector flow under Dyson Brownian motion, combining with an isotropic local law for Green's function. As an auxiliary result, we prove that for the eigenvector flow of Dyson Brownian motion with general initial data, the eigenvectors are asymptotically jointly normal in the direction $q$ after time $\eta_*\ll t\ll r$, if in a window of size $r$, the initial density of states is bounded below and above down to the scale $\eta_*$, and the initial eigenvectors are delocalized in the direction $q$ down to the scale $\eta_*$.


Introduction
In this paper, we consider the following two models of sparse random matrices H with sparsity p = p(N ): investigated for over half a century. To mention some, we refer the readers to the books [7,8] for a general discussion on spectral graph theory, the survey article [22] for the connection between eigenvalues and expansion properties of graphs, and the articles [9,10,[26][27][28][29][31][32][33] on the applications of eigenvalues and eigenvectors in various algorithms, such as combinatorial optimization, spectral partitioning and clustering.
We study the spectral properties of sparse random graphs from the random matrix theory point of view, i.e. the local eigenvalue statistics and the eigenvector statistics. It is expected that: i) the gap distribution for the bulk eigenvalues N (λ i+1 − λ i ) is universal, with density approximately given by the Wigner surmise; ii) the distribution of the second largest eigenvalue is given by the Tracy-Widom distribution (the largest eigenvalue of GOE); iii) the eigenvectors are asymptotically normal. For Wigner type random matrices, it is proved in [5,[13][14][15][16][17][18][19][20]35] for the bulk and [21,30,34] for the edge that the eigenvalue statistics are universal; it is proved in [6,24,36] that the eigenvectors are asymptotically normal. Sparser models are harder to analyze. The bulk universality for both Erdős-Rényi graphs and regular graphs on the regime p ≫ 1 were proved in [11,12,23]. The edge universality was only proved for Erdős-Rényi graphs on the regime p ≫ N 1/3 in [11,12,25]. Less was known for the distribution of eigenvectors. To our knowledge, only recently, in [1], Backhausz and Szegedy proved that the components of almost eigenvectors of p-regular graphs with fixed p converges to normal distribution in weak topology. However the proof heavily depends on the special structure of regular graphs and is hard to be generalized to other models.
Let H be the normalized adjacency matrix of G(N, p/N ) or G N,p on the sparse regime, i.e. p = p(N ) ≪ N . We denote its eigenvalues as λ 1 λ 2 · · · λ N and the corresponding normalized eigenvectors u 1 , u 2 , · · · , u N . The main goal of this paper is to prove that the bulk eigenvectors for H on the regime p ≫ 1 are asymptotically jointly normal. Comparing with [1], our results give explicitly the variance of the limit distribution, the asymptotical normality holds in any direction, and the argument does not depend on the special symmetry of the models. Theorem 1.1. Fix arbitrary small constant δ, κ > 0. Let H be the normalized adjacency matrix of sparse Erdős-Rényi graphs G(N, p/N ) with sparsity N δ p N/2; or the normalized adjacency matrix of pregular graphs G N,p with sparsity N δ p N 2/3−δ . Fix a positive integer n > 0 and a polynomial P of n variables. Then for any unit vector q ∈ R N (we assume q ⊥ e for p-regular graphs), deterministic indexes i 1 , i 2 , · · · , i n ∈ [[κN, (1 − κ)N ]], there exists a constant d > 0 depending on δ such that developed in [23], and our case follows directly. The main content of this paper is the second step. We study the eigenvector flow of Dyson Brownian motion with general initial data. For any N × N real deterministic matrix H, we define the following random matrix process, the Dyson Brownian motion: where W t = (w ij (t)) 1 i,j N is symmetric with (w ij (t)) 1 i j N a family of independent Brownian motions of variance (1 + δ ij )t . We denote H t = (h ij (t)) 1 i,j N , and so H 0 = H is our original matrix. We denote the eigenvalues of H t as λ(t) : λ 1 (t) λ 2 (t) · · · λ N (t) and the corresponding eigenvectors u 1 (t), u 2 (t), · · · , u N (t), where we write the j-th entry of u i (t) as u ij (t).
Under some local regularity conditions (see Assumption 1.3 and 1.4) of the initial matrix H 0 , we first prove the isotropic local law for Green's function of H t . With this as input and combining with the rigidity estimates of eigenvalues from [5], we analyze the eigenvector moment flow of Dyson Brownian motion following the approach developed in [6]. We prove that the eigenvectors of H t corresponding to "bulk" eigenvalues are asymptotically normal after short time. The main new features of our results in this paper are: 1) We only assume estimates on the Green's function locally near the energy E for which we are interested in. Our results can also be viewed as a local version of [6,Theorem 7.1] with general initial data. 2) Our results can be applied to sparse matrices with global correlations such as d-regular graphs.

Preliminary notations
A fundamental quantity is the Stieltjes transform of the empirical eigenvalue distributions of H t , we denote the resolvent of H t by G(t; z) := (H t − z) −1 , and the Stieltjes transform as for z ∈ C + the upper half complex plane. Often, we will write z as the sum of its real and imaginary parts z = E + iη where E = Re[z], η = Im[z].
We denote by ρ fc,t the free convolution of the empirical eigenvalue distribution of H 0 , i.e. ρ 0 = 1/N δ λi(0) and the semicircle law with variance t, and m fc,t the Stieltjes transform of ρ fc,t . The density ρ fc,t is analytic on its support for any t > 0. And the function m fc,t solves the equation m fc,t (z) = m 0 (z + tm fc,t (z)) = 1 where refer to [4] for a detailed study of free convolution with semi-circle law. For any t 0, we denote the classical eigenvalues of ρ fc,t by γ i (t), which is given by (1.4) Throughout the paper we use the following notion of overwhelming probability. Definition 1.2. We say that a family of events F (u) indexed by some parameter(s) u holds with overwhelming probability, if for any large D > 0 and N N (D, u) large enough, uniformly in u.
We use C to represent a large universal constant, and c for a small universal constant, which may depend on other universal constants, i.e., c in the control parameter ψ defined in (1.6), constants b and c in Assumption 1.3 and 1.4, and may be different from line by line. We write X Y or X = O(Y ), if there exists some universal constant such that |X| CY . We write X k Y , or X = O k (Y ) if there exists some constant C k , which only depends on k (and possibly other universal constants), such that |X| C k Y . We write X ≪ Y if there exists some small constant c, such that N c |X| Y . Now we can state the assumptions on the initial matrix H 0 . In Sections 2 and 3, we fix an arbitrarily small number c > 0, and define the control parameter (1.6) We fix an energy level E 0 , radius 1/N ≪ r 1, and mesoscopic scales 1/N ≪ η * ≪ r, where r and η * will depend on N . For example, the reader can take η * = ψ/N , r = N −1/2 in mind. We will study the eigenvectors corresponding to the "bulk" eigenvalues, which refer to eigenvalues on the interval [E 0 − r, E 0 + r]. We show that after short time, the projections of those "bulk" eigenvectors on some unit vector q are asymptotically normal.
The first assumption is the same as in [5], which imposes the regularity of density of H 0 around E 0 . Assumption 1.3. We assume that there exists some large constant a > 0 such that 2. The Stieltjes transform of H 0 is lower and upper bounded uniformly for any z ∈ {E + iη : Besides the information of eigenvalues of the initial matrix H 0 , we also need the following regularity assumption on its eigenvectors. Assumption 1.4. We assume that for some unit vector q, there exists some small constant b > 0 such that

Statement of Results
Let E 0 and r be the same as in Assumption 1.3. For any 0 κ < 1, we denote and the spectral domain: Theorem 1.5. We assume that the inital matrix H 0 satisfies Assumption 1.3 and 1.4. Fix κ > 0, positive integer n > 0 and polynomial P of n variables. Then for any η * ≪ t ≪ r and unit vector q ∈ R N , there exists a constant d > 0 depending on a, b, r, t such that 10) provided N is large enough, where sup is over all possible index sets I, and N j are independent standard normal random variables.
As a corollary, we have the following local quantum unique ergodicity statements for "bulk" eigenvectors.
Corollary 1.6. We assume that the initial matrix H 0 satisfies Assumption 1.3. We further assume that there exists a small constant b such that Then the following quantum unique ergodicity holds: Fix κ > 0. For any η * ≪ t ≪ r and ε > 0, there exists a constant d > 0 depending on a, b, r, t such that provided N is large enough, where a = (a 1 , a 2 , · · · , a N ), such that i a i = 0 and max i |a i | ≤ 1, and its norm a 1 = |a i |.

Local Law
In this section, we prove the following isotropic local law for the resolvent of H t . If we write H 0 = U 0 Λ 0 U * 0 , where Λ 0 = diag{λ 1 (0), · · · , λ N (0)}, and U 0 is the orthogonal matrix of its eigenvectors. Theorem 2.1 states that G(t, z) is well approximated by U 0 diag{g 1 (t, z), g 2 (t, z), · · · , g N (t, z)}U * 0 where g i are defined in (1.3). It implies that the Green function becomes regular after adding a small Gaussian component.
Under the Assumption 1.3, fix κ > 0. Then for any η * ≪ t ≪ r, unit vector q = (q 1 , q 2 , · · · , q N ) * ∈ R N , uniformly for any z ∈ D κ (as in (1.9)), the following holds with overwhelming probability, provided N is large enough, where u i (0) are eigenvectors of H 0 , and g i are defined in (1.3).

Rigidity of Eigenvalues
In [5], the eigenvalues of H t are detailed studied under the Assumption 1.3. In this section we recall some estimates on the locations of eigenvalues from [5]. For the free convoluted density ρ fc,t , we have the following deterministic estimate on its Stieltjes transform and classical eigenvalue locations (as in (1.3)

2)
and where C is a constant depending on the constant a in Assumption 1.3, and g i (t, z) are as in (1.3); for the classical eigenvalue locations, uniformly for any index i such that 2) is the same as [5, (7.7) Lemma 7.2]. For (2.3), we denoteẼ + iη := z + tm fc,t (z), and divide the sum into the following dyadic regions: We divide the sum into the following dyadic regions: For the eigenvalues which do not belong to ∪ n U n , we have |λ i (0) −Ẽ| 1. Sinceη t ≫ η * , we have Thus we can bound (2.32) The following result on eigenvalue rigidity estimates of H t is from [5,Theorem 3.3].
Under the Assumption 1.3, fix κ > 0. Then for any η * ≪ t ≪ r, and N large enough, with overwhelming probability, the followings hold: uniformly for z ∈ D κ ; and for the eigenvalues, uniformly for any index i such that λ i (t) ∈ I κ r (E 0 ).
where W is a standard Gaussian orthogonal ensemble, i.e., W = (w ij ) 1 i j N is symmetric with (w ij ) 1 i j N a family of independent Brownian motions of variance (1+δ ij )/N , we have the following equality in law: Therefore, Theorem 2.1 can be reduced to the case that H t = Λ 0 + √ tW : The entry-wise local law of the matrix ensemble Λ 0 + √ tW (so called deformed Gaussian orthogonal ensemble) was studied in [5]. In the following we recall some estimates on the entry-wise local law from [5,Theorem 3.3]. To state it we need to introduce some notations. For any index set ∈T the minor of H t by removing the columns and rows indexed by T, and its resolvent by .
(2.8) For the simplicity of notation, if the context is clear, we may simply write g i (t, z) as g i . Roughly speaking, the following theorem states that the resolvent matrix G(t, z) is close to the diagonal matrix diag{g 1 , g 2 , · · · , g N }.

9)
and for the off-diagonal resolvent entries, where T is any index set of size |T| log N .
Proof of Theorem 2.1. From the discussions above, we can assume that H 0 = Λ 0 is diagonal, and take H t = Λ 0 + √ tW , where W is the standard Gaussian orthogonal ensemble. The quadratic term in (2.1) can be written as a sum of diagonal terms and off-diagonal terms: where q = (q 1 , q 2 , · · · , q N ). The proof consists of two parts, the first part is trivial, we prove that the leading order term is the sum over diagonal terms; the second part is more involved, we show that the sum over off-diagonal terms is negligible by moment method.
For the diagonal terms, from (2.9) in Theorem 2.4 and (2.32) in Proposition 2.8, with overwhelming probability we have For the second part we prove that for any integer k > 0, uniformly for z ∈ D κ , we have where the implicit constant depends only on k. Then it will follows from the Markov inequality that |Z| √ N η holds with overwhelming probability. By Assumption 1.3, we have the following trivial lower bound for Im We expand E[|Z| 2k ], and introduce the shorthand notation X b2i−1b2i := G b2i−1b2i for 1 i k, and X b2i−1b2i := G * b2i−1b2i for k + 1 i 2k, where b = (b 1 , b 2 , · · · , b 4k ) and the sum b is over all b's such that b 2i−1 = b 2i , for 1 i 2k. To obtain an efficient control on E[X b1b2 X b3b4 · · · X b 4k−1 b 4k ], we need to understand the correlations between these offdiagonal resolvent entries G ij for i = j. Heuristically, G ij mainly depends on the matrix entry h ij , weakly depends on the matrix entries on the same row and column, and the dependence on the rest of the matrix H is negligible. Therefore the correlations of G ij and G mn are negligible if {i, j} ∩ {m, n} = ∅. In the rest of this section, we will make this heuristic argument more rigorous.
We denote the index set T = {b 1 , b 2 , · · · , b 4k−1 , b 4k }. Recall the following Schur complement formula t (z) = Tr G (T) /N is its Stieltjes transform. Schur complement formula gives the following resolvent identity: where D(z) and E(z) are two |T| × |T| matrices, which depend on the index set T, With overwhelming probability, uniformly for any z ∈ D κ , the error term E(z) is much smaller than D(z) in the sense of matrix norm. In fact, for E (1) , by (2.6) and notice the deterministic estimate from interlacing of eigenveallues |m t − m (T) t | |T|/N η, with overwhelming probability, t|m . For E (2) , with overwhelming probability, its entries are uniformly bounded by ψ(t/N ) 1/2 . For E (3) , with overwhelming probability, we have the following estimate where the first inequality follows from the large deviation estimate [20,Appendix B], and the second inequality follows from (2.6). Since E(z) is a |T| × |T| matrix, where |T| 4k, and with overwhelming probability, its entries are uniformly bounded, so is its norm: (η + t), which implies D(z) (η + t). As a result, there exists a constant C k which depends only on k, the following holds: uniformly for any z ∈ D κ , with overwhelming probability. We define the event A, such that (2.16) holds. Since it holds with overwhelming probability, for sufficiently large N , we can assume that By Taylor expansion, on the event A, we have where f is a large number, and we will choose it later. In the rest of the proof, we denote b2i−1b2i for k + 1 i 2k. We remark that G (ℓ) and X (ℓ) implicitly depend on the index set T. With these notations, we have where we used the fact that b 2i−1 = b 2i , and D −1 = diag{g i } i∈T is a diagonal matrix; therefore, when ℓ = 0, the term X (0) b2i−1b2i vanishes. On the event A, (D − E) −1 k 1/η and ED −1 k ψ/(N η) 1/2 , they together imply: In the following we show that: once we take f sufficiently large, these terms X b2i−1b2i are negligible, and do not contribute to (2.12). Since X b2i−1b2i 's are all uniformly bounded by 1/η, and the sum where Y is as in (2.12). We separate the leading term of the product of those X b2i−1b2i as If we take f = ⌈4k(a+1)/c⌉, then on the event A, the second term on the righthand side of (2.20) is bounded, where in the last inequality we used ψ = N c (as in (1.6)) and η ψ 4 /N , since z ∈ D κ (as in (1.9)). This combining with (2.19) leads to By the Cauchy-Schwarz inequality, we have In the following we bound the first term on the righthand side of (2.22), the second term can be treated in exactly the same way. By our definition of X where a represents arrays a i j ∈ T = {b 1 , b 2 , · · · , b 4k }, with indices 1 i 2k and 1 j ℓ i + 1; the above sum is over all the possible arrays a containing b, denoted by b ⊂ a, in the sense that a i Since by our definition g i are all deterministic, we can separate the deterministic part and the random part of (2.23): For the control of the expectation of the product ofẼ ij , we have the following proposition, whose proof we postpone to the next section.
where χ is an indicator function such that χ = 1 if any number in the array (b 1 , b 2 , · · · , b 2ℓ ) occurs even number of times, otherwise χ = 0.
where we used the fact that ℓ i 2k(f − 1) 8k 2 (a + 1)/c, so the implicit constant depends only on k.
Given b, the sum a:a⊂b is over all the possible arrays a such that a i For any array a with b ⊂ a and χ(b) = 1 (as in Proposition (2.5)), we denote the frequency representation of the array · · · , d n are all even, and n = |T|. Notice that d i = 4k counts the total number. We also denote the frequency representation of ((a i j ) 1 j ℓi+1 ) 1 i 2k as γ d1+r1 counts the total number. We summarize here the relations between d i , r i and ℓ i , which will be used later: Notice that the frequencies d i and d i + r i are uniquely determined by the partition P, in fact d i + r i are the sizes of blocks of P. Moreover since b is uniquely determined by a, first adding up terms corresponding to a such that b ⊂ a, and then summing over b is equivalent to first summing over arrays a corresponding to the same partitions P, which we denote by a ∼ P, and then summing over different partitions with each block size at least two. b a:b⊂a where in the first inequality we use (2.32) in Proposition 2.8 and d i 2, and for the last inequality we used in the last inequality, we used that ψ log N/ √ N η 1 and the total number of partition is bounded by ( ℓ i + 2k)!, which is a constant depending on k.
Following the same argument, one can check that This finishes the proof for isotropic law Theorem 2.1.
The following is an easy corollary of Theorem 2.1: uniformly for any z ∈ D κ , with overwhelming probability, provided N is large enough.
We take the event A 1 of trajectories (λ(t)) 0 t r such that: When we conditioning on any trajectory λ ∈ A, with overwhelming probability, the following holds As a consequence of Theorem 2.3 and 2.1, and notice we can take the parameter c (as in (1.6)) arbitrarily small, the event A 1 holds with overwhelming probability.

Auxiliary results
Then for any k 2 and m 0, we have

32)
and for any m 0, we have uniformly for any z ∈ D κ , where g i are as in (2.8).
∈T,j∈T , so they are independent. (2.25) can be decomposed into the following three estimates: with overwhelming probability where E T is the expectation with respect to rows and columns of W indexed by T.
For (2.36), since E (1) is diagonal and by (2.6) in Theorem 2.3, with overwhelming probability, t|m For (2.37), it is a product of normal variables, which does not vanish only if each variable occurs even number of times. Thus (2.37) follows, and the implicit constant is from the moment of normal variables, and can be bounded by (2ℓ − 1)!!.
In the following we prove (2.38). The entries of E (3) are given by Therefore, the lefthand side of (2.38) is bounded by For each monomial of resolvent entries G β 2ℓ−1 β 2ℓ , we associate it with a labeled graph G in the following procedure: We denote the frequency representation of the array (β 1 , β 2 , · · · , β 2ℓ ) as γ d1 We denote s the number of self-loops in G. For any vertex γ i ∈ G, its degree is given by d i , where self-loop adds two to the degree. It is easy to see that (2.38) follows from combining the following two estimates: where the implicit constant is from the moment of normal variables, and can be bounded by (2ℓ − 1)!!; and with overwhelming probability, uniformly for any z ∈ D κ , where ρ(G) is an indicator function, which equals one if each vertex of G is incident to two different edges, otherwise it is zero. For any graph G with ρ(G) = 1, we count the total number of edge-vertex pairs, such that the vertex is incident to the edge: each self-loop contributes to 1, and each non self-loop contributes to two, so the total number is s + 2(ℓ − s); since each vertex of G is incident to at least two different edges, the total number is at least 2v. Therefore, we have the following relation between v, s and ℓ: For the first bound (2.39), we denote the set B = {(b j , β j )} 1 j 2ℓ . Then the product in (2.39) can be rewritten as Since for (b, β) ∈ B, w bβ are independent normal random variables, (2.39) does not vanish only if e 1 (b, β) is even and e 1 (b, β) + e 2 (b, β) 2 for any (b, β) ∈ B, which implies ρ(G)χ(b 1 , b 2 , · · · , b 2ℓ ) = 1. Therefore, we have and (2.39) follows.
For the second bound (2.40), by Proposition 2.4, with overwhelming probability we have In terms of the graph G, the first bound corresponds to self-loops, and the second bound corresponds to non self-loop edges. In the graph G, there are s self-loops and ℓ − s non self-loop edges. The product of resolvent entries can be bounded as with overwhelming probability. Notice that ρ(G) = 1 implies that d i 2. The index set (β 1 , β 2 , · · · , β 2ℓ ) induces a partition P on the set {1, 2, · · · , 2ℓ} such that i and j are in the same block if and only if β i = β j . If two index sets induce the same partition, they correspond to isomorphic graphs (when we forget the labeling). Therefore, for (2.40), we can first sum over the index sets corresponding to the same partition and then sum over different partitions: where the second inequality follows from (2.34), in the third inequality, we used i d i = 2ℓ, for the second to last inequality, we used the bound v ℓ − s/2 from (2.41), and in the last inequality, we bounded the total number of different partitions by (2ℓ)!.

Short Time Relaxation
The Dyson Brownian motion 1.2 induces the following two dynamics on eigenvalues and eigenvectors,  j 1 ), . . . , (i m , j m )} with distinct i k 's and positive j k 's can be encoded as a particle configuration η = (η 1 , η 2 , · · · , η N ) on [ [1, N ]] such that η i k = j k for 1 k m and η p = 0 if p / ∈ {i 1 , i 2 , · · · , i m }. The total number of particles is N (η) := η ℓ = j k . We denote the particles in non-decreasing order by x 1 (η) x 2 (η) · · · x N (η) (η). If the context is clear we will drop the dependence on η. We also say the support of η is {i 1 , i 2 , · · · , i m }. It is easy to see that this is a bijection between test functions Q t j1,...,jm i1,...,im and particle configurations. We define η ij to be the configuration by moving one particle from i to j. For any pair of n particle configurations η: 1 x 1 x 2 · · · x n N and ξ: 1 y 1 y 2 · · · y n N , we define the following distance: We condition on the trajectory of the eigenvalues, and define where η is the configuration {(i 1 , j 1 ), . . . , (i m , j m )}. Here λ denotes the whole path of eigenvalues for 0 t 1. The dependence in the initial matrix H 0 will often be omitted so that we write f t = f H0 λ,t . We will call f t the eigenvector moment flow, which is governed by the following generator B(t) [6, Theorem 3.1]: Then f t satisfies the equation An important property of the eigenvector moment flow is the reversibility with respect to a simple explicit equilibrium measure: And for any function f on the configuration space, the Dirichlet form is given by We are interested in the eigenvectors corresponding to eigenvalues on the interval [E 0 − r, E 0 + r], and we only have local information of the initial matrix H 0 . However, the operator B(t) has long range interactions. We fix a short range parameter ℓ, and split B(t) into short-range part and long range part: Notice that S and L are also reversible with respect to the measure π (as in (3.8)). We denote by U B (s, t) (U S (s, t) and U L (s, t)) the semigroup associated with B (S and L ) from time s to t, i.e.
For any η * ≪ t ≪ r, In the rest of this section, we fix time t 0 and the range parameter ℓ, such that η * ≪ t 0 t t 0 + ℓ/N , which we will choose later. We will show that the effect of the long-range operator L (t) is negligible in the sense of L ∞ norm, i.e. f t (η) ≈ U S (t 0 , t)f t0 (η); and the short-range operator S (t) satisfies certain finite speed of propagation estimate, and (3.9) converges to equilibrium exponentially fast with rate N . As a consequence, f t (η) ≈ 1 and Theorem 1.5 follows.

Finite Speed of Propagation
In this section, we fix some small parameter 0 < κ < 1, and define the following efficient distance on n particle configurations:d where η: 1 x 1 x 2 · · · x n N and ξ: 1 y 1 y 2 · · · y n N , and γ i (t 0 ) are classical eigenvalue locations at time t 0 (as in (1.4)).
Proof. Thanks to the Markov property of Dyson Brownian motion, we know that the conditioned law (λ(t)) t t0 |λ(t 0 ) = λ is the same as Dyson Brownian motion starting at λ. In the proof, we will neglect the conditioning in (3.11), and simply think it as a Dyson eigenvalue flow starting at λ. We denote ν = N/ℓ and r s (η, ξ) = U S (t 0 , s)δ η (ξ). We define a family of cut-off functions g w parametrized by w ∈ R by demanding that inf x g w (x) = 0 and define g ′ w by considering the following three cases: It is easy to see that for any fixed x, as a function of w, g ′ w (x) is non-increasing. We take χ a smooth, nonnegative function, compactly supported on [−1, 1] with χ(x)dx = 1. We also define the smoothed version of g w , ϕ i (x) = g γi(t0) (x − y)νχ(νy)dy. Then ϕ i is smooth, We define the stopping time τ , which is the first time s t 0 such that either of the following fails: i) |m s (z) − m fc,s (z)| ψ(N η) −1 uniformly for z ∈ D κ ; ii) |λ i (s) − γ i (s)| ψN −1 uniformly for indices i such that γ i (s) ∈ I κ r (E 0 ). By our assumption that λ(t 0 ) is a good configuration, we have that τ t with overwhelming probability. Recall the inverse Stieltjest transform, ρ fc,s (E) = lim η→0 Im[m fc,s (E + iη)]/π. By Proposition 2.2, the densities ρ fc,s (E) and N −1 δ λi(s) are lower and upper bounded on I r κ (E 0 ), on the scale η ψ 4 /N . Thus, there exists some universal constant C such that for any t 0 s t, and interval I centered in I r κ (E 0 ), with |I| ψ 4 /N , For any configuration ξ with n particles we define where π is the reversible measure with respect to the eigenvector moment flow (as in (3.8)).
In the following we prove (3.14). We decompose X s as X s = M s + A s , where M s is a continuous local martingale with M t0 = 0, and A s is a continuous adapted process of finite variance. We denote A * t := sup t0 s t A s , and M * t := sup t0 s t |M t |. Then we have that X * t M * t + A * t . For M * t we will bound it by Burkholder-Davis-Gundy inequality: For A * t , since A t is a finite variance process, we will directly upper bound ∂ s A s , and By the Itó's formula we have The martingale part comes from (3.21), Therefore, combining with (3.17), we have Firstly, for (3.19), we need the following estimate: for |k − j| ℓ, We assume that j < k, then there exists 1 p < q n such that y p−1 j < y p (we set y 0 = 0) and y q−1 < k = y q (recall y q = y q (ξ)) and (3.13), and both ϕ xα (λ yα−1∨j ) and ϕ xα (λ yα ) vanish. Especially we have ϕ xα (λ yα−1∨j ) − ϕ xα (λ yα ) = 0. Similarly, if λ yα E + (1 − κ)r, then λ yα−1∨j E + (1 − κ)r − Cℓ/N , and ϕ xα (λ yα−1∨j ) − ϕ xα (λ yα ) = 0. Therefore, where we used (3.13) again. This estimate leads to (3.24): Combining with (3.24), it follows that For (3.20), we have the bound For (3.21), the finite variance part is given by By our choice of the cutoff function, |νϕ ′′ For (3.28), we either have λ yα / ∈ I r 2κ (E 0 ), then ϕ ′ xα (λ yα ) = 0; or λ yα ∈ I r 2κ (E 0 ), in this case, by a dyadic decomposition argument similar to (2.5), we have : Therefore we always have that (3.28) ν log N X s d(s ∧ τ ). (3.31) Finally to bound (3.29), we symmetrize its summands (3.32) where in the last inequality, we replaced ϕ ′ xα (λ k ) by ϕ ′ xα (λ i ). By our choice of In all the following bounds, we consider i and k as fixed indices. We also introduce the following subsets of configurations with n particles, for any 0 q p n: We denoteξ = (ξ 1 ,ξ 2 , · · · ,ξ N ) the configuration exchanging all particles from sites i and k, i.e.ξ i = ξ k , ξ k = ξ i andξ j = ξ j if j = i, k. We denote the locations of particles of the configurationξ: 1 ȳ 1 ȳ 2 · · · ȳ n N . Using π(ξ) = π(ξ), we can rewrite the sum over ξ in (3.32) as For i < k, both index sets {α : y α = k} ∪ {α :ȳ α = k} and {α : y α = i} ∪ {α :ȳ α = i} has cardinality p, and the j-th largest number in the first set is larger than its counterpart in the second set. By (3.12), for any a b, we have ϕ ′ a (x) ϕ ′ b (x). This implies that Equations (3.33) and (3.34) together with λ i < λ k give where we used, in the second inequality, ψ ′ xα ∞

1.
Note that transforming ξ intoξ can be achieved by transferring a particle for i to k (or k to i) one by one at most n times. More precisely, if ξ ∈ A p,q such that q p − q, we can define ξ j+1 = ξ ki j , for 0 j p − 2q. Then ξ 0 = ξ and ξ p−2q =ξ and where in the last inequality we used π(ξ) n π(ξ j ). Therefore we can bound (3.35) as where we used AM-GM inequality. Finally, we obtain the following bound We take expectation on both sides of (3.40) and obtain, Again by Gronwall's inequality, we have This finishes the proof of (3.14).
We can take the event A 2 of trajectories (λ(s)) 0 s t such that: conditioning on the trajectories (λ(s)) 0 s t , the short-range operator U S satisfies, sup t0 s t U S (t 0 , s)δ η (ξ) e −2cψ (3.42) for any pair of n particle configurations η and ξ (notice that the total number of n particle configurations is bounded by N n ) such thatd(η, ξ) ψℓ/2. Since with overwhelming probability λ(t 0 ) is a good eigenvalue configuration, combining with the Lemma 3.2, we know that A 2 holds with overwhelming probability.
Thanks to the semi-group property of U S , for any (λ(s)) 0 s t ∈ A 2 , we claim, for N large enough, the following hold: conditioning on the trajectories (λ(s)) 0 s t , the short-range operator U S satisfies, for any pair of n particle configurations η and ξ such thatd(η, ξ) ψℓ. We prove the statement by contradiction. Assume there is a pair η 0 and ξ 0 withd(η 0 , ξ 0 ) ψℓ and time t 0 s ′ s t such that (3.43) fails. We take a function h = d (η,η0) ψℓ/2 δ η on the space of n particle configurations. By triangular inequality, for any η such thatd(η, η 0 ) ψℓ/2, we haved(η, ξ 0 ) ψℓ/2. Therefore by (3.42), for sufficiently large N , (3.44) By the same argument for (3.44), we have Notice that U S (t 0 , s ′ ) preserves the constant function, we have which gives a contradiction with (3.44). Therefore, we have the following corollary of Lemma 3.2: For any trajectory (λ(s)) 0 s t ∈ A 2 as defined in (3.42), conditioning on (λ(s)) 0 s t , the short-range operator U S satisfies: uniformly, for any function h on the space of n particle configurations, and particle configuration ξ which is away from the support of h in the sense thatd(η, ξ) ψℓ, for any η in the support of h, it holds

Short time relaxation
Lemma 3.4. Under the Assumption 1.3, for any η * ≪ t ≪ r, we fix time t 0 and the range parameter ℓ, such that η * ≪ t 0 t t 0 + ℓ/N ≪ r. The Dyson Brownian motion W s (as in (1.2)) for 0 s t induces a measure on the space of eigenvalues and eigenvectors (λ(s), u(s)) for 0 s t. The following event A of trajectories holds with overwhelming probability: 1. The eigenvalue rigidity estimate holds: sup t0 s t |m s (z) − m fc,s (z)| ψ(N η) −1 uniformly for z ∈ D κ ; and sup t0 s t |λ i (s) − γ i (s)| ψN −1 uniformly for indices i such that γ i (s) ∈ I κ r (E 0 ). 2. When we condition on the trajectory λ ∈ A, with overwhelming probability, the following holds 3. Finite speed of propagation holds: uniformly, for any function h on the space of n particle configurations, and particle configuration ξ which is away from the support of h in the sense thatd(η, ξ) ψℓ, for any η in the support of h, it holds sup t0 s ′ s t (3.46) Let the indices b 1 , b 2 , b 1 − d 1 and b 2 + d 2 be such that among all the classical eigenvalue locations at time t 0 , γ b1 (t 0 ), γ b2 (t 0 ), γ b1−d1 (t 0 ) and γ b2+d2 (t 0 ) are closest to E 0 −(1−7κ/4)r, E 0 +(1−7κ/4)r, E 0 −(1−3κ/2)r and E 0 + (1 − 3κ/2)r respectively. We further define d = min{d 1 , d 2 }. We collects some facts here, which will be used throughout the rest of this section: 1. m s (z) is the Stieltjes transform of empirical eigenvalue distribution N −1 N i=1 δ λi(s) , and q, G(s, z)q can be viewed as the Stieltjes transform of the weighted spectral measure: N i=1 u i (s), q 2 δ λi(s) . The imaginary part of Stieltjes transform contains full information of the spectrum. (2.2) and Lemma 3.4 implies the following statements in terms of averaged density of eigenvalues and eigenvectors of H s : there exists some universal constant C such that for any t 0 s t, and interval I centered in I r κ (E 0 ), with |I| ψ 4 /N , we have (3.47) especially, C −1 rN d CrN ; and with overwhelming probability 4). Moreover, the eigenvector u i (s) is localized in the direction q with high probability, N q, u i (s) 2 ψ 4 Im[ q, G(s, (λ i (s) + iψ 4 /N ))q ] ψ 4 . (3.49) We define the following flattening and averaging operators on the space of functions of configurations with n points: Flat a (f ). (3.50) . We will only use the elementary property where the distance is defined in (3.4).
For a general number of particles n, consider now the following modification of the eigenvector moment flow (3.6). We only keep the short-range dynamics (depending on the short range parameter ℓ) and modify the initial condition to be flat when there is a particle outside the interval we are interested, i.e. [E 0 − r, E 0 + r]: ∂ t g t = S (t)g t , g t0 (η) = (Avf t0 )(η), (3.53) for n = 1, we write these functions as f t (k) and g t (k) when η is the configuration with 1 particle at k. We remind the reader that f t (η) can be define either by (3.5) or by the solution of the equation (3.6).
Before we prove our main results, we still need the following lemma on the L ∞ control on the difference of the full operator U B and the short-range operator U L : Lemma 3.5. For any eigenvalue trajectory λ ∈ A as defined in Lemma (3.4), we define the eigenvector moment flow as in Theorem 3.1. we have the following L ∞ control on the difference of the full operator U B and the short-range operator U L : where ξ is any n-particle configuration supported on Proof. By Duhamel's principle For any η corresponds to the configuration Then by (3.49), with overwhelming probability, we have N q, u ip (s) 2 jp ψ 2jp uniformly for any 1 p m, which leads to the following priori bound on the eigenvector moment flow: (3.55) We remark that k ∈ [[1, N ]] can be any index. Since (3.49) is local, only holds for eigenvectors corresponding to eigenvalues in the interval I r κ (E 0 ), in general we do not have control on N q, u k (s) 2 . However, it still follows from (3.55), , and thus λ ip (s) ∈ I r κ (E 0 ). A similar dyadic decomposition as in (2.5), combining with (3.47) and (3.48), we have Similarly, we also have and it follows Notice thatd(supp(L (s)f s − Flat d+3ψℓ (L (s)f s )), ξ} ψℓ. Therefore by the finite speed of propagation (3.46) in Lemma 3.4 of U S , we have where in the last inequality, we used that U S is a contraction in L ∞ . (3.54) follows, since we gain a factor t − t 0 from integration of time.
By Lemma 3.4, the event A holds with overwhelming probability. Theorem 1.5 easily follows from the following Theorem.
Theorem 3.6. Fix any η * ≪ t ≪ r. For any eigenvalue trajectory (λ(s)) 0 s t ∈ A defined in Lemma 3.4, let f be a solution of theñ particle eigenvector moment flow (3.6) with initial matrix H 0 and eigenvalue trajectories (λ(s)) 0 s t . Then for N large enough we have where the constant d > 0 depending on a, b, r, t.
Proof. The proof is an induction on the numberñ of particles. For any η such that N (η) = n and η ⊂ where we bounded the first term by Lemma 3.5, and the second term by finite speed of propagation (3.46), since f t0 − Avf t0 vanishes for any ξ such that ξ In the following we prove that sup η |{g t (η)} − 1| N −1 by a maximum principle argument. For a given t 0 s t, letη, corresponding to the particle configuration {(j 1 , k 1 ), . . . , (j m , k m )}, be such that where we define z jp = λ jp + iη, and ψ 4 /N η ℓ/N , will be chosen later. For the first term in (3.59) where we used (3.47). For the second term in (3.59), we claim that for any fixed j p such that (3.60) We can bound the lefthand side of (3.60) by (3.61) + (3.62) + (3.63) where The term (3.61) will be controlled by finite speed of propagation; (3.62) will be controlled by Lemma 3.5, and (3.63) by the isotropic local semicircle law for N (η) = 1, and by induction for N (η) 2.
To bound (3.61), we write For any a ∈ [ [1, d]], there are three cases: or neither of them.
. By finite speed of propagation (3.46) in Lemma 3.4, the total mass of U S (t 0 , s)( There are at most 2nψℓ such a, where n = N (η) is the total number of particles. Moreover, since U S is a contraction in L ∞ , we have Since by (3.49), uniformly for any Combining the above three cases together, it follows that (3.64) and therefore (3.61) n ψ 4n+1 ℓ/d.

To bound the term (3.62), Sinceη is supported on the interval [
for any k such that |k − j p | ℓ. Therefore, by Lemma 3.5 we have , then by (3.49) uniformly for any j in the support ofη jpk , N q, u j (s) 2 ψ 4 with overwhelming probability. Therefore, f s (η jpk ) ψ 4n , for any 1 p m. The first part of (3.63) is where we used that |aηjpk − aη| d(η,η jpk )/d ℓ/d.
From the proof of Lemma 3.5, and by our choice η ℓ/N , we have (3.63) can be reduced to upper bound the following expression, Moreover, by definition (3.5) the first sum in the above expression is (3.66) Thanks to (3.45), we have k ∈{j1,...,jm} with overwhelming probability. As a result, (3.66) is bounded by Combining (3.67) and (3.65), we have the following estimate for (3.63), (3.60) follows from combining the error estimate of (3.61), (3.62) and (3.63).
With the estimate (3.60), we can start proving (3.57) by induction. We choose the parameters We can take c (as in the control parameter ψ (1.6)) and d small enough such that d + 4ñc b, 4d + (16ñ + 2)c − 1 log N r and 2d + (8ñ + 1)c − 1 log N (t/2), then and for any 0 n ñ, it holds Thus (3.60) can be simplified as: for any 1 n ñ, and t 0 s t, we have If we plug in this back to (3.59), we have either g s (η) − 1 N −1 or We can prove the following by induction on n: let t k = t 0 + kψη/ñ for k = 0, 1, 2, · · · ,ñ. Then for any time t n s t we have (3.70) holds trivially for n = 0. Assume (3.69) holds for n − 1, we prove it for n. By induction , for any t n−1 s t we have Therefore for any t n s t, Gronwall's inequality leads to for N large enough. Combing with (3.58), we obtain, Similarly by a minimum principle argument, one can show that and (3.70) follows for any 0 n ñ.
Proof of Corollary 1.6. By taking q supported on i and j-th coordinates in Theorem 1.5, we know that for any k such that λ k (t) ∈ I r 2κ (E 0 ), u 2 ki (t) and u 2 kj (t) are jointly asymptotically normal. A second moment calculation yields By Theorem 1.5, the first term of the right hand side is bounded by CN −d , and the second term is bounded by C/ a 1 , where C is an universal constant. The Markov inequality then allows us to conclude the proof of (1.12).

Proof of Theorem 1.1
For the proof of Theorem 1.1, we follow the three-step strategy as in [2,23], where it was proved for sparse Erdős-Rényi graphs on the regime N δ p N/2 in [23], and p-regular graphs on the regime N δ p N 2/3−δ in [2], that in the bulk of the spectrum the local eigenvalue correlation functions and the distribution of the gaps between consecutive eigenvalues coincide with those of the Gaussian orthogonal ensemble. We prove Theorem 1.1 for Erdős-Rényi graphs, the proof for p-regular graphs is similar, and we only remark the differences.
Before the proof of Theorem 1.1, we recall some definitions and notations from [23].
Definition 4.1. Let A be an N × N deterministic real symmetric matrix. We denote the eigenvalues of A as λ 1 λ 2 · · · λ N and corresponding eigenvectors u 1 , u 2 , · · · , u N . For any small parameter c > 0 and unit vector q ∈ R N , we call the matrix A (c, q)-general, if there exists an universal constant C such that We recall the quantity Q i on the space of symmetric N × N real matrices. For any N × N matrix A, if λ i is a single eigenvalue of A, Q i (A) is given by (4.1) This quantity plays an important role in [34,35], where it was observed that Q i (A) captures quantitatively the derivatives of the eigenvalues λ i of A.
where ∂ ab is the derivative with respective to (a, b)-th entry of A.
Proof. The first two estimates (4.3) and (4.4) are proved in [23,Proposition 4.6]. The proof of (4.5) is analogous. We denote G = (A − z) −1 the resolvent of A, and V the matrix whose matrix elements are zero everywhere except at the (a, b) and (b, a) position, where it equals one. For the derivative of eigenvectors (4.5), we use the following contour integral formula: where the contour encloses only λ i . (4.5) follows from analogue estimate as in the proof of [23,Proposition 4.6]. For example thanks to the delocalization of eigenvectors of A in directions e a , e b and q.
Recall that H is the normalized adjacency matrix of Erdős-Rényi graphs as given in section 1. We define the following matrix stochastic differential equation which is an Ornstein-Uhlenbeck version of the Dyson Brownian motion. The dynamics of the matrix entries are given by the stochastic differential equations where W t = (w ij (t)) 1 i j N is symmetric with (w ij (t)) 1 i j N a family of independent Brownian motions of variance (1+δ ij )t. We denote H t = (h ij (t)) 1 i,j N , and so H 0 = H is our original matrix. More explicitly, for the entries of H t , we have Clearly, for any t 0 and i < j, we have E[h ij (t)] = f , and E[(h ij (t) − f ) 2 ] = 1/N . More importantly, the law of h ij (t) is Gaussian divisible, i.e. it contains a copy of Gaussian random variable with variance O(tN −1 ). Therefore H t can be written as where G is a standard Gaussian orthogonal ensemble, i.e., G = (g ij ) 1 i j N is symmetric with (g ij ) 1 i j N a family of independent Brownian motions of variance (1 + δ ij )/N , and is independent ofH t . Proposition 4.3. For N δ p N/2, we fix 0 < b δ/3. Then for 0 s ≪ 1, any unit vector q ∈ R N and N large enough, the followings hold.
1. For any c > 0, with overwhelming probability H s is (c, q)-general in the sense of definition 4.1.
Thanks to Proposition 4.3, with overwhelming probability,H satisfies Assumptions 1.3 and 1.4. In (4.8), if we condition on those good initial dataH, the eigenvectors of H t are asymptotically normal with overwhelming probability with respect to the randomness of G (as in (4.8)). If we then take expectation with respect toH, the following proposition follows.
where u i (t) are eigenvectors of H t corresponding to i-th eigenvalue, and N i are independent standard normal random variables.
Proof of Theorem 1.1. For simplicity of notation, we only state the proof for n = 1 case, i.e. we fix time t = N 4b−1 , and prove that for any i ∈ Let m be the degree of P , we have that P (x) Cx m . By (4.3), H s is (c, q)-general, especially, with overwhelming probability N q, u i (s) 2 CN c , for s = 0, t. Therefore E[P 2 (N q, u i (s) 2 )] CN 2mc , and we have where N is a standard normal random variable.
The proof for the case of p-regular graphs is analogous. Let H be the normalized adjacency matrix of p-regular graphs as in Section 1. The isotropic local law of H was proved in [3]. Since the adjacency matrix A of a p-regular graph is subject to the hard constraints that its rows and columns have sum p (i.e. it has the eigenvector e = (1, 1, · · · , 1) * / √ N ). Therefore, instead of using the usual Dyson Brownian motion (4.8) as in the Erdős-Rényi graph case, we use the constrained Dyson Brownian motion (as introduced in [3, Definition 2.2]), which is the Dyson Brownian motion constrained to the subspace of symmetric matrices whose row and column sums vanish. Let H t be the constrained Dyson Brownian motion after time t with initial data H 0 = H. We denote its eigenvalues λ 1 (t) λ 2 (t) · · · λ N −1 (t) λ N (t) = p/ √ p − 1, with corresponding eigenvectors u 1 (t), u 2 (t), · · · (t), u N −1 (t), u N (t) = e. Up to a change of basis, the constrained Dyson Brownian motion is equivalent to the usual (N − 1)-dimensional Dyson Brownian motion normalized by N rather than by N − 1. More concretely, let P be an isomorphism from e ⊥ to R N −1 , e.g., we can take Once we identify e ⊥ with R N −1 using P , the constrained Dyson Brownian motion is the same as the usual N − 1-dimensional Dyson Brownian motion: where G = (g ij ) 1 i j N −1 is symmetric with (g ij ) 1 i j N −1 a family of independent Brownian motions of variance (1 + δ ij )/N . Since u i (t) ⊥ e, P u i (t) for i ∈ [[N − 1]] are eigenvectors of P H t P * . Thus P u i (t) for i ∈ [[N − 1]] have the same distribution as the eigenvectors of e −t/2 P H 0 P * + √ 1 − e −t G. Thanks to Theorem 1.5, for t ≫ 1/N , the bulk eigenvectors of e −t/2 P H 0 P * + √ 1 − e −t G are asymptotically normal in the direction P q, which is a unit vector in R N −1 since q ⊥ e. Noticing that P q, P u i (t) = q, u i (t) , we conclude that the bulk eigenvectors of H t are asymptotically normal in the direction q. The same argument as in the proof of Erdős-Rényi case, combining with the continuity Proposition [2, Proposition 3.1], implies that the law of { q, u i (0) } i=i1,i2,··· ,in is asymptotically the same as that of { q, u i (t) } i=i1,i2,··· ,in . And thus, the claim of Theorem 1.1 follows.