Mesoscopic fluctuations for unitary invariant ensembles

Considering a determinantal point process on the real line, we establish a connection between the sine-kernel asymptotics for the correlation kernel and the CLT for mesoscopic linear statistics. This implies universality of mesoscopic fluctuations for a large class of unitary invariant Hermitian ensembles. In particular, this shows that the support of the equilibrium measure need not be connected in order to see Gaussian fluctuations at mesoscopic scales. Our proof is based on the cumulants computations introduced in [48] for the CUE and the sine process and the asymptotics formulae derived by Deift et al. in [12]. For varying weights $e^{-N \tr V (H)}$, in the one-cut regime, we also provide estimates for the variance of linear statistics $\tr f(H)$ which are valid for a rather general function $f$. In particular, this implies that the characteristic polynomials of such Hermitian random matrices converge in a suitable regime to a regularized fractional Brownian motion with logarithmic correlations defined in [19]. For the GUE kernel, we also discuss how to obtain the necessary sine-kernel asymptotics at mesoscopic scale by elementary means.


Introduction and results
A point process on R is called determinantal if its correlation functions (with respect to the Lebesgue measure) exist and are given by where K : R × R → R is called the correlation kernel. These processes arise in random matrix theory to describe eigenvalues of the so-called unitary (invariant) ensembles; see theorem 1.1 below and section 3 for more details. There are many other interesting examples such as random tilings or the positions of non-colliding stochastic diffusions that we will not discuss here. We refer to [49,26,22,1,40] for various introductions to the general theory and further examples. A fundamental feature of determinantal processes is that all the information about the process is encoded in the correlation kernel. For instance, for unitary invariant ensembles, universality of the local correlations in the bulk of the spectrum follows from the convergence of the rescaled correlation kernel to the sine-kernel, [11,44,47]. In this work, we show that at mesoscopic scales, the sine-kernel asymptotics still holds and this leads to the following Central Limit Theorem.
Theorem 1.1. Let V : R → R be real-analytic such that

2)
and consider the probability measure dP V N = Z −1 V,N e −N Tr V (H) dH on the space of N × N Hermitian matrices equipped with the Lebesgue measure dH. If (λ 1 , . . . , λ N ) denote the eigenvalues of a random matrix H distributed according to P V N , then for any x 0 ∈ J V , any 0 < α < 1, and for any f ∈ C 1 (R) with compact support, we have as N → ∞, Proof. Section 3.3.
The condition (1.2) guarantees that Z V,N < ∞, so that the measure P V N is well-defined. This also implies that, for large N , the eigenvalue process is supported in a fixed compact set J V with high probability; see formula (1.5) below. Hence, the potential V is confining and the condition x 0 ∈ J V means that we zoom in around a point x 0 which lies in the bulk of the spectrum.
In theorem 1.1, the parameter α ∈ [0, 1] is called the scale. Since the eigenvalues density is of order N in the bulk, when α = 0, the l.h.s. of (1.3) depends on the whole spectrum of H and this regime is called global or macroscopic. On the other hand, when α = 1, the distance between consecutive eigenvalues remains of order 1 as the size N of the matrix tends to infinity and this regime is called local or microscopic. Any intermediate scale, 0 < α < 1, is called mesoscopic. In this regime, the limit (1.3) is independent of the potential V , the scale α and x 0 . Hence, this establishes universality of fluctuations for a large class of Hermitian random matrix ensembles.
The variance in formula (1.3) is given by (1.4) wheref (u) = f (x)e −i2πxu dx denotes the Fourier transform of f . Modulo constant functions, the norm (1.4) defines a complete subspace of L 2 (R) denoted by H 1/2 (R). Most of the work on unitary invariant ensembles has focused on the asymptotics of local or global statistics and we briefly review the main results, further references can be found in the textbooks [11,44]. Under the assumptions of theorem 1.1, there exists a probability density ̺ V with compact support J V on R such that for any f ∈ C ∩ L ∞ (R), It means that, for large N , the eigenvalues of a random matrix sampled according to P V N are distributed according to the equilibrium density ̺ V . Moreover, it is known that the fluctuations around this equilibrium configuration remain bounded as N → ∞. The precise behavior of linear statistics depends on the support of ̺ V . In the simplest case, there exists x 0 ∈ R and ℓ > 0 so that the potential V is said to satisfy the one-cut condition and we have a CLT: for any f ∈ where Σ(f ) 2 := 1 4π 2 (1.8) Theorem (1.7) was first proved in [25] when V is a polynomial of even degree using a variational method. We refer to [2] for further developments and to [9,33] for alternative proofs which are valid for more general determinantal processes. It is known that (1.6) holds when the potential V is strictly convex on R and, ifṼ (x) = V (x 0 + ℓx), by considering the ensemble PṼ N instead of P V N , we can always assume that x 0 = 0 and ℓ = 1. The one-cut condition is crucial to observe a Gaussian process in the limit. If J V = supp(̺ V ) is not connected, then for a generic test function f , the behavior of the linear statistic f (λ k ) is quasi-periodic in N and, even though this sequence of random variables is tight, it has no limit has N → ∞, see [42]. This complicated behavior is explained by the fact that the number of eigenvalues in the different components of J V fluctuates. Nevertheless, along the subsequence where it converges, the asymptotic distribution of f (λ k ) can still be described and it is not Gaussian in general, [3]. On the other hand, at the local scale, the behavior of the eigenvalue process is independent of the equilibrium density and it is described by the sine process in the bulk. Theorem 1.1 shows that mesoscopic fluctuations are universal as well. Actually, this results was first derived heuristically by Pastur in [42] also based on the semiclassical asymptotic formulae derided in [12] for the orthogonal polynomials with respect to the measure e −N V (x) dx on R.
Mesoscopic spectral statistics were first considered in [6,7] for Hermitian and symmetric Wigner matrices. In particular, the authors proved a result analogous to (1.3) for the GUE and GOE using the resolvent as a test function. One of the pioneering works on the subject which has been of inspiration for this article is Soshnikov's CLT for eigenvalues statistics of Haar distributed random matrices from the compact groups, [48]. In the case of the unitary group, this probability measure is known as the Circular Unitary Ensemble (CUE) and it is determinantal with the correlation kernel K UN (x, y) = sin N (x − y)/2 2π sin (x − y)/2 ∀x, y ∈ R/2πZ . (1.9) For this point process, Soshnikov obtained the counterpart of (1.3) which can been seen as a continuous analogue of the Strong Szegő theorem. A special case of theorem 1.1 was recently established in [19,37] for the Gaussian Unitary Ensemble (GUE). The authors of [19] proved that a suitable regularization of the characteristic polynomial of a GUE matrix converges weakly to a certain fractional Brownian motion which is logarithmically correlated, see section 4. From Theorem 2.4 therein, one can infer the CLT for mesoscopic linear statistics of any Schwartz-class test function. In [37], analogous results are proved for Hermitian Wigner matrices with sub-Gaussian entries, extending the results of [7]. For a class of determinantal processes known as orthogonal polynomial ensembles, an alternative approach to universality which is discussed in section 3.1 appeared in [9,10]. In particular, the authors obtained the counterpart of (1.3) for another family of unitary invariant ensembles, see theorem 3.9 and remark 3.3. All these results have the following interpretation. If we view the eigenvalue process as a random measure if centered, Ξ x0,α N converges in distribution to a random Gaussian process G with covariance structure (1.11) The random process G is called the H 1/2 -Gaussian field, see [24] chap. 1. Its special feature is that it is scale invariant. If f η (x) = f (ηx), then G(f η ) ∼ G(f ) for any η > 0, as can be seen from formula (1.11). Heuristically, this motivates why it is expected to describe mesoscopic fluctuations of point processes with strong repulsion such as eigenvalues of random matrices, see the discussion in [50]. In some respect, these ensembles behave like the sine process and this is the idea behind the proof of theorem 1.2 below. The mesoscopic correlations can also be guessed from formulae (1.7 -1.8). Namely, if x 0 = 0 and ℓ = 1, by a change of variables and, if f decays sufficiently fast, we obtain It is natural to investigate whether (1.3) holds under the optimal condition f ∈ H 1/2 ∩ L 1 (R).
To the author' knowledge, the question remains open even for the Gaussian Unitary Ensemble (GUE). To some extent, this issue is addressed in section 3.2. In particular, if the potential V satisfies the one-cut condition (1.6), we derive an upper-bound for the variance of the random variable Ξ x0,α N f which is valid e.g. for any function f ∈ H 1/2 (R) with compact support, cf. proposition 3.3. This easily allows us to generalize theorem 1.1, cf. theorem 3.6. As a corollary, we establish in section 4 that the result of [19] is also valid for the characteristic polynomial of a random matrix from an arbitrary unitary invariant P V N in the one-cut regime.
The proof of theorem 1.1 is based on the so-called Plancherel-Rotach asymptotics for the orthogonal polynomials (OP) with respect to the weight e −N V (x) on R derived in [12] and the following general result. For any function ρ : R → R + locally integrable, we let J ρ := {t ∈ R : ρ(t) > 0 and ρ(t) is continuous} (1.12) and, for all x ∈ R, we define We also denote by C k 0 (R) the space of compactly supported real-valued functions with k continuous derivatives on R.
Theorem 1.2. Consider a determinantal process on R with a correlation kernel K N which is locally trace-class. For any x 0 ∈ R, α ∈ [0, 1] and f ∈ C 0 (R), we consider the linear statistic where the sum is over the point configuration {λ k } of the process. Assume that there exists a function ρ : R → R + , x 0 ∈ J ρ , α ∈ (0, 1] and β > 0 such that for any L > 0, (1.14) uniformly for all ξ, ζ ∈ [−L, L]. Then, if α < 1, for any f ∈ C 1 0 (R), we have as N → ∞, On the other hand, if α = 1, for any f ∈ C 0 (R), we have as N → ∞, Proof. Section 2.
For any ν > 0, Ξ sin ν denotes the sine process with density ν > 0, i.e. the determinantal process on R with the correlation kernel At the local scale, the limit (1.16) implies the convergence of the process Ξ x0,1 N to the sine process. This behavior is known to be universal for Hermitian ensembles. In the context of theorem 1.1, it was proved in [43,12,34,35]. Assuming that the kernel K N is locally traceclass is standard, it means that for any function f ∈ L ∞ (R) with compact support, the integral operator is trace-class on L 2 (R). For instance, this implies that the linear statistic f (λ k ) has a finite Laplace transform and its cumulants are well-defined, see formula (2.4) below. Note that, we do not assume that the kernel K N is reproducing. In particular, the configuration {λ k } may have a random number of points or infinitely many. Hence, theorem 1.2 can be applied beyond the context of unitary ensembles.
It is obvious that the CUE kernel (1.9) has an asymptotic expansion of the form (1.14) with ρ = 1/2π and Soshnikov's CLT is a special case of theorem 1.2. Our main observation is that, if the correlation kernel satisfies (1.14), then we can still apply Soshnikov's method to prove a Central limit theorem, see lemma 2.2. In particular, the fact that the limiting process is Gaussian follows from the Main combinatorial Lemma of [48], theorem 2.5. For determinantal processes within the sine process universality class, it is plausible that the asymptotics (1.14) holds at scales α which are sufficiently close to 1, so that theorem 1.2 explains the appearance of the H 1/2 -Gaussian field G in this context. This also emphasizes on the connection with the Main combinatorial Lemma. However, the general mechanism behind universality of mesoscopic fluctuations is still far from being understood. In particular, it would be interesting to understand further the connection between random matrix theory and logarithmically correlated Gaussian fields as discussed in [19,37]. Within other symmetry classes and for the Dyson's β-ensembles, mesoscopic correlations are also conjectured to be described by the H 1/2 -Gaussian field. For instance, this has been rigorously established for the Gaussian β-Ensembles in [4], for random matrices from the special orthogonal and symplectic groups in [48] and in number theory, when considering mesoscopic linear statistics of the zeros of the Riemann-Zeta function [5,46]. There are also examples of determinantal processes where the precise asymptotics of the correlation kernel is not known, but the CLT (1.15) has been proved by other means, e.g. for non-colliding Brownian motions in [14].
In this article, we focus on applications to random matrices, but theorem 1.2 should be useful to investigate mesoscopic fluctuations for other instances of determinantal processes. Based on the Riemann-Hilbert formulation of [17,13], it is possible to derive very precise asymptotics for the orthogonal polynomials and the Christoffel-Darboux kernels for a large class of measures on R. These results combined with theorem 1.2 allows to prove universality of the mesoscopic correlations for the classical random matrix ensembles. For the GUE, it is possible to derive the asymptotics (1.14) using only the Plancherel-Rotach asymptotics for the Hermite polynomials, [45], and this leads to a rather elementary proof of theorem 1.1.
The rest of the paper is organized as follows. In section 2, we review the cumulants method introduced in [48] to study linear statistics of determinantal processes and we prove theorem 1.2. The proof relies on ideas developed in [27]. In section 3.1, we give a brief introduction to the theory of unitary ensembles focusing on the orthogonal polynomials method. In section 3.2, we provide some estimates for the variance of linear statistics of orthogonal polynomial ensembles both in the global and mesoscopic regime. In section 3.3, we review the results of [12] on the asymptotics of the Christoffel-Darboux kernels for varying exponential weights. This provides the necessary asymptotics to prove theorem 1.1. In section 3.4, we discuss another family of unitary ensembles known as the Moified Jacobi ensembles. The asymptotics of their correlation kernels is derived using the results of [31,32] and we deduce a CLT in this case as well, theorem 3.9. In section 4, we apply the mesoscopic CLT to generalize the result of [19] to a large class of unitary invariant ensembles in the one-cut regime. In section 5, we present an elementary approach to obtain the sine-kernel asymptotics at mesoscopic scales for the GUE kernel; see theorem 5.5. In the appendix A, we generalize an estimate obtained in section 3.2 for the variance of global linear statistics, further motivations and applications are given in [33].

Proof of theorem 1.2
We consider a family of determinantal processes on R with correlation kernels K N which depend on a parameter N > 0. We want to study the law of the random variable as N → ∞, where {λ k } is a configuration of the determinantal process with kernel K N and f ∈ C 1 0 (R) is a test function. We will assume that supp(f ) ⊂ [−L, L].
For any real-valued random variable Z with a well-defined Laplace transform, its cumulants C n [Z] are defined by the generating function: Using formula (1.1), one can compute moments and cumulants of linear statistics of determinantal processes. In particular, it was proved in [48] that, if the correlation kernel is locally trace-class and f ∈ C 0 (R), then for any n ∈ N, where In the following, we suppose that K N satisfies (1.14) for a given function ρ : R → R + and we let J = J ρ and F = F ρ according to (1.12), respectively (1.13). We also assume that J is non-empty, fix a point x 0 ∈ J and, for any ξ, ζ ∈ R, we let Then, by (2.1), (2.4) and a change of variables, we get It was observed in [27] that, if the correlation kernel K N satisfy the uniform asymptotics (1.14), then we can relate its cumulants to those of the sine process as N → ∞. In particular, lemma 2.1 which is the main ingredient to prove proposition 2.2 below is a straightforward adaptation of lemma 2.6 in [27].
Lemma 2.1. We consider two families of kernels (S N ) N >0 and (S N ) N >0 on R. If there exist β > 0, L > 0, and a function Γ N : R → R + such that when N is sufficiently large: Then, for all ǫ > 0, ℓ ∈ N, and for any functions f N,1 , . . . , f N,ℓ with support in [−L, L] such that sup f N,k ∞ : k = 1, . . . , ℓ} ≤ C ℓ , we have We let The asymptotics (1.14) implies that the kernelsK N and S N satisfy condition (1) of lemma 2.1. We claim that, if C 0 > 0 is sufficiently large, conditions (2) and (3) hold as well, so that we obtain for any m 1 , . . . , m ℓ ∈ N, By (2.7), it is straightforward to check that for any C 0 > 0, so that condition (3) holds. To check condition (2), note that by definition of J, (1.12), for any 0 < ǫ 0 < 1/2, there exists δ 0 > 0 so that the density ρ is continuous on [x 0 − δ 0 , x 0 + δ 0 ] and for all |x − x 0 | < δ 0 , If N α > L/δ 0 and C 0 ≥ ρ(x 0 )(1 + ǫ 0 ), this implies that for all ξ, ζ ∈ [−L, L], Thus, if we use the trivial bound | sin x| ≤ |x| ∨ 1, according to (2.7), we conclude that The map F is continuous non-decreasing, so it has a generalized inverse In the sequel, we will assume that δ 0 is sufficiently small, so that (2.9) holds and the map G . (2.11) Recall that the sine process Ξ sin ν is the determinantal process on R with a correlation kernel K sin ν given by (1.17). Proposition 2.2. Let f ∈ C 0 (R) and α ∈ (0, 1]. We have for any n ≥ 1, 12) and the functions F = F ρ and G is given by (2.10).
Proof of proposition 2.2. We fix m 1 , . . . , m ℓ ∈ N and we suppose that N α > L(1 ∨ 2ρ(x 0 ))/δ 0 . We can make the change of variables in the formula If we let 14) and f N be given by (2.12), according to remark 2.1, this leads to For any 0 < α ≤ 1, a Taylor expansion in (2.14) yields for all y, z ∈ [−L 0 , L 0 ], This implies that uniformly for all y, z ∈ [−L 0 , L 0 ], where ν N = N 1−α ρ(x 0 ). Thus, the kernelsS N and K sin νN satisfy condition (1) of lemma 2.1 with β = α. Moreover, if Γ N is given by (2.7) with C 0 = ρ(x 0 ), the kernel K sin νN also satisfies conditions (2) and (3). Therefore, since the functions If we combine formulae (2.8), (2.15) and (2.16), we have proved that for any m 1 , . . . , m ℓ ∈ N, Since, by formula (2.6), the cumulants of the random variable Ξ x0,α N f are linear combination of such traces, we conclude by formula (2.17) that for any n ≥ 1, In the physics literature, the change of variables (2.13) is known as unfolding the spectrum since in the context of random matrices, it corresponds to rescaling the eigenvalue process so that it has a constant density ν N in a mesoscopic range around the point x 0 ∈ J V . Notice that in formula (1.14), if ρ(x 0 ) = 0, a Taylor expansion of the function F ρ shows that we recover the standard sine-kernel asymptotics in the regime α > 1/2, Hence, at sufficiently small scales, the fact that the eigenvalues are not uniformly distributed is not relevant and, if the integrated density of states F ρ is smooth in J V , we can deduce proposition 2.2 directly from lemma 2.1 without the change of variables (2.13).
First, we use proposition 2.2 to derive the local correlations. In the regime α = 1, for any n ≥ 1, lim By (2.11), a Taylor expansion of the map G yields for all |x| < L 0 , By remark 2.1, the function f N has support in [−L 0 , L 0 ] and by continuity of f , the limit (2.20) implies that lim Hence, by the dominated convergence theorem, we get lim

By (2.19), it proves that lim
f for any f ∈ C 0 (R) and the limit theorem (1.16) follows from the fact that the sine process is characterized by its cumulants.
We now turn to the proof of (1.15) in the mesoscopic regime, 0 < α < 1. The argument is different because, in formula (2.18), the density of the sine-process ν N → ∞ as N → ∞. A relevant result in this regime is a CLT due to Soshnikov for the sine process.
The proof is based on Fourier analysis and a combinatorial argument given in the article [48].
Although the original proof is given for Schwartz functions, using a density argument, it is not difficult to extend Soshnikov's CLT to all test functions in the Sobolev space H 1/2 (R). In order to deduce theorem 1.2 from proposition 2.2, we see that it suffices to extend the proof of theorem 2.3 to deal with test functions f N of the form (2.12). To proceed we need further notations and to recall two key lemmas from [48].
For any tuple m ∈ N ℓ , we define 48]). There exists a constant C n > 0 which only depends on n ≥ 2 such that for any ν > 0 and any function f ∈ L 1 (R), If g ∈ C 1 0 (R), we define We will also need the following result.
Lemma 2.6. If f ∈ C 1 0 (R) and the function f N is given by (2.12), then Then, by the triangle inequality, First note that, by (2.20) and the continuity of f ′ , Second, by remark 2.1, we have for all sufficiently large N , Since ǫ 0 can be taken arbitrary small, we deduce that In the end, by the dominate convergence and the estimates (2.23 -2.25), we conclude that Observe that, if g ∈ C 1 0 (R), according to (1.4) and (2.22), By (2.20) and the dominated convergence theorem, we get lim For now, let us also assume that, with ν N = N α ρ(x 0 ) and A n ν defined in lemma 2.4, we have This implies that for any n ≥ 2, Since f is real-valued and Υ 2 (u, −u) = |u|/2, by lemma 2.5, we get By proposition 2.2 and (2.26), we conclude that for any f ∈ C 1 0 (R), A special case of the limit (2.27) was computed in [27, proposition 4.13]. The proof relies on lemma 2.6 and it is straightforward to generalize the argument of [27] to obtain (2.27). Hence, by (2.3), the CLT (1.15) holds for any 0 < α < 1, x 0 ∈ J, and f ∈ C 1 0 (R).

General context
The most well-known probability measure on the space of N × N Hermitian matrices is the Gaussian Unitary Ensemble, where dH denotes the Lebesgue measure. In this section, we will consider some generalizations of the GUE of the form where the function ω : R → [0, +∞) is upper-semicontinuous and such that for all k ≥ 0, This condition implies that the partition function Z ω,N < ∞. For scaling reasons, the weight ω may also depend on the dimension N even though we will not indicate it to keep our notations as simple as possible. The matrix log ω(H) is defined by functional calculus and the trace guarantees that the measure (3.2) is invariant under the transformation H → U HU * for any U ∈ U(N ). Hence, the name unitary invariant ensembles. In particular, if we use the spectral decomposition of H, under P ω N , the eigenvectors are independent of the spectrum Λ and Λ = λ 1 , . . . , λ N has a joint density on R N which is given by In order to analyze the probability measure P ω N , a method introduced by Gaudin and Metha in [41] consists in rewriting the density F ω N using the orthogonal polynomials with respect to the measure ω(x)dx on R. The condition (3.3) guarantees that these polynomials exist and we define for any k ≥ 0, Then, it follows from formula (3.4) that the eigenvalues density is Formulae (3.6 -3.7) implies that the point process Λ is determinantal with correlation kernel K ω N in the sense of (1.1). These facts are well-established and we refer to e.g. [11,28] for an introduction to the subject. By theorem 1.2, this reduces the question of universality of mesoscopic fluctuations for the ensembles (3.2) to obtain a precise asymptotics for the OPs with respect to the measure ω(x)dx.
Beyond the context of random matrix theory, one may consider the determinantal process (3.6) associated with a general measure. These processes are known as orthogonal polynomial ensembles and a significant amount of research has focused on proving the sine-process universality at the local scale, see [39] and reference therein. At mesoscopic scales, another universality result just appeared in [10] and the authors already obtained theorem 3.9 below. Instead of working with the correlation kernel of the process, they reformulate the cumulant problem in terms of the so-called Jacobi matrix of the measure ω(x)dx and this reduces the question of universality to the asymptotics of the recurrence coefficients which define the OPs. The drawback of their method is that, for technical reasons, it fails when the reference measure depends on the dimension N , like the GUE or the exponential weights considered in section 3.3. However, this other method requires only the asymptotics of the recurrence coefficients and it applies to discrete or singular measures where the OP asymptotics is difficult to derive.
Under general conditions and provided that the weight ω is suitably normalized as N → ∞, see [8,21], it is known that there is a Law of Large numbers Hence, µ ω is the equilibrium measure for the ensemble P ω N and it has compact support. In the following, we will suppose that it is absolutely continuous: dµ ω = ̺ ω (x)dx. The equilibrium density ̺ ω plays a fundamental role in the non-linear steepest descent introduced in [13] and it comes in the asymptotics of the Christoffel-Darboux kernel. Based on the results of [12,32,31], in sections 3.3 and 3.4, we will derive the mesoscopic asymptotics for the correlation kernels of the ensembles P V N , and the so-called modified Jacobi ensembles respectively. We do not intend to review the Riemann-Hilbert literature but we point out that the Deift-Zhou steepest descent has been developed by several authors and it has yields local universality for an extensive pool of OP ensembles and it should be possible to apply theorem 1.2 to prove mesoscopic universality as well, e.g. for the modified Laguerre ensembles and Wishart matrices using the results of [52].
In section 3.2, assuming that the OPs satisfy classical asymptotic formulae, see (3.10) below, we derive estimates for the variance of linear statistics. This allows us to extend the CLT (1.15) to a larger class of test functions, see theorems 3.6 and 3.9 below. Lemma 3.2 is also of interest for global linear statistics (α = 0). In particular, in the appendix A, we extend the scope of the CLT (1.7) to rather general test functions. Finally, for the GUE, it is possible to get the mesoscopic asymptotics of the correlation kernel without using the Riemann-Hilbert techniques; a complete proof is given in section 5.

Variance estimates
For the GUE, estimates for the variance of mesoscopic linear statistics have been derived in [19] for Schwartz-class test functions. Using a similar formalism, we will derive estimates for the variance of linear statistics which are valid both in the global and mesoscopic regimes for arbitrary weight ω such that the support of the equilibrium density ̺ ω is connected. Our method relies only on the bulk asymptotics of the OPs, therefore it cannot yield optimal upper-bounds. However, our results apply to test functions with rather mild smoothness and slow decay, such as the functions g t which arise when considering the logarithm of a regularized characteristic polynomial, cf. (4.2) below.
is an orthonormal family in L 2 (R). By (3.7), the Christoffel-Darboux kernel for the weight ω(x) on R is given by We suppose that the OPs have the following asymptotics for all |x| < 1, (3.10) The function ψ ∈ C(−1, 1) and the notation o ǫ means that the error term converges to 0 uniformly for all |x| < 1 − ǫ, i.e. it only depends on the parameter ǫ > 0. Moreover, according to (1.13), F = F ̺ω where ̺ ω denotes the equilibrium density for the ensemble P ω N and we assume that J ̺ω = (−1, 1). For a fixed weight ω, we can deduce from formula (3.10) that When the weight ω depends on the dimension N , we will suppose that formula (3.11) still hold for some functionψ ∈ C(−1, 1). For instance, when ω(x) = e −N V (x) and V (x) is analytic on R, the asymptotics of the OPs have been investigated in [12] by solving the appropriate Riemann-Hilbert problem and it is rather straightforward to check that, if J V = (−1, 1), then the asymptotics (3.10 -3.11) hold, ψ(x) = −ψ(x) = arcsin(x)/2 regardless of the potential V (x), and furthermore For the GUE, these results follows directly from the Plancherel-Rotach asymptotics, see section 5.
We consider the determinantal process Ξ N with correlation kernel (3.9) and for any continuous function, we denote Ξ N f := f (λ k ) where the sum is over the configuration {λ k } N k=1 . It is well-known that since K ω N defines a projection on L 2 (R), cf. e.g. [27,Lem. 3.1], we have We have seen in the introduction that, for the GUE or a general ensemble P V N satisfying the condition (1.6), the CLT (1.7) implies that for any sufficiently smooth function f , The question, we address in this section is wether there exists a constant C > 0 which may depend only on the weight ω such that for more general test functions, (3.14) Since the point process Ξ N is essentially supported in the bulk J, we expect that, apart form some reasonable growth assumption, the behavior of the function f (x) outside J should be irrelevant to estimate Var [Ξ N f ]. However, because of the effect of the spectral edge, it is technical to prove (3.14) in general. Instead, we will show that, if the OPs satisfy the conditions (3.10 -3.12), then for any function f ∈ H 1/2 (R) such that there exists 0 < δ < 1 and L > 0 so that and We begin by proving a simple lemma on the asymptotic behavior of the L 2 -mass of the function Φ N .
According to formula (3.13), in order to get an upper-bound for the variance of linear statistics, we need to estimate the quantity |K ω N (x, y)| 2 for all x, y ∈ R. By (3.9), we have Moreover, using the asymptotics (3.10 -3.12), we get for all |x|, |y| < 1 − ǫ, On the other hand, if f satisfies the hypothesis (H.1) and 0 < ǫ < δ, by formula (3.17), By symmetry Note that we used that the asymptotics of lemma 3.1 holds for the function Φ N −1 as well; this follows from formula (3.11). Since γN−1 γN → 1 2 , this upper-bound and (3.20) implies that for any 0 < ǫ < δ, The claim follows from formula (3.13) by combining the estimates (3.19) and (3.21).
The first consequence of lemma 3.2 is that, for any function f ∈ H 1/2 (R) which satisfies (H.1), there exists a constant C > 0 such that for any 0 < ǫ < 1, Since the l.h.s. is independent of ǫ and lim In the remainder of this section, we discuss the implication of lemma 3.2 for mesoscopic linear statistics.
for any |x 0 | < 1 and for any 0 < α < 1, we have Remark 3.1. The main difficulty to estimate the variance of linear statistics is to control the contribution from the edges of the spectrum. An issue that we avoided by using lemma 3.1 and the condition (H. 2). Based on the results of [12], the same method should apply to the ensemble P V N in the multi-cut case as well, though the asymptotics of the OPs is more complicated and the argument becomes rather technical. It is straightforward to check that (H. 2) holds in both cases: ii) f is bounded and has compact support. In particular, the estimate (3.22) applies to the resolvent x → (x − z) −1 for any z ∈ C such that ℑz = 0 and for any function in H 1/2 ∩ L ∞ (R) with compact support. From the point of view of mesoscopic linear statistics, this encompass the most relevant class of test functions.
Proof of proposition 3.3. The assumption (H.2) implies that there exists C > 0 so that, if |x| ≥ C, then for all |y| ≤ |x|, This inequality shows that, if |x 0 − 1| ∧ |x 0 + 1| = 2δ, then for all N > (C/δ) 1/α and for all |x| Hence, since the r.h.s. of (3.23) is symmetric in x and y, for sufficiently large N , the functions g N satisfy the condition (H.1) and by lemma 3.2, By a change of variables, On the other hand, using the estimate (3.23), for all sufficiently large N and for any 0 < ǫ < δ, we have and then, according to formula (3.25), we obtaiñ Finally, if we combine the estimates (3.24) and (3.26), there exists a constant C > 0 so that for any 0 < ǫ < δ, lim N →∞ Since this holds for any 0 < ǫ < δ and lim ǫց0 Θ(ǫ) = 0, this implies formula (3.22).

Varying exponential weights
In this section, we consider unitary ensembles with varying weight of the form ω(x) = e −N V (x) where the potential V : R → R is real-analytic. Like in theorem 1.1, we denote this probability measure by P V N . The condition (1.2) guarantees that (3.3) holds and the equilibrium density ̺ V exists, see (1.5). In the following, according to (1.12), respectively (1.13), we denote Moreover, by (3.6 − 3.7), the spectrum Λ of a random matrix sampled according to P V N is a determinantal process with correlation kernel (3.28) In the physics literature, F V is known as the integrated density of states. The set J V corresponds to the bulk of the spectrum Λ, it is composed of finitely many bounded open intervals and the equilibrium density ̺ V is smooth on J V ; see [12] for further references. One of the fundamental results of [12] is the following local asymptotics for the correlation kernel of the eigenvalue process .

29)
where the error is uniform for x 0 in compact subsets of J V and for ξ, ζ in compact sets of R.
Actually, the non-linear steepest descent analysis of [12] is valid at any scales and their results implies the following sine-kernel asymptotics at mesoscopic scales.
30) where the error is uniform for x 0 in compact subsets of J V and for ξ, ζ in compact sets of R. Proposition 3.5 is not formulated in [12] because the authors were interested in universality of the local correlations and not in mesoscopic statistics. However, the proof of proposition 3.5 is a straightforward adaptation of that of lemma 3.4 and we will just review the main steps for completeness. First, note that 0 is an arbitrary reference point in the definition of F V . In particular, one shall interpret the r.h.s. of (3.30) according to (1.5), namely for any x < y, Moreover, since the density ̺ V is smooth on J V , we have and the asymptotics (3.29) is a special case of (3.30) when α = 1.
Proof of proposition 3.5. We will use the Riemann-Hilbert formulation of [12] and the formulae referenced {#} come from therein. We let I = (b, a) be the component of J V which contains x 0 and for all x ∈ I, see formula {6.7} (note that in [12], the equilibrium density is denoted by Ψ instead of ̺ V , cf. {1.6}). By {2.2}, we can write the correlation kernel where the 2 × 2 matrix Y is the solution of a appropriate Riemann-Hilbert problem. Transforming the problem, cf. {6.8 − 6.9}, the authors proved that for any x ∈ I,  Hence, since det M (z) = 1 for all z ∈ C, by formula (3.33), we obtain Hence, if we take x = x 0 + ξ/N α and y = x 0 + ζ/N α with ξ, ζ ∈ [−L, L] in (3.35) and rescale by N α , we obtain formula (3.30).
The correlation kernel K V N has rank N and proposition 3.5 shows that it satisfies (1.14) for any α ∈ (0, 1] with ρ = ̺ V . Hence theorem 1.1 is a direct consequence of theorem 1.2. Furthermore, in the one-cut case, the asymptotics (3.10 -3.12) hold and we can use proposition 3.3 to extend the validity of theorem 1.1 to all test functions f ∈ H 1/2 (R) which satisfies the condition (H.2). In particular, theorem 3.6 applies to the GUE and, in general, when the potential V (x) is strictly convex on R.
Theorem 3.6. Let V : R → R be real-analytic function which satisfies the assumptions (1.2) and (1.6). For the eigenvalue process of the ensemble P V N , the CLT (1.3) holds for any x 0 ∈ J V , any 0 < α < 1, and for all f ∈ H 1/2 ∩ L ∞ (R) with compact support.
Proof. If X and Y are two random variables with mean zero, by Chebychev's inequality, for any ξ ∈ R, (3.36) According to (1.10), we let Ξ x0,α be the characteristic function of the centered linear statistics Ξ x0,α N f . Recall that G denotes the Gaussian field indexed by the Hilbert space H 1/2 (R) and we let By the triangle inequality, for any functions f, g ∈ H 1/2 (R) and for any ξ ∈ R, Using the estimate (3.36) twice, since both processes Ξ x0,α N and G are linear, this implies that If f satisfies the condition (H.2), since g ∈ C 1 0 (R), by the triangle inequality, the function f − g also satisfies (H.2) and by proposition 3.3, On the other hand, by formula (1.11), Var[G(f − g)] = f − g H 1/2 and we obtain [36,Thm 7.14], the r.h.s. of (3.37) is arbitrary small by choosing g ∈ C 1 0 (R) appropriately and we conclude that Ξ x0,α N (f ) ⇒ G(f ) as N → ∞.

Remark 3.2.
When V is strictly convex, the asymptotics (3.30) has also been derived in [29] with an error which is also uniform for all potentials in a neighborhood of V (see the proof of theorem 1.7 therein). Their method is also inspired from the results of [12] and it also applies to the slightly modified family of random matrix ensembles where V is analytic and strictly convex on an interval J ⊂ R. Hence, the results of [29] imply that theorem 3.6 holds for the ensembles dP V,J N as-well.

Modified Jacobi Ensembles
In this section, we look at another instance of unitary ensembles given by the weight where γ + , γ − > −1 and h(x) is a function which is real-analytic and strictly positive on the interval (−1 − ǫ, 1 + ǫ) for some ǫ > 0. In this case, the probability measure (3.2) can be written as where H denotes the operator norm of H. We also assume that ω is a probability density. The measure P ω N induces a determinantal process on the eigenvalues of H with correlation kernel (3.7). In particular, if the function h is constant, the OPs with respect to ω are the classical Jacobi polynomials and their asymptotics is well-known, cf. [51] Theorem 8.21.8 and also Theorem 12.1.4. In general, the probability measure P ω N is called the modified Jacobi unitary ensemble and the goal of this section is to derive the following asymptotics.
Proposition 3.7. For any ǫ > 0 and α ∈ (0, 1], the correlation kernel of the modified Jacobi ensembles P ω N with weight (3.38) satisfies (3.40) uniformly for all |x 0 | < 1 − ǫ and all ξ, ζ in compact subsets of R, where The probability measure ̺(x)dx on R is called the arcsin measure since its distribution function is given by Proposition 3.7 implies that ̺ is the equilibrium density for the eigenvalue process of the modified Jacobi ensembles. In contrast to the varying weights e −N V (x) analyzed in section 3.3, the global eigenvalues distribution is independent of the parameters of the model and it also turns out that the asymptotics of the OPs is simpler. Formula (3.40) can be deduced from the Riemann-Hilbert analysis of [32] by adapting the proof of proposition 3.5. However, we will give a slightly different proof based on formula (3.10) and the fact that the integrated density of states for the modified Jacobi ensembles is the arcsin distribution. First, it is interesting to look at an example where we can derive proposition 3.7 using only elementary trigonometry. When γ + = γ − = 1/2 and h = 1/π, we denote the weight by ω 0 (x) = √ 1 − x 2 /π, and the OPs which appear in the correlation kernel (3.7) are the Chebychev polynomials of the second kind. With the convention (3.3), they satisfy for all k ≥ 0 and x ∈ [−1, 1], In particular, the correlation kernel of the Chebychev process is given explicitly by (3.44) We will need the following lemma.
Lemma 3.8. Let Ψ N be a function which depends on a parameter N > 0. We define for all |x|, |y| < 1, (3.45) For any ǫ > 0, we have for all |x|, |y| < 1 − ǫ, where the error term is uniform and independent of N .
The connection with the Chebychev process is that, by (3.44), we have K ω0 N = K Ψ 0 N with the phase Ψ 0 N (x) = (N + 1) arccos x − π/2 . In particular, by (3.42), for any x, y ∈ [−1, 1], and, according to lemma 3.8, we obtain uniformly for all |x|, |y| < 1 − ǫ. Based on this approach, theorem 3.9 below was first proved in [10] for C 1 test functions with compact support, cf. theorem 1.1 therein. In the Chebychev case, instead of using the asymptotics (3.48), the authors used that the Laplace transform of the random variables Ξ x0,α N f is given by a Toeplitz determinant and computed its limit using the Strong Szegő theorem. Theorem 3.9. If (λ 1 , . . . , λ N ) denote the eigenvalues of a random matrix distributed according to P ω N , (3.39), then for any x 0 ∈ (−1, 1), any 0 < α < 1, and for all f ∈ H 1/2 ∩ L ∞ (R) with compact support, we have as N → ∞, In the remainder of this section, we will give a proof of proposition 3.7 which is inspired by the Chebychev case and lemma 3.8. The main observation is that, by theorem 3.10 below, the OPs with respect to the weight (3.38) behave very much like the Chebychev's polynomials when N is large. By theorem 1.2, proposition 3.7 implies the CLT for test functions in C 1 0 (R). Moreover, since the asymptotic formulae (3.10 -3.12) holds for the modified Jacobi ensembles, proposition 3.3 allows us to extend the CLT for any function f ∈ H 1/2 (R) which satisfies the condition (H.2). The argument is identical to the proof of theorem 3.6. The asymptotics of the OPs for the modified Jacobi ensembles has been derived using the Riemann-Hilbert method in [31]. In particular, we will need the following results. Theorem 3.10 (Thm. 1.6, Thm. 1.12, [31]). For any γ + , γ − > −1 and any function h(x) which is real-analytic and strictly positive on (−1 − ǫ, 1 + ǫ) for some ǫ > 0, there exists D ∞ > 0 and ψ ∈ C 1 (−1, 1) such that the OPs with respect to ω(x)dx satisfy uniformly for all x in compact subsets of (−1, 1), and In a follow-up paper, [32], the sine-kernel asymptotics was also derived at the local scale. For any ǫ, L > 0, uniformly for all x 0 ∈ [−1 + ǫ, 1 − ǫ] and ξ, ζ ∈ [−L, L]. Based on the results of theorem 3.10, we obtain a first version of formula (3.40) which is valid as long as |ξ − ζ| ≥ N −1+ǫ for any ǫ > 0. Then, using local universality, we can make this asymptotics uniform for all ξ, ζ in any compact subsets of R. Hence, by combining lemmas 3.11 and 3.12 below, this completes the proof of proposition 3.7.
A fundamental observation due to K. Johansson is that, in the regime |ξ − ζ| ≤ N −1+α , the difference N −α (ξ − ζ) is microscopic and we can deduce the asymptotics of the kernel using only local universality considerations.
Remark 3.4. To prove lemma 3.7 for all α ∈ (0, 1], it is important to use the uniform asymptotics of theorem 3.10 and (3.49) with the optimal error of order 1/N . There are other methods than the Riemann-Hilbert steepest descent to compute the asymptotics of OPs for a non-varying measure and prove local universality, e.g. the methods developped by Levin and Lubinsky, [38,39]. However, these methods usually provide weaker asymptotics which yields the sine-kernel only at small scales; see also remark 5.1 below.

Regularized characteristic polynomial and log-correlated Gaussian processes
The goal of this section is to elaborate on the connection between logarithmically correlated Gaussian processes (1/f-noise) and random matrix theory. It was established in [23] and [19] that the logarithm of the modulus of the characteristic polynomial of a CUE, respectively GUE, random matrices converge weakly to Gaussian generalized functions (random tempered distributions) whose correlation kernels have a logarithmic singularity at 0. Based on the socalled freezing transition scenario, this motivates some recent conjectures for the distributions of the extreme values of these polynomials, as well as for the extreme value of the Riemann Zeta function on some interval of the critical line, see [18,20] and references therein. This also suggests that the characteristic polynomials of random matrices give raise to regularizations of the so-called Gaussian Multiplicative Chaos measures introduced by Kahane, which play an important role in some recent physical theory, such as conformal field theory, disordered systems, Liouville quantum gravity, etc, [15,53]. In the following, we consider a random Hermitian matrix H distributed according to the unitary invariant measure P ω N , (3.2). We will not look directly at the characteristic polynomial of the matrix H but the following regularization at mesoscopic scales. Let 0 < α < 1, x 0 ∈ R, η > 0, z t = t + iη, and define (4.1) This object was introduced in [19] and it was proved that if H is a GUE matrix, then the random process t → W N (t) − E [W N (t)] converges weakly in L 2 [a, b] (a, b ∈ R) to a logarithmically correlated Gaussian process B 0 defined below. ii) B 0 has stationary increments.
We refer to [19] for some background and references on fractional Brownian motion. Let us just point out that the process B 0 has the following representation, for any t ∈ R, where Z is a complex Brownian motion with unit variance. Inspired by Riemann-Hilbert asymptotics obtained by Krasovsky in [30], the authors of [19] computed the limits of the Laplace transform of the random variable W N (t) for any t ∈ R and show that the finitedimentionnal distributions of W N − E [W N ] converges to that of B 0 . In the following, we generalize this result to other unitary invariant ensembles using the central limit theorem 3.6. We suppose that the weight ω satisfies (3.3) and the one-cut condition, J ̺ω = (−1, 1), so that the estimates of section 3.2 hold. Although this condition should not be relevant, we have not derived the necessary variance estimates in the multi-cut regime. First, observe that the random variable W N (t) is a linear statistic, where the function g t (x) = ℜ log x − z t x − z 0 is defined using the principal branch of the logarithm and z t = t + iη. It is easily seen that, even though g t / ∈ L 1 (R), its Fourier transform is well defined in L 2 (R) and, by lemma 4.2 below, it is given by Lemma 4.2. For any η > 0 and x, t ∈ R, we have Proof. This identity is classical and it can be proved by observing that, for any t > 0, and, by Fubini's theorem, We conclude by observing that, by definition, for any t > 0, The proof in the case t < 0 is almost identical.
At the end of this section, we check that the test functions g t satisfy the assumptions of proposition 3.3 so that for any |x 0 | < 1 and 0 < α < 1, and we can apply the CLT (1.3), cf. the proof of theorem 3.6. Namely, for any t 1 < · · · < t k and ξ 1 , . . . , ξ k ∈ R, letting f = k j=1 ξ j g tj , we obtain where Moreover, by formula (4.3), and, according to lemma 4.2 with x = 0, we obtain for any t, s ∈ R Since we have established that Ξ x0,α N g t = W N (t), according to definition 4.1, formulae (4.6 -4.7) imply that for any k ∈ N, Note that the fact that the Gaussian process B 0 has independent increments follows immediately from the covariance structure (4.7) and the continuity of its sample paths follows from Kolmogorov's theorem. Following [19,Thm. 2.3], the convergence (4.8) of the finitedimensional distributions and the estimate (4.5) allows to conclude that the random process W N converges in distribution to B 0 in an appropriate function space.  Proof. Without loss of generality we suppose that t > 0. Formula (4.4) implies that where for any (x, y) ∈ R 2 , Note that, in all four cases, the length of the contour |C x,y | = 2 min{t, |x − y|}, and there exists C t > 0 and a continuous function h : This implies that By Fubini's theorem, we conclude that Moreover, by construction, for all |x| ≥ C t , we have so that the hypothesis (H.2) holds.

The Gaussian Unitary Ensemble
The GUE (3.1) was introduced by E. Wigner as a model to describe scattering resonances of Heavy nuclei and it is certainly the Hermitian matrix model which received most attention. In particular, in addition to be unitary invariant, the entries of a GUE matrix are independent Gaussian random variables. The GUE falls in the general class discussed in section 3.3 with weight ω(x) = e −N x 2 . Hence, theorem 1.1 implies that its eigenvalue process converges at mesoscopic scales to the H 1/2 -Gaussian field G. In fact, another proof valid for Gaussian β-ensembles, appeared previously in [4]. The goal of this section is to derive the GUE kernel asymptotics from the classical integral formulae for the Hermite polynomials rather than by solving a Riemann-Hilbert problem. We proceed like in section 3.4. First, in section 5.2, we produce the global asymptotics of the correlation kernel. Then, in section 5.3, we make this asymptotics uniform by looking at the microscopic regime.

Plancherel-Rotach asymptotics
The first observation is that the GUE weight satisfies ω(x) = ω G (  √  2N x) where ω G (x) = e −x 2 does not depend on the dimension N . Moreover, the OPs with respect to the Gaussian weight ω G are the classical Hermite polynomials, for all k ≥ 0, then, according to formula (3.7), the correlation kernel of the GUE eigenvalue process is given by The functions φ k are usually called the Hermite (wave) functions and they form an orthonormal basis of L 2 (R). Moreover, they have the following asymptotics.
Proposition 5.1. Let, for all |x| < 1 and N > 0, There exists two sequences of functions Λ N andΛ N which are smooth on (−1, 1) such that for any ǫ > 0 and for all |x| ≤ 1 − ǫ, Moreover, there exists a universal constant C > 0 such that for all |x| < 1 and N > 0, Proof. Thanks to Rodrigues' formula, (5.1), the Hermite functions have the following integral representation The saddle point analysis for this integral was performed in a seminal paper by Plancherel and Rotach, [45, formula 7]. If η N is given by (5.8), they obtained for any k ∈ N and |x| < 1, It is remarkable that they managed to obtain the full asymptotic expansion of the Hermite functions. In fact, to obtain the mesoscopic asymptotics of the GUE kernel, it suffices to take k = 2 in (5.11), then the coefficients in the expansion are C 0,0 = 1, C 1,0 = 0, C 1,1 = 3/16 and C 1,2 = 5/48 according to [45]. In this case, we deduce from formula (5.11) that uniformly for all x in compact subsets of (−1, 1), (5.12) and the function Λ N is smooth on (−1, 1) and satisfies (5.14) Moreover, by (5.10), we see that for any |x| < 1 − ǫ, This identity is remarkable because if Ψ N is defined according to (A.2) and we substitute (5.15) in formula (5.14), we obtain Moreover, for any |x| ≤ 1 − ǫ, and this implies that .

The global asymptotics
Proposition 5.1 encompasses most of the technical work to prove formula (1.14) for the GUE kernel. In this section, we will derive the global asymptotics of the GUE kernel based on the method developed in section 3.4.

The local asymptotics and uniformity
The asymptotics of lemma 5.2 is not uniform and to complete the proof of formula (5.38) below, we need to remove the condition |x − y| ≫ 1/N 2 . To do so, we will use a method introduced by Levin and Lubinsky to prove local universality, see [35,39]. It consists in first computing the asymptotics of the Christoffel-Darboux kernel along the diagonal, then extending the result off-diagonal using some a priori estimates on the derivative of the OPs. For the GUE kernel, we can use that the Hermite function solves a second order ODE to obtain this estimate, see formula (5.33) and lemma 5.4 below.
Proof. The Hermite polynomials are an Appell sequence and, by formula (5.2), this implies that for all k ≥ 0, If we use this equation and formula (5.31) below, we obtain The same argument as the proof of proposition 5.1 shows that for any |x| ≤ 1 − ǫ/ √ 2, By formulae (5.3) and (5.28), this implies that for any |x| ≤ 1 − ǫ/ √ 2, By (5.8), and using the trigonometric identity this yields for all |x| ≤ 1 − ǫ/ √ 2, Formula (5.27) follows from a trivial change of variables.
Remark 5.1. Using the same argument, it is possible to get the estimate (5.29) for a general ensemble P ω N provided that its correlation kernel correlation satisfies (5.35) and K ω N (x, x)/N = ̺ ω (x) + O(N −1 ). For instance, if the weight ω do not depend on the dimension N and is compactly supported, the estimate (5.35) follows from the Markov-Berstein inequality, see [39]. For the modified Jacobi ensembles, in the regime α > 1/2, this can be used to give another proof of proposition 3.7 without using the local asymptotics (3.49), though it only gives an error term of order N 1/2−α .
By combining lemmas 5.2 and 5.4, we obtain the full asymptotics for the GUE kernel. Notice that the error term of order N −2 is crucial to complete the proof for all mesoscopic scale α ∈ (0, 1]. This can be achieved because the asymptotics of proposition 5.1 includes an extra term compared to the classical expansion used in section 3.3.
By (A.2), the first term is finite. Since f satisfies the condition (H.1), the second term is bounded by 4A 2 L. By definition of the set B, the third term satisfies and we conclude that f 2 The aim of this appendix is to derive an estimate for the variance of global linear statistics valid for continuously differentiable test functions.
Proposition A.2. Let V : R → R be a real-analytic function which satisfies (A.1) and such that J V = (−1, 1). We denote Ξ N h = h(λ k ) where the sum is over the eigenvalues of a random matrix distributed according to P V N . Let h ∈ C 1 (R) and suppose that there exists Q, n > 0 so that |h ′ (x)| ≤ Q|x| n for all |x| ≥ 1, then The proof is based on the result of proposition 3.2 and the exponential decay of the Christoffel-Darboux kernel outside of the bulk; see lemma A.3 below. We suppose that h ∈ C 1 (R) in order to simplify the proof, however this condition is not necessary. In fact, by a simple modification of our method, it suffices to suppose that h ∈ H 1/2 and there exists Q > 0 and n > 0 so that for all |x| > 1 − δ, Proof. By [12, formula 1.58], for any ǫ > 0, we have where for all x ∈ R, This function appears in the determination of the equilibrium density ̺ V . In fact, ̺ V (x)dx is the unique minimizer of a weighted energy functional and it is uniquely determined by the following Euler-Lagrange variational conditions: Using [12, formula 1.59] instead, we can show that the estimate (A.5) holds for the function Φ N −1 as well. By formula (3.17), this implies that for all |x| ≥ B, Hence, since Φ N L 2 = Φ N −1 L 2 = 1, we obtain for all |x| ≥ B, Proof of proposition A.2. Let A > B and χ ∈ C 1 R + → [0, 1] such that −χ ′ ∈ [0, 1] and We also letÃ = A + 1. We decomposition h = f + g where f = χh ∈ C 1 0 (R) and g = (1 − χ)h ∈ C 1 (R). According to formula (3.13), we have Var Ξ N h ≤ 2 Var Ξ N f + Var Ξ N g . Next, we will show that the function f satisfies the condition (H.1). By definition, we have Hence, if |x| <Ã, using the properties of the cutoff function χ, for all |y| ≤ |x|, On the other hand, if |x| ≥Ã, for all |y| ≤ |x|, It completes the proof sinceΣ(f ) =Σ(h) because h(x) = f (x) for all |x| ≤ 1.
Proposition A.2 is used in [33] to give a new proof of theorem A.4 below. In fact, the results of [33] are valid for more general orthogonal polynomial ensembles. Theorem A.4 is an extension of the CLT (1.7) and its proof is inspired from that of theorem 3.6.
Theorem A.4. Let V : R → R be a real-analytic function which satisfies the condition (A.1) and such that J V = (−1, 1). If (λ 1 , . . . , λ N ) denote the eigenvalues of a random matrix distributed according to P V N , then for any f ∈ C 1 (R) such that there exists Q, n > 0 so that |f ′ (x)| ≤ Q|x| n for all |x| ≥ 1, we have Combining (A.15) and (A.16), we conclude that S(h) ∼ N 0, Σ(h) and the CLT follows since this holds for any subsequence π.