Empirical Spacings of Unfolded Eigenvalues

We study random points on the real line generated by the eigenvalues in unitary invariant random matrix ensembles or by more general repulsive particle systems. As the number of points tends to infinity, we prove convergence of the empirical distribution of nearest neighbor spacings. We extend existing results for the spacing distribution in two ways. On the one hand, we believe the empirical distribution to be of more practical relevance than the so far considered expected distribution. On the other hand, we use the unfolding, a non-linear rescaling, which transforms the ensemble such that the density of particles is asymptotically constant. This allows to consider all empirical spacings, where previous results were restricted to a tiny fraction of the particles. Moreover, we prove bounds on the rates of convergence. The main ingredient for the proof, a strong bulk universality result for correlation functions in the unfolded setting including optimal rates, should be of independent interest.


Introduction
The universal behaviour of eigenvalue statistics of random matrices has attracted much interest over the last decades. Although random matrices have already been studied by Wishart [Wis28] in the 1920s, the development of the theory was promoted in the 1950s by Wigner [Wig58], who used eigenvalues of random matrices to model spectra of complex nuclei. Montgomery's surprising discovery [Mon73] that zeros of the Riemann zeta function behave statistically as eigenvalues of random matrices, led to a further increase of interest. In recent years, the belief has emerged that limit laws obtained in random matrix theory are also ubiquitous in large systems of strongly correlated particles on the real line. One instance of this belief is the Bohigas-Giannoni-Schmit conjecture [BGS84], stating that level spacings of quantum systems with classically chaotic motion should be given by random matrix laws. The ubiquity of certain limit laws has been established within Random Matrix Theory (RMT) as the universality phenomenon, which means that for large matrix sizes many eigenvalue statistics exhibit the same limit distributions for essentially different matrix models, provided these models share the same symmetry (e.g. realsymmetry, complex-Hermiticity etc.). These universal limits usually arise for gap probabilities, spacing statistics or correlations of close eigenvalues in the bulk or at the edge of the spectrum if the mean spacing between consecutive eigenvalues is one. This paper is motivated by the following problem: Assume that we consider a complicated (real-world) system. Based on a data set of real values obtained as a particular realization of that system, we want to study the appearance of universal RMT laws. The central question is how to detect such a universal behavior.
From a practical point of view, the easiest and most common statistic to consider is the empirical nearest-neighbor spacing distribution: for an ordered N -tuple x 1 ≤ · · · ≤ x N and an interval I N we denote by σ(I N , x) the counting measure of the The second author has been partially supported by the SFB 701. Here,σ(I N , x) represents a histogram of the spacings from the data x and has as such a high practical relevance. Indeed, histograms of spacings obtained in numerous (real-world) systems have been compared to limit distributions from RMT, ranging from level spacings in nuclear physics (see Mehta's classical book [Meh04] and the references therein) to bus waiting times in certain Mexican cities [KS03].
Despite its relevance, results on empirical spacings in otherwise well-understood random matrix ensembles are surprisingly sparse. Most results in the literature are available for the expected spacing distribution instead of empirical spacings. Briefly speaking, the expected spacing distribution is obtained by averaging over all realizations of σ(I N , x) orσ(I N , x). A more formal definition and discussion is given below. This preference of the expected spacing distribution is partly due to its direct relation to correlation functions (marginals of fixed dimension). For important classes of random matrices, these have particularly nice forms. This led to the convention of establishing local universality in terms of correlation functions. However, to deduce strong universality results of empirical spacing statistics, quite strong forms of convergence of the correlation functions are needed, e.g. uniformity in intervals growing with N . These requirements are often not met by standard formulations of universality results.
In this paper, we prove the convergence of the empirical spacing distribution of unfolded eigenvalues or more general particles on the real line to a universal distribution, the Gaudin distribution. The unfolding basically consists of applying the limiting spectral distribution function to the eigenvalues/particles. This nonlinear rescaling transforms the limiting spectral measure into the uniform measure on [0, 1] and allows for considering spacings of eigenvalues/particles from I N of macroscopic length, even the whole spectrum. Our main theorem, Theorem 2.2, states the uniform convergence of the distribution function ofσ(I N , x) towards the Gaudin distribution function G in mean, i.e. (1.2) We also obtain rates of convergence in terms of the length of the interval I N .
Let us define the two models of this paper. In the first model, we consider the eigenvalues of Hermitian matrices from unitary invariant ensembles: Given functions V, f : J → R on J = [L − , L + ] ∩ R, which is a finite or infinite interval (−∞ ≤ L − < L + ≤ ∞, for precise assumptions see (GA) 1 ), we define a density on R N by (1. 3) If f = 0, then we write P N,V instead of P N,V,0 . We will slightly abuse notation by not using different symbols for the measure and its density. P N,V,f is the joint density of the eigenvalues of a random Hermitian matrix whose distribution has a density proportional to exp(−Tr(N V (M ) + f (M ))) with respect to the "Lebesgue measure" dM = N j=1 dM jj 1≤j<k≤N dM R jk dM I jk on the space of N × N Hermitian matrices M with the property that all eigenvalues of M lie in J. Most prominent in this class and arguably in the entire RMT is the Gaussian unitary ensemble (GUE), which is obtained by choosing V (t) = t 2 , f = 0 and J = R.
As a second model, we will consider ensembles recently studied by Götze and the second author in [GV14], which generalize (1.3) by allowing for different interactions between particles. Given Q, h : R → R smooth (see (GA) 2 for our assumptions), we define (1.4) We will call P h N,Q a repulsive particle system. As h is smooth, the densities P h N,Q and P N,V,f vanish at the same order if two particles approach each other. It is conjectured (cf. [GV14]) that the universal local limit laws should only depend on the strength of the repulsion but be independent of the interaction at a non-zero distance and of the external field Q.
Returning to spacings, let us review known results in RMT. The spacing distribution has been understood best for a particular ensemble of random unitary matrices, the circular unitary ensemble (CUE). It was introduced by Dyson [Dys62] as a Haar distributed unitary matrix, which may be seen as having a uniform distribution on the group of N × N unitary matrices. The eigenvalues lie on the unit circle and have a joint density proportional to j<l |e iθ j − e iθ l | 2 , (1.5) where θ 1 , . . . , θ N ∈ [0, 2π). Note the similarity in the interaction of (1.5) and (1.3). It has been shown by Katz and Sarnak [KS99] that for some limiting function G we have for any ε > 0, where I N = [0, 2πN ). Soshnikov [Sos98] proved that (1.6) holds with a rate of O((|I N |) −1/2+ε ) for any ε > 0, where I N ⊂ [0, 2πN ) is such that |I N | → ∞ as N → ∞, in expectation as in (1.6) and also almost surely. Furthermore, he proved that the fluctuations around the limit G(s) are Gaussian.
In (1.6), is the distribution function of the so-called Gaudin distribution and S is the sine kernel, This probability measure had already been studied before in the physics literature (see e.g. [Meh04] for references). In particular, the density G ′ (s) is given as the second derivative of the Fredholm determinant of the integral operator with the sine kernel on L 2 ((0, s)). Although this density does not seem to have a nice closed form expression, for many practical purposes it is sufficient to consider the so-called Wigner surmise instead, i.e. p 2 (s) := 32 π 2 s 2 e − π 4 s 2 provides a good approximation to G ′ (s).
Before we can further review results on the spacing distribution for invariant ensembles, we first need to introduce the notion of the equilibrium measure in order to rescale the particles. It is well-known that under very mild assumptions on V and J, there is a measure µ V on R with compact support that is the weak limit of the expected empirical spectral distribution of P N,V , i.e. for all continuous and bounded g : (1.8) Here E N,V denotes the expectation w.r.t. P N,V . The measure µ V is the unique probability measure among all Borel measures on J which minimizes the functional Throughout our paper we will assume that V is convex and real-analytic on a neighborhood of the support of µ V (see (GA) 1 ). Then µ V has a density, which we will abusing notation also denote by µ V , strictly positive in the interior of the support of µ V . In invariant ensembles the results for the spacing distribution are by far weaker than for circular ensembles. For instance, until the recent [Sch15] by the first author, only the absolutely continuous intensity measure Eσ(I N , x) had been considered for these ensembles. We recall that if x is random, then σ(I N , x) is a random measure and its intensity measure is defined as for any measurable set B and E denotes expectation w.r.t. the probability measure underlying the random variable x. We will call this measure the expected spacing measure. To our knowledge, the first rigorous result on the spacing distribution for unitary invariant matrix ensembles is due to Deift et al. [DKM + 99]. Let a ∈ supp(µ V ) • and t N > 0 such that t N → ∞ and t N /N → 0 as N → ∞. Setting I N = [a − t N , a + t N ], it has been shown in [DKM + 99] following the method of Katz and Sarnak that for real-analytic V and f ≡ 0 in (1.3), we have (1.10) We observe that (1.10) clearly shows universality, as the r.h.s. of (1.10) does not depend on V . Let us also remark that (1.10) expresses the weak convergence of the distribution function of the asymptotically normalized expected spacing distribution to the Gaudin distribution. Furthermore, due to the continuity of G, (1.10) actually holds uniformly in s.
An analogous result for certain Hermitian Wigner matrices was proved by Johansson in [Joh01]. Universality was proved for large classes of Wigner matrices and invariant ensembles in terms of the expected spacing measures by Bourgade, Erdős, Schlein, Yau, Yin et al. (see [BEY14] and the references therein). In particular, they show vague convergence of the asymptotically normalized expected spacing measure. Moreover, the limiting distribution has been proved to be universal for large classes of real-symmetric, Hermitian or quaternionic self-dual random matrices as well as for general β-ensembles, which are variants of (1.3) in which a parameter β > 0 replaces the exponent 2. The universal limit depends on the symmetry type or on β, respectively.
Recently, there has been quite some interest in the distribution of a single spacing. This was initiated by Tao, who proved in [Tao13] that the Gaudin distribution is also the limit in law of a single spacing in the bulk of the spectrum, say of x ⌊N/2⌋+1 − x ⌊N/2⌋ . This shows that for expected spacings, an average over an increasing number of spacings as in (1.10) is not necessary to obtain the limiting distribution. Tao proved this for Hermitian Wigner matrices and the result was later extended in [EY12] and [BFG15] to all symmetry classes and β-ensembles. Moreover, the results in [EY12] and [BFG15] allow to consider statistics of the form (for ordered x i ) where µ V,i := µ V (q i ) and q i is the i/N -quantile of the distribution µ V . The test function g is assumed to be smooth and of compact support, thus determining vague convergence. The index set I is assumed to contain only bulk eigenvalues in case of [EY12] and is arbitrary in [BFG15]. Both works show that in the large N limit, (1.11) become independent of V , thereby showing universality. However, to deduce convergence to the Gaudin distribution via [Tao13], the set I has to fulfill εN ≤ min I ≤ max I ≤ (1 − ε)N for some ε > 0. It is instructive to compare the statements (1.11) and (1.2). In (1.11) as well as in (1.10), by first taking the expectation, the spacing statistics are averaged over all realizations, thus making it non-empirical. After that, this average is compared to the limiting quantity. In (1.2), the empirical statistic is first compared to the limit. The error of that comparison is then shown to vanish in L 1 . It is further important to note that for statements like (1.2), the number of eigenvalues in the statistic necessarily has to increase with N . Indeed, the smooth function G can only be approximated well by the step function · 0 dσ(I N , x) if the number of steps goes to infinity. The convergence of (1.11) for a finite set I is only possible since taking the expectation means averaging over infinitely many spacings (from all different realizations).
Let us turn to empirical spacings. In [Sch15] the first author of this work shows under certain assumptions on the Christoffel-Darboux kernel (see (2.18) for a definition of the kernel) that for unitary invariant ensembles with I N , a as in (1.10). This result is quite analogous to (1.6) for the CUE and was also shown in [Sch15] for eigenvalues of real-symmetric and quaternionic self-dual ensembles. However, this result has the drawback that the considered spacing statistics are very inefficient in an empirical sense in that only a tiny fraction of the eigenvalues is used for the statistics. Indeed, the expected number of rescaled eigenvalues N µ V (a)x j in I N = [a−t N , a+t N ] is about 2t N and thus by the condition t N /N → 0, the fraction of used eigenvalues even goes to 0, as N gets large. This unsatisfying situation is due to the scaling of the eigenvalues N µ V (a)x j , j = 1, . . . , N, (1.13) which we will call localized around the point a. To obtain a universal limit, the eigenvalues have to be rescaled such that their mean spacing is (asymptotically) one. This is achieved by the rescaling (1.13) in a small vicinity of a point a in the bulk of the spectrum, where the mean spacing is (µ V (a)N ) −1 + o(N −1 ). But as the equilibrium density µ V (a) is generically not constant in a (in contrary to the CUE), a linear rescaling such as (1.13) does not allow to consider spacings of eigenvalues which are spread over an interval of macroscopic size. To overcome this problem, we will in this article unfold the eigenvalues with the distribution function of µ V . This natural non-linear rescaling produces an ensemble with constant equilibrium density.

Statement of Results
Before we give a precise formulation of our results, we specify our assumptions for the ensembles (1.3) and (1.4). Except for the support J, the class of ensembles P h N,Q formally subsumes the unitary invariant ensembles P N,V . However, our assumptions depend on the specific type of ensemble and therefore we will consider two different sets of assumptions. For P N,V,f these assumptions are (4) L − , L + do not belong to the support of the equilibrium measure µ V .
If f = 0, then (2) can be replaced by the weaker condition (2') V ′ is strictly increasing.
For the ensembles P h N,Q , we make the following assumptions.
(3) h is a Schwartz function with exponentially fast decaying Fourier transform.
Under (GA) 2 , it has been shown in [GV14] that for given h, for each Q with α Q large enough (depending on h), there exists a probability measure µ h Q such that (1.8) holds with µ h Q replacing µ V and E h N,Q replacing E N,V . Furthermore, µ h Q has also the generic form (4.47) for V := Q + h * µ h Q , where * denotes convolution.
Remark 2.1 (On the assumptions). One could without substantial changes also introduce the external field f to the model P h N,Q . For the sake of notational convenience this is not done here. The model could also be studied on an interval J, which might not be the whole line. This would require a condition analogous to (GA) 1 (4) for P h N,Q . For repulsive particle systems the equations (4.42), which determine the endpoints of the equilibrium measure, depend on the equilibrium measure itself, which makes checking an analog of (GA) 1 (4) more complicated than for invariant ensembles. Nevertheless, we will show in our analysis that a truncation to large enough J only produces an asymptotically negligible error.
To define the scaling of the particles, for P N,V,f let F V denote the distribution function of the equilibrium measure µ V and consider the unfolded eigenvaluẽ If x is a random configuration sampled from P N,V,f , thenx is a point process on [0, N ] with asymptotically constant density and mean spacing 1, at least in the bulk. If x is distributed according to the repulsive particle system P h N,Q , let F h Q denote the distribution function of µ h Q and definẽ . Now, our main theorem on the spacing distribution reads as follows. (1) Let P N be either P N,V,f satisfying (GA) 1 or P h N,Q satisfying (GA) 2 with α Q large enough (this depends on h only) and h negative-definite. Then for any ε > 0 (2.14) (2) If P N = P h N,Q with h not necessarily negative-definite, then for α Q large enough (2.14) holds with the O-term being replaced by o(1). In (1) and (2),σ(I N ,x) can be replaced by |I N | −1 σ(I N ,x) without altering the result.
(1) In the language of mathematical statistics, (2.14) implies thatσ(I N ,x) is an asymptotically consistent estimator for the Gaudin distribution, considered in the Kolmogorov metric. Note that similar results on the expected spacing distribution like (1.10) only show the asymptotic unbiasedness. For applications, it would be favorable to have the unfolding with F := F V or F := F h Q being replaced by an unfolding based solely on the empirical values. We remark that using the empirical distribution function as a naive estimator for F would result in a non-randomx. However, we expect that a smoothed empirical distribution function should yield the desired. In fact, this is a typical procedure in statistical analysis. It will be considered in a future work.
(2) To our knowledge, Theorem 2.2 is the first result on rates of convergence to the Gaudin distribution for invariant ensembles. For the simpler CUE a rate of O(|I N | −1/6+ε ) for any ε > 0 was derived in [KS99] resp. a rate of O(|I N | −1/2+ε ) was shown in [Sos98]. Numerical experiments (cf. [KS13]) suggest that the optimal rate for the GUE is |I N | −1/2 , possibly with some logarithmic factor. The dependence of the rate on |I N | reflects the fact that necessarily a growing number of empirical spacings has to be considered in order to obtain convergence. (3) For negative-definite h, an exact representation of P h N,Q in terms of determinantal ensembles will be derived in Section 5, which allows to transfer rates of convergence from the unitary invariant ensembles to the repulsive particle systems. For general h, only convergence can be shown, see Remark 5.2 for more details. On the other hand, if h is positive-definite, then it suffices to have α Q > sup t∈R −h ′′ (t).
Abandoning any rate of convergence, we an also deduce the result of Theorem 2.2 for I N = [0, N ].
Corollary 2.4. With P N as in Theorem 2.2 (1) or (2), we have (2.15) Remark 2.5. Note that in (2.15), edge spacings are included. Although correlations between eigenvalues at the edge are not given by the sine kernel, the number of edge spacings is relatively small and thus does not change the limit distribution.
The next corollary shows a much better rate of convergence for the expected spacing distribution. We believe that this rate is almost optimal.
Corollary 2.6. Let P N , I N be as in Theorem 2.2 (1). Then for any ε > 0 Remark 2.7. Recall from the discussion around (1.11) that a similar result should also hold for intervals with |I N | of order 1, probably with rate O(N −1+ε ). As Corollary 2.6 is merely a byproduct of an analysis which necessarily deals with growing intervals, we will not pursue this here.
A major ingredient to the proof of Theorem 2.2 is a strong form of bulk universality for correlation functions, which should be of independent interest. To state it, let us recall the notion of correlation functions.
For a probability measure P N (x)dx on R N , invariant under permutations of its arguments, the k-th correlation function is the map R k N : Note that in previous works of the second author, the correlation functions are defined as the k-dimensional marginal densities of P N and therefore differ from (2.16) by the factor N !/(N − k)!. The definition in (2.16) is more convenient for dealing with sums and will be used throughout this article. A crucial fact for many computations and universality proofs is that unitary invariant ensembles are determinantal, that is such that the analysis boils down to studying the so-called Christoffel-Darboux kernel where p j,N , j = 0, 1, . . . are the orthonormal polynomials with positive leading coefficients to the weight To formulate a certain uniformity in the field f , let for a domain D ⊂ C, (X D , · D ) denote the Banach space of functions f : D → C which are analytic, bounded and real-valued on D ∩ R. Here · D denotes the sup-norm on D.
where the error term is uniform for t, s ∈ I N .
(2) Assume in addition to (1) that J is compact and let 0 < η < 1. Then there is a complex domain D ⊃ J such that (1) holds uniformly for N be the k-th correlation function of P N with P N as in Theorem 2.2 (1). Then, abbreviatingt j := F −1 V (t j /N ) in the unitary invariant case and , as well as writing µ for µ V and µ h Q , respectively, we have for any ε > 0 1 with the O term being uniform for t 1 , . . . , t k ∈ I N . If P N = P N,V,f , then the statement is valid for ε = 0. For P N as in Theorem 2.2 (2), the statement is valid with the O term replaced by o(1).
Remark 2.9. The convergence of the Christoffel-Darboux kernel of some determinantal ensemble to the sine kernel is a very typical result in RMT and the content of numerous papers in the field. We will mention here only very few seminal papers and refer to [Kui11] for an overview instead. A first universality proof was given by Pastur and Shcherbina [PS97] (see also [PS08]) for sufficiently smooth V . Deift et al.
[DKM + 99] showed universality for real-analytic V using Riemann-Hilbert techniques and more recently Levin and Lubinsky [LL08] established it under very mild assumptions on V using complex analysis.
In the existing literature, the kernel is usually considered in the localized scaling, , where a is in the bulk of the spectrum and µ is the limiting spectral density. Then t and s are typically assumed to lie in some fixed compact set, that is their distance is bounded in N . In the recent [KSSV14], convergence is shown under the assumption |t − s| = o(N ) with the rate O((1 + |t| + |s|)/N ), which is optimal in the localized scaling. To our knowledge, Theorem 2.8 is the first version of bulk universality for unitary invariant ensembles which does not require all eigenvalues to lie in a vicinity of some point a. Moreover, the rate in part (1) of the theorem is optimal, as can be seen for instance in (4.49).
Theorem 2.2 will be deduced from the following more general result.
Theorem 2.10. Let for each N , I N ⊂ [0, ∞) be an interval and P N (x)dx be a probability measure on R N , invariant under permutation of the coordinates. Let R k N denote the k-th correlation function of P N (x)dx, defined in (2.16). Further let C 0 > 0 denote a constant such that the following conditions are satisfied: (1) For all N ∈ N we have (2) There exists where for S as in (1.7), S k is given by Then for any ε > 0 The same result holds withσ(I N , N x) being replaced by |I N | −1 σ(I N , N x).
The preceding theorem can also be used to show universality of the spacing distribution in the localized scaling.
Corollary 2.11. Let P N be as in Theorem 2.2 (1).
Let a be such that µ(a) > 0 and I N ⊂ J be a symmetric interval with center a and lim N →∞ If P N is as in Theorem 2.2 (2), the last statement is valid with the O term being replaced by o(1). In either case, the statements remain valid ifσ( Remark 2.12. Note that for |I N | ≫ √ N , Theorem 2.2 provides a better rate of convergence for the unfolded particles than Corollary 2.11 for the localized particles.
We will finish this section with some concluding remarks. The repulsive particle systems P h N,Q appeared first in a more general setting with many-body interactions in [BdMPS95], where under some convexity condition on the additional interaction results on the equilibrium measure were announced. Further results associated with global asymptotics in such models can be found in [BGK13] and [CGZ14]. Local bulk universality was proved for P h N,Q in [GV14] and for the β variants in [Ven13]. Edge universality and fine asymptotics of the largest particle have been considered for P h N,Q in the recent [KV15].
The paper is organized as follows: After a brief outline, the proof of Theorem 2.10 will be given in Section 3. Here we follow the method developed by Katz and Sarnak in [KS99], streamlining and optimizing in order to obtain better rates of convergence. Theorem 2.8 (1) and (2) will be proved in Section 4. The proof of Theorem 2.8 (3) for the repulsive particle systems relies on a non-trivial reduction to the unitary invariant case. An outline of the method from [GV14] and the proof of Theorem 2.8 (3) are contained in Section 5. The proofs of Theorem 2.2 and the corollaries are given in Section 6.

Investigation of the Spacing Distribution -Proof of Theorem 2.10
Theorem 2.10 will be proved first with the asymptotic and non-random normalization |I N |, the case of the exact but random normalization will be discussed at the end of the proof. Moreover, let us note that in this case the statement of Theorem 2.10 is trivial if |I N | is bounded in N . Hence, we will from now on assume that Let us furthermore make some notational remarks. By C we denote absolute positive constants that may change from line to line. Finally, note that we often suppress certain N -dependencies.
A major disadvantage of σ(I N , x) is its dependence on the ordering x 1 ≤ · · · ≤ x N , which prevents an efficient use of correlation functions. This problem can be circumvented by using the measures which are symmetric and fulfil the relations (3.20) The proof of Theorem 2.10 consists of three steps. The first step establishes the point-wise convergence and bounds the difference of and G(s) in terms of the variances of the γ k 's. In the second step, these variances are estimated. From this we can deduce a bound on The difference between (3.22) and the quantity to be estimated in Theorem 2.10, is that we need to take the supremum over all s before taking the expectation. This issue is addressed by considering (3.22) at a (growing) number of nodes s i rather than at a single s. The respective results are provided in Section 3.3.
Before we turn to the proof of Theorem 2.10, we note an important estimate for power sums. We will frequently encounter sums of the form L k=2 a k z k with L and z growing in N and with different sequences (a k ). To provide a unified and efficient treatment of these sums, the following simple lemma will prove useful.
For an entire function f , recall that f is said to be of finite order if the inequality max |z|≤r |f (z)| < e r κ holds for all r large enough and some κ < ∞. The greatest lower bound of all such κ is called order of f . If f has a power series expansion ∞ k=0 a k z k , then the order ρ can be found via If f has finite order ρ, then it is said to be of finite type, if max |z|≤r |f (z)| < e ζr ρ holds for r large enough and some finite ζ. The greatest lower bound ν of all such ζ is called the type of f and can be determined via Lemma 3.1. Let f (z) = ∞ k=0 |a k |z k be an entire function of order at most 2 and finite type.
Proof. Let ε > 0 and ν denote the type of f . For any K > 1, we have the trivial estimate Choosing K = K(ε) large enough, the lemma is proved. Here we used that K L N is of subpolynomial growth in M N , due to our assumption on L N .
. We turn to the proof of (3.21). For s > 0 and t 1 , . . . , t k , we denote by χ s,I N the function To select certain entries of a vector x = (x 1 , . . . , x N ) ∈ R N we use the notation With this notation we can rewrite and we obtain (3.26) The following lemma establishes the convergence of the terms E N s 0 dγ k (I N , N x) and further provides a useful estimate for the proof of Theorem 2.10. The proof is essentially contained in [Sch15] and we revisit the arguments in order to adjust them to the current setting.
Lemma 3.2. Let the conditions of Theorem 2.10 be satisfied.
(1) For k ≥ 2 we have where the O term is uniform for s ∈ R and k ∈ N.
Proof. In order to prove (1), we consider (3.26) and use the uniform convergence of the correlation functions in (2) of Theorem 2.10, i.e. we use that 1 We further use the obvious estimate The translational invariance of S k and the change of variables Observe that in the latter integral, we integrate over z 1 from I N except for an interval which has at most length s. The estimate in (1) is then obvious from (1)). Observe that statement (1) together with (3.20) already implies (3.21).
To show (2), we first introduce the notation The idea is to use (3.20) and bound G(s) and s 0 dσ(I N , N x) from above and from below by alternating sums over E(s, k) and s 0 dγ k (I N , N x), respectively. Then we obtain for L ∈ N Introducing the notations := max(1, s), we conclude from (3.28) 0s L (L − 1)! and using (1), we arrive at Now, under our assumptions on the growth of s and L, for any ε > 0 by Lemma 3.1, applied with f (z) = e C 0 z . To deal with the remaining sum, first observe that the series k! converges absolutely and defines an entire function. Using (3.23) and Stirling's formula, we readily compute its order as ρ = 2 and using (3.24) and again Stirling's formula, its type as ν = C 2 0 e/2. Thus Lemma 3.1 finishes the proof.
3.2. The variance of Proof. In order to calculate the variance of We consider the terms with l < 2k in (3.30) later and observe that by (3.26) with t ′ := (t 1 , . . . , t k ), t ′′ := (t k+1 , . . . , t 2k ) we have To calculate the difference of (3.31) and (3.32) we use By (3.28) we obtain We now claim that If two components of (t ′ , t ′′ ) are equal, then S 2k (t ′ , t ′′ ) = 0 and the claim is trivially true, as S k ≥ 0. If all components are distinct, then 1≤n,m≤j is a positive-definite matrix (its principal minors are exactly S 1 , S 2 , . . . , S j−1 > 0). Now, (3.33) follows from Fischer's inequality. With (3.33) we can further estimate So far, we showed We continue to consider the integrals in the double sum. For l ∈ {k, . . . , 2k − 1} and sets T, M ⊂ {1, . . . , N } with |T | = |M | = k and T ∪ M = l (i.e. T and M have a non-empty intersection) we have (using the symmetry of R k N and 1 By the easy calculation Summarizing, we have shown ¿From Lemma 3.3, we can already derive an estimate on the expected deviation of the spacing distribution from its limit at a given point s.
Corollary 3.4. Let the conditions of Lemma 3.2 be satisfied. Then we have for any ε > 0, s ≥ 0 Proof. By statement (2) in Lemma 3.2, it suffices to bound Similar to the proof of Lemma 3.2, we can apply Lemma 3.1 with the functions which are both of order 2 and finite type, as can be checked easily using (3.23) and (3.24). It is known (cf. [KS99, Proposition 3.1.9]) that the Gaudin distribution has sub-Gaussian tails, that is, for some A, B > 0. This implies that δ N fulfils Proof of Theorem 2.10. We first establish the inequality (3.37) It has been given in [KS99], so we only sketch its simple proof. For fixed Next, we show that (3.38) Using the notation S(I N , N x) := #{i : N x i ∈ I N }, we can write (3.39) Hence, we need to calculate the first and second moment of S(I N , N x). By an easy computation, we obtain (using the symmetry of R N N ) In a similar fashion, using I 2 N S 2 (x, y)dxdy = |I N | 2 + O(|I N |), we have which shows together with (3.39) and (3.40) Now Jensen's inequality proves the claim in (3.38). We further use the crude bound Now, choosing M as the smallest natural number larger than A 1/4 N , we get with (3.37) and Corollary 3.4 which proves Theorem 2.10 for |I N | −1 σ(I N , N x). To deduce the result forσ(I N , N x), let for 0 < ι < 1 denote We will assume that N is so large that x ∈ A implies It is straightforward to check that x ∈ A implies |I N |/

Proof of Theorem 2.8 (1) and (2)
We need to introduce some of the notation of [KSSV14]. Let us define the Mhaskar-Rakhmanov-Saff numbers a V and b V via the relations It is known that for convex, smooth V , a V and b V are uniquely determined by (4.42) and that these are the endpoints of the support of the equilibrium measure µ V . Moreover, it is important for us to see them as functions of V . The linear rescaling that maps [−1, 1] onto [a V , b V ] is denoted by (4.43) Its inverse is (4.44) Note that ρ V is the equilibrium measure of V rescaled such that its support is [−1, 1]. The actual equilibrium measure µ V is related to ρ V via Proof of Theorem 2.8 (1) and (2). We will mostly deal with the more involved case (2). Let us abbreviate V, f for V − f /N . We will also write · instead of · D .
where g(r, s) is some function which is not important here. Formula (4.49) holds for all r, s ∈ (−1 + δ, 1 − δ) with arbitrary δ > 0, where the O term is uniform in r, s for fixed δ and uniform for V − f /N ∈ X D for some D. In order to use (4.49) for the proof of Theorem 2.8, we first have to show that for N large enough and some δ > 0 , for all f with f ≤ N η (4.50) for some η > 0 small. By [KSSV14, Lemma 2.4], the maps V → a V , V → b V , defined by (4.42) are Frechet differentiable with (uniformly) bounded derivatives on a neighborhood of V . This lemma was proved in [KSSV14] only for V satisfying (GA) 2 but the proof goes through also for V with (GA) 1 . Hence and thus uniformly for t ∈ [−1, 1]. For given 0 < η < 1, assertions (4.51) and (4.52) hold uniformly for all f with f ≤ N η , which proves (4.50). We have thus shown that formula (4.49) can be applied. The second summand on the rhs (4.49) is uniformly bounded for r, s ∈ (−1 + δ, 1 − δ) and hence negligible when multiplied by 1 N . Now, we claim that uniformly on (−1 + δ, 1 − δ) a(r) a(s) +â (s) a(r) = 2 + O(|r − s| 2 ).
Setting z :=â (r) a(s) , this claim is equivalent to the relation (z − 1) 2 /z = O(|r − s| 2 ). It is now a straightforward application of Taylor's formula to show z − 1 = O(|r − s|). Writingt Using (4.55) and the smoothness of ρ V (see (4.47)), it is straightforward to establish the relation where we used in the last step that µ V is bounded away from 0 on [a V + δ, b V − δ]. Hence (4.54) reduces to where we used the simple inequality |sin(t + s) − sin(t)| ≤ |s|. If 1/|t − s| = O(1/N ), i.e. |t−s| N ≥ c for some constant c > 0, then F −1 V (t/N ) − F −1 V (s/N ) is bounded away from 0 and hence we see that the first term on the right-hand side of (4.56) is O(1/N ) which proves the theorem in this case (as the sine kernel of t − s is then also O(1/N )).
If |t − s| = o(N ), using Taylor's expansion on F −1 V s N at t N leads for some ν between t/N and s/N to Hence, the second summand in (4.57) is O( f /N ) and we arrive at Note that in (4.58) the O-term will be of order N −1 for f fixed in case (1) and N η−1 in case (2). By the simple equality The last equality is due to the boundedness of A(t, ν, N ) for t ∈ I N and our assumption (t − s)/N → 0.

Repulsive Particle Systems -Proof of Theorem 2.8 (3)
Note that part (3) of Theorem 2.8 for unitary invariant ensembles follows immediately from part (1) and the determinantal relations (2.17). To prove (3) for repulsive particle systems, we need to introduce some of the method developed in [GV14] to tackle these ensembles. We remark that in comparison to [GV14], there are several new (technical) elements, in particular the truncation to f D ≤ N κ and the necessity to work with complex-valued processes. Furthermore, aiming at rates of convergence requires a separate investigation of the cases of negative-definite h and of arbitrary h.
The first step is to decompose the additional interaction term i<j h(x i − x j ) into a linear term and a bivariate term of lower order. This will be done by the Hoeffding decomposition w.r.t. a (so far arbitrary) probability measure µ on R. Setting h µ (t) := h(t − s)dµ(s) and for another measure ν on R h µν := h(t − s)dµ(t)dν(s), we may write is of the same shape as N N j=1 Q(x j ), giving rise to the external field V µ := Q + h µ . Our aim is to choose µ such that P h N,Q and the unitary invariant ensemble P N,Vµ have the same equilibrium measure. To achieve this, the statistic should be concentrated under P N,Vµ . As U µ is a global statistic, we should have where ν now is the equilibrium measure to V µ . As we will not divide by N 2 and thus will be at the scale of fluctuations, the rhs of (5.60) should be 0. This leads us to the condition ν = µ, or in other terms, µ should be the equilibrium measure to V µ . This recursive problem was solved by a fixed point argument in [GV14, Lemma 3.1], showing existence of a measure µ with the desired property. The uniqueness followed later by proving that any fixed point is the limiting measure for P h N,Q . From now on let µ denote the fixed point, set V := V µ and U := U µ . The identity ) with Z N,V,U := Z h N,Q e C N allows to carry many properties from P N,V over to P h N,Q . Concentration of U under P N,V was proved in [GV14,Proposition 4.7] by showing that the ratio Z N,V /Z N,V,U is bounded in N and bounded away from 0 provided that α Q is large enough. More precisely, for any λ > 0, there is a constant α(λ) < ∞ such that for some 0 < C 1 < C 2 < ∞ and all α Q ≥ α(λ) we have for all N . A main ingredient to this is the following concentration of measure result for linear statistics (cf. [GV14,Corollary 4.4]), which will be used lateron.
Proposition 5.1. Let Q be a real analytic external field with Q ′′ ≥ c > 0. Then for any Lipschitz function f with third derivative bounded on an open interval D containing supp µ Q , we have for arbitrary ǫ > 0 Here C is uniform in f , Lipf denotes the Lipschitz constant of f on D and · ∞ is the sup norm on D.
The key to the local statistics is a linearization method, which transforms the bivariate statistic U into random linear statistics. We give an outline for negativedefinite h, that meansĥ ≤ 0, where h(t) := 1 √ 2π R e −its h(s)ds denotes the Fourier transform of h. For such function, −h may be seen as the covariance function of a centered stationary Gaussian process (f (t)) t∈R , i.e. a stochastic process on R whose finite-dimensional distributions are all multivariate Gaussian and such that Cov(f (t), f (s)) = −h(t − s). Then a quick computation verifies that where the expectation is w.r.t. the probability space underlying the Gaussian process. S. Jansen has pointed out to the second author that the linearization (5.62) is known in mathematical physics as the Sine-Gordon transformation. Furthermore, , resulting in a perturbation of lower order, which does not influence the equilibrium measure. The limiting bulk correlations are not altered by the function f either, as can be seen from Theorem 2.8. It should be noted that the scaling of the correlation functions is independent of f . To summarize, the ensemble P h N,Q is an average over determinantal ensembles P N,V −f /N with a small random perturbation of the external field. We will show that universality of P h N,Q can be deduced from universality of the invariant ensembles P N,V −f /N . Note that the averaging over f results in a weaker rate of convergence as uniformity in f has to be shown (cf. Theorem 2.8 (1) and (2)).

Alternative representation of correlation functions and truncation.
Let us define the generalized invariant ensemble Labeling the eigenvalues of the ensemble P N N −k,V by x k+1 , . . . , x N and denoting we arrive at the representation By analogous steps, we represent the k-th correlation function R h,k N,Q of P h N,Q as where we abbreviated U (t 1 , . . . , t k , x k+1 , . . . , x N ) by U (t, x). By [GV14, Lemma 28] we can assume that x k+1 , . . . , x N ∈ [−L, L] for L large enough. To be precise, the lemma shows that for each k we have L, C > 0 such that for all N and for all t 1 , . . . , t k . For later use, let us also state one more inequality from that lemma, valid for some constants C, c 1 , c 2 > 0. It will allow us to restrict the whole ensemble (instead of correlation functions) to some compact [−L, L].

Linearization and proof of Theorem 2.8 (3) for negative-definite h.
Let us give more details on the linearization method for negative-definite h. For such h, −h can indeed be seen as the covariance function of a centered stationary Gaussian process on R such that (5.62) and (5.63) hold. Since the sample paths of that process will become a part of the external field, we have to show analyticity. This can be done by invoking an explicit representation. Recall that h(t) denotes the Fourier transform of h and that we have −ĥ ≤ 0.
For the representation of f , let (B 1 t ) t , (B 2 t ) t denote two independent Brownian motions and define Here it is convenient to understand the stochastic integral as a Wiener integral, which exists for g sufficiently smooth and of a certain decay at ±∞ (note that by the law of the iterated logarithm, |B t | is almost surely bounded by √ 2t log log t).
Using that Gaussianity of f is equivalent to Gaussianity of all linear combinations of the random variables {f (t) : t ∈ R}, it is not hard to check that f (t) t∈R defined in (5.70) forms a Gaussian process on R. Furthermore, it has mean 0 and covariance function −h.
By the assumption on the exponential decay ofĥ, the rhs of (5.70) can be extended to a strip {x + iy : x ∈ R, |y| < c} for some c > 0 which implies analyticity of f in that strip a.s.. Let D := (−L − δ, L + δ) × (−c/2, c/2) (5.71) with δ > 0. Then it also follows from (5.70) that the extended process (f (w)) w∈D is a complex-valued centered Gaussian process with covariance function E(f (w 1 )f (w 2 )) = −h(w 1 − w 2 ). Recall the abbreviation 1≤i,j≤k and set (5.72) To prove Theorem 2.8 (3), in view of (5.68) we have to show for any ε > 0 in the prescribed uniformity. Here we used that by (5.61), E N,V exp U (x) is bounded and bounded away from 0, which carries over to the truncated setting. Note also the slight abuse of notation by not indicating the local scaling of the t j 's in O L .
Proof of Theorem 2.8 (3), negative-definite h. Let h be negative-definite. Furthermore, letf denote a centered Gaussian process with covariance function −h and define f :=f − f dµ. Then we have where we used the identity which can be obtained analogously to (5.67). Thus the first summand of (5.73) equals To apply Theorem 2.8 (2), we will replace the integration over all f by an integration over f with f D ≤ N η , where D is the complex domain defined in (5.71). More precisely, we will show that for some c > 0. This will be done by applying Hölder's inequality to separate the expectations of Let us reconsider the complex-valued process (f (t)) t∈D , which is the extension of the Gaussian processf on [−L, L] with covariance function −h. It follows from (5.70) that real and imaginary parts off on D are (real-valued) centered Gaussian processes. Their covariance functions are readily computed as giving the variances Now Borell's inequality (see e.g. [AT07, Theorem 2.1.1]) states that the supremum X K of a continuous centered Gaussian process X t over a compact K has sub-Gaussian tails, more precisely it is dominated by a Gaussian random variable with a certain expectation and variance σ := sup t∈K EX 2 t . Since the sum of sub-Gaussian variables is sub-Gaussian as well, we see with sup w∈D |f + (w)| ≤ sup w∈D |Re f + (w)|+ sup w∈D |Im f + (w)| that sup w∈D |f + (w)| is also sub-Gaussian. The same reasoning leads to the conclusion that sup w∈D |f (w)| is sub-Gaussian, giving for some c > 0. Next, we provide a bound for E N,V ;L exp N j=1 f (x j ) . Proposition 5.1 also holds for the ensemble truncated to [−L, L] with an exponentially small error which we will neglect. However, it is crucial that in the truncated case the Lipschitz constant is taken over [−L, L] instead of the whole real line. Thus we get for some C.f ′ is again a centered stationary Gaussian process with covariance function −h ′′ and thus, by similar arguments as above, sub-Gaussianity of Lipf and analogously also sub-Gaussianity of f (3) [−L,L] follow. Hence for some λ > 1 (close to 1, coming from Hölder's inequality) there is a constant C such that for all N if α Q (and hence α V ) is large enough.
It is important to note that, as the processes are stationary on R, the variances of Lipf, f [−L,L] and f (3) [−L,L] and therefore the required α Q are independent of the truncation threshold L and k. Now we will estimate C(t)R k N,V,f ;L (t 1 , . . . ,t k ) − S k (t). If t i = t j for some i = j, then this quantity is 0, hence we will only consider t with distinct elements. For such t, (2.17) and P N,V,f ;L (x) > 0 for any x ∈ [−L, L] N with distinct components, imply that (K N,V,f ;L (t i , t j )) 1≤i,j≤k =: A is a positive definite matrix and can hence be written as A = B 2 for some matrix B. Using Hadamard's inequality then gives Let us employ the representation as inverse Christoffel function (see e.g. [Tot00]) where the infimum is taken over all polynomials P N −1 of at most degree N − 1 with P N −1 (t) = 1. This representation immediately implies the bound To bound R 1 N,V ;L , we can now use the uniform convergence stated in Theorem 2.8 (2) together with the boundedness of the sine kernel, giving for any ε > 0, given that α Q is large enough.

Proof of Theorem 2.8 (3) for general h.
A general h may be decomposed into positive-definite functions as follows. Let ± denote positive and non-positive part and writeĥ = (ĥ) + − (ĥ) − . Then h = h + − h − , where h ± := (ĥ) ± and furthermore, h ± are positive-definite. To use h ± for our linearization method, we need these functions to be real-analytic. The real-analyticity is somewhat surprisingly equivalent to the exponential decay ofĥ at infinity. On the one hand, exponential decay ofĥ allows to extend h ± to the complex plane via Fourier inversion, thereby showing real-analyticity. On the other hand, [LS52, Theorem 2] tells us that any real-analytic positive-definite function has a Fourier transform of exponential decay.
Instead of −h, we will use −h z := zh + + h − , z > 0, as a covariance function and use complex analysis to show the desired. To this end define for complex z ∈ C Again, we have to show for z = −1. The proof for negative-definite h implies (5.84) for z > 0. This is basically enough, as Vitali's theorem implies, which we state for the convenience of the reader (cf. [Tit39,5.21]): Let g n (z) be a sequence of analytic functions on a domain U ⊂ C with |g n (z)| ≤ M for all n and all z ∈ U . Assume that lim n→∞ g n (z) exists for a set of z having a limit point in U . Then lim n→∞ g n (z) exists for all z in the interior of U and the limit is an analytic function in z.
The set containing a limit point will be chosen as a small interval (0, δ) for some δ > 0. We remark in passing that δ can be arbitrarily small and as a consequence h + has no influence on the necessary size of the convexity constant α Q .
To transfer the required uniformity in the t j from the case z > 0 to z = −1 is a technical issue as taking absolute values and suprema would destroy the analyticity in z, which is necessary for the application of Vitali's theorem. Therefore, we will use the following characterization of uniform convergence in terms of sequences: A sequence of continuous real-valued functions (g n ) n , defined on R l , converges uniformly on the sequence of compact sets (B n ) n , B n ⊂ R l towards a continuous function g if and only if for all sequences (n m ) m ⊂ N with lim m→∞ n m = ∞ and all sequences (t m ) m with t m ∈ B nm we have lim m→∞ g nm (t m ) − g(t m ) = 0.
We will take B n := I N . Let (N m ) m ⊂ N be a sequence going to infinity and (t (m) ) m be a sequence with t (m) ∈ I Nm . Let us define W m : C → C as (5.85) Note that we suppressed some of the m-dependencies. Clearly, (W m ) m is a sequence of analytic functions.
Remark 5.2. Vitali's theorem allows to deduce convergence in a region of the complex plane from convergence in another region. As rates of convergence might well be different according to the region one is looking at, it is clear that we cannot transfer the rate O(N −1+ε ), valid for z > 0, to z = −1 with the same technique. This seems to be rather a technical issue, we in fact believe that the correct rate should also be (at least) O(N −1+ε ).
Proof of Theorem 2.8 (3), general h. We will often drop the dependence on m in the following. For positive z, we can apply the linearization procedure as described above, as then h z is a positive-definite function. Thus we have W m (z) = o(1) for any z ∈ (0, δ) for some δ provided α Q is large enough, where δ is chosen so small that the lower bound on α Q does not depend on δ. Hence, to apply Vitali's theorem, we need to show boundedness uniform in m and z from a domain containing −1 and (0, δ). This domain may in fact be chosen as the halfplane {z ∈ C : Re z < δ}, which can be seen as follows. Bounding W m termwise, (5.82) shows that Im z only gives a phase which vanishes by taking absolute values, hence we can concentrate on real z. For z < 0, (5.82) is non-positive, since it is minus 1/2 times the variance of a Gaussian random variable (cf. (5.63)), implying that in this case the influence of (5.82) in bounding W m will vanish as well, due to taking absolute values. It is thus sufficient to consider z > 0. In this case we get as above that W m (z) equals where f z is a centered Gaussian process with covariance function −h z . Now, the bounds (5.73) and (5.80) can be used again to show uniform boundedness in m. To check that these bounds are uniform in z ∈ (0, δ) for δ > 0 small, is straightforward and left to the reader. The theorem is proved.

Proofs of remaining statements
Proof of Theorem 2.2. Let an interval I N ⊂ [0, N ] be given such that lim inf N →∞ dist(I N , {0, N })/N > 0. We will apply Theorem 2.10 with P N being the distribution of the unfolded ensembles P N,V,f and P h N,Q , respectively. We start with the former case. Let F [−1] V be a function with the following properties: By properties (1) and (2), P N is a probability measure on R N . By property (3), we have F ) −1 for all t such that N t ∈ I N for N large enough. Then the correlation function R k N is (for N large enough) on I k N given by It remains to check conditions (1) and (2) in Theorem 2.10. For condition (1), we may use inequality (5.80), giving which is valid also for ensembles with non-compact support. Here we use again the abbreviationt j := F −1 V (t j /N ). Positivity of µ V on (a V , b V ) yields condition (1) for the unitary invariant ensembles.
For condition (2), another application of Hadamard's inequality may be used, which we cite from [AGZ10, Lemma 4.3.2]. For kernels K 1 , K 2 , defined on some locally compact A × A and all t ∈ A k , the following inequality holds: (6.87) Here K := sup t,s∈A |K(t, s)| denotes the sup-norm of a kernel K on A × A. Choosing K 1 (t, s) := (N µ V (F −1 V (t/N ))) −1 K N,V,f (F −1 V (t/N ), F −1 V (s/N )) and K 2 as the sine kernel, we find with A := I N that by Theorem 2.10 (1) K 1 − K 2 = O(1/N ).
Moreover, the boundedness of the sine kernel and the uniform convergence of K 1 imply that max( K 1 , K 2 ) ≤ C for some C > 1. This proves condition (2) of Theorem 2.10 for P N,V,f .
For the repulsive particle systems P h N,Q , we will first truncate the ensemble to a compact [−L, L] N and apply Theorem 2.10 to the truncated ensemble. This will be convenient as the truncation threshold L in (5.68) depends on the order of the correlation function k and in the subsequent linearization several constants depend on L, which would complicate checking conditions (1) and (2) of Theorem 2.10 significantly.
For any L > 0 we have the crude bound where E h N,Q;L denotes expectation w.r.t. the truncated ensemble P h N,Q;L . Note that both ensembles have the same limiting measure µ h Q and therefore, there is no change in the unfolding of the particles. The case ofσ instead of |I N | −1 σ is completely analogous.
The same procedure as above can now be applied to fit P h N,Q;L into the framework of Theorem 2.10. Condition (1) then follows using the linearization procedure introduced in Section 5. Indeed, an inequality of the form R h,k N,Q t 1 , . . . ,t k ≤ C for some C, uniformly in t 1 , . . . , t k , is equivalent to uniformly in t 1 , . . . , t k for small positive z. By linearization, the l.h.s. of (6.88) is equal to to which now (5.80) may be applied for almost all f . The sub-Gaussianity of f ∞ gives the desired bound. For condition (2), we can use the same arguments and (6.87).
Proof of Corollary 2.4. Let I N be a sequence of proper compact sub-intervals of [0, N ] with I N /N → [0, 1]. We will use the simple inequality The last term is for two sequences of positive numbers ε N , ε ′ N , both converging to 0. Let us exemplarily deal with For R 1 N,V,f , [KSSV14, Theorem 1.5] provides the following asymptotics. For a certain c > 0 and all t ∈Ĵ with t > 1 + c −1 N −2/3 , we have where the last equality is due to t > 1. Hence η V (t) ≥ c ′ t and we conclude that for some With the linearization technique, this bound can be transfered to the ensemble P h N,Q . We note in passing that due to the mixing over f , for P h N,Q the bound for the first correlation function is somewhat worse than (6.91), see [KV15] for details.
For the edge regime, i.e. b V − ε N < t < b V + c ′′ N −2/3 , the following asymptotics are given in [KSSV14, Theorem 1.5], valid for 1 − c < t < 1 + cN −2/5 (with the same c as above), where γ V,f := 2 −1/3 G V,f (1) 2/3 , is uniform in f as above and as before, we can neglect the f -dependence. Here K Ai is the Airy kernel, On the diagonal, the Airy kernel is given as K Ai (t, t) = Ai ′ (t) 2 − t Ai(t) 2 . We are therefore left to estimate Let us first consider the case t < b V , i.e. λ −1 V (t) < 1. Setting ζ(t) := 2 3 t 3/2 , [AS64, 10.4.60, 10.4.62] provide the following asymptotics, valid as t > 0, This estimate can by the now familiar procedure be extended to the ensemble P h N,Q . Altogether we have combining (6.90), (6.93) and (6.96) Returning to (6.89), we wish to apply Theorem 2.10 for an interval I N which exhausts [0, N ] for N → ∞. It suffices to show uniform convergence of the correlation functions towards the sine kernel, as conditions (1) and (2) of Theorem 2.10 will then follow exactly as in (5.80) and (6.87). Hence our task is to extend the proof of Theorem 2.8 to regions close to the edges. Formula (4.49) was valid for r, s ∈ (−1 + δ, 1 − δ) for some arbitrary but fixed δ > 0. As now I N covers some of the left and right edge regions, we also have to consider correlations between particles from different regions. Here the general [KSSV14, Theorem 1.3] is useful which gives K N,V,f (t, s) = k 1 (t)k 2 (s) − k 2 (t)k 1 (s) t − s + O(N −1 ), where k 1 , k 2 are bounded functions and the O term is uniform for t, s from bounded subsets of J. This means that K N,V,f (t, s) is bounded if t − s is bounded away from 0, which includes the case of one particle at the left and the other at the right edge. Dividing by N , we see that Theorem 2.8 (1) and (2) hold in this case. The correlations between bulk particles and edge particles are more subtle, as these regions are adjacent and a transition from sine kernel to Airy kernel statistics occurs. Without loss of generality we will consider this transition at the right edge only. To state the analogue of (4.49) at the edge, we need some more notation. Let us remark that we restrict ourselves to the case t ≤ b V , the void region not being of interest here thanks to (6.93) and (6.96). Recalling Note that this formula covers a part of the bulk region. By (6.94) and (6.95), K Ai (f N (r), f N (s))(f N (r) − f N (s)) = Ai(f N (r)) Ai ′ (f N (s)) − Ai ′ (f N (r)) Ai(f N (s)) (6.99) = 1 π cos ζ(f N (s)) − π 4 cos ζ(f N (r)) + π 4 − cos ζ(f N (r)) − π 4 cos ζ(f N (s)) + π 4 + O ζ(|f N (r)|) −1 + ζ(|f N (s)|) −1 = 1 π sin(ζ(f N (r)) − ζ(f N (s))) + O ζ(|f N (r)|) −1 + ζ(|f N (s)|) −1 Hence we have for r, s < 1, r = s, We have thus recovered the leading term of (4.49). However, the O-term involving ξ(r) −1 will only be small for r and s being not too close to each other. We will therefore first consider the case of |r − s| ≥ N −2+p for some small p > 0. From (6.94), (6.97), (6.98) and the boundedness of the derivative of d V it follows that for 1 − δ < r, s < 1 − ε ′ N with ε ′ N > 0 converging to 0 slowly enough, we have b V,f − a V,f 2N K N,V,f (λ V,f (r), λ V,f (s)) = sin N π r s ρ V,f (u)du πN (r − s) for some ι > 0, uniformly for r, s with |r−s| ≥ N −2+p . Now we can proceed as in the proof of Theorem 2.8 (1) and (2). We get with properly chosen ε N > 0 converging to 0, that uniformly in t, s ∈ [(1 − δ)N, (1 − ε N )N ] (6.101) Here we already replaced F V,f and µ V,f by their counterparts F V and µ V , which has been justified in the proof of Theorem 2.8. For |r − s| < N −2+p , we may use Taylor's expansion in s at the point r in (6.99) together with Ai ′′ (t) = t Ai(t) to obtain for certain ν, λ between s and r. Using K Ai (t, t) = Ai ′ (t) 2 − t Ai(t) 2 , we see that (6.102) equals f ′ N (r)K Ai (f N (r), f N (r))(r − s).
With these asymptotics and (6.97) we see that uniformly for 1 − δ ≤ r < 1 − ε ′ N with ε ′ N → ∞ slowly enough, which establishes (6.101) also close to the diagonal. As written above, this suffices to apply Theorem 2.10 to finish the proof. The transfer to correlation functions is made with (2.17) and to P h N,Q with the linearization method. This proves Corollary 2.4.
The case of repulsive particles is analogous. By the two-fold application of Hadamard's inequality (cf. (5.79) and (6.87)) and the linearization method, it suffices to show , a + s N µ V (a) = sin(π(t − s)) π(t − s) This is precisely the statement of [KSSV14, Theorem 1.8], which can also easily be deduced from (4.49). Here we used again that the f -dependence in the scaling can be neglected.
Proof of Corollary 2.6. A careful reading of the proof of Lemma 3.2 shows that we have for any s, L and any ε > 0 for any ε ′ > 0. This proves the corollary.