CLT for Circular beta-Ensembles at High Temperature

We consider the macroscopic large N limit of the Circular beta-Ensemble at high temperature, and its weighted version as well, in the regime where the inverse temperature scales as beta/N for some parameter beta>0. More precisely, in the large N limit, the equilibrium measure of this particle system is described as the unique minimizer of a functional which interpolates between the relative entropy (beta=0) and the weighted logarithmic energy (beta=\infty). More precisely, we provide subGaussian concentration estimates in the W1 metric for the deviations of the empirical measure to this equilibrium mesure. The purpose of this work is to show that the fluctuation of the empirical measure around the equilibrium measure converges towards a Gaussian field whose covariance structure interpolates between the Lebesgue L^2 (beta=0) and the Sobolev H^{1/2} (beta=\infty) norms. We furthermore obtain a rate of convergence for the fluctuations in the W_2 metric. Our proof uses the normal approximation result of Lambert, Ledoux and Webb [2017] the Coulomb transport inequality of Chafai, Hardy, Maida [2018] and a spectral analysis for the operator associated with the limiting covariance structure.


Introduction and statement of the results
Let T := [−π, π] R/2πZ be the one-dimensional torus that we equip with the metric (x, y) → |e ix −e iy | = |2 sin( x−y 2 )|. Given an inverse temperature parameter β > 0, the Circularbeta-ensemble is a celebrated particle system from random matrix theory of N particles on T with distribution 1 where Z N > 0 is a normalization constant. This corresponds to the eigenvalues distribution of a unitary Haar distributed random matrix when β = 2. The macroscopic behavior of this particle system as N → ∞ is well-known: the empirical measure converges almost surely (a.s) weakly towards the uniform mesure dx 2π on T. The fluctuations of the particle system around the uniform measure can be described as well: for any smooth enough test function ψ : T → R satisfying T ψ dx 2π = 0, Johansson [1988] proved 1 the central limit theorem (CLT) where the Sobolev semi-norm · H 1/2 is defined by Here and in what followsψ k := T ψ(x) e −ikx dx 2π are the usual Fourier coefficients. The aim of this work is to provide similar statements at high temperature, namely when β goes to zero as N → ∞. Notice first that if we take β = 0, which corresponds to the infinite temperature setting, then the x i 's are independent random variables uniformly distributed on T. Thus the law of large numbers yields the a.s. weak convergence µ N → dx 2π as N → ∞ and the classical CLT states that, for any L 2 function ψ : T → R satisfyingψ 0 = 0, 1 More precisely, the CLT in [Johansson, 1988] is stated for β = 2, in which case it is equivalent to the strong Szegö theorem for Toeplitz determinants, see for example [Simon, 2005, Chapter 6] or [Deift et al., 2013] for comprehensive expositions of this celebrated result. However, it is straightforward to check that the method of [Johansson, 1988] still applies for any fixed β > 0 provided that the test function ψ is C 1+α for some α > 0. See also [Lambert, 2019, Theorem 1.2] for a generalization to the mesoscopic scale. Let us also stress that, although one may believe this CLT holds true as soon as ψ H 1/2 < ∞, a counterexample has been provided in [Lambert, 2019] when β = 4.
2 where the L 2 norm reads Notice the difference of normalization between (2) and (4).
As we shall see, there is a critical temperature regime of temperature where the variance structure of the fluctuations interpolates between the Lebesgue L 2 and Sobolev H 1/2 (semi-)norms, and this happens when β is of order 1/N . Thus, from now we consider the particle system where we rescale the inverse temperature parameter as β → 2β/N , the factor 2 being cosmetic. We also consider the case where the particle system is confined by an external potential V and will show that the limiting variance depends on V in a non trivial way. In contrast, in the usual fixed temperature setting, the variance is expected to depend only on the support of the equilibrium measure.
The study of random matrix ensembles at high temperature (i.e. with an interaction strength of order 1/N ) was initiated by Allez, Bouchaud, and Guionnet [2012] who described explicitly the crossover for the density of state from the Wigner semicircle law to the Gaussian law. There are also several results about eigenvalues fluctuations in this regime [Benaych-Georges and Péché, 2015, Trinh, 2017, Nakano and Trinh, 2018, Pakzad, 2018, Nakano and Trinh, 2019 whose study is motivated by the transition from random matrix to Poisson statistics, which is considered to be instrumental to describe the Anderson localization phenomenon. In particular Trinh [2017] and Nakano and Trinh [2018] obtained a CLT for the linear statistics of the Gaussian-beta-ensembles at this temperature regime, relying of the Dumitriu and Edelman [2002] tridiagonal matrix representation for this particle system, although the limiting variance is not explicit. The asymptotic behavior of the largest eigenvalue of the Gaussian beta-ensembles at high temperature has been recently investigated in [Pakzad, 2019a]. Moreover, in [Spohn, 2019], the asymptotic behavior of the generalized free energy of the Toda chain has also been related with certain statistics of the Dumitriu-Edelman model in the high temperature regime. There are also a few results available in higher dimension for Coulomb gases [Rougerie andSerfaty, 2016, Akemann andByun, 2019] in this regime. Here we chose to focus instead on beta-ensembles on T; that T is compact yields several technical simplifications in the proofs and a simple formula for the limiting variance. However, let us mention that one could adapt our approach to tackle the setting of the Gaussian-beta-ensembles, and the beta-ensembles on R with a general potential as well, and provide an explicit formula for the limiting variance similar to the one that we will derive below.
Let us also mention that an interesting result where fluctuations similar to the one we obtain here has been previously derived by Guionnet and Bodineau [1999] for a two-component 2D plasma model. We now present the particle system we investigate and our main results.
The particle system of interest. For any β > 0 and any continuous potential V : T → R, we consider N random interacting particles on T with joint probability distribution where Z N > 0 is a normalization constant (which depends on the parameters β > 0 and V ). In the following we set and, without loss of generality (by adding a constant to V if necessary), we assume that µ V 0 is a probability measure on T. If we introduce the discrete logarithmic energy of a configuration then (6) takes the form which is the Gibbs measure associated with the energy interaction H at inverse temperature 2β/N with reference measure (µ V 0 ) ⊗N . This particle system has a physical interpretation: we can observe that H (x) = i<j g(x i − x j ) where g can be written as the restriction g(x) = G(x, 0) of the Green function G of the two-dimensional torus T × T, that is ∆G = −2π(δ 0 − 1) on T × T in the distributional sense, see e.g. [Borodin and Serfaty, 2013]. Thus P N describes a gas of N unit charges, interacting according to the laws of electrostatic on the two-dimensional torus but constrained to stay on T T × {0} ⊂ T × T, in presence of an external potential V , at temperature N/(2β). As we shall see below, in this temperature regime, one of the main reasons to study the statistical properties of such a Coulomb gas for large N is that there is a subtle competition between the energy and entropy of the gas which results in non-trivial global fluctuations. This fact is somewhat surprising knowing that for any β ≥ 0, the local fluctuations of the Coulomb gas (6) are described by a Poisson point process with intensity µ V β -this follows from adapting the argument from Nakano and Trinh [2019] from R to T.
Macroscopic behavior. First, we discuss the large N limit of the empirical measure µ N , see (1), when the x i 's are distributed according to P N . If µ lies in the space M 1 (T) of probability measures on T, define its logarithmic energy by Moreover, given any µ, ν ∈ M 1 (T), the relative entropy of µ with respect to ν is given by when µ is absolutely continuous with respect to ν; set K(µ|ν) := +∞ otherwise. The functional of interest here is F Note that when F V β (µ) is finite, then µ is absolutely continuous and, if µ(dx) = µ(x)dx, then we can alternately write In particular, when µ has a density and log µ dµ < ∞, we see that is the celebrated weighted logarithmic energy from potential theory [Saff and Totik, 1997]. The next result can be extracted from the literature.
has compact level sets {F β ≤ α}, α ∈ R, and is strictly convex. In particular it has a unique minimizer µ V β on M 1 (T). (b) The sequence (µ N ) satisfies a large deviation principle in M 1 (T) equipped with its weak topology at speed βN with rate function When β = 0, this is Sanov's theorem for i.i.d random variables and elementary properties of the relative entropy, see e.g. [Dembo and Zeitouni, 2010]. Moreover, the unique minimizer of F V 0 is given by (7) and hence the notation is consistent. In the case where β > 0, statement (a) is classical (see e.g. the proof of Proposition 2.1 below) and (b) can be found in [Berman, 2018, García-Zelada, 2018. In fact, statement (a) of the theorem is also true for weaker regularity assumptions on V and also when β = ∞. Moreover, if one considers back the fixed temperature setting by taking the particle system (6) after the scaling β → N β and V → N V , then statement (b) holds true at the same speed with rate function F V ∞ − F ∞ (µ V ∞ ), see [Hiai andPetz, 2000, Anderson, Guionnet, and.
We will derive several properties for µ V β in Section 2 but let us already mention that, due to the rotational invariance, the equilibrium measure µ 0 β for V = 0 is the uniform probability measure dx 2π on T for every β ∈ [0, ∞]. For a general potential V , we shall see that µ V β has a bounded density that is larger than a positive constant and is essentially as smooth as V is.
Macroscopic fluctuations. Our main result is a central limit theorem (CLT) for the random signed measure tested against sufficiently smooth functions, with an explicit upper bound on the rate of convergence in the Wasserstein W 2 metric; the latter is defined for random variables X, Y taking values in R d by where the infimum is taken over all random variables Z = (Z 1 , Z 2 ) with Z 1 law = X and Z 2 law = Y. To state the result, let us also write µ V β for the density of the equilibrium measure, so that dµ V β (x) = µ V β (x)dx, and introduce the operator L defined by which acts formally on the space L 2 (T) of real-valued square integrable functions on T equipped with the scalar product Here H stands for the Hilbert transform defined on L 2 (T) by where p.v. is the Cauchy principal value, that is the limit as ε → 0 of this integral restricted to the integration domain |e ix − e it | > ε. Note that when β = 0 the operator L corresponds to the Sturm-Liouville operator L φ = −φ + V φ . As we shall see from Proposition 4.3 below, for any β > 0 the operator L is well-defined and positive on the Sobolev-type space which is an Hilbert space once equipped with the inner-product and moreover that its inverse L −1 is trace-class on H.
The central result of this work is that ν N converges, in the sense of finite dimensional distributions, to a Gaussian process on H with covariance operator L −1 .

6
Of course the theorem still holds for a general ψ ∈ C 2γ+1 (T) after replacing ψ by ψ− ψ dµ V β in the left hand side of (19) and in the limiting variance (20) 2 . When V = 0, we can obtain an explicit formula for the limiting variance.
This identity follows from the fact that, using the invariance by rotation, it is easy to diagonalize the operator L -see the identity (73) below. Indeed, in this setting we have −L φ = φ + βH(φ ) and the eigenfunctions are given by the Fourier basis φ j (x) = e ijx since L φ j = (j 2 + β|j|)φ j for every j ∈ Z.
Remark 1.1. Let us observe that the rate of convergence in Theorem 1.2 does not depend on the smoothness of V , but it improves with the regularity of the test function. Moreover, if ψ ∈ C ∞ (T), we have We expect this rate to be optimal, maybe up to the factor √ log N .
The proof of Theorem 1.2 is deferred to Section 4 and relies on a normal approximation technique introduced in [Lambert, Ledoux, and Webb, 2017], which is inspired from Stein's method; see Theorem 4.5 below. In [Lambert et al., 2017] this method has been used to investigate the rate of convergence of the fluctuations for beta-Ensembles on R at fixed temperature. There is a substantial technical difference in the analysis which arises in the high temperature regime due to the fact that the operator L has an extra Sturm-Liouville component. In particular, the spectral properties of L are quite different and this yields changes in the rate of convergence as well as in the limiting variance.
Stein's method has also been used previously in the context of random matrix theory to investigate the rate of convergence for linear statistics of random matrices from the classical compact groups [Fulman, 2012, Döbler and Stolz, 2011 and for the Circular beta-Ensemble at fixed temperature [Webb, 2016]. There are also results from Chatterjee [2009] on linear statistics of Wigner matrices which are valid under strong assumptions on the law of the entries and from Johnson [2015] on the eigenvalues of random regular graphs. For a comprehensive introduction to Stein's method which includes several applications, we refer to the survey [Ross, 2011].
On the road to establish the CLT, we prove the following concentration inequality which may be of independent interest: let W 1 (µ, ν) be the Wasserstein-Kantorovich distance of order 1 between µ, ν ∈ M 1 (T), defined by where Π(µ, ν) is the set of probability measures on T × T with respective marginals µ and ν; the second identity is known as the Kantorovich-Rubinstein dual representation for W 1 , where the supremum is taken over Lipschitz functions T → R with Lipschitz constant at most one.
Theorem 1.4 (Concentration). Let β > 0 and assume V : T → R has a weak derivative V in L 2 (T). Then, there exists C = C(µ V β ) > 0 such that, for every N ≥ 10 and r > 0, We have an explicit expression for the constant C in terms of µ V β in (43). In particular, when V = 0, this upper bound holds with C = 2 log 2 + 3/2 + 16 + π −1 19.2, which does not depend on β.
In particular, this yields together with Borel-Cantelli lemma that W 1 (µ N , µ V β ) → 0 a.s. for fixed β > 0 and, when V = 0, that W 1 (µ N , dx 2π ) → 0 a.s. when β may depend on N as long as β N −1 . For lower order temperature scales this should still be true but one needs to prove it differently; note also there is an interesting change of behavior for the partition function of the Gaussian-beta-ensemble around β ∼ N −1 pointed out in [Pakzad, 2018, Lemma 1.3].
The proof of the theorem follows the same strategy than the one of [Chafaï, Hardy, and Maïda, 2018] and rely on their Coulomb transport inequality. Differences however arise due to the presence of the relative entropy in F V β . In particular, one needs to study the regularity of the potential of the equilibrium measure.
Organisation of the paper. In sections 2 we obtain preliminary results on the equilibrium measure µ V β and its logarithmic potential. Section 3 is devoted to the proof of Theorem 1.4. In section 4, we provide the core of the proof of Theorem 1.2. In section 5, we obtain concentration estimates for error terms by means of Theorem 1.4. In section 6, we investigates the spectral properties of the operator L ; in particular we show that L −1 is trace-class. In section 7, we study the regularity of the eigenfunctions of the operator L so as to complete the proof of the main theorem. Finally, in Section 8, we investigate the behavior of the variance σ V β as β → 0 (Poisson regime) as well as β → ∞ (random matrix regime).
Notations, basic properties and conventions. From now, β > 0 is fixed. In the following, if η is a measure on T, we will denote by η(x) its density with respect to the Lebesgue measure dx when it exists. If S ⊂ T is a Borel set, we denote by |S| its Lebesgue measure.
Recall that T is equipped with the metric (x, y) → |e ix − e iy | and denote for any k ∈ N := {0, 1, 2, . . .} and 0 < α ≤ 1 by C k,α (T) the space of k-times differentiable functions on T whose k-th derivative is α-Hölder continuous, or Lipschitz continuous when α = 1. When 0 < α < 1 we also write C α instead of C 0,α , since there is not ambiguity, and put Note that, for any 0 < α < 1, we have ψ C α (T) ≤ 2 ψ Lip . We sometimes use as well the chordal metric instead of the reference metric since they are equivalent: 2 for any x, y ∈ R. Moreover, since Rademacher's theorem states that the Lipschitz constant for the metric d T reads f L ∞ , we have and H m (T) be the Sobolev subspace of L 2 (T) of functions having their m-th first distributional derivatives in L 2 (T). We will also use at several instances the continuous embedding H m+1 (T) ⊂ C m,1/2 (T) for m ∈ N, sometimes known as the Sobolev-Hölder embedding theorem.
Finally, we uses the letter C for a positive constant which may varies from line to line, and which may depend only on β > 0 and on the potential V unless stated otherwise. interesting discussions, and Severin Schraven for pointing out the reference [Brown et al., 2013]. A. H. is supported by ANR JCJC grant BoB (ANR-16-CE23-0003) and Labex CEMPI (ANR-11-LABX-0007-01). G.L. is supported by the grant SNSF Ambizione S-71114-05-01.

Properties of the equilibrium measure
In this section we study the minimizer µ V β of F β , see (12), and collect useful properties for later. Given µ ∈ M 1 (T), its logarithmic potential U µ : T → [0, +∞] is defined by Proposition 2.1. If V : T → R is a measurable and bounded function, then for any β ≥ 0, In particular, there exists 0 Part (c) of the proposition is usually referred as the Euler-Lagrange equation.
Remark 2.1. If V = 0, then µ V β is the uniform measure dx 2π because of the rotational invariance. One can also check it satisfies (24) since, for any x ∈ T, Thus, the Euler-Lagrange constant reads C 0 β = 2β log 2 − log(2π). Remark 2.2. Part (a) of the proposition follows from well known results. Although part (b) and (c) seem to be part of the folklore, we were not able to locate (b) and (c) proven in full details in the literature; the little subtlety is to take care of the sets where the density of µ V β may a priori vanish or be arbitrary close to zero due to the term log µ V β . Proof of Proposition 2.1. It is known that both mappings µ → E(µ) and µ → K(µ|µ V 0 ) have compact level sets on M 1 (T) and are strictly convex there, see [Saff andTotik, 1997, Dembo and, from which (a) directly follows. Moreover, since

This yields in turn
when ε → 0 for some C ∈ R and, since ε(C + log ε) + O(ε 2 ) is negative for every ε > 0 small enough, this contradicts the fact that µ V β is the unique minimizer. Thus |A 0 | = 0. We next prove a weak form of (c). Let φ : T → R be a measurable and bounded function satisfying φ dµ V β = 0. Then, for any real |ε| ≤ φ −1 ∞ , we have (1 + εφ)µ V β ∈ M 1 (T) and for any such φ's. If η ∈ M 1 (T) has a bounded density ψ with respect to µ V β , then by taking φ := ψ − 1 in the previous identity we obtain Now, if one assumes (26) we reach a contradiction. Since the same holds after replacing > by < we obtain We are now equipped to prove (b) and (c). Using that U µ V β ≥ 0 on T, we obtain from (27) e for some c > 0, and thus the same holds true (Lebesgue)-a.e. In particular, since V is bounded by assumption, there exists C > 0 such that µ V β (x) ≤ C 2π for a.e. x ∈ T. This yields in turn with (25) and thus µ V β (A κ ) = 0 for every κ > 0 small enough. Since we have already shown that |A 0 | = 0, this means that |A κ | = 0 for every κ > 0 small enough, and the first claim of (b) is proven. Since the function x → log | sin( x 2 )| −1 is non-negative and integrable on T, the second claims follows as well.
Finally, this yields that the equation (27) holds a.e. and thus (c) is proven.
Proof. One can assume µ has a density which satisfies log µ dµ < ∞ since the identity is otherwise trivial. Similarly, one can assume E(µ) < ∞ so that E(µ − µ V β ) makes sense (and is non-negative), see [Saff and Totik, 1997, Lemma 1.8]. By integrating (24) against µ this yields In particular, we obtain by taking µ = µ V β and subtracting the resulting identity to (28), We also describe the behavior as β → 0 and β → ∞ of the equilibrium measure.
Lemma 2.3. If V : T → R is measurable and bounded, then we have the weak convergences If we further assume V is lower semicontinuous and that µ V ∞ has a density which satisfies log µ V ∞ dµ V ∞ < ∞, then we have the weak convergence Note that V is lower semicontinous and does not take the value +∞ ensures that F V ∞ is lower semicontinuous and has a unique minimizer µ V ∞ on M 1 (T), see [Saff and Totik, 1997].
Since µ → K(µ|µ V 0 ) has for unique minimizer µ V 0 and is lower semicontinuous on M 1 (T), which is weakly compact, this implies the weak convergence µ Since E is lower semicontinuous on M 1 (T), this similarly yields the weak convergence µ V β → dx 2π as β → ∞.
Next, we study the regularity of the equilibrium measure and its potential. Recall the Hilbert transform H acting on the Hilbert space L 2 (T) is defined in (16). We can also define Hµ for µ ∈ M 1 (T) as soon as it has a density µ(x). Note that H acts in a simple fashion on the Fourier basis: H(1) = 0 and, if k ∈ N \ {0}, By taking the complex conjugate, this implies that for every k ∈ Z, where we set sgn(0) := 0. This yields that H : L 2 (T) → L 2 0 (T) is a well-defined bounded operator with adjoint H * = −H. Moreover, when restricted to L 2 0 (T), this turns H into an isometry which satisfies H −1 = −H. We will also use that this implies that for any f ∈ H 1 (T), Hf also belong to the Sobolev space H 1 (T) and that (Hf ) = H(f ). In the sequel, we will use these properties of the Hilbert transform at several instances.
Proof. For any ϑ ∈ C 1 (T), by using the definition of the Cauchy principle value and doing an integration by part we obtain, for every x ∈ T, Next, using Fubini theorem and that H is a bounded operator on L 2 (T) satisfying H * = −H, we obtain This shows that U µ V β has a distributional derivative given by πHµ V β . Moreover, since the density µ V β (x) belongs to L 2 (T) by Proposition 2.1 (b), so does Hµ V β and thus (U µ V β ) ∈ L 2 (T).
The case m ≥ 2 follows inductively by differentiating (32) and using the same reasoning.
3 Proof of Theorem 1.4 We now turn to the proof of Theorem 1.4. The proof follows the same strategy than the one in [Chafaï, Hardy, and Maïda, 2018] and is based on combining a Coulomb transport inequality together with an energy estimate after an appropriate regularization of the empirical measure. The regularization we use here is rather similar to [Maïda and Maurel-Segala, 2014] and the technical input with this respect here is the following lemma.
Lemma 3.1. Given any configuration of distinct points x 1 , . . . , x N ∈ T, there exists a configuration y 1 , . . . , y N ∈ T satisfying: Proof. Given any ordered configuration x 1 < . . . < x N in T, there exists at least one index j such that x j+1 − x j ≥ 2π/N . Thus, by permutation and translation, one can assume without loss of generality that Consider the increasing bijection x ∈ T →x := tan(x/2) ∈ R ∪ {±∞} which satisfies We setỹ 1 :=x 1 andỹ j+1 :=ỹ j + max(x j+1 −x j , N −2 ) and then let y 1 < . . . < y N ∈ T be the configuration obtained by taking the image of theỹ j 's by the inverse bijection. Since by Next, by assumption on the x j 's we have max j |x j | ≤ | tan( π 2 − 1 N )| ≤ N, which yields max j |ỹ j | ≤ 2N, and we thus obtain, for every j = k, Using that, for any 0 < c < 1, then we can write for some new normalization constant Z N > 0.
Step 4: The Coulomb transport inequality and conclusion. Lemma 3.1 yields, Since both µ N and µ V β have finite logarithmic energy, it follows from [Chafaï et al., 2018, Theorem 1.1] and the discussion below that, for every ε > 0, . Moreover, using that N 2 , we obtain for any r > 0 from (41), where the constant is given by 43) and the proof of the theorem is complete.

17
4 Main steps for the proof of Theorem 1.2 In this section, we explain the main strategy to prove Theorem 1.2. It is based on the multidimensional Gaussian approximation result from [Lambert, Ledoux, and Webb, 2017] combined with the previous concentration inequality and a study of the spectral properties of the operator L . Consider the differential operator given by which satisfies the integration by part identity f (−Lg) dP N = ∇f · ∇g dP N for any smooth functions f, g : , we first show that ν N (φ), seen as a mapping T N → R, is an approximate eigenfunction for L as long as φ is a (strong) eigenfunction of the differential operator L defined in (15). More precisely, we have the approximate commutation relation: Proof. If we set Φ(x) := N j=1 φ(x j ) then we have Next, it is convenient to introduce the operator Ξ defined by which is a weighted version of the Hilbert transform H defined in (16). Indeed, we can write and this yields together with (46) and (45), (48) By (15), observe that the variational equation (31) yields where we used that, by (47), Moreover, we obtain by using that H * = −H, By integrating (31) against φ dx, this yields together with an integration by parts: By combining (48)-(51), we have finally shown that and the result follows by linearity of L since Φ = √ N ν N (φ) + N φ dµ V β . It turns out the random variables ζ N (φ) are of smaller order of magnitude than the fluctuations provided φ is smooth enough. More precisely, we have the following estimates.
Lemma 4.2. There exists a constant C = C(β, V ) > 0 such that, for any N ≥ 10, for any φ ∈ C 3,1 (T), we have Moreover, for any Lipschitz function g : The proof of this lemma is based on Theorem 1.4 and is postponed to Section 5.
Another important input is the existence of an eigenbasis of H for the operator L that behaves like an eigenbasis of a Sturm-Liouville operator. Note that by (17), H is a separable Hilbert space and it follows from Proposition 2.1(b) that the associated norm satisfies a fact we will use at several instances below.
Proposition 4.3. Assume that V ∈ C m,1 (T) for some m ≥ 1. Then there exists a family (φ j ) ∞ j=1 of functions φ j : T → R such that: is an orthonormal basis of the Hilbert space H.
(d) φ j ∈ C m (T) and there exist constants C k such that for every k ∈ {0, . . . , m}, The proof of Proposition 4.3 is postponed to the sections 6 and 7.
Proposition 4.4. Assume that the external potential V ∈ C 3,1 (T). There exists a constant C = C(β, V ) > 0 such that, if we set then we have for every N ≥ 10 and d ≥ 1, Here N (0, I d ) stands for a real standard Gaussian random vector in R d . This proposition is a consequence of the previous concentration estimates together with the following general normal approximation given by [Lambert et al., 2017, Proposition 2.1]; for F = (F 1 , . . . , F d ) ∈ C 2 (T N , R d ) we set LF := (LF 1 , . . . , LF d ) and denote by · R d the Euclidean norm of R d . and see both F and Γ as random variables defined on the probability space (T N , B(T) ⊗N , P N ). Given any d × d diagonal matrix K with positive diagonal entries, we have Proof of Proposition 4.4. By Proposition 4.3 (a) and Lemma 4.1, we have for every j ≥ 1, As a consequence, taking K := diag(κ 1 , . . . , κ d ), we obtain and, by Lemma 4.2 and Proposition 4.3 (c)-(d), this yields Next, since for any i, j ∈ N, and using that the φ j 's are orthonormal, we obtain Proposition 4.3 (d) moreover yield, for any i, j ∈ N, and it thus follows from Lemma 4.2 that The proposition follows by combining estimates (54) and (55) together with Theorem 4.5.
We are finally in position to prove Theorem 1.2 by decomposing a general test function into the eigenbasis (φ j ) ∞ j=1 and by using Proposition 4.4.
Proof of Theorem 1.2. Assume that V ∈ C 3,1 (T) and let ψ ∈ C 2γ+1 (T) for some integer γ ≥ 2. We can assume without loss of generality that ψ dµ V β = 0. Thus ψ ∈ H and we have by Proposition 4.3 (b), Moreover, since ψ lies in the domain of L γ and using that L is symmetric, we have In particular, by Proposition 4.3 (c), the series (56) converges uniformly on T. Next, given any d ∈ N, let us consider the truncation of ψ, Proposition 4.3 (c)-(d) and the upper bound (57) yield for γ > 1, Thus, by definition of the W 2 metric and Lemma 4.2, this yields Next, if we set η m := m j=1 κ −1 j ψ, φ j H 2 for m ∈ N ∪ {∞}, then we obtain from (57) and Last, Proposition 4.4 and Proposition 4.3 (c) yields Finally, by combining the estimates (58)-(60) and taking for d the integer part of N 1/4 γ+1 , we obtain where C ψ > 0 depends on η ∞ , L γ ψ H , β and V only. It remains to check that η ∞ equals to the variance σ V β (ψ) 2 given in (20); this is proven in Proposition 6.3 below. The proof of the theorem is therefore complete.

Concentration estimates: Proof of Lemma 4.2
If we use the Kantorovich-Rubinstein dual representation of W 1 and take r := R log N/N in Theorem 1.4, then under the same assumptions and using the same notation as in that theorem we obtain the following estimate: there exists C = C(µ V β , β) > 0 and κ = κ(β) > 0 such that, for every R ≥ 6 and N ≥ 10, We also need the next estimate.
Lemma 5.1. There exists κ = κ(β) > 0 and a constant R 0 > 0 such that, for any function ψ ∈ C 2,1 (T), one has for every R ≥ R 0 and N ≥ 10, Proof. The strategy is to prove that the random function has Lipschitz constant controlled by ψ Lip √ log N with high probability and then to use (61). Since ψ ∈ C 2,1 (T), we verify that for any x ∈ R, We now provide an upper bound on the Lipschitz constant of the integrand of Ψ N which is uniform in x. Indeed, we have d dy .
(63) Let et us recall that we introduced d T (x, y) in (22). Two Taylor-Lagrange expansions yield, for any x, y ∈ T, for some u, v ∈ T, so that Together with (63), this implies that there exists a constant C > 0 such that Since the mean value theorem yields that we deduce from (61) that there exist constants κ ≥ κ and R 0 > 0 such that for all R ≥ R 0 and N ≥ 10, Therefore, by (61) again, we obtain which completes the proof of the lemma.
Proof of Lemma 4.2. Using (61) and that, for any real random variable X and α > 0, we obtain for any N ≥ 10 and any Lipschitz function g : and the second statement of the lemma is obtained. Next, according to (45) and since µ N is a probability measure, we have Using the inequality φ L ∞ ≤ 2π φ Lip obtained as in (64), we deduce from Lemma 5.1 that for all R ≥ R 0 and N ≥ 10, Thus, combined with (65) this yields and the proof of the lemma is complete.

6 Spectral theory: Proof of Proposition 4.3 (a)-(c)
In this section, we always assume V ∈ C 1,1 (T). In particular it follows from Proposition 2.5 that (log µ V β ) is Lipschitz continuous. Recalling (15), we write where we introduced the operators on L 2 (T), Note that A is a Sturm-Liouville operator in the sense that it reads with p := log µ V β and q := 0; we refer to [Marchenko, 2011, Brown et al., 2013 for general references on Sturm-Liouville equations.
We first check that L is a positive operator on H, as a consequence of the next lemma.
Lemma 6.1. The operators A and W are both positive on H.
Proof. We have for any function φ ∈ H, where we set ϕ := φ µ V β . Moreover, if one decomposes ϕ in the Fourier basis, then we have and the lemma is proven.
The spectral properties of the Sturm-Liouville operator A (with periodic boundary conditions) are well known, see for instance [Brown et al., 2013, Chapter 2 and 3], from which one can obtain the basic properties: Lemma 6.2. There exists a orthonormal basis (ϕ j ) ∞ j=1 of H consisting of (weak) eigenfunctions of A associated with positive eigenvalues. Moreover, if A ϕ j = λ j ϕ j with 0 < λ 1 ≤ λ 2 ≤ · · · , then there exists α > 0 such that, as j → ∞, Proof. Since for any smooth function φ : T → R we have we see that A is a positive Sturm-Liouville operator on L 2 (µ V β ) whose domain is where we used Proposition 2.1 (b) for this equality. It then follows from the general properties of the Sturm-Liouville operators that there exists an orthonormal basis of L 2 (µ V β ) consisting of eigenfunctions ( ϕ j ) +∞ j=0 ⊂ H 1 (T) of A associated to non-negative increasing eigenvalues (λ j ) ∞ j=0 . Moreover, by Weyl's law (see e.g. [Brown et al., 2013, Theorem 3.3.2] in our setting), there exists α > 0 such that λ j ∼ αj 2 as j → ∞.
The smallest eigenvalue λ 0 = 0 comes with the eigenfunction ϕ 0 = 1 which is orthogonal to H in L 2 (T), see (17). Since the ϕ j 's are orthonormal in L 2 (µ V β ), we have for any j ≥ 1, and thus ( ϕ j ) ∞ j=1 ⊂ H. Moreover, since we have for any i, j ∈ N, it follows that λ 1 > 0 (since otherwise ϕ 1 would be a non-zero constant function and this would contradict (72)). Finally, if we set ϕ j := ϕ j / λ j , then the family (ϕ j ) ∞ j=1 is an orthonormal basis of H that satisfies the requirements of the lemma. Proposition 6.3. Proposition 4.3 (a)-(c) hold true. More precisely, there exists a orthonormal basis (φ j ) ∞ j=1 of H such that L φ j = κ j φ j with 0 < κ 1 ≤ κ 2 ≤ · · · and we have κ j ∼ αj 2 as j → ∞ for the same α > 0 than in Lemma 6.2. In particular, L −1 is a well defined trace class operator on H and, for any ψ ∈ H, we have Proof. We use here basic results from operator theory, see e.g. [Kato, 1995]. Lemma 6.2 yields that A is a positive self-adjoint operator on H and that A −1 is trace-class. Since W is non-negative and self-adjoint on H, it follows that L −1 = (A + 2πβW ) −1 is a positive self-adjoint compact operator on H. The spectral theorem for self-adjoint compact operators then yields the existence of an orthonormal family (φ j ) ∞ j=1 in H and an increasing sequence of positive numbers (κ j ) ∞ j=1 such that L −1 = j κ −1 j φ j ⊗ φ j . In particular L φ j = κ j φ j weakly for every j ≥ 1. Moreover, since L −1 is positive, the family (φ j ) is necessarily a complete orthonormal family in H: part (a) and (b) are thus proven.
Writing ·, · instead of ·, · H for simplicity, the min-max theorem (see e.g. [Reed and Simon, 1978, Theorem XIII.2]) yields, for any j ≥ 1, where the maximum is taken over all subspace S j−1 ⊂ H of dimension j − 1. By taking S j := span(ϕ 1 , . . . , ϕ j ) where (ϕ j ) ∞ j=1 is as in Lemma 6.2, this provides where we also used that W ≥ 0 in the last inequality. Similarly, we use the reversed form of the min-max principle to obtain that, using also (68), for any j ≥ 1, Next, by using (70), the Cauchy-Schwarz inequality, that H is an isometry of L 2 0 (T) such that H(ψ) = H(ψ ) for every ψ ∈ H 1 (T), the second equality in (69) and Proposition 2.1 (b), we obtain for any ≥ 1, For the last step, we used that by definition, ϕ H = 1 and ϕ L 2 (µ V β ) = λ −1/2 (see the end of the proof of Lemma 6.2). Together with (75), this yields Finally, combined with (74) and Lemma 6.2, the proof of the proposition is complete.
7 Regularity: Proof of Proposition 4.3 (d) We start with the following lemma.
Proof. First, since φ dµ V β = 0 and φ is continuous, there exists ξ ∈ T such that φ(ξ) = 0. Thus, by the Cauchy-Schwarz inequality, By Proposition 2.1(b), this yields in turn Since µ V β ∈ C 1,1 (T) according to Proposition 2.5 and using that the Hilbert transform H preserves the L 2 (T) norm, we see the functions H(µ V β φ ) and (log µ V β ) φ are in L 2 (T). Together with the definition (15) of L , this implies that belongs to L 2 (T). Recalling (69), an integration by parts shows that Moreover, by (69), using Cauchy-Schwarz inequality and (77), we have Put together, by (68), this yields which completes the proof.
We finally turn to the proof of the last statement of Proposition 4.3 and thus complete the proof of Theorem 1.2.
Thus, since µ V β ∈ C 1,1 (T) and φ j ∈ L 2 (T), according to (78), we have for every j ≥ 1, for some C = C(β, V ) > 0; note that we used again that φ j L 2 ≤ δ −1/2 φ j H . By using this estimate in (78) together with (77) and using the proposition for k = 0, we obtain This proves the proposition when k = 1. Note that, in particular, φ j ∈ L ∞ (T). Assume now that m ≥ 2 so as to treat the case where k = 2. Observe that, since φ j ∈ L 2 (T), the right hand side of equation (78) has a weak derivative in L 2 and we obtain, for any j ≥ 1, Together with (79) and the upper bounds used to prove it, this yields φ j L 2 ≤ Cκ j and in particular φ j ∈ L 2 (T). Similarly as in (80), this implies in turn that By using this estimate combined together with the proposition for k = 0 and k = 1, we obtain from (81) that φ j L ∞ ≤ Cκ 3/2 j and the proof of the proposition is complete when k = 2. The setting where k ≥ 3 is proven inductively by using the same method, after k − 1 differentiations of formula (78).

Continuity of the variance in the parameter β ∈ [0, +∞]
In this final section, we study the limits of σ V β (ψ) 2 as β → 0 and β → ∞. We provide sufficient conditions on V so that the variance interpolates between the L 2 and the H 1/2 (semi-)norms, as it is the case when V = 0, see Lemma 1.3.

Convention:
In this section, we denote the Hilbert space H and the operators L , A , and W defined in the previous sections by H β , L β , A β , and W β respectively to emphasize on the dependence on the parameter β ≥ 0.
First, let us record the following smoothing property of the operators L −1 β .
Proof. Recall that U dx 2π = log 2. Since K(µ V β |µ V 0 ) ≥ 0 and dx 2π minimizes E, we have for β > 0, From (24), which holds for all x ∈ T if V ∈ H 1 (T) according to Proposition 2.5, the density µ V β satisfies for all x ∈ T, Next, notice that the mapping g defined in (34) is L p for any p > 0. Using Young's convolution inequality we obtain, for every p > 1, 2π L p and the lemma follows.