Local universality for real roots of random trigonometric polynomials

Consider a random trigonometric polynomial $X_n: \mathbb R \to \mathbb R$ of the form $$ X_n(t) = \sum_{k=1}^n \left( \xi_k \sin (kt) + \eta_k \cos (kt)\right), $$ where $(\xi_1,\eta_1),(\xi_2,\eta_2),\ldots$ are independent identically distributed bivariate real random vectors with zero mean and unit covariance matrix. Let $(s_n)_{n\in\mathbb N}$ be any sequence of real numbers. We prove that as $n\to\infty$, the number of real zeros of $X_n$ in the interval $[s_n+a/n, s_n+ b/n]$ converges in distribution to the number of zeros in the interval $[a,b]$ of a stationary, zero-mean Gaussian process with correlation function $(\sin t)/t$. We also establish similar local universality results for the centered random vectors $(\xi_k,\eta_k)$ having an arbitrary covariance matrix or belonging to the domain of attraction of a two-dimensional $\alpha$-stable law.

The limit distribution does not depend on the distribution of ξ 1 , a phenomenon referred to as local universality. Azaïs et al. [3] proved their conjecture assuming that ξ 1 has an infinitely smooth density that satisfies certain integrability conditions. However, as they remarked, even the case of the Rademacher distribution P[ξ 1 = ±1] = 1/2 remained open. Our aim is to prove the conjecture of [3] in full generality (Theorem 2.1 below). The method of proof proposed in the present paper is very different from the one used in [3]. Let us briefly sketch our approach assuming that (ξ 1 , η 1 ), (ξ 2 , η 2 ), . . . are i.i.d. random vectors such that ξ 1 and η 1 are centered uncorrelated random variables with unit variance. We start by proving a functional limit theorem (Theorem 3.1 below) stating that weakly on some suitable space of analytic functions. Then, we use the continuous mapping theorem to deduce the convergence of the real zeros. The basic fact underlying this part of the proof is the Hurwitz theorem stating that the complex zeros of an analytic function do not change "too much" under a slight perturbation of the function. Essentially, Hurwitz's theorem tells us that the functional which maps an analytic function to the point process of its complex zeros is continuous.
Since we are interested in real zeros, we have to ensure that real zeros remain real after a small perturbation. If we restrict ourselves to analytic functions which are real on R, then non-real zeros come in complex conjugated pairs, and a simple real zero cannot become complex under a small perturbation of the function. These considerations, see Lemmas 4.1 and 4.2, justify the use of the continuous mapping theorem.
Our method is quite general and allows us to establish the corresponding local universality result in the case when (ξ 1 , η 1 ) has a non-zero correlation (Theorem 2.3) or even does not have finite second moments but is in the domain of attraction of some stable two-dimensional law (Theorem 2.5).
Closing the introduction, we mention that the scope of our approach is not restricted to trigonometric polynomials. The same method can be applied to various ensembles of random algebraic polynomials. A similar method was used in [20], [15] for complex zeros of random Taylor series near the circle of convergence, in [19] for Dirichlet series with random coefficients and some other sums of analytic functions with random coefficients, and in [12,13] for complex zeros of the partition function of the (Generalized) Random Energy Model. Unlike in these works, we investigate real zeros. Let us also mention that the asymptotics of EN n [a, b] (that is, the expected number of real zeros in an interval whose length does not go to 0) was studied in the recent works [1] and [8], where more references on random trigonometric polynomials can be found.
The structure of the paper is as follows. The main results are stated in Section 2. Functional limit theorems for X n and their proofs are given in Section 3. In Section 4 the proofs of the main theorems are presented. Some auxiliary technical lemmas are collected in the Appendix.
As usual, d −→ denotes convergence in distribution of random variables and vectors. The notation w −→ is used to denote weak convergence of random elements with values in a metric space, while v −→ denotes vague convergence of locally finite measures.

2.1.
Coefficients with finite second moments. For a real analytic function f which does not vanish identically denote by N f [a, b] the number of zeros of f in the interval [a, b]. It will become clear from our proofs that the results hold independently of whether the zeros are counted with multiplicities or not. Theorem 2.1, which is our first main result, proves the conjecture of [3], weakens the original assumptions of [3] on the distribution of the coefficients and allows for an arbitrary sequence (s n ) n∈N as the location of the scaling window.
Theorem 2.1. Let (ξ 1 , η 1 ), (ξ 2 , η 2 ), . . . be i.i.d. random vectors with zero mean and unit covariance matrix, that is, Let (s n ) n∈N be any sequence of real numbers and [a, b] ⊂ R a finite interval. Then, where (Z(t)) t∈R is the stationary Gaussian process defined in Section 1.
We can also prove the weak convergence of point processes of zeros. Given a locally compact metric space X, denote by M p (X) the space of locally finite point measures on X endowed with the vague topology. A random element with values in M p (X) is called a point process on X. We refer to [17] for the information on point processes and their weak convergence. For a real analytic function f which does not vanish identically denote by Zeros R (f ) the locally finite point measure on R counting the real zeros of f with multiplicities. The next theorem is stronger than on M p (R).
In the next theorem we consider i.i.d. random vectors with arbitrary covariance matrix. In particular, this theorem covers random trigonometric polynomials of the form n k=1 ξ k sin(kt) and n k=1 η k cos(kt) involving sin or cos terms only. Theorem 2.3. Let (ξ 1 , η 1 ), (ξ 2 , η 2 ), . . . be i.i.d. random vectors with Then, for every fixed s ∈ R, with the convention that x → (1 − cos x)/x equals 0 at x = 0.
Remark 2.4. In Theorem 2.3 we can replace the fixed s by a general sequence (s n ) n∈N as in Theorem 2.1, but then we have to replace the condition s / ∈ πZ with lim n→∞ n dist(s n , πZ) = +∞ and s ∈ πZ with lim n→∞ n dist(s n , πZ) = 0. Here, we used the notation dist(s n , πZ) = min{|s n − πk| : k ∈ Z}.

2.2.
Coefficients from a stable domain of attraction. Let (ξ 1 , η 1 ), (ξ 2 , η 2 ), . . . be i.i.d. random vectors from the strict domain of attraction of a two-dimensional α-stable distribution, 0 < α < 2. This means that there exist numbers b n > 0 such that where S α,ν is a non-degenerate two-dimensional α-stable random vector with Lévy measure ν and shift parameter 0. The adjective "strict" is used to highlight that convergence (4) holds without centering, in particular, it is assumed that Eξ 1 = Eη 1 = 0 if α > 1. We refer to [18] for details on multivariate stable distributions and stable processes. Note that ν is a locally finite measure on R 2 \ {0} which has the homogeneity property ν(λB) = λ −α ν(B) for all λ > 0 and all Borel sets B ⊂ R 2 \ {0}. In what follows we identify R 2 and C via the canonical isomorphism and consider R 2 -valued processes as C-valued and vice versa.
Theorem 2.5. Assume that (4) holds and let s ∈ R be fixed. Then on M p (R), where (Z ν (t)) t∈R is a stochastic process given by for t ∈ R, and (L(u)) u∈[0,1] is a C-valued α-stable Lévy process with zero drift, no Gaussian component, and the Lévy measureν defined by ν(e 2πik/q B), if s = 2πp/q, with p ∈ Z, q ∈ N coprime, for all Borel sets B ⊂ C \ {0}, with ν being the Lévy measure of S α,ν in (4).
Remark 2.6. The integral in (5) (which need not exist in the Lebesgue-Stieltjes sense because L has finite variation a.s. in the case α ∈ (0, 1) only) is defined via integration by parts: See, e.g., [10,18] for the properties of such stochastic integrals.
Remark 2.7. An interesting feature of Theorem 2.5 is that the behavior of the zeros near s depends on whethers := s/(2π) is rational or not. To see why such arithmetic effects show up, assume for a moment that ξ k and η k are independent and symmetric α-stable. Then, X n (s) is also symmetric α-stable with scaling parameter σ n , where , if s = 2πp/q. Note that for α = 2 (which corresponds to the finite variance case studied in Section 2.1), there is no difference between the rational and irrational cases because sin 2 t + cos 2 t = 1.
Remark 2.8. In Theorem 2.5 it is possible to replace the fixed s by a sequence (s n ) n∈R assuming that s := lim n→∞ s n exists and either s / ∈ πQ (the first case in (6)) or s ∈ πQ and |s n − s| = o(1/n) as n → ∞ (the second case in (6)).

Convergence of random trigonometric polynomials as random analytic functions
3.1. Spaces of analytic functions and analytic continuations of the processes Z, G and Z ν . Let H be the space of functions which are analytic on the entire complex plane. We endow H with the topology of uniform convergence on compact sets. This topology is generated by the complete separable metric whereD r = {|z| ≤ r} is the closed disk or radius r > 0 around the origin, and f K = sup z∈K |f (z)| is the sup-norm of f on a compact set K ⊂ C; see [6, pp. 151-152]. A random analytic function is a random element taking values in the space H endowed with the Borel σ-algebra. We refer to [9] and [19] for more information on random analytic functions.
Let H R be a closed subspace of H consisting of all functions f ∈ H which take real values on R. Note that for every f ∈ H R we have f (z) = f (z) for all z ∈ C. Indeed, the functions f (z) and f (z) are analytic and coincide on R. Hence, they must coincide everywhere on C by the uniqueness theorem for analytic functions. The space H R is endowed with the induced topology and metric.
Following the approach outlined in the introduction, we shall show that convergence (2) and its counterparts in the case of correlated (ξ 1 , η 1 ) and in the stable case hold weakly on the space H R . But, first of all, we have to construct analytic continuations of the limit processes Z, G and Z ν appearing in Theorems 2.1, Theorems 2.3 and 2.5, respectively.
3.1.1. The process Z. The stationary Gaussian process (Z(t)) t∈R appearing in Theorem 2.1 can be extended analytically to the complex plane using the representation where (N k ) k∈Z are i.i.d. real standard Gaussian random variables. The series in (8) converges uniformly on compact subsets of C because so does the series k∈Z | sinc(t − πk)| 2 ; see [9, Lemma 2.2.3]. It follows that (Z(t)) t∈C is an analytic function on C with probability 1.
The R 2 -valued process ((Re Z(t), Im Z(t))) t∈C is jointly real Gaussian in the sense that for all t 1 , . . . , t d ∈ C, the 2d-dimensional random vector is real Gaussian. Clearly, EZ(t) = 0 for all t ∈ C. The covariance structure of (Z(t)) t∈C is given by For instance, in the case when t, s / ∈ πZ, we have where we used the partial fraction expansion of the cotangent. In the case when t = πj for some j ∈ Z, we have because sinc(t − πk) = 1 for k = j and 0 for k = j. The proof of (9) is similar. Representation (8) appeared, for example, in [2]. Note that the analytically continued process (Z(t)) t∈C is stationary with respect to shifts along the real axis, but it is not stationary with respect to shifts along the imaginary axis.
3.1.2. The process G. In the case s / ∈ πZ we can simply take where the process (Z(t)) t∈C is the same as in Section 3.1.1. In the case s ∈ πZ take a centered C-valued Brownian motion (W (u)) u∈[0,1] with covariance structure and put where the integral is defined via the formal integration by parts, as in (7). Clearly, this defines U as a random analytic function on C. Now put 1 Integrating by parts and using the identities it is easy to check that the covariance function of (G(t)) t∈R is given by the second line in (3). 3.1.3. The process Z ν . Let (L(u)) u∈[0,1] be a C-valued α-stable Lévy process defined in Theorem 2.5. As in the construction of G above, put where the integral is understood as in (7). Obviously, U ν is a random analytic function on C and we can take 3.2. Functional limit theorems for random trigonometric polynomials. Now we are ready to prove convergence (2) and its counterparts corresponding to Theorems 2.3 and 2.5.
Theorem 3.1. Let (ξ 1 , η 1 ), (ξ 2 , η 2 ), . . . be i.i.d. random vectors with zero mean and unit covariance matrix. Fix any sequence of real numbers (s n ) n∈N and consider a random process (Y n (t)) t∈C defined by Proof. The proof consists of two steps.
1 Here the following observation is used: if f is an analytic function, so is g(z) := (f (z) − f (z))/(2i). Moreover, g ∈ H R and for t ∈ R we have g(t) = Im f (t). In particular, G(t) = Im U (t) for t ∈ R, but, generally speaking, this relation fails for t ∈ C \ R.
With C = 2 cosh 2 (Im t) we get which converges to 0 as n → ∞ because E[ξ 2 1 + η 2 1 ] < ∞. Tightness. In order to prove that the sequence (Y n ) n∈N is tight on H, it suffices to show that for every R > 0, (13) sup see [12,Lemma 4.2] or the remark after Lemma 2.6 in [19]. For all |t| ≤ R and n ∈ N we have because −R ≤ k n Im t ≤ R for all k = 1, . . . , n. It follows that Y n converges to Z weakly on H, as n → ∞. Since H R is a closed subset of H and all processes under consideration have their sample paths in H R , the convergence holds weakly on H R , as well.
The next theorem provides convergence of random trigonometric polynomials under the assumptions of Theorem 2.3.
Theorem 3.2. Let s ∈ R be fixed and define a random process (Y n (t)) t∈C by Under assumptions of Theorem 2 Proof. We use the same idea as in the proof Theorem 3.
We shall need the standard trigonometric identities where the case θ ∈ 2πZ is understood by continuity. As in the proof of Theorem 3.1 the subsequent argument is divided into two steps.
Convergence of finite-dimensional distributions. First we prove that the covariances of Y n converge to those of G. For all t 1 , t 2 ∈ C, we have cos k s + t 1 n cos k s + t 2 n + ρ n n k=1 sin k 2s + t 1 + t 2 n .
Denote the three terms on the right-hand side by S 1 (n), S 2 (n), S 3 (n). Using the formula 2 sin x sin y = cos(x − y) − cos(x + y) and then the first identity in (14) we obtain Sending n → ∞ and considering the cases sin s = 0 and sin s = 0 separately, we infer Similarly, using the formula 2 cos x cos y = cos(x − y) + cos(x + y) for the second sum we arrive at Finally, in view of the second formula in (14), Taking everything together and recalling the definition of the process G, see (3), we obtain Similar computation (with t 2 replaced byt 2 ), yields Fix t 1 , . . . , t d ∈ C. In view of the convergence of the covariances established in (16) and (17), to prove that (Y n (t 1 ), . . . , Y n (t d )) converges in distribution to (G(t 1 ), . . . , G(t d )) it is enough to verify the Lindeberg condition: for every fixed t ∈ C and ε > 0. This can be done exactly in the same way as in the proof of Theorem 3.1.
Tightness. It is sufficient to check condition (13). Starting with the equality E|Y n (t)| 2 = E[Y n (t)Y n (t)] and applying (15) with t 1 = t =t 2 , we arrive at Together with the inequalities | sin z| ≤ cosh(Im z) and | cos z| ≤ cosh(Im z), this implies condition (13). Combining pieces together, we see that Y n → G weakly on H and hence, also on H R .
In the case of attraction to a stable law we have the following functional limit theorem. Since its proof is more involved than in the previous cases, it is given in the separate Section 3.3.

Proof of Theorem 3.3.
We start with a well-known observation, see [16], that (4) implies that the distribution of (ξ 1 , η 1 ) varies regularly in R 2 with the limit measure ν, which, in turn, is equivalent to the vague convergence
Proof. Define the sequence (λ n ) n∈N of measures on [0, ∞) × (C \ {0}) as follows: Case s / ∈ πQ. We have to check that The left-hand side of the latter relation equals Case s = 2πp/q follows analogously from (33) in Lemma 5.1.
The rest of the proof mimics the proof of Proposition 3.1 in [16]. The only place which has to be checked is relation (3.3) of the cited paper, which in our situation reads lim where A is a compact subset of C \ {0}. But this is obvious, since, by (18), Then, where the Lévy process L is the same as in Theorem 2.5.
Proof. If s ∈ 2πZ, then (21) is just a functional limit theorem for i.i.d. vectors corresponding to (4). Let us assume that s / ∈ 2πZ, which means that the vectors are independent but not identically distributed. We shall use a criterion for functional convergence given in Theorem 3.1 in [21]. In view of Lemma 3.4 we need to check that, for every δ > 0, (22) It follows from the definition ofν that it is invariant under the transformations z → ze 2πiθ , where θ ∈ R (if s / ∈ πQ) and θ ∈ q −1 Z (if s = 2πp/q). Since we assume s / ∈ 2πZ, this transformation group contains at least one non-trivial rotation which implies that The next step is to show that Note that E[∆ n,k ] = 0. Since (| m k=1 ∆ n,k |) m∈N is a non-negative submartingale, we can apply Doob's inequality: where (ξ, η) is a distributional copy of (ξ 1 , η 1 ). Assumption (18) implies that x → P[ ξ 2 + η 2 > x] is regularly varying. Hence by Karamata's theorem in the form given by formula (5.22) on p. 579 in [7], The last expression tends to zero, as ε → 0, sinceν is a Lévy measure, whence (24).
Lemma 3.6. Fix s ∈ R and define a sequence of processes Under the assumptions of Theorem 2.5 we have on H, where the process U ν is defined in (11).
Proof. Define a mapping F : D([0, 1], C) → H as follows: In view of the representation Y n = F (L n ), with L n as in (20), convergence (26) follows from the continuous mapping theorem. Now we are in a position to prove Theorem 3.3.

Proof of Theorem 3.3.
Recalling the definition of X n , we can write 2i with Y n as in (25). It follows from Lemma 3.6 that  Proof. Consider any sequence (f n ) n∈N ⊂ H R which converges to f ∈ A locally uniformly. We need to show that for sufficiently large n we have f n ∈ A and N (f n ) = N (f ). Let R > 0 be so large that [a, b] is contained in the open disk D R = {|z| < R}. Let z 1 , . . . , z d be the collection of all zeros of f in D R with corresponding multiplicities m 1 , . . . , m d . Assume without loss of generality that f has no zeros on the boundary of D R (just increase R, otherwise). Let ε > 0 be so small that the open ε-disks z 1 + D ε , . . . , z d + D ε do not intersect each other, the boundary of D R , and the real axis (except when the zero is itself real). By Hurwitz's theorem [6, p. 152], for all sufficiently large n, the function f n has exactly m k zeros (with multiplicities) in the disk z k + D ε , for all k = 1, . . . , d, and there are no other zeros of f n in D R . If z k ∈ (a, b), then m k = 1 (in view of f ∈ A) and the corresponding zero of f n in the disk z k + D ε is also real because otherwise f n would have two different complex conjugated zeros (recall that f n (z) = f n (z)), which is a contradiction. It follows that all real zeros of f n in (a, b) are simple and their number is N (f ). Clearly, f n (a) = 0 and f n (b) = 0 for sufficiently large n. Hence, f n ∈ A and N (f n ) = N (f ) for large n.
Recall that Zeros R (f ) is a locally finite measure on R counting the real zeros of f ∈ H R \ {0} with multiplicities. Proof. Let (f n ) n∈N ⊂ H R be a sequence which converges to f ∈ A(R) locally uniformly. Fix R > 0. Let z 1 , . . . , z l be the real zeros of f in [−R, R] and assume there are no zeros at −R and R. Fix ε > 0. Arguing as in the proof of Lemma 4.1, we can show that for sufficiently large n, the function f n has exactly one real zero in any of the disks z 1 + D ε , . . . , z l + D ε and there are no further real zeros of f n in [−R, R]. But this means that Zeros R (f n ) converges to Zeros R (f ) vaguely.
Analogously, Theorem 2.1 is a consequence of for every a < b. In order to verify these statements, we need the following result due to E. V. Bulinskaya [5]. It provides general conditions which ensure that a stochastic process (which need not be Gaussian) does not have multiple zeros, with probability 1. The parts of (28) and (29) regarding the Gaussian processes Z and G (in the case s / ∈ πZ) follow immediately from Lemma 4.3 (see also [22] or [23] for further work on the absence of multiple zeros of Gaussian processes). Indeed, the variances of both Z and G are non-zero constants which implies that there are uniform upper bounds on the densities of Z(t) and G(t). Let us consider G(t) in the case s ∈ πZ.
Lemma 4.5. With probability 1, there is no t ∈ R such that Z ν (t) = Z ′ ν (t) = 0. Proof. Recall that Z ν is a random analytic function. We intend to show that Z ν (t) have densities which are bounded uniformly in t ∈ R, ε < |t| < ε −1 for fixed ε > 0. By Lemma 4.3 this implies that the process Z ν almost surely does not have multiple zeros in any interval bounded away from zero. Fix ε > 0. It is enough to show that where C does not depend on ε < |t| < ε −1 . This means that the characteristic function of the random variable Z ν (t) has bounded L 1 -norm which, by Fourier inversion, implies that this random variable has Lebesgue density, say p t , and We prove (30). Recall that aZ ν (t) = where ψ(x, y) = log Ee i(x Re L(1)+y Im L(1)) , x, y ∈ R. The random vector (Re L(1), Im L(1)) is α-stable. Denote its spectral measure by Γ (which is a finite measure on the unit circle T = {z ∈ C : |z| = 1} that can be easily expressed in terms ofν). We have, see Theorem 2.3.1 in [18], Re ψ(a sin(tu), a cos(tu)) = −|a| α Putting this into (31) we obtain The function (φ, t) → t −1 t+φ φ | sin v| α dv is continuous and strictly positive on the compact set [0, 2π] × [ε, ε −1 ], hence attains its minimal value, say δ > 0. Therefore, Re log Ee iaZν (t) ≤ −δΓ([0, 2π))|a| α , a ∈ R, yielding (30) and proving that there are no multiple zeros in R \ {0}.
To check (32) since the distribution functions of µ ′ n converge pointwise to the distribution function of µ ′ .
By the Skorokhod representation theorem there exists a probability space (Ω, A, P) and random vectors X n and X on this space such that X n has distribution µ ′ n , X has distribution µ ′ , and X n → X, as n → ∞, almost surely. With this notation we can recast (32) as follows: Recalling the uniform boundedness of (f z ) and invoking the dominated convergence theorem, we arrive at (36). Relation (33) follows analogously from the observation that µ ′′ n := 1 na k≥1 δ (k/n,(2πpk/q)mod(2π)}) → µ ′′ := a −1 LEB × U q , weakly, where U q is a uniform measure on the set {0, 2π/q, 4π/q, . . . , 2π(q − 1)/q}. The proof of Lemma 5.1 is complete.