A characterization of limiting functions arising in Mod-* convergence

: In this note, we characterize the limiting functions in mod-Gausssian convergence; our approach sheds a new light on the nature of mod-Gaussian convergence as well. Our results in fact more generally apply to mod-* convergence, where * stands for any family of probability distributions whose Fourier transforms do not vanish. We moreover provide new examples, including two new examples of (restricted) mod-Cauchy convergence from arithmetics related to Dedekind sums and the linking number of modular geodesics. Abstract In this note, we characterize the limiting functions in mod-Gausssian convergence; our approach sheds a new light on the nature of mod-Gaussian convergence as well. Our results in fact more generally apply to mod-* convergence, where * stands for any family of probability distributions whose Fourier transforms do not vanish. We moreover provide new examples, including two new examples of (restricted) mod-Cauchy convergence from arithmetics related to Dedekind sums and the linking number of modular geodesics.


Introduction
In [4] a new type of convergence which can be viewed as a refinement of the central limit theorem was proposed, following the idea that, given a sequence of random variables, one looks for the convergence of the renormalized sequence of characteristic functions rather than the convergence of the renormalized sequence of the given random variables. More precisely the following definitions were introduced: Definition 1.1 ([4]). Let (Ω, F, P) be a probability space and let (X n ) n 0 be a sequence of random variables defined on this probability space. 1. We say that (X n ) n 0 converges in the mod-Gaussian sense with parameters (m n , σ 2 n ) and limiting function Φ(λ) if the following convergence holds locally uniformly for λ: 2. We say that the sequence (X n ) n 0 converges in the mod-Poisson sense with parameter γ N and limiting function Φ if the following convergence holds locally uniformly for λ:

2)
(we have normalized by the characteristic function of Poisson random variables with mean γ n ).
In fact, as pointed out in [4], one can more generally study the convergence of the characteristic functions after renormalization with any family of characteristic functions which do not vanish: with this more general situation in mind, we talk about mod-* convergence. In a series of works [4,6,5,1,2] the authors establish that mod-* convergence occurs in many situations in number theory, random matrix theory, probability theory, random permutations and combinatorics and prove that under some extra assumptions, mod-* convergence may imply results such as local limit theorems, distributional approximations or precise large deviations. It should be noted that mod-* convergence usually implies convergence in law of the random variables X N , possibly after rescaling, which corresponds to most interesting studied cases where m n = 0, σ N → ∞ in (1.1) or γ n → ∞ in (1.2). Moreover it is shown in [4] and [5] that the limiting function sheds some new light into the connections between number theoretic objects and their naive probabilistic models. Roughly speaking, naive probabilistic models are based on the wrong assumptions that primes behave independently of each other but yet they can predict central limit theorems, such as Selberg's central limit theorem for the Riemann zeta function or the Erdos-Kac central limit theorem for the total number of distinct prime divisors of integers. However at the level of mod-Gaussian or mod-Poisson convergence, they fail to predict the correct behavior and a correction factor appears in the limiting function to account for the lack of independence. Hence the limiting function seems to carry some information about the dependence among prime numbers. It thus seems natural to ask what the possible limiting functions can be in the framework of mod-* convergence and this question was left open in [4].
In this paper, we propose a characterization of the limiting functions. Let S 0 be the set of functions which can be obtained as the characteristic function of a real random variable, divided by the characteristic function of a gaussian random variable. It is clear that S 0 is contained in the set S of the continuous functions φ from R to C such that φ(0) = 1 and φ(−λ) = φ(λ) for all λ ∈ R. The converse is not true: it is clear that if a function φ in S tends to infinity faster than λ → e σ 2 λ 2 /2 when |λ| goes to infinity, for all σ > 0, then φ / ∈ S 0 . However, the following result holds: The set S 0 is dense in S for the topology of the uniform convergence on compact sets.
The next section is devoted to a complete and short proof of this result and on another possible proof based on the study of mod-Gaussian convergence for sums of i.i.d. random variables. We also propose the larger framework where mod-* convergence only holds on a finite interval. We moreover provide two new examples of mod-Cauchy convergence from arithmetics related to Dedekind sums and the linking number of modular geodesics, thus strengthening the relevance of this framework in number theory as well.
2 Proofs of Theorem 1.2

Analytic proof
Let P be a polynomial with real coefficients, such that P (0) = 1. For all σ > 0, let us define the function f σ from R to R by and the function g P,σ from R to R by where D denotes the operator of differentiation of functions (e.g., for P (x) = x 2 + 1, P (D)(f σ ) = f σ + f σ ). We first establish a lemma: Lemma 2.1. For any real polynomial P with constant term 1, there exists σ 0 > 0 such that for all σ σ 0 , g P,σ (x) is a nonnegative function.
Proof. Without loss of generality, we can assume that deg(P ) 1, i.e. P = 1 (for P = 1 the result is trivial). Now f σ (x) = f1(x/σ) σ and then, by taking the k-th derivative, for all σ > 0, x ∈ R, k 0. From the expression of the derivatives of f 1 in terms of Hermite polynomials, one deduces that there exists a constant C P > 1, depending only on the polynomial P , such that for all σ > 1, x ∈ R (recall that P − 1 has no constant term). Let us first suppose that σ > 1 and |x| σ 3/2 . In this case, |x|/σ 2 1/ √ σ, and then, from (2.1): which implies that P (D)(f σ )(x) 0, and a fortiori g P,σ (x) 0, for σ 9C 2 P . Let us now suppose that |x| > σ 3/2 . In this case, for σ > 3, the third inequality coming from the Taylor expansion of the exponential function, and the last inequality coming from the fact that σ > 3, and then |x| > σ 3/2 > 3 3/2 > 4, which implies that e |x|/σ 2 e x 2 /4σ 2 . One deduces: 2) holds for all σ large enough, depending only on P .
Once the positivity of g P,σ is proven (for σ large enough, depending only on P ), let us compute its Fourier transform: one checks that for all λ ∈ R, In particular, the value of the Fourier transform at λ = 0 is equal to one, which implies that g P,σ is in fact a probability density. Hence, the following function is in S 0 : By letting σ → ∞, one deduces that the adherence of S 0 , for the topology of uniform convergence on compact sets, contains the function and then all the functions in S, by the Stone-Weierstrass theorem.

Probabilistic proof: mod-Gaussian convergence for sums of i.i.d. random variables
It is natural to ask whether there exists a general result of mod-Gaussian convergence for sums of i.i.d. random variables like there exists a central limit theorem. The answer is positive and provides in fact an alternative proof to Theorem 1.2. The result also outlines the interesting fact that mod-Gaussian convergence is closely related to cumulants. More precisely, we have the following result: Proposition 2.2. Let k 2 be an integer, and let (X n ) n 1 be a sequence of i.i.d. variables in L r for some r > k + 1, such that the k first moments of X 1 are the same as the corresponding moments of a standard gaussian variable. Then, the sequence of variables converges in the mod-gaussian sense, with the sequence of means and variances where c k+1 denotes the (k + 1)-th cumulant of X 1 .

Remark 2.3.
Intuitively, this mod-gaussian convergence suggests to approximate the distribution of the renormalized partial sums of (X n ) n 1 by the convolution of a gaussian density and a function H k,c k+1 whose Fourier transform is λ → e (iλ) k+1 c k+1 /(k+1)! . The function H k,c k+1 is not a probability density, since it takes some negative values: it appears in a paper by Diaconis and Saloff-Coste [7] on convolutions of measures on Z.
Proof. On can assume r ∈ (k + 1, k + 2), and then one has for all λ ∈ R, Hence, if φ denotes that characteristic function of X 1 , and (µ j ) 0 j k+1 the first successive moments of X 1 , one has when λ goes to zero. Now, (µ j ) 0 j k and µ k+1 − c k+1 are also the first moments of the standard Gaussian variable, hence, One deduces that for fixed λ, Now, the left-hand side of (2.3) is the characteristic function of Remark 2.5. One may in fact go further in the cumulants approach to mod-Gaussian convergence for an arbitrary sequence of random variables (X n ). In this case, the cumulants depend on n and one needs to control the growth of the cumulants of order k higher than 2 as functions of (k, n).
This approach is useful in some combinatorial framework and under some analytic assumptions one deduces precise large deviations estimates (with a good control on the error terms) from mod-* convergence. This is the topic of a forthcoming work.

Further examples and remarks
All the limiting functions obtained from mod-* convergence correspond to functions which are in the space S. In other words, they can always be obtained as mod-gaussian limits. The functions of S can also be viewed as the Fourier transforms of some special kind of distributions. Indeed, let E be the space of functions from R to R, generated by the functions x → cos(µx) for µ 0 and x → sin(µx) for µ > 0. These functions form a basis of E. Indeed, if for p 0, q 0, µ 1 > µ 2 > · · · > µ p 0, µ 1 > · · · > µ q > 0, α 1 , . . . , α p , α 1 , . . . α q = 0, the function vanishes for all x ∈ R, it vanishes for all x ∈ C, since it is an entiere function. If p 1, then for y real and tending to infinity, (g(iy)) = α 1 e µ1y 1 2 and if q 1, (g(iy)) = − α 1 2 e µ 1 y + o(e µ 1 y ), which contradicts the fact that g is identically zero. One can then define the distributions with space of test functions E as the linear forms on this space. The following result clearly holds: Lemma 3.1. A distribution D, defined as a linear form on E, is characterized by its values ψ D (µ) at the functions x → cos(µx) (µ 0), and ψ D (µ) at the functions x → sin(µx) (µ > 0). Moreover, if the distributions are canonically extended to complex test functions, then for λ ∈ R, the image of x → e iλx by D is given by The function φ D can be viewed as the Fourier transform of D. The following also holds: where F 1 is the space of functions from R + to R, F 2 is the space of functions from R * + to R, and G is the space of functions φ from R to C satisfying the equation φ(−λ) = φ(λ). The space of distributions on E is in bijection with G, via the Fourier transform D → φ D .
Since S is included in G, the functions in S can be viewed, via inverse Fourier transform, as distributions with test space E. Note that these distributions can be very singular, since we have a priori no control on the behavior of their Fourier transform at infinity: in general, they cannot be identified with tempered distributions in the usual sense. If D 1 and D 2 are two distributions with test space E, one can define their convolution D 1 * D 2 as the distribution whose Fourier transform is the product φ D1 φ D2 . If φ D2 nowhere vanishes, then the deconvolution of D 1 by D 2 is the unique distribution D such that D * D 2 = D 1 : one has φ D = φ D1 /φ D2 . Now, the mod-gaussian convergence can be interpreted as follows: if the sequence of distributions (L n ) n 1 converges in the mod-gaussian sense, with the sequence of parameters (m n ) n 1 and (σ 2 n ) n 1 , to a function φ ∈ S, then the deconvolution of L n by the gaussian distribution N (m n , σ 2 n ) converges, in the sense of the distributions with test space E, to the inverse Fourier transform of φ when n goes to infinity.
On the other hand, it is possible to enlarge the space of possible limit functions of mod-* convergence, by considering a weaker convergence.   ∈ (−a, a). Moreover, all the functions in S a can be obtained from a-mod-gaussian limit.
Proof. It is clear that all the a-mod-* limits are in S a . Conversely, from Theorem 1.2, all the restrictions to (−a, a) of functions in S can be obtained as a-mod-gaussian limits. The set of functions obtained in this way is dense in S a , for the uniform convergence on compact subsets of (−a, a). For a concrete example of a-mod-Gaussian convergence, we refer to Example 4 from [6] which is taken from random matrix theory and which is essentially due to Wieand [11]. Let T N ∈ U (N ) be a random unitary matrix which is Haar distributed. All eigenvalues are then on the unit circle. We consider the discrete valued random variable counting the number of eigenvalues lying in some fixed arc of the unit circle. More precisely, let γ ∈ (0, 1 2 ) and let I = {e 2iπθ ; |θ| γ}.
Then define X N to be the number of eigenvalues of T N in I. Using asymptotics of Toplitz determinants with discontinuous symbols, one can show that as N → ∞, for all |t| < π, where G is the Barnes double Gamma function. The restriction on t is necessary since the characteristic function of X N is 2π-periodic. We would like now to report on two interesting examples of a-mod-Cauchy convergence related to arithmetics and which are in fact re-interpretations of results of Vardi [10] and of Sarnak [9].
First, recall that a Cauchy variable with parameter γ > 0 is one with law given by The most natural definition of mod-Cauchy convergence would then be that (X N ) converges in mod-Cauchy sense with parameters (γ N ) and limiting function Φ if we have lim N →+∞ exp(γ N |t|)E[e itX N ] = Φ(t) and the limit is locally uniform in t (so Φ is continuous and Φ(0) = 1). Let's say that we have a-mod-Cauchy convergence if the limits above exist, locally uniformly, for |t| < a for some a > 0. This restricted convergence is sufficient to ensure the following: Fact 3.6. If (X N ) converges a-mod-Cauchy sense with parameters (γ N ) and some a > 0, then we have convergence in law We now detail our two examples of a-mod-Cauchy convergence from number theory. Note that, although they seem to involve very different objects, they are in fact closely related through the way they are proved using spectral theory for certain differential operator involving complex multiplier systems on the modular surface SL(2, Z)\H.
The latter is obtained as consequence (using the Fact above) of a restricted mod-Cauchy convergence.
Proof. This follows from [10, Prop. 2], after making minor notational adjustments. In particular: Vardi uses 2πr instead of t; the case t = 0 is omitted in Vardi's statement, but it is trivial; only the case 0 < r < 1 is mentioned, but there is a symmetry r ↔ −r (see [10, p. 7]) that extends the result to −1 < r 0.
we see from the error term that the formula gives, in fact, only restricted convergence with a well-defined limit for |t| < 4π 3 . It is not clear on theoretical grounds whether this is optimal or not (note also the pole of the first factor of Φ(t) for t = ±4π), but the numerical experiments summarized in Figures 1 to 4, which illustrate the behavior of for N 5000 and t ∈ {π/2, π, 2π, 4π}, tend to indicate that there is no limit when t is large (note in particular the y-scale for the last picture).
Concerning the limiting function, recall that the measure 3 π dxdy y 2 is a probability measure on the modular surface, so Φ(t) (surprisingly?) involves the inverse of a Laplace transform of the distribution function of log(y|η(z)| 4 ).
where C runs over the set Π of prime closed geodedics in SL(2, Z)\H and lk(k C ) is the linking number of a knot associated to C and the trefoil knot -the relation coming from an identification of the homogeneous space SL(2, Z)\SL(2, R) with the complement in S 3 of the trefoil knot. This is also accessible more concretely by the classical identification of Π with the set of primitive (i.e., not of the form g n , n 2) hyperbolic (i.e., with | Tr(g)| > 2) conjugacy classes in SL(2, Z). In this identification C ↔ g, one has lk(k C ) = ψ(g), where ψ : P SL(2, Z) → Z is a fairly classical map (called the Rademacher map), which is not a homomorphism but a "quasi-homomorphism" (namely, the map (g, h) → ψ(gh) − φ(g) − ψ(h) is bounded on P SL(2, Z) 2 ). In turn, this ψ-function is related to the multiplier system for the η function.  Now, for x > 0, let Π x = {g ∈ Π | N (g) x} where the "norm" N (g) is defined and related to the length (g) of the closed geodesic by N (g) = Tr(g) + Tr(g) 2 − 4 2 2 , (g) = log N (g).
Let P x denote the probability measure where each g ∈ Π x has weight proportional to (g); the normalizing factor to ensure that it is a probability measure is N (g) x log N (g) ∼ x as x → +∞, by Selberg's Prime Geodesic Theorem (this can be made much more precise, see e.g. [3]). Let E x denote the corresponding expectation operator.
Again, if one looks at the proof, one sees that this is deduced from: Theorem 3.10 (Sarnak). Let lk x denote the random variable g → lk(g) = ψ(g) on Π x . Then for |t| π/12, we have E x (e itlkx ) = exp(−|t|γ x )Φ 1 (t) + O(x 3/4 ) where γ x = 3 π (log x) and Φ 1 (t) = 1 Proof. Again, up to notational changes, this is given by [9, (16)] since the quantity v r (γ) there is given by v r (g) = e iπrψ(g)/6 for g ∈ Π and r ∈ R. So the r in loc. cit. is given by r = 6t/π to recover our formulation. This is again an example of restricted mod-Cauchy convergence. Again, we do not know how far the restriction on t is necessary. One may of course perform a summation by parts to remove the weight log N (g) = (g) from these results, if desired.