1 Introduction

1.1 Background

The statistics of gaps between energy levels in the semiclassical limit is a central problem in the theory of spectral statistics [6, 15]. The Berry–Tabor conjecture [3] asserts that (typical) integral systems have Poisson spacing statistics, and the Bohigas–Giannoni–Schmit conjecture [7] asserts that (generic) chaotic systems should have spacing statistics given by some ensemble of random matrix theory; in particular small gaps are unlikely.

More precisely, with \(\{ \lambda _{1} \le \lambda _{2} \le \ldots \}\) denoting the energy levels, suitably unfolded using the main term in Weyl’s law so that \(|\{ i : \lambda _{i} < E \}| \sim E\) for E large, define consecutive gaps, or spacings, \(s_{i} := \lambda _{i+1}-\lambda _{i}\). The level spacing distribution P(s), if it exists, is defined by

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{ |\{i \le N : s_{i} < x \}| }{N} = \int _{0}^{x} P(s) \, {\mathrm{d}}s \end{aligned}$$

for all \(x \ge 0\). For Poisson spacing statistics, \(P(s) = e^{-s}\), whereas (time reversible) chaotic systems should have Gaussian Orthogonal Ensemble (GOE) spacings, where \(P(s) \approx \pi s/2\) for s small, and \(P(s) \approx (\pi s/2) \exp ( - \pi s^{2}/4)\) for s large; in particular, there is linear vanishing at \(s=0\) (“level repulsion”).

1.1.1 Systems with Intermediate Statistics

There are also “pseudo-integrable” systems that are neither integrable nor chaotic. Their spectral statistics do not fall into the models described above and are believed to exhibit “intermediate statistics”, e.g., there is level repulsion as for random matrix theory systems, whereas P(s) has exponential tail decay similar to Poisson statistics (cf. [5]).

The point scatterer, or the Laplacian perturbed by a delta potential, for rectangular domains (i.e., in dimension \(d=2\), with Dirichlet boundary conditions) was introduced by Šeba [18] as a model for investigating the transition between integrability and chaos in quantum systems. For this model, Šeba found evidence for level repulsion of GOE type for small gaps as well as “wave chaos”, in particular Gaussian value distribution of eigenfunctions.

Shigehara [19] later pointed out that level repulsion in dimension two only occurs if “strength” of the coupling of the perturbation grows logarithmically with the eigenvalue \(\lambda \). (The perturbation is formally defined using von Neumann’s theory of self-adjoint extensions; in this setting, there is a one-parameter family of extensions, but any fixed parameter choice turns out to result in a “weak coupling limit” with no level repulsion, cf. [21, Section 3].) On the other hand, for dimension \(d=3\), Cheon and Shigehara [20] found GOE-type level repulsion for fixed (nontrivial) self-adjoint extensions for rectangular boxes, again with Dirichlet boundary conditions and the scatterer placed at the center of the box; placing the scatterer elsewhere appeared to weaken the repulsion.

On the assumption that the unperturbed spectrum has Poisson statistics (which, after desymmetrization, is expected to hold for rectangular billiards having Diophantine aspect ratio), GOE-type level repulsion and tails of Poisson type have been shown to hold by Bogomolny, Gerland and Schmit [4, 5]. In particular, for Šeba billiards with periodic boundary conditions, \(P(s) \sim (\pi \sqrt{3}/2) s\) (for s small), whereas \(P(s) \sim (1/8 \pi ^3) s \log ^{4} s\) in case of Dirichlet boundary condition and generic scatterer position.

1.1.2 Toral Point Scatterers

With \({\mathbb T}\) denoting a torus of dimension \(d=2\) or \(d=3\), a point scatterer on \({\mathbb T}\) is formally given by the Hamiltonian

$$\begin{aligned} H = H_{x_{0},\alpha } = - \Delta + \alpha \delta _{x_{0}} \end{aligned}$$

where \(\Delta \) is the Laplace operator acting on \(L^{2}({\mathbb T})\), \(x_{0} \in {\mathbb T}\) is the location of the point scatterer, and \(\alpha \ne 0 \) can be viewed as the “strength” of the perturbation; in the physics literature, \(\alpha \) is known as the coupling constant. For simpler notation we shall, without loss of generality, from here on assume that \(x_{0} =0\). A mathematically rigorous definition of H can be made via von Neumann’s theory of self-adjoint extensions. Below we will briefly summarize the most important properties and refer the reader to [1, 16, 21] for detailed discussions.

The addition of a \(\delta \)-potential is a rank-1 perturbation of the Laplacian, and the spectrum of H consists of two kinds of eigenvalues: “old” and “new” eigenvalues. Namely, an eigenvalue \(\lambda \) of the unperturbed Laplacian is also an eigenvalue of H, but with multiplicity reduced by one; the corresponding H-eigenspace is the linear subspace of Laplace eigenfunctions vanishing at \(x_{0}\). The set of new eigenvalues all have multiplicity one and strictly interlace with the set of unperturbed eigenvalues.

For generic tori, multiplicities of the unperturbed spectrum are easily seen to be bounded, but for arithmetic tori such as \({\mathbb T}={\mathbb R}^{2}/{\mathbb Z}^2\) or \({\mathbb T}={\mathbb R}^{3}/{\mathbb Z}^3\), multiplicities are unbounded and the set of new eigenvalues consists of a zero density subset of the full spectrum of H. This implies a singular level spacing distribution, namely \(P(s) = \delta (s)\). To avoid this degeneracy, it is natural to focus on the spectral statistics of the sequence formed by the new eigenvalues.

In dimensions two and three, Rudnick and Ueberschär [16] used trace formula techniques to investigate the spacing distribution for toral point scatterers. For \(d=2\), for a fixed self-adjoint extension (resulting in a weak coupling limit) they showed that the spectral statistics of the new eigenvalues is the same as for the unperturbed spectrum after removing multiplicities. We remark that in the Diophantine aspect ratio case it is believed that the unperturbed spectrum, after removing some systematic (and bounded) multiplicities, is of Poisson type; for partial results in this direction cf. [8, 17]. Further, for the square torus \({\mathbb R}^2/{\mathbb Z}^2\), Poisson statistics is known to hold [9] for the unperturbed spectrum, again after removing (unbounded) multiplicities, and assuming certain analogs of the Hardy–Littlewood prime k-tuple conjecture for sums of two integer squares.

For \(d=3\), a fixed self-adjoint extension results in a strong coupling limit,Footnote 1 and here Rudnick–Ueberschär gave evidence for level repulsion: The mean displacement between old and new eigenvalues was shown to equal half the mean spacing between the old eigenvalues. However, the method does not rule out the level spacing distribution having (say) positive mass at \(s=0\).

1.2 Results

The purpose of this paper is to show that there is strong level repulsion between the set of new eigenvalues for point scatters on arithmetic tori in dimension three. To state our main result, we need to describe some basic properties of the model.

1.2.1 Toral Point Scatterers for Arithmetic Tori

Let \({\mathbb T}:= {\mathbb R}^3/(2\pi {\mathbb Z}^3)\) denote the standard flat torus in dimension three. The spectrum of the unperturbed Laplacian on \({\mathbb T}\) is arithmetic in nature, and given by \(\{m \in {\mathbb Z}: r_{3}(m) > 0 \}\), where

$$\begin{aligned} r_{3}(m) := | \{ v \in {\mathbb Z}^3 : |v|^{2} = m \}| \end{aligned}$$

denotes the number of ways \(m \in {\mathbb Z}\) can be written as a sum of three integer squares, and the multiplicity of each eigenvalue m is given by \(r_{3}(m)\). Associated with each old/unperturbed eigenvalue m, there is a new eigenvalue \(\lambda _{m}\) (of multiplicity one) of H, and the set of new eigenvalues \(\{ \lambda _{m} : m \in {\mathbb N}, r_{3}(m)>0 \}\) interlace between the old eigenvalues. More precisely, the corresponding new eigenfunction \(\psi _{\lambda _{m}}(x)\) is given by the Green’s function

$$\begin{aligned} \sum _{v \in {\mathbb Z}^3} \frac{e^{ i v \cdot x}}{|v|^{2}-\lambda _{m}} , \quad x \in {\mathbb T}, \end{aligned}$$

(in \(L^{2}\) sense), with the new eigenvalue \(\lambda _{m}\) being a solution to the spectral equation

$$\begin{aligned} G(\lambda ) = \frac{1}{\nu }; \quad G(\lambda ) := \sum _{n} r_{3}(n) \left( \frac{1}{n - \lambda } - \frac{n}{n^{2}+1} \right) , \end{aligned}$$
(1)

where \(\nu \ne 0\) parametrizes the self-adjoint extension; in the physics literature, it is known as the “formal strength” of the perturbation (cf. [20, Eq. (4)]).

It is convenient to use the following labeling of the new spectrum: given \(m \ge 1\) such that \(r_{3}(m)>0\), let \(\lambda _{m}\) denote the largest solution to \(G(\lambda ) = 1/\nu \) such that \(\lambda < m\). (In particular, note that \(\lambda _{m}\) does not denote the m-th new eigenvalue since \(r_{3}(m)=0\) for a positive proportion of integers.) For m such that \(r_{3}(m)>0\), let \(m_{+}\) denote the smallest integer \(n > m\) such that \(r_{3}(n)>0\), and define the consecutive spacing

$$\begin{aligned} s_{m} := \lambda _{m_{+}} - \lambda _{m}. \end{aligned}$$

Note that \(\{ s_{m} \}_{m : r_{3}(m)>0}\) has mean 6/5 (rather than one), but as we are concerned mainly with the frequency of very small spacings this shall not concern us.

1.2.2 Statement of the Main Result

We show that the cumulant of the nearest-neighbor distribution essentially has fourth order vanishing near the origin, and hence considerably stronger repulsion than the quadratic order vanishing of the cumulant in the GOE-model.

Theorem 1

Given \(\epsilon ,\gamma \in (0,1/2)\), there exists \(X=X(\epsilon ,\gamma )>0\) such that for all \(x \ge X\),

$$\begin{aligned} \frac{|\{ m \le x : r_{3}(m)>0, s_{m} < \epsilon \} |}{ |\{ m \le x : r_{3}(m) >0 \}|} = O_{\gamma }(\epsilon ^{4-\gamma }). \end{aligned}$$

1.3 Discussion

As Figs. 1 and 2 indicate, spectral gap statistics for 3d arithmetic point scatterers is clearly non-generic since there is no mass at all in the tail. The reason is the “old” spectrum being very rigid—all positive integers n except the ones ruled out by simple congruence conditions (i.e., n’s of the form \(n = 4^{k} \cdot m\) for \(m \equiv 7 \mod 8\)) satisfy \(r_{3}(n)>0\); hence, the gaps between new eigenvalues are easily seen to be bounded above by 4.

Fig. 1
figure 1

Histogram illustration of the distribution of \(s_{m}\), for \(m \le 10\),000 (and \(r_{3}(m)>0)\)

The main driving force of the occurrence of small gaps, say between two new eigenvalues \(\lambda _{m}\) and \(\lambda _{m+}\), is arithmetic in nature and mainly due to fluctuations in \(r_{3}(m)\). To see this, we note the following simple consequence of the mean value theorem.

Lemma 2

Assume that \(A,B >0\). Let \(f(\delta )\) be a smooth function on \([-1/2,1/2]\) such that \(0 < f'(\delta ) \le B\) for \(|\delta | \le 1/2\). If the equation

$$\begin{aligned} f(\delta ) = A/\delta \end{aligned}$$

has two roots \(\delta _{1} \in [-1/2,0)\) and \(\delta _{2} \in (0,1/2]\), thenFootnote 2

$$\begin{aligned} |\delta _{2}-\delta _{1} | \gg \sqrt{A/B}. \end{aligned}$$

Now, \(\lambda _{m}\) and \(\lambda _{m^{+}}\) are consecutive roots, near \(\lambda = m\), of the spectral equation (cf. (1))

$$\begin{aligned} \sum _{n} r_{3}(n) \left( \frac{1}{n-\lambda } - \frac{n}{n^{2}+1} \right) = 1/\nu \end{aligned}$$

Writing \(\lambda = m + \delta \), we can apply Lemma 2 with \(A = r_{3}(m)\) and

$$\begin{aligned} B = \max _{|\delta | \le 1/2} \sum _{n \ne m} \frac{r_{3}(n) }{(n-m-\delta )^{2}} \ll \sum _{k \in {\mathbb Z}, k \ne 0} \frac{r_{3}(m+k)}{k^{2}} \end{aligned}$$

and deduce that \(s_{m} \gg \sqrt{r_{3}(m)/B}\). One then (roughly) proceeds by noting that B is very rarely \(\gg \sqrt{m}\) (cf. Lemma 7), and that \(r_{3}(m)\) is very rarely \(\ll \sqrt{m}/2^{l}\) (cf. Propositions 3 and 4), with l denoting the largest integer such that \(4^{l} | m\).

In particular, the small gap repulsion is not due to the lack of time reversal symmetry, but rather due to \(r_{3}(m)/\sqrt{m}\) being small extremely rarely unless \(4^{l} | m\) for some high exponent l. In fact, as indicated by the plots below (as well as by the proof of Theorem 1), small gap occurrences are mainly due to integers m that are divisible by large powers of 4.

Fig. 2
figure 2

Histogram illustration of the distribution of \(s_{m}\) (for m such that \(r_{3}(m)>0\)), along the progressions \(\{m = 4^{10}\cdot k : k \le 10{,}000\}\) (left) and for \(\{m = 4^{20}\cdot k : k \le 10{,}000\}\) (right)

Likely the true order of vanishing of the cumulant is higher—a heuristic argument, on the assumption of the joint distribution of \((G_m(0),G_m'(0))\) being (weakly) independent as well as \(G_m(0)\) having a smooth and nonvanishing probability density function, suggests sixth order vanishing of the cumulant.

Determining the spacing statistics for the set of new eigenvalues associated with toral point scatterers on the two-dimensional torus \({\mathbb R}^2/{\mathbb Z}^2\) would also be interesting. However, there are several obstacles that must be overcome: At the rigorous level, we know little about the spacings between eigenvalues of the unperturbed Laplacian, and fluctuations of \(r_2(n)\) are quite wild, e.g., \(r_2(n)=0\) for n in a full density subset of the positive integers, and the normal order of \(r_2(n)\) is much smaller than the average order when conditioning on \(r_2(n)>0\). Even obtaining convincing numerics is challenging—exploring ranges where the mean spacing is of size 50 involves factoring many integers having hundreds of digits. (Finding \(r_2(n)\), or even determining if \(r_2(n)>0\), seems very hard to do without being able to factor n for a reasonably numerous set of “difficult” n’s.)

2 Number Theoretic Background

The purpose of this section is to give bounds on how often \(r_{3}(n)\) is “unusually” small or large. We begin by recalling various useful results for \(r_{3}(n)\). A classical result of Legendre (cf. [11, Chapter 3.1]) asserts that \(r_{3}(n) \ne 0\) if and only if n is not of the form \(4^{a} (8k+7)\) for \(a,k \in {\mathbb Z}_{\ge 0}\), and we also have

$$\begin{aligned} r_{3}(4^{a}n) = r_{3}(n). \end{aligned}$$

We also remark that Heath–Brown’s estimate (cf. [13])

$$\begin{aligned} \sum _{n \le R^{2}} r_{3}(n) = \sum _{v \in {\mathbb Z}^3 : |v|^{2} \le R^{2}} 1 = \frac{4\pi }{3} R^{3} + O(R^{21/16}) \end{aligned}$$

was very helpful for the numerics behind Figs. 1 and 2,

2.1 Sums of Three Squares and Values of L-Functions

With \(R_{3}(n)\) denoting the number of primitive representations of n, i.e., the number of ways to write \(n = x^{2}+y^{2}+z^{2}\) for xyz coprime, we have the following basic identity:

$$\begin{aligned} r_{3}(n) = \sum _{d^2|n} R_{3}(n/d^{2}). \end{aligned}$$

The reason for introducing \(R_{3}(n)\) is Gauss’ marvelous identity (cf. [11, Ch. 4.8])

$$\begin{aligned} R_{3}(n) = \pi ^{-1} \mu _{n} \sqrt{n} L(1,\chi _{-4n}) \end{aligned}$$

where \(\mu _{n} = 0\) for \(n \equiv 0,4,7 \mod 8\), \(\mu _{n} = 16\) for \(n \equiv 3 \mod 8\), and \(\mu _{n} = 24\) for \(n \equiv 1,2,5,6 \mod 8\); here the Dirichlet series

$$\begin{aligned} L(s,\chi _{-4n}) := \sum _{m=1}^{\infty } \chi _{-4n}(m)/m^{s} \end{aligned}$$

converges for \({\text {Re}}(s)>0\), where \(\chi _{-4n}\) is a quadratic character on \(({\mathbb Z}/4n{\mathbb Z})^{\times }\) defined via the Kronecker symbol

$$\begin{aligned} \chi _{-4n}(m) := \left( \frac{-4n}{m} \right) \end{aligned}$$

(roughly, it can be viewed as an extension of the Jacobi symbol to even moduli, for details cf. [14, Ch. 3.5].)

Now, \(-\,4n\) is not always a fundamental discriminant,Footnote 3 but if \(R_{3}(n) >0\) and we write

$$\begin{aligned} n = c^{2} d \end{aligned}$$

where d is square free and \(4 \not \mid c^{2}d\), then \(L(1,\chi _{-4n})\) and \(L(1,\chi _{-4d})\) have the same Euler product factors,Footnote 4 except possibly at primes p dividing \(2 c^{2}\) (to see this, use that \(\left( \frac{-4c^{2}d}{p} \right) = \left( \frac{-4d}{p} \right) \cdot \left( \frac{c}{p} \right) ^{2}\) by multiplicativity of the Kronecker symbol, together with \(\left( \frac{ c}{p} \right) = \pm 1\) for \(p \not \mid c\).)

Moreover, \(-4d\) is a fundamental discriminant if \(n \equiv 1,2,5,6 \mod 8\). If \(n \equiv 3 \mod 8\), then \(-4d\) is not a fundamental discriminant but \(-d\) is, and the Euler factors of \(L(1,\chi _{-4d})\) and \(L(1,\chi _{-d})\) are the same at all odd primes. (Recall that \(n \not \equiv 0,4,7 \mod 8\) since we assume that \(R_{3}(n) >0\).)

Thus, if \(4 \not \mid n_{0}\) and we write \(n_{0} = c^{2} d\) with d square free, we note the following useful lower bound in terms of L-functions attached to primitive characters (associated with fundamental discriminants)

$$\begin{aligned} L(1,\chi _{-4n_{0}}) \gg {\left\{ \begin{array}{ll} (\phi (c)/c) L(1, \chi _{-4d}) &{} \text { if }n_{0} \equiv 1,2,5,6 \mod 8, \\ (\phi (c)/c) L(1, \chi _{-d}) &{} \text { if }n_{0} \equiv 3 \mod 8, \end{array}\right. } \end{aligned}$$
(2)

where \(\phi \) denotes Euler’s totient function.

We next note a crucial estimate stating that \(L(1,\chi _{-4n})\) is small very rarely. With

$$\begin{aligned} FD := \{ d \in {\mathbb Z}: d <0 \text { and} d \text {is a fundamental discriminant} \}. \end{aligned}$$

the following proposition is an easy consequence of [10, Proposition 1].

Proposition 3

There exists \(\kappa >0\) such that for \(T \ge 1\), we have

$$\begin{aligned} \left| \left\{ d \le x : -d \in FD, L(1,\chi _{-d}) < \frac{\pi ^{2}}{6 e^{\gamma } T} \right\} \right| \ll x \exp (- \kappa \cdot e^{T}/T) \end{aligned}$$

as \(x \rightarrow \infty \), where \(\gamma = 0.577 \cdots \) denotes Euler’s constant.

For imprimitive quadratic characters, we will use the following weaker bound. (The loss can be controlled if we have a bound on the largest odd square divisor of d.)

Proposition 4

The number of integers \(n \le x\) of the form \(n = 4^{l} n_{0} = 4^{l} c^{2} d\), where d is square free, \(c \le C\), \(4 \not \mid c^{2}d\), and

$$\begin{aligned} L(1,\chi _{-4n_{0}}) \le 1/T \end{aligned}$$

is

$$\begin{aligned} \ll \frac{x}{4^{l}} \exp ( - (T/\log \log C)^{4}) \end{aligned}$$

for all integer \(l \ge 0\).

Proof

Using (2), together with the bound \(c/\phi (c) \ll \log \log c \le \log \log C\) (cf. [12, Theorem 328]) we find that for cl fixed, we have (note that either \(-d\) or \(-4d\) is a fundamental discriminant)

$$\begin{aligned} |\{ n \le x : n = 4^{l} n_{0} = 4^{l} c^{2} d, L(1,\chi _{-4n_0}) \le 1/T \}|\\ \ll \left| \left\{ d \le 4x/(c^{2}4^{l}) : -d \in FD, L(1,\chi _{-d}) \ll \frac{\log \log C}{T} \right\} \right| \end{aligned}$$

which, by Proposition 3 (with plenty of room to spare), is

$$\begin{aligned} \ll x/(c^{2}4^{l}) \exp ( -(T/\log \log C)^{4}) \end{aligned}$$

as \(x \rightarrow \infty \). Summing over \(c \le C\) the proof is concluded. \(\square \)

2.2 Estimates on Moments of \(r_{3}(n)\)

We recall the following bound by Barban [2].

Theorem 5

For \(k \in {\mathbb N}\) we have

$$\begin{aligned} \sum _{1 \le d \le x} L(1,\chi _{-d})^{k} \ll _{k} x. \end{aligned}$$

We can now easily deduce bounds on the (normalized) moments of \(r_{3}(n)\).

Proposition 6

For \(k \in {\mathbb N}\) we have

$$\begin{aligned} \sum _{n \le x} (r_{3}(n)/\sqrt{n})^{k} \ll _{k} x \end{aligned}$$

Proof

Writing \(n = 4^{l} n_{0}\) so that \(4 \not \mid n_{0}\) (later we will write \(4^{l} || n\) to denote that \(4^{l}\) is the largest power of 4 that divides n), we have

$$\begin{aligned} r_{3}(n) = r_{3}(n_{0}) \end{aligned}$$

and we further recall the identities (cf. Sect. 2.1)

$$\begin{aligned} r_{3}(n) = \sum _{d^2|n} R_{3}(n/d^{2}) \end{aligned}$$

and

$$\begin{aligned} R_{3}(n) = \mu _{n} \sqrt{n} L(1,\chi _{-4n}), \end{aligned}$$

where \(\mu _{n} = 0\) if \(n \equiv 0,4,7 \mod 8\); otherwise \(16 \le \mu _n \le 24\). Thus,

$$\begin{aligned} \sum _{n\le x} (r_{3}(n)/\sqrt{n})^{k}&= \sum _{n\le x} \left( \sum _{\begin{array}{c} d^2|n, d \equiv 1 \mod 2\\ {}4^{l}|| n \end{array}} \frac{R_{3}(n/(4^{l}d^{2}))}{\sqrt{n}} \right) ^{k}\\{}&\ll \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ {} d \text {odd} \end{array}} \sum _{\begin{array}{c} d_1,\ldots , d_{k} \\ {}[d_{1}, \ldots , d_{k}] = d \end{array}} \sum _{\begin{array}{c} n \le x : d^{2} |n \\ {} 4^{l} || n \end{array}} \prod _{i=1}^{k} \frac{R_{3}(n/(4^{l}d_{i}^{2}))}{\sqrt{n}} \\&\ll {} \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ {} d \text {odd} \end{array}} \sum _{\begin{array}{c} d_1,\ldots , d_{k} \\ {}[d_{1}, \ldots , d_{k}] = d \end{array}} \sum _{\begin{array}{c} n_{0} \le x/ d^{2}4^{l} \end{array}} \prod _{i=1}^{k} \frac{R_{3}(n_{0}d^{2}/d_{i}^{2})}{\sqrt{d^{2}4^{l}n_{0}}} \\&\ll {} \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ {} d \text {odd} \end{array}} \sum _{\begin{array}{c} d_1,\ldots , d_{k} \\ {} [d_{1}, \ldots , d_{k}] = d \end{array}} \sum _{\begin{array}{c} n_{0} \le x/ d^{2}4^{l} \end{array}} \prod _{i=1}^{k} \frac{L(1, \chi _{-4n_{0}d^{2}/d_{i}^{2}})}{d_{i}} \ll \end{aligned}$$

which, on notingFootnote 5 that \(L(1, \chi _{-4n_{0}d^{2}/d_{i}^{2}}) \ll L(1, \chi _{-4n_{0}}) d/\phi (d) \) is

$$\begin{aligned} \ll \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ d \text { odd} \end{array}} \sum _{\begin{array}{c} d_1,\ldots , d_{k} |d \end{array}} \frac{(d/\phi (d))^{k}}{d_{1}\cdots d_{k}} \sum _{\begin{array}{c} n_{0} \le x/ d^{2}4^{l} \end{array}} L(1, \chi _{-4n_{0}})^{k}. \end{aligned}$$
(3)

By Theorem 5, the inner sum over \(n_{0}\) is \(\ll _{k} x/(d^{2}4^{l})\), and hence (3) is

$$\begin{aligned} \ll \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ d \text { odd} \end{array}} \sum _{\begin{array}{c} d_1,\ldots , d_{k} |d \end{array}} \frac{(d/\phi (d))^{k}}{d_{1}\cdots d_{k}} \frac{x}{d^{2}4^{l}} \ll \sum _{l : 4^{l} \le x} \sum _{\begin{array}{c} d^{2} \le x/4^{l} \\ d \text { odd} \end{array}} (d/\phi (d))^{2k} \frac{x}{d^{2}4^{l}} \ll x \end{aligned}$$

using that \(\sum _{d_i|d} 1/d_{i} \ll d/\phi (d) \ll d^{o(1)}\). \(\square \)

3 Proof of Theorem 1

Let \({{\mathcal {N}}}_3:= \{ n \in {\mathbb N}: r_{3}(n)>0 \}\) denote the unperturbed spectrum. Given \(n \in {{\mathcal {N}}}_3\), let \(n_{+}\) denote its nearest right neighbor in \({{\mathcal {N}}}_3\), and recall our definition

$$\begin{aligned} s_{n} := \lambda _{n_{+}} - \lambda _{n} \end{aligned}$$

of the nearest-neighbor spacing between the two new eigenvalues \(\lambda _{n}, \lambda _{n_{+}}\).

The spectral equation (cf. (1)) is then given by

$$\begin{aligned} \sum _{n} r_{3}(n) \left( \frac{1}{n-\lambda } - \frac{n}{n^{2}+1} \right) = 1/\nu \end{aligned}$$

where \(\nu \) is constant. We remark that the method of proof gives the same result when \(1/\nu \) is allowed to change, sufficiently smoothly, with \(\lambda \) (e.g., provided \(| \frac{\mathrm{d}}{\mathrm{d} \lambda }(1/\nu (\lambda ))| \ll \lambda ^{1/2-\epsilon }\) for some \(\epsilon >0\); this ensures a unique root in each interval \((m,m_{+})\) since the derivative of the left-hand side of (4) is strictly positive, as well as suitably bounded from above for \(\delta \) small.)

Given \(m \in {{\mathcal {N}}}_3\), set \(\lambda = m+\delta \), and define

$$\begin{aligned} G_{m}(\delta ) := \sum _{n \in {\mathbb N}: n \ne m} r_{3}(n) \left( \frac{1}{n-m-\delta } - \frac{n}{n^{2}+1} \right) ; \end{aligned}$$

we can then rewrite the spectral equation as:

$$\begin{aligned} G_{m}(\delta ) -1/\nu = \frac{r_{3}(m) }{\delta } \end{aligned}$$
(4)

For \(|\delta | \le 1/2\), we find (note that all terms are positive)

$$\begin{aligned} 0 < G_{m}'(\delta ) \ll \sum _{n \in {\mathbb N}: n \ne m} \frac{r_{3}(n) }{(n-m)^{2}} = \sum _{h \in {\mathbb Z}, h \ne 0} \frac{r_{3}(m+h)}{h^{2}} \end{aligned}$$

To apply Lemma 2, we will need to bound \(G_{m}'\) from above.

Lemma 7

Given \(k \in {\mathbb N}\), we have

$$\begin{aligned} \sum _{n \le x} \left( \sum _{h \in {\mathbb Z}, h \ne 0} \frac{r_{3}(n+h)}{h^{2} \sqrt{n}} \right) ^{k} \ll _{k} x \end{aligned}$$

and consequently

$$\begin{aligned} \left| \left\{ n \le x : \sum _{h \in {\mathbb Z}, h \ne 0} \frac{r_{3}(n+h)}{ h^{2} \sqrt{n}} > T \right\} \right| \ll _{k} x/T^{k} \end{aligned}$$

Proof

We begin by deducing a bound on \(r_{3}(n)\). Letting \(r_{2}(n)\) denote the number of representations \(n = x^{2}+y^{2}\) with \(x,y \in {\mathbb Z}\), we have \(r_{3}(n) \le 2 \sum _{0 \le k \le \sqrt{n}} r_{2}(n-k^{2})\). Using that \(r_{2}(n) \ll _{\epsilon } n^{\epsilon }\) for all \(\epsilon >0\) (cf. [12, Theorem 338]) we find that \(r_{3}(n) \ll _{\epsilon } n^{1/2+\epsilon }\) for all \(\epsilon >0\).

Since \(r_{3}(n+h) \ll _{\epsilon } h^{1/2+\epsilon }\) for \(h > n\), and \(r_{3}(n+h) \ll _{\epsilon } n^{1/2+\epsilon }\) for \(|h| \le n\), we easily see that

$$\begin{aligned} \sum _{h \in {\mathbb Z}, |h| \ge n^{1/2}} \frac{r_{3}(n+h)}{h^{2} \sqrt{n}} \ll _{\epsilon } n^{-1/2+\epsilon } \end{aligned}$$

Thus, using the inequality \((A+B)^{k} \ll _{k} A^{k} + B^{k}\), it suffices to show that \( \sum _{n \le x} \left( \sum _{h \in {\mathbb Z}, 0 < |h| \le \sqrt{n}} \frac{r_{3}(n+h)}{h^{2} \sqrt{n}} \right) ^{k} \ll _{k} x. \)

Expanding out the k-th power expression and using the inequality \(\frac{1}{h^{1/2}n^{1/2}} \ll \frac{1}{\sqrt{n+h}}\), it is enough to show that for \(0 <|h_{1}|,|h_{2}|, \ldots , |h_{k}| \le x^{1/2}\) we have

$$\begin{aligned} \sum _{n \le x } \frac{r_{3}(n+h_{1}) \cdots r_{3}(n+h_{k})}{\prod _{i=1}^{k}(n+h_{i})^{1/2}} \ll _{k} x \end{aligned}$$

and this follows from Hölder’s inequality and the bound on the sum \(\sum _{n \le x} (r_{3}(n)/\sqrt{n})^{k}\) given by Proposition 6. \(\square \)

In particular, for \(k \in {\mathbb N}\), we have

$$\begin{aligned} |\{ m \le x : G_{m}'(\delta )/\sqrt{m} > T \hbox { for}\ |\delta | \le 1/2 \}| \ll _{k} x/T^{k} \end{aligned}$$
(5)

Now, if \(s_{m} = \lambda _{m_{+}}-\lambda _{m}\) denotes the distances between two consecutive new eigenvalues near \(m \in {{\mathcal {N}}}_3\), we have one of the following: Either one or both new eigenvalues lie outside \([m-1/2, m+1/2]\) in which case \(s_{m} \ge 1/2\). In case both eigenvalues lie in \([m-1/2, m+1/2]\), Lemma 2 gives that

$$\begin{aligned} s_{m} \gg \sqrt{r_{3}(m)/G_{m}'(0)} = \sqrt{A(m)/B(m)}. \end{aligned}$$
(6)

where we define

$$\begin{aligned} A = A(m) := r_{3}(m)/\sqrt{m}, \quad B = B(m) := G_{m}'(0)/\sqrt{m} \end{aligned}$$

Proposition 8

For any \(\gamma >0\) we have, as \(x \rightarrow \infty \),

$$\begin{aligned} |\{ n \le x : 0< r_{3}(n)/G'_{n}(0) \le \epsilon ^{2} \}| \ll _{\gamma } \epsilon ^{4-\gamma } \cdot x \end{aligned}$$

Proof

Given n such that \(r_{3}(n) > 0\), write \(n=4^{l} n_{0} =4^{l} c^{2} d\), where \(4 \not \mid n_{0}\) and d is squarefree. Recalling that \(G_{n}'(0) = \sqrt{n} B(n)\) and

$$\begin{aligned} r_{3}(n) \ge R_{3}(n_{0}) \gg \frac{\sqrt{n}}{2^{l}} L(1, \chi _{-4n_{0}}) \end{aligned}$$

it is enough to estimate the number of \(n = 4^{l} n_{0} \le x\), for which

$$\begin{aligned} \epsilon ^{2} \ge r_{3}(n)/G_{n}'(0) \gg \frac{L(1,\chi _{-4n_0})}{2^{l} B(4^{l}n_{0})} \end{aligned}$$
(7)

We start by noting that integers n that have a large odd square factors are quite rare. More precisely, the number of \(n \le x\) such that \(n = 4^{l} c^{2} d\) for \(c \ge \epsilon ^{-6}\) is \(\ll x/\epsilon ^{-6} = \epsilon ^{6} x\). We may thus assume that \(c\le \epsilon ^{-6}\) in any such decomposition.

Now, given \(n, \epsilon \), define real numbers \(a=a(n),b=b(n)\) and \(m = m(\epsilon )\) such that

$$\begin{aligned} L(1,\chi _{-4n_{0}}) = 2^{-a} \quad B(n) = 2^{b}, \quad \epsilon = 2^{-m}. \end{aligned}$$

Moreover, (7) is equivalent to \( 2^{-2m} \gg \frac{2^{-a}}{2^{l} 2^{b}} \), i.e.,

$$\begin{aligned} a+b+l \ge 2m + O(1). \end{aligned}$$

First case: \(a > \sqrt{m}\). Noting that \(c \le 1/\epsilon ^{6} = 2^{6m}\) implies that \(\log \log c \le \log 6m\), Proposition 4 gives that the number of \(n \le x\) such that \( L(1,\chi _{-4_{n_{0}}}) = 2^{-a} \le 2^{-\sqrt{m}}\) is

$$\begin{aligned} \ll x \exp (-(2^{\sqrt{m}}/( \log 4m))^{4}) \ll x \exp ( -1000m) = o(\epsilon ^{10} x), \end{aligned}$$

and thus “small” values of \(L(1,\chi _{-4_{n_{0}}})\) are very rare.

Second case: \( a \le \sqrt{m}\). We next treat the case of \(L(1,\chi _{-4_{n_{0}}}) \) not being “small”. Given a large positive integer \(k \asymp 8/\gamma \), we first consider l such that \(l < 2m(1-\gamma ) - \sqrt{m}\). Here we have some control of the size of powers of 4 dividing n, and this will force B(n) to be quite large. Namely, the upper bound on l implies that

$$\begin{aligned} b \ge 2m + O(1) - a - l \ge 2m +O(1) - \sqrt{m} - 2m(1-\gamma ) + \sqrt{m} \ge 2m \gamma +O(1) \end{aligned}$$

for all sufficiently large m. Using the bound in (5) (recall that \(k \asymp 8/\gamma \) and that \(B(n) = G_{n}'(0)/\sqrt{n}\)), such numbers are quite rare in the sense that the number of \(n \le x\) such that \(B(n) \ge 2^{\gamma m}\) is

$$\begin{aligned} \ll _{\gamma } x/(2^{\gamma m} )^{8/\gamma } \ll _{\gamma } x/(2^{8m}) = O_{\gamma }( \epsilon ^{8} x) \end{aligned}$$

The remaining case—giving the main contribution to the set of small gaps—are integers n being divisible by a large power of 4, and these are easily bounded. Namely, the number of \(n \le x\) such that \(4^{l}|n\) for some \(l \ge 2m(1-\gamma ) - \sqrt{m}\) is

$$\begin{aligned} \ll x/2^{2 (2m(1-\gamma ) - \sqrt{m})} \ll x \epsilon ^{4(1-\gamma ) +o(1)} \end{aligned}$$

as \(m \rightarrow \infty \) (or equivalently, that \(\epsilon = 2^{-m}\rightarrow 0\)), thereby concluding the proof.

\(\square \)