Abstract
We show that every sufficiently large integer is a sum of a prime and two almost prime squares, and also a sum of a smooth number and two almost prime squares. The number of such representations is of the expected order of magnitude. We likewise treat representations of shifted primes \(p-1\) as sums of two almost prime squares. The methods involve a combination of analytic, automorphic and algebraic arguments to handle representations by restricted binary quadratic forms with a high degree of uniformity.
Similar content being viewed by others
1 Introduction
1.1 Statement of results.
In his first paper, Hooley [Ho1] gave a proof, conditional upon the Generalized Riemann Hypothesis, of a conjecture made by Hardy and Littlewood some 30 years before [HL]: the number of representations of an integer n in the form
in primes p and non-zero integers \(x_1, x_2\) as \(n \rightarrow \infty \) is asymptotic to
In particular every sufficiently large number is representable in this form. The Generalized Riemann Hypothesis (GRH) was removed a few years later by Linnik [Li] who used his dispersion method to obtain the first unconditional proof of the Hardy-Littlewood problem. The proof could be greatly streamlined and simplified once the Bombieri-Vinogradov theorem became available, cf. e.g. [Ho2]. The best error term \(O_A(n (\log n)^{-A})\) in the asymptotic formula was recently established by Assing, Blomer and Li [ABL] by invoking in addition the theory of automorphic forms.
In his last paper, Hooley [Ho3] returned to this topic and considered a refined version. He showed that every sufficiently large number is representable as a sum of a prime and two squares of square-free numbers and gave a lower bound for the number of such representations of the expected order of magnitude. At first sight this appears to be easy—square-free numbers can be detected by zero-dimensional sieve, so only a little bit of inclusion–exclusion is required. On second thought, a major issue emerges: to sieve out even a fixed number of squares of primes, one needs an asymptotic formula for (1.1) with \(d_1 \mid x_1\), \(d_2 \mid x_2\). However, the quadratic form \(d_1^2 x_1^2 + d_2^2 x_2^2\) fails to have class number one, in fact very soon it even fails to have one class per genus, and hence there is no convolution formula for the number of ways it represents a given integer. Without such a formula, all approaches to the Hardy-Littlewood problem and its variations become very problematic. Hooley finds an ingenious (but rather ad hoc) way to circumvent this issue altogether for the particular problem at hand.
In this paper, we continue on this avenue. We introduce a general sieve process into the problem with the goal of showing that every sufficiently large integer is a sum of a prime and two squares of almost primes. The density of such representations requires us to have a good error term (a saving of at least \((\log x)^{3+\epsilon }\)) in solutions to (1.1), which ultimately requires the study of primes in arithmetic progressions to large moduli that go beyond the range that GRH could handle individually. More importantly, in stark contrast to Hooley’s result, the application of a sieve to make \(x_1, x_2\) almost prime requires the coefficients \(d_1\) and \(d_2\) to be as large as some fixed power of n. In addition to the fact that the quadratic forms \(d_1^2x_1^2 + d_2^2x_2^2\) fail to have class number one, this large range of uniformity for individual \(d_1,d_2\) seems to require inevitably a count of solutions of (1.1) with a power-saving error term, which is only obtainable under some form of GRH.
We will give more details on the combination of algebraic, automorphic and analytic methods involved in a moment, but emphasize at this point that we are nevertheless able to solve this problem unconditionally.
Theorem 1.1
There exists a constant \(C > 0\) such that every sufficiently large integer \(n\equiv 1, 3\bmod 6\) can be represented in the form (1.1), where \(p\) is a prime and \(x_1, x_2\) are integers all of whose prime factors are greater than \(n^{1/C}\). The number of such representations is \(\gg {\mathfrak {S}}(n) n (\log n)^{-3}\) with \({\mathfrak {S}}(n)\) as in (1.2). In particular, every sufficiently large integer \(n\equiv 1, 3 \mod 6\) can be written as the sum of a prime and two squares of integers with no more than C prime factors.
Theorem 1.1 gives the expected order of magnitude given the restrictions on \(x_1,x_2\). We have made no effort to compute and optimize the value of C, although an upper bound of size \(C \le 10^6\), say, can easily be obtained by careful book-keeping. Note that the congruence condition on n is necessary to guarantee solutions to (1.1) with \((x_1x_2, 6)=1\) and \(p\) a prime greater than 3. At the cost of slightly more work it is possible to remove the condition modulo 6 if \(x_1,x_2\) are permitted to have the small primes \(2\) and \(3\) as factors. Going further, our method can be generalized to deal with general fixed quadratic forms \(F(x_1, x_2)\) of fundamental discriminant, instead of \(x_1^2 + x_2^2\), cf. the remark after Lemma 2.2.
The technique used in the proof of Theorem 1.1 also gives the following result.
Theorem 1.2
There exists a constant \(C>0\) such that the number of solutions to the equation
in primes \(p \le x\) and \(x_1, x_2\) all of whose prime factors are greater than \(p^{1/C}\) is \(\gg x (\log x)^{-3}\). In particular, there are infinitely many primes shifted by one that can be written as two squares of almost primes.
This should be compared with a beautiful recent result of Friedlander-Iwaniec [FI2] who considered (1.3) for a prime \(x_1\) and an almost prime \(x_2\), but without the additive shift. The multiplicativity in Gaußian primes plays a crucial rule in their proof. Adding a shift parameter ruins the multiplicative structure and so it is no surprise that our proof takes a different route, see Sect. 1.2. Note that Theorem 1.2 is slightly easier than Theorem 1.1 as the shift parameter is fixed (although of course one can prove a similar result with a more general shift f uniformly for \(|f|\ll x\)).
From a multiplicative point of view, complementary to primes are smooth numbers (“entiers friables” in French) all of whose prime factors are small. The smallness is measured in terms of a parameter y, and following the usual notation we let \(\Psi (x, y) = \#S(x, y)\) where S(x, y) denotes the set of positive integers \(m \le x\) such that \(p\mid m\) implies \(p \le y\) for all primes p. Smooth numbers can appear quite frequently (e.g \(\Psi (x,y)\gg x\) if \(\log x \asymp \log y\)), yet additive problems with smooth numbers are often surprisingly difficult, see e.g. [Ba, BBD, KMS]. The present case is no exception. Here we consider the equation
where \(x_1, x_2\) are almost primes and m is y-smooth with \(y \ge {\mathcal {L}}(n):= (\log n)^D\) for some sufficiently large constant D. We remark that the smoothing parameter \(y\) could be much smaller than those that are admissible in [Ba, BBD, KMS], which makes the problem quite delicate due to the sparseness of smooth numbers when y becomes such small. We have that
In particular, we have
To capture smooth numbers with such small density, we need to define the quantity
where \(\alpha =\alpha (n,y)\) is the unique positive solution to the equation
Theorem 1.3
There exist constants \(C, D> 0\) such that for any function g with \((\log n)^D\le g(n)\le n\), every sufficiently large integer n can be represented in the form (1.4) with \(m \in S(n, g(n))\) and integers \(x_1, x_2\) all of whose prime factors are greater than \(n^{1/C}\). The number of such representations is \(\gg {\mathcal {F}}(n, g(n))\Psi (n,g(n))(\log n)^{-2}\).
In fact, in the representations we construct, the smooth number m will in addition be a square-free number times a power of 2. For orientation we remark that
for \(y \ge {\mathcal {L}}(n)\). The arithmetic factor could vary quite a lot when y is small as
and whilst when \(\log y\ge (\log _2n)^2\), we have \({\mathcal {F}}(n, y)\asymp 1\).
We mention one possible path for further work, namely Lagrange’s four square theorem with restricted variables. That is, one seeks to solve \(N=x_1^2+x_2^2+x_3^2+x_4^2\) with \(x_i\) restricted to thin subsets of the integers such as primes, almost primes, or smooth numbers. See for instance Brüdern-Fouvry [BF], Heath-Brown-Tolev [HBT] and Blomer-Brüdern-Dietmann [BBD]. It is feasible that the techniques in this paper can be helpful to study related problems.
1.2 Methods and ideas.
While the statements of Theorems 1.1 and 1.3 are purely arithmetic, the proofs combine a variety of different methods from different fields. We need the algebraic structure theory of orders in quadratic number fields, the full machinery of automorphic forms including some higher rank L-functions, and of course the toolbox of analytic number theory.
Starting with the latter, the obvious approach to almost primes is the application of a sieve. Thus, for the proof of both theorems we need to analyze quite accurately the quantity
for square-free integers \(d_1, d_2\) where \(\ell \) runs either over primes or over smooth numbers. As mentioned above, in general there is no closed convolution formula for \(r_{d_1, d_2}\), since there is typically more than one class per genus. For fixed \(d_1, d_2\) one can lift the necessary ingredients to analyze (1.7) from [Iw1] or [Vi], but for almost primes we need strong uniformity where \(d_1, d_2\) can be as large as some small power of n.
A robust substitute that may also be useful in other situations is as follows. Let F be a primitive binary quadratic form of discriminant \(D = D_0f^2 < 0\) (with \(D_0\) fundamental) and let \({\mathcal {O}}_D \subseteq {\mathbb {Q}}(\sqrt{D})\) be the (not necessarily maximal) order of discriminant D. The set of equivalence classes of primitive forms of discriminant D is in bijection with the proper (i.e. invertible) ideal classes \({\mathcal {C}}_D\) of \({\mathcal {O}}_D\) (see [Co, Section 7]), and we denote by \(C_F \in {\mathcal {C}}_D\) the class corresponding to F. Then for \((n, f) = 1\) we have the exact formula
where w is the number of units in \({\mathcal {O}}_D\). In our application, the underlying quadratic field is fixed: for all \(d_1, d_2\) in (1.7) we obtain the Gaußian number field \({\mathbb {Q}}(i)\), but the discriminant of the order increases with \(d_1, d_2\), in particular the order is highly non-maximal. This will cause technical challenges and involved computations of the corresponding main terms.
The key point is now to split the outer sum into two parts: class group characters \(\chi \) of order at most two are genus characters and can be computed explicitly by some kind of convolution identity. For a character of order three or more, the inner sum is the n-th Hecke eigenvalue of a cusp form of weight 1 and level |D|. In this way, the theory of automorphic forms enters, and (1.8) can be seen as a version of Siegel’s mass formula.
The upshot is that—at least morally—we need to understand a sum as in (1.7) in two cases: when the arithmetic function in question is a Fourier coefficients of some cusp form and when the arithmetic function is a Dirichlet convolution of two characters, in other words the Fourier coefficient of an Eisenstein series. In particular, this is yet another example where the originally purely arithmetic problem of representing n in the form (1.1), (1.3) or (1.4) is intimately linked to the theory of automorphic forms. Indeed, here we need to invoke the full power of the spectral theory of automorphic forms.
As an aside we remark that there is a different way to build a bridge to automorphic forms. Instead of applying the algebraic identity (1.8), one could model \(r_{d_1, d_2}\) by a suitably truncated expression coming from a formal application of the circle method. More rigorously, one could apply a \(\delta \)-symbol method à la Heath-Brown [HB] and Duke-Friedlander-Iwaniec [DFI], followed by Poisson summation. This leads to sum of Kloosterman sums (for congruence subgroups of levels roughly \(d_1^2d_2^2\)) which after an application of the Kuznetsov formula returns a similar expression of cuspidal and non-cuspidal Fourier coefficients. This can be seen as a double Kloosterman’s refinement of the circle method. An application of (1.8), however, seems more direct.
To understand (1.7) when \(r_{d_1, d_2}\) is replaced by a cuspidal Fourier coefficient, we refine methods and results established in [ABL, Theorem 1.3] to obtain an unconditional power saving for individual pair \((d_1,d_2)\) with the required degree of uniformity. However, the Eisenstein contribution requires much more (or at least rather different) work. As mentioned above, if we were going to handle each pair \((d_1,d_2)\) separately for the Eisenstein contribution, we would ultimately require a zero-free strip for the relevant Dirichlet L-functions. To avoid this use of GRH, we instead incorporate the sum over \(d_1,d_2\) from the sieve weights into the analysis of primes (or smooth numbers) in long arithmetic progressions (see Proposition 5.1 below). The corresponding generalizations and extensions of [ABL, Theorem 2.1] are given in Propositions 6.1 and 7.1 below. This is of independent interest and should find applications elsewhere.
The proof of Theorem 1.3 features additional difficulties. The arithmetic of smooth numbers is rather subtle, and it turns out that leading term in a suitable asymptotic formula of (1.7) for \(\ell \in S(n, g(n))\) is not multiplicative in \(d_1, d_2\). Multiplicativity does not fail by a lot, but we obtain secondary terms involving derivatives of Euler products. This makes the application of a sieve perhaps not impossible, but at least seriously problematic. There are some ways to handle non-multiplicative main terms without developing the theory in full generality (see [FI1, p. 37]). But the fact that we have two variables to sieve makes the monotonicity principle, in particular [FI1, Corollary 5.4/5.5], not applicable. We therefore take a different route and incorporate the sieve weights directly into the analysis. A prototype of this idea can be found in the proof of [FT3, Theorem 4], but such a strategy in the context of additive problems seems to be new. There are a number of interesting but more technical details that we will discuss as they arise.
Roadmap for the rest of the paper In Sect. 2, we give the explicit formula for \(r_{d_1,d_2}\) (see Lemma 2.2), which consists of a cuspidal part and an Eisenstein part. To deal with the cuspidal part, in Sect. 3 we prove general bounds for linear forms in Hecke eigenvalues over prime numbers (Theorem 3.3) and smooth numbers (Theorem 3.4) assuming a bilinear estimate (Proposition 3.1) and a \(\tau _3\)-type estimate (Proposition 3.2) with Hecke eigenvalues, whose proofs are given in Sect. 4. We then give a general averaged bilinear estimates in Sect. 5 which will be used to handle the error terms when applying sieve methods. The proofs of Theorems 1.1/1.2 and 1.3 are given in Sects. 6 and 7 respectively. The most involved part is the rather involved analysis of the main term in the sieving process, which covers Sects. 6.2–6.3 in the prime number case and Sects. 7.4–7.7 in the smooth number case.
2 Algebraic considerations
We recall the classical theory on binary quadratic forms. For \(f \in {\mathbb {N}}\) let \({\mathcal {O}}_{f} = {\mathbb {Z}} + f{\mathbb {Z}}[i] \subseteq {\mathbb {Z}}[i]\) be the unique order of discriminant \(-4f^2\) (the notation is slightly different than in Sect. 1.2). We have \({\mathcal {O}}_f^{*} = \{\pm 1\}\) unless \(f = 1\) in which case \({\mathcal {O}}_f^{*} = \{\pm 1, \pm i\}\). We write \(w_f = \#{\mathcal {O}}_f^{*}\). Its class group \({\mathcal {C}}_f\) is the set of proper (= invertible) fractional \({\mathcal {O}}_f\)-ideals \(I_f\) modulo principal ideals \(P_f\). This is a finite group of cardinality [Co, Theorem 7.24]
We have a sequence of isomorphisms. Let \(\tilde{{\mathcal {C}}}_f\) be the set of primitive positive binary quadratic forms with discriminant \(-4f^2\) modulo the action of \(\textrm{SL}_2({\mathbb {Z}})\). Let \(I^*_f\) be the set of fractional \({\mathcal {O}}_f\)-ideals coprime to f (these are automatically proper [Co, Lemma 7.18(ii)]) and \(P^*_f\) the subgroup of its principal ideals. Let \(I^{*}\) be the set of fractional \({\mathcal {O}}_1 = {\mathbb {Z}}[i]\)-ideals coprime to f and \(P^{*}\) the set of principal \({\mathbb {Z}}[i]\)-ideals with a generator \(\alpha \in {\mathcal {O}}_f\) coprime to f. Then we have ([Co, Theorem 7.7(ii), Exercise 7.17(a), Propositions 7.19, 7.20, 7.22] with the convention that \(\Im (\beta /\alpha )>0\))
The last isomorphism is norm-preserving ([Co, Proposition 7.20]). For a positive integer m and a primitive binary form F with corresponding ideal class \(C_F\), the first isomorphism induces a bijection ([Co, proof of Theorem 7.7(iii)] or [BS, Theorem 5 in Chapter 2, Section 7.6]) between
If \((m, f) = 1\), then the last isomorphism in (2.2) shows
where \(r(m) = {{\textbf {1}}} *\chi _{-4}\) is the usual sums-of-two-squares function.
The elements of order 2 in \({\mathcal {C}}_f\) are well-understood by genus theory. In particular, for square-free f the real characters \(\chi \) of \(\widehat{{\mathcal {C}}}_f\) are parametrized by odd divisors \(D \mid f\), and for an ideal \({\mathfrak {a}} \in I_f^{*}\) (i.e. coprime to f) and a real character associated with the divisor D we have
We write
for the number of genera.
From now on we focus on the form \(F_{a,b}=a^2x_1^2+b^2x_2^2\). We write \(C_{a, b} \in {\mathcal {C}}_{ab}\) to denote the class corresponding to the form \(F_{a,b}\). We will give an explicit formula for \(r_{d_1, d_2}\) defined in (1.7). We first notice that we can compute the real characters of \(\tilde{{\mathcal {C}}}_{ab}\) explicitly for the class \(C_{a,b}\).
Lemma 2.1
Let \(a, b \in {\mathbb {N}}\) be coprime and square-free. Let \(F_{a, b}\) be as above and let \(\chi \) be a real character of \(\tilde{{\mathcal {C}}}_{ab}\). Then \(\chi (C_{a, b}) = 1\).
Proof
Then \(F_{a, b}\) is equivalent to the form \((a^2 + b^2)x_1^2 + 2b^2x_1x_2 + b^2 x_2^2\), which under the isomorphism (2.2) corresponds to an ideal of norm \(a^2 +b^2\) coprime to ab, so if \(\chi \) belongs to the divisor \(D\mid ab\), then by (2.5) we have
Remark
Alternatively one can explicitly compute the class group structure from the exact sequence [Co, (7.25), (7.27), Exercise 7.30]
For instance, if f is even (and squarefree), one can conclude by a computation based on the Chinese remainder theorem and the fact that the multiplicative group of a finite field is cyclic that \(\text {rk}_2({\mathcal {C}}_f) = \text {rk}_4({\mathcal {C}}_f)\), so that the 2-part of \({\mathcal {C}}_f\) is a direct product of copies of \({\mathbb {Z}}/2^{k_j}{\mathbb {Z}}\) with \(k_j \ge 2\) and hence all real characters are trivial on elements of order 2 (which are squares) and so a fortiori on diagonal forms.
From Lemma 2.1 we can give the explicit formula for \(r_{d_1,d_2}\) in (1.7). We recall the notation (2.1) and (2.6).
Lemma 2.2
Let \(d_1, d_2\) be two square-free numbers, and write \(\delta =(d_1,d_2), d_1' = d_1/\delta \), and \(d_2' = d_2/\delta \). Let \({\mathcal {L}} \subseteq {\mathbb {N}}\) be a finite set and define
Then \({\mathcal {S}} = {\mathcal {S}}_{\le 2}+ {\mathcal {S}}_{\ge 3}\) where we have
with
and
with a Hecke eigenvalue \(\lambda _{\chi }\) of some holomorphic cuspidal newform of weight 1, character \(\chi _{-4}\) and level dividing \(4d_1^2d_2^2\).
Proof
Since \(\delta , d_1', d_2'\) are pairwise coprime and square-free, we have
Note that \((\delta _1', \delta _2') = 1\), so the form \((\delta _1')^2 x_1^2+ (\delta _2')^2 x_2^2\) is primitive and \(f = \delta _1'\delta _2'\) is square-free. Moreover, the argument of \(r_{ \delta '_1, \delta _2'}\) is (nonzero and) coprime to \(\delta _1'\delta _2'\). We have by (2.3) and orthogonality of characters
If \(\chi \) is real and belongs to the odd divisor \(D \mid \delta _1'\delta _2'\), then by (2.5) and (2.4) for \((m, \delta _1'\delta _2') = 1\) we have
Combining this with Lemma 2.1, we obtain
which yields (2.7) after summing over \(2\not \mid D\mid \delta _1'\delta _2'\).
If \(\text {ord}(\chi ) \ge 3\), it remains to show that \(\sigma (\chi , m) \) is a normalized Hecke eigenvalue of a weight 1 newform of level dividing \(4d_1^2d_2^2\) and character \(\chi _{-4}\). This is a special case of automorphic induction for imaginary quadratic fields [AC], but most of it can be seen elementarily. We observe first that by splitting into ideal classes the function
is a linear combination of theta series corresponding to forms \(F \in \tilde{{\mathcal {C}}}_{\delta _1'\delta _2'}\) and hence by [Iw4, Theorem 10.9] a modular form of weight 1, character \(\chi _{-4}\) and level \(4(\delta _1'\delta _2')^2\). Using (2.2), one can check directly from the definition of the Hecke operators \(T_m\) that
for \((m, \delta _1'\delta _2') = 1\). Hence \(f_{\chi }\) belongs to the space generated by some newform of level dividing \(4(\delta _1'\delta _1')^2\), and for \((m, \delta _1'\delta _2') = 1\) the numbers \(\sigma (\chi , m)\) coincide with the Hecke eigenvalues.
It remains to show that \(f_{\chi }\) is a cusp form for \(\text {ord}(\chi ) \ge 3\). This is again a classical fact (also contained in [AC]) that can be proved in several ways: by the parametrization of Eisenstein series, a modular form is non-cuspidal if and only if the associated Dirichlet series factorizes into two Dirichlet L-functions, and this happens if and only if its Rankin–Selberg square L-function has a double pole at \(s=1\). The Rankin–Selberg square of a class group character \(\chi \) is, up to finitely many Euler factors, \(\zeta (s) L(s, \chi ^2)\) which has a double pole at \(s=1\) if and only if \(\text {ord}(\chi ) \le 2\).
This completes the proof of the lemma.
Remark
The formula (2.7) depends on Lemma 2.1. For general quadratic forms not covered by Lemma 2.1 the summation over D can be carried out in the same way. This yields an expression similar to (2.7) with a set \({\mathcal {G}}_{\delta _1'\delta _2'}\) which may be different from the one defined in (2.8) but which has the same cardinality, namely \(\phi (\delta _1'\delta _2')/G_{\delta _1'\delta _2'}\). This is a consequence of the fact that numbers coprime to the order can only be represented in one genus. More precisely, the Chinese Remainder Theorem induces a bijection \({\mathcal {G}}_{\delta _1'\delta _2'} \cong \prod _{p \mid \delta _1'\delta _2'} {\mathcal {G}}_p\) where \({\mathcal {G}}_p \subseteq ({\mathbb {Z}}/p{\mathbb {Z}})^{*}\) has cardinality \(\phi (p)/2\) for odd p and cardinality 1 for \(p=2\). This is all the information needed in the subsequent arguments. In this way, our main results generalize to other quadratic forms than sums of two squares, although some extra care may be necessary at the last step of (2.10).
If we consider solutions to \(n=p+x_1^2+x_2^2\) where \(x_1,x_2\) have no small prime factors other than a small square-free factor of r, then we will encounter binary quadratic forms of the shape \(x^2+(r^2y)^2\), which can be studied with our method while paying special attention to the real characters of \({\mathcal {C}}_f\) with non square-free f and the step in (2.10).
3 Additive problems in cusp forms
In this section we deal with the innermost sum of (2.9) when the set \({\mathcal {L}}\) consists of numbers \(n-\ell \) where \(\ell \) is either prime or smooth. We generally consider a holomorphic or Maaß cuspidal Hecke newform \(\phi \) of level \(4\mid N\), character \(\chi _{-4}\) and Hecke eigenvalues \(\lambda (n)\) (which by self-adjointness of Hecke operators are real) whose archimedean parameter (weight or spectral parameter) we denote by \(\mu \). We start with two auxiliary results.
Proposition 3.1
Let \( M, X, Z \ge 1\), \(\Delta \in {\mathbb {N}},\) and \(f, \sigma \in {\mathbb {Z}} {\setminus } \{0\}\). Let \(\alpha _m, \beta _n\) be sequences supported on [M, 2M], [X, 2X] respectively. Suppose that \(|\alpha _m|\ll m^\varepsilon \) for any \(\varepsilon >0\) and \(\alpha _m\beta _n\) vanishes unless \(\sigma mn+f\) is in some dyadic interval \([\Xi , 2\Xi ]\) with \(|\sigma |MX/Z \ll \Xi \ll |\sigma |MX + |f|\). Then
for some absolute constant \(A > 0\) and any \(\varepsilon >0\), where \(P =\Delta |\sigma |(1 + |\mu |)NZ (1 + |f|/MX)\).
We remark that the trivial bound is \(\Vert \beta \Vert (MX^{1/2})^{1+\varepsilon }\), so Proposition 3.1 is non-trivial as soon as \(M \ll X^{1/2 - \delta }\) for some \(\delta > 0\) as in [Pi, Theorem 1.2] when P is fixed.
Proposition 3.2
Let \(\Delta , q,d\in {\mathbb {N}},r, f\in {\mathbb {Z}} \setminus \{0\}\), \(Z, X_1, X_2, X_3 \ge 1/2\). Suppose \((\Delta , q)=1\) and that at least one of r, f is positive. Let G be a function with support on and \(G^{(\nu _1, \nu _2, \nu _3)} \ll _{{{\varvec{\nu }}}} Z^{\nu _1+\nu _2+\nu _3}X_1^{-\nu _1} X_2^{-\nu _2} X_3^{-\nu _3}\) for all \({\varvec{\nu }}\in {\mathbb {N}}_0^3\). Write \(X = |r|X_1X_2X_3\) and suppose that G(k, l, m) vanishes unless \(rklm + f\) is in some dyadic interval \([\Xi , 2\Xi ]\) with \(X/Z \ll \Xi \ll X+|f|\). Then
for some absolute constant A and any \(\varepsilon > 0\), where \(P = qr\Delta (1 + |\mu |)NZ(1 + |f|/X)\).
The trivial bound is \((X/|r|\Delta q)^{1+\varepsilon }\), and so Proposition 3.2 is non-trivial uniformly for P up to some small power of X. Better exponents could be obtained with more work, but these bounds suffice for our purpose. In principle this is a shifted convolution problem of \(\tau _3\) against \(\lambda \), cf. e.g. [Mu, To].
We postpone the proofs of Propositions 3.1 and 3.2 to the next section and state the two main results of this section.
Theorem 3.3
There exist absolute constants \(A, \eta > 0\) with the following property. Let \(\sigma , f\in {\mathbb {Z}}\setminus \{0\}\), \(X, Z \ge 1 \), \(\Delta , q, d\in {\mathbb {N}}\) with \((\sigma q, \Delta )=1\). Let G be a smooth function supported on [X, 2X] with \(G^{(\nu )}\ll _\nu (X/Z)^{-\nu }\) for \(\nu \in {\mathbb {N}}_0\). Suppose that G(n) vanishes unless \(\sigma n+f \) is in some dyadic interval \([\Xi , 2\Xi ]\) with \(|\sigma |X/Z\ll \Xi \ll |\sigma | X+|f|\). Then
where \(P = q|\sigma |\Delta (1 + |\mu |)NZ(1+|f|/|\sigma |X) \).
Proof
Based on Propositions 3.1 and 3.2, this could be proved similarly as in [ABL, Prop. 9.1] which in turn is the argument of [Pi, Section 9]. The proof in [Pi] uses Vaughan’s identity to decompose \(\Lambda \) iteratively three times. This is slightly cumbersome, but has the advantage that we only need \(\alpha _m\) to be supported on square-free integers in the bilinear estimate in Proposition 3.1, which could yield better error terms numerically. Here we give a shorter independent proof at the cost of slightly weaker error terms. Instead of using Vaughan’s identity as in [Pi], we apply Heath-Brown’s identity in the form
for \(1 \le n \le X\). Next we apply a smooth partition of unity to the variables \(m_i, n_i\). To be precise, let \(0<{\mathcal {Z}}\le 1\) be a parameter to be chosen later and let \(\psi (x)\) be a smooth function supported on \([-1-{\mathcal {Z}}, 1+{\mathcal {Z}}]\) that equals 1 on \([-1,1]\) and satisfies \(\psi ^{(j)}(t)\ll {\mathcal {Z}}^j\). Let \({\mathcal {D}}=\{(1+{\mathcal {Z}})^m, m\ge 0\}\). Then we have the smooth partition of unity
for any natural number n, where \(\psi _N(n)=\psi (n/N)-\psi ((1+{\mathcal {Z}})n/N)\). Here we can choose \({\mathcal {Z}}=1\) (later parts of the paper use the same argument with different choices of \({\mathcal {Z}}\)) so that
where
Therefore it is enough to show that there exist some absolute constants \(A,\eta >0\) such that
We distinguish two cases. If there exists some i such that \(X^{\eta _0}\le M_i \,{(\text {or } N_i)}\le X^{1/4+\eta _0}\) for some \(\eta _0>0\), then we can split \(m_i\) or \(n_i\) into residue classes modulo q, use Mellin inversion to separate variables \(m_i,n_i\) (see e.g. [Pi, Section 7]) and then apply Proposition 3.1 to obtain
Otherwise, we must have
and
in which case we can apply Proposition 3.2 to obtain
Combining the bounds in these two cases and chossing \(\eta _0\) suitably, we complete the proof of (3.3).
Our second main result treats the case of smooth numbers. As usual, \(P^+(n)\) denotes the largest prime factor of n with the convention \(P^+(1) = 1\).
Theorem 3.4
There exist absolute constants \(A, \eta >0\) with the following property. Let \(\sigma , f\in {\mathbb {Z}}\setminus \{0\}\), \(X, Z, y \ge 1\), \( \Delta , q,d\in {\mathbb {N}}\) with \((\sigma q, \Delta )=1\). Let G be a smooth function supported on [X, 2X] with \(G^{(\nu )}\ll _\nu (X/Z)^{-\nu }\). Suppose that G(n) vanishes unless \(\sigma n+f \) is in some dyadic interval \([\Xi , 2\Xi ]\) with \(|\sigma | X/Z\ll \Xi \ll |\sigma | X+|f|\). Then
where \(P = q |\sigma |\Delta (1+|\mu |)NZ(1 + |f|/|\sigma | X)\).
Proof
The set-up follows along the lines in [FT1, Lemme 2.1] on averages of smooth numbers in arithmetic progressions to large moduli, which we also generalize (see Proposition 7.1). Throughout the proof, we use A to denote some absolute positive constant, not necessarily the same at each occurrence. For \(t \ge 1\) let \(g_t(n)\) be the characteristic function on numbers n with \(P^+(n) \le t\). The starting point is Buchstab’s identity that we iterate. For \(y \ge 1\) we have
Thus
say, where
For \(S_0\), we apply [ABL, Corollary 7.6] to obtain
For \({\overline{S}}\), we localize \(p_i\) in intervals \((P_j, P_{j+1}]\) where \(P_j=y(1+{\mathcal {Z}})^j\) with some \({\mathcal {Z}}\le 1\) to be chosen later and \(0\le j\le J= 1+\lfloor \frac{\log (X/y)}{\log (1+{\mathcal {Z}})}\rfloor \). Then
To justify this, we see that \({\overline{S}}\) and the sum in (3.6) differ only by integers n that lie in \([X(1+{\mathcal {Z}})^{-4},X(1+{\mathcal {Z}})^4]\) or have at least two prime factors \(p_i,p_j\) with \(p_i<p_j<(1+{\mathcal {Z}})p_i\). The contribution from these integers can be bounded by
Since \(y<P_{i_1}\ll X^{1/4}\), we can apply Proposition 3.1 with \(\textbf{m}=p_1\) and \(\textbf{n}=|\sigma |p_2\cdots p_4 m\) to the main term in (3.6) by splitting \(\textbf{m},\textbf{n}\) into residue classes modulo q to obtain
upon choosing \({\mathcal {Z}} = \max (y^{-1/21}, X^{-1/144})\).
For \(S_k\), \(1 \le k \le 3\), we first replace the characteristic function on primes with the von Mangoldt function \(\Lambda \). As above, we see that the contribution of higher prime powers is at most
which can be absorbed in the existing bounds. Now similar to the proof of Theorem 3.3, we use Heath-Brown’s identity (3.1) to decompose the prime variables. Again we split all variables into short intervals using the smooth partition (3.2) for some parameter \({\mathcal {Z}}\) to be chosen later. We first bound
where \(1\le r, s \le 12\), \( M\prod _{i=1}^r M_i \prod _{j=1}^s N_j\asymp X\) and \(M_i\ll X^{1/4}\). Again we distinguish two cases.
If there exists a subset \({\mathcal {I}} \subset \{M,M_i,N_j\}\) such that \(K:=\prod _{I\in {\mathcal {I}}}I\) such that \(X^{\eta _0}< K \le X^{1/4+\eta _0}\), then we can apply Proposition 3.1 to obtain
If such \({\mathcal {I}}\) does not exist, then all the \(M_i\)’s and possibly some of the M, \(N_j\)’s can be combined into \(R\ll X^{\eta _0}\) and the number of M,\(N_j\)’s of size \(\gg X^{1/4+\eta _0}\) is less than, or equal to three. So it is enough to bound
which by Proposition 3.2 is
for some absolute constant \(A > 0\). Then by the same argument as above we see that
and hence by choosing \({\mathcal {Z}}=X^{\eta _1}\) for some suitable \(\eta _1>0\) we obtain
for some \( \eta _2 > 0\). Combining (3.5), (3.7) and (3.8), we have shown that there exist some absolute constants \(A, \eta >0\) such that
On the other hand, when y is very small we can do better using the flexible factorization of smooth numbers to create some bilinear structure so that Proposition 3.1 can be applied directly. Recall [FT3, Lemme 3.1]: for any \(M \ge 1\), every \(n\ge M\) with \(P^+(n)\le y\) has a unique representation in the form
We can separate l and m in the condition \(P^+(l)\le P^-(m)\) using [Dr1, eq. (3.37)] with an acceptable error. Thus by the same argument as before, we can localize m into short intervals and apply Proposition 3.1 to obtain
by choosing \(M=X^{7/33}y^{-7/11}, {\mathcal {Z}}=X^{1/33}y^{-1/11}\). Applying (3.9) when \(y\ge X^{\eta }\) and (3.11) when \(y\le X^{\eta }\) for some suitable \(\eta >0\) we obtain the theorem.
Remark
Arguing similarly as in the proofs of Theorems 3.3 and 3.4, we can replace \(\Lambda (n)\) or \(\mathbb {1}_{P^+(n)\le y}\) by \(\tau _k(n)\) (for fixed k) and obtain a power saving bound uniformly in all parameters. Moreover, the proof works also for a wide class of arithmetic functions which possess a similar combinatorial decomposition (cf. e.g. [DT, FT4] as well as [MT, section 5.1]), such as generalized divisor functions \(\tau _z(n)\) for any complex z and the indicator function of norm forms of abelian extensions of \({\mathbb {Q}}\).
4 Proofs of Propositions 3.1 and 3.2
We still owe the proofs of Propositions 3.1 and 3.2. Recall that P is the product of the “unimportant” parameters which has slightly different meanings in Proposition 3.1 and Proposition 3.2. For notational simplicity we write
for any \(\varepsilon > 0\) and some \(A > 0\), not necessarily the same at each occurrence.
4.1 Proof of Proposition 3.1.
This is an extension of [ABL, Proposition 8.1] where we relax the condition \(\alpha _m\) supported on square-free numbers at the cost of slightly weaker bounds. Let \({\mathscr {B}}(\alpha , \beta )\) denote the m, n-sum we want to bound. We use the notation (4.1) with \(P=\Delta |\sigma |(1+|\mu |)N Z(1+|f|/MX)\). Note that we can assume without loss of generality that \(\sigma =\{\pm 1\}\) as otherwise we can set \({\tilde{\beta }}_n=\beta _n\mathbb {1}_{\sigma \mid n}\). When \(\alpha _m\) is supported on square-free numbers, then a straightforward extension of [ABL, Proposition 8.1] (replacing the F in the definition of set of moduli \({\mathcal {Q}}\) by \([N_1, D^2]\Delta _1\), with the notation \(\Delta =\Delta _1\Delta _2\) and \(\Delta _2=(\Delta , (m_3m_4)^\infty )\) to accommodate the new parameter \(\Delta \)) shows
To deduce the result for general sequences, we write
We choose a parameter \(L > 1\). By Hölder’s inequality, we see that the contribution from \(l\ge L\) gives
using the well-known bound for the fourth moment of Hecke eigenvalues
(The Dirichlet series \(\sum _n \lambda _\phi (n)^4 n^{-s}\) equals \(L(s, \text {sym}^2 \phi \,{\times }\, \text {sym}^2 \phi ) L(s, \text {sym}^2\phi )^2 \zeta (s)\) up to an Euler product that is absolutely convergent in \(\Re s > 1/2 + 2\theta \) where \(\theta \le 7/64\) is an admissible exponent for the Ramanujan-Petersson conjecture. Hence if \(C_\phi \) denotes the conductor of \(\phi \), then we conclude from standard bounds for L-functions at \(s=1\) [XLi] the second bound in the above previous display.)
On the other hand, for \(l\le L\) we combine n and \(l^2\) to one variable and apply (4.2) with \({\tilde{\beta }}^{(l)}_n=\beta _{n/l^2}\mathbb {1}_{l^2\mid n}\) to obtain
With \(L = M^{2/7}\), we obtain
Here we can drop the middle term, because if it dominates the last term, then \(M > X^{1/2}\) in which case the claim is trivially true. This completes the proof.
4.2 Proof of Proposition 3.2.
This result has a precursor in [ABL, Theorem 2.5], but it turns out that here the extra congruence conditions modulo q and \(\Delta \) are somewhat subtle and not completely straightforward to implement due to well-known difficulties in the Voronoi summation formula for general moduli. We therefore modify the proof, the biggest difference being that we replace the delta-symbol method ([ABL, Lemma 7.3]) of Duke-Friedlander-Iwaniec with Jutila’s circle method.Footnote 1
Let us denote by \({\mathscr {S}}\) the k, l, m-sum that we want to bound and recall that P is the product of the “unimportant” parameters (keeping in mind that the definition of P differs slightly from the previous proof). Assume without loss of generality \(X_1 \ge X_2 \ge X_3\).
By Voronoi summation (cf. [ABL, Corollary 7.7]) it is easy to see that
We can also write
where the redundant function g(n) has support on \(\frac{1}{2}\Xi /\Delta \le n\le 3\Xi /\Delta \) and satisfies \(g(n)=1\) for \(\Xi /\Delta \le n\le 2\Xi /\Delta \) and \(g^{(\nu )}(n)\ll (\Xi /\Delta )^{-\nu }\) for all \(\nu \in {\mathbb {N}}_0\). We now employ Jutila’s circle method as in the corollary to [Ju, Lemma 1], i.e. we insert the constant function into the \(\alpha \)-integral and approximate it by a step function of small rational intervals. This device will simplify the argument especially with respect to the extra congruence condition on q. Let \(C\gg X^{1/2}\) be a large parameter and \({\mathcal {C}}\subset [C, 2C]\) be the set of moduli which are coprime to \(\Delta Nq\). Define
and let \(\delta \) be a real number such that \(C^{-2}\ll \delta \ll C^{-1}.\) Then we have (bounding the error term with Cauchy-Schwarz and Rankin–Selberg)
where
We now apply Voronoi summation [KMV, Theorem A.4] to the n-sum and note that by construction \((\Delta N, c) = 1\) (this flexibility is the main advantage of the present set-up). We obtain
for \((a, c) = 1\), some constant \(\xi \in S^1\) depending only on the cusp form \(\phi \) and functions \({\mathcal {G}}^{\pm }\) which by [ABL, Lemma 7.5] satisfy uniformly in \(|\alpha |\le \delta \) the bound
where \(\tau = 0\) if \(\phi \) is holomorphic and \(\tau = \Im t_{\phi }\) if \(\phi \) is Maaß with spectral parameter \(t_{\phi }\). We choose
With this choice we can truncate the n-sum at \(\preccurlyeq C^2/X\) and obtain
The factor \(e(\alpha rklm)\) can be incorporated into G at essentially no cost, since \(\alpha rklm \ll \delta X = 1\). When \(X_1\ll C\), we can bound the k, l, m-sum using [ABL, Lemma 7.2] to obtain
where \(c_{\square }\) is the squarefull part of c. We can now evaluate the sums over n (cf. [ABL, (7.1)]) and c (with Rankin’s trick, cf. two displays after [ABL, (8.3)]) getting
(recall that \(X_1 \ge X_2 \ge X_3\)). At this point we make the admissible choice \(C = X^{1/2 + 1/54}\) getting
Then the claim follows from (4.5) when \(X^{1/3} \le X_1 \le X^{2/5}\) and from (4.4) when \(X_1 \ge X^{2/5}\).
5 Averaged bilinear estimates
The main result of this section is an averaged bilinear estimate which will be used later to handle the error terms when applying sieve methods. Define
Note that \({\mathfrak {u}}_R(n; q, a, 1, *)\) equals \({\mathfrak {u}}_R(n{\bar{a}}, q)\) as defined in [ABL, Section 4]. Here we only use \(\psi \,(mod \,d)\) with \({\text {cond}}(\psi )\le R\) so that GRH can be avoided when estimating contributions from the second term in (5.1). However, we could not treat the contribution from each d individually as in [ABL, Proposition 4.1] and thus we make use of the sum over d as the following result indicates.
Proposition 5.1
Let \(M, N, Q, R\ge 1\), \(a_1 \in {\mathbb {Z}}\backslash \{0\}\), write \(x=MN\). Let \(c\in {\mathbb {N}}\), \(c_0 \in {\mathbb {Z}}\) with \((c_0, c) = 1\). Let \(\alpha _m\) and \(\beta _n\) be two sequences supported in \(m\in (M, 2\,M]\) and \(n\in (N, 2N]\) such that for some \(A \ge 1\) we have \(\alpha _m\le \tau (m)^A, \beta _n\le \tau (n)^A\), and suppose \(\lambda _d\ll \tau (d)^A\).
Let \(\eta > 0\) be any sufficiently small number. Then there exist \(\delta = \delta (\eta ) > 0\) and \(D = D(\eta , A)\) with the following property. If
then
We remark that the condition \((n,a_2)=1\) is important in this proposition for a few reasons: it is used by [Dr2, around eq. (5.11)] to reduce to the case of n being square-free as well as the modulo 1 reduction of the exponential phase in [Dr2, (5.24)]. More importantly, it is required in the application of the Kuznetsov formula when estimating sum of Kloosterman sums (e.g. [Dr2, the condition \((q,s)=1\) in Proposition 4.12]).
Proof
Following the proof of [ABL, Proposition 4.1], it is enough to prove for any smooth function \(\gamma :{\mathbb {R}}_+\rightarrow [0,1]\) with
that under the assumptions in Proposition 5.1 as well the additional condition that \(\beta _n\) is supported on square-free integers we have
for \(D\ll x^\delta \). The proof of (5.2) is similar to that in [ABL, Proposition 4.2] with some modifications. Here we have a sum over d which will only influence the application of the large sieve inequalities, as we obtain power saving error terms in all other parts of the argument for each d. Note that we also have an additional condition \((q,d)=1\) compared to that in [ABL, Proposition 4.2], which can be incorporated following the proof. The sup over \(w\,(mod \,d)\) and \(a_3\mid d\) cause no issue as the main expressions will be independent of \(w, a_3\).
To start with, we also assume \(\beta _n\) is supported on \(n\equiv b_0\,(mod \,c)\) as in the second display of the proof of Proposition 4.2 of [ABL], which makes certain coprime conditions easier to verify when applying [ABL, Theorem 2.1] (and costs a factor c in the final bound).
We now apply the Cauchy-Schwarz inequality in d, m with the bounds for \(\lambda _d\) and \(\alpha _m\). After majorizing the summation over m with the help of a suitable smooth weight \(\alpha (m)\) we get
where (if \((n,qd)\ne 1\) the above sum is empty)
and
It is then enough to show (recall \(MN = x\))
We can evaluate \({\mathcal {S}}_3\) as in [Dr2, Section 5.3.1] so that the m-sum becomes
where \(W=[q_1,q_2]d\), and thus for \(D, R\le x^\delta \) and uniformly for \(|a_2|\le x^\delta \) we have
where
Note that \(X_3\) is independent of \(w,a_3\), and the error term is acceptable for (5.3) (in fact much better than required).
Similarly we can evaluate \({\mathcal {S}}_2\) as in [Dr2, Section 5.3.2] so that for \(D, R\le x^\delta \) and uniformly for \(|a_2|\le x^\delta \) we have
We next evaluate \({\mathcal {S}}_1(w, a_3)\). After Poisson summation in m, we get the expected main term as \({\hat{\alpha }}(0)X_1\), where
which is also independent of \(w, a_3\). The rest of the evaluation of \({\mathcal {S}}_1\) is similar to that in [ABL, Proposition 4.2] (with \(a_2\) replaced by \(a_2a_3\)). The condition \((q_i,a_2d)=1\) is equivalent to \((q_i, a_2a_3)=1\) and \((q_i, {\tilde{d}})=1\) where \({\tilde{d}}=d/(d, (a_2a_3c)^\infty )\). Thus the additional condition \((q_i,{\tilde{d}})=1\) can be incorporated by following the proof of [ABL, Proposition 4.2] using Möbius inversion. To be precise, we replace the condition \(\delta _i\mid a_1\) by the condition that \(\delta _i\mid a_1{\tilde{d}}\) so that the condition \((\delta _2, n_0a_2a_3c)=1\) is still satisfied when applying [ABL, Theorem 2.3] since \((n_i,d)=1\). In conclusion, we have for \(\delta =\delta (\eta ),\kappa =\kappa (\eta )\) small enough, \(D, R\le x^\delta \) and uniformly for \(|a_2|\le x^\delta \) that
It remains to show that
(Note that \(X_1, X_3\) are independent of \(w, a_3\).) The proof follows along the lines of [Dr2, Section 5.6]. We have
Before we can estimate this with the large-sieve, we need to uncouple the variables. To do so, we write \(\text {cond}(\chi )=q\ell \) and detect the conditions \((n_i,q_id)=1\) by Möbius inversion. We thus estimate the above
By Cauchy-Schwarz and the symmetry between \(n_1\) and \(n_2\), we arrive at
By the large sieve inequality for Dirichlet characters ([Dr2, Lemma 3.3]) and partial summation, the contribution from \(q>R^{1/3}\) can be bounded by
Similarly the contribution from \(\ell \ge R^{2/3}, q\le R^{1/3}\) can be bounded by
Therefore, we have
if \(D\ll N/R^{1/3}\). Since \(N\gg x^\eta \), we conclude that for \(\delta \) small enough (in terms of \(\eta \)) equation (5.3) holds and so (5.2) follows. This completes the proof of Proposition 5.1.
6 Proofs of Theorems 1.1 and 1.2
6.1 Primes in arithmetic progressions.
Before we start with the analysis of the main term, we give a result on averaged equidistribution of primes in long arithmetic progressions, which will be used to handle the error terms in the sieving process.
Proposition 6.1
There exists some absolute constant \(\varpi >0\) such that the following holds. Let \(x\ge 2\), \(c_0,c,d, C\in {\mathbb {N}}, (c_0, c)=1, a_1,a_2\in {\mathbb {Z}}{\setminus } \{0\}\) such that for
we have
Proof
This is the analogue of [ABL, Theorem 2.1] and can be proved in the exact same way as at the end of [ABL, Section 4]. We only need to replace [ABL, Proposition 4.1] with Proposition 5.1 with the choice \(R=(\log x)^B\) for some large B depending on C, A. The contribution from \(\tau _3\)-type sums is negligible by choosing \(\varpi \) small enough since [BFI, Lemma 2] saves a small power of x. Using \(\sum _{d\le x^\kappa } \lambda _d/\phi (d)\ll (\log x)^{O(1)}\) and the Siegel-Walfisz Theorem we also have that the contribution from \(\chi \,(mod \,qd)\) with \({\text {cond}}(\chi )\le R \) can be bounded by \(x R (\log x)^{O(1)}\exp (-b\sqrt{\log x})\), which is negligible by the choice of R.
For future reference we remark that we can replace the von Mangoldt function by the characteristic function on primes.
6.2 Preparing for the sieve.
In this section, we use \(\kappa \) to denote a sufficiently small (depending on \(\varpi \) in Proposition 6.1 and \(A, \eta \) as in Theorem 3.3) positive constant.
We use \(\varepsilon \) to denote an arbitrarily small positive constant.
Let \({{\textbf {d}}} = (d_1, d_2)\) denote a pair of two square-free numbers with \(d_1, d_2 \le n^{\kappa }\). In preparation for a sieve we define
(where of course p denotes a prime). As usual, we denote by \(c_p(n)\) the Ramanujan sum. The key input for the sieve is the following:
Proposition 6.2
With the above notation, there exists some absolute constant \(\kappa >0\) such that uniformly for \(d_1, d_2\le n^{\kappa }\) and \(\lambda _{{\textbf {d}}}\ll (\tau (d_1d_2))^C\) we have
where the singular series \({\mathfrak {S}}_{{{\textbf {d}}}} (n)\) is given by
Remark
One can check that this matches the main term coming from a formal application of the circle method. In particular, we see that \({\mathfrak {S}}_{{{\textbf {d}}}} (n) = 0\) if \((d_1, d_2, n) > 1\) as it should.
Proof
Let \(\delta =(d_1, d_2), d_1'=d_1/\delta , d_2'=d_2/\delta \) as in Lemma 2.2. By Lemma 2.2 and Theorem 3.3 (with \(\kappa \) sufficiently small depending on the constants \(A, \eta \)) we have
where \({\mathcal {G}}_{\delta _1'\delta _2'} \subseteq ({\mathbb {Z}}/\delta _1'\delta _2'{\mathbb {Z}})^{*}\) is as in (2.8) and \(r = {{\textbf {1}}} *\chi _{-4}\) as before. From (2.1) we obtain
in all cases, including \(\delta _1'\delta _2' = 1\). Note that we have
We apply the Dirichlet’s hyperbola method to the r-function so that the innermost p-sum in (6.3) can for some parameter \(Y > 0\) be written as
say, where
We choose \(Y=n^{1-3\kappa }\) so that the error term is acceptable after summing over \(d_1,d_2\le n^\kappa \). We will evaluate \(S_1\) and \(S_2\) on average over \(d_1,d_2\) by Proposition 6.1. This requires some preparation. It turns out that the contribution from \(S_1\) will not appear in the main term \({\text {Li}}(n){\mathfrak {S}}_{\textbf {d}}(n)\), but as there are some complications from the powers of two, we start with a detailed treatment for \(S_1\).
We use the notation
We see that \(S_1\) equals
Here we have used the fact that if \(2\mid \delta _1'\delta _2'\) then we must have \(a_2r_2=1\) and \((w,2)=1\). The contribution when one of \(a_2,a_d\) is at least \(n^{4\kappa }\) can be bounded by
which is negligible after summing over \(d_1,d_2\le n^\kappa \). So from now on we restrict to \( a_2,a_d\le n^{4\kappa }\) (note that \(r_2, d', d\) are automatically at most \(n^{2\kappa }\)). We see that the a, p-sum equals
where the error term comes from the artificially added condition \((a, n) = 1\). For the same reason we can restrict to \((n, d) =(n-w,d')=1\). After combining the last three congruence conditions modulo \(4a_2r_2^2d'a_dd^2\) to a single condition and noting \(d_1d_2\mid 4a_2r_2^2d'a_dd^2\) and \(\lambda _{\textbf {d}}\ll \tau (d_1d_2)^C\), we are in a position to apply Proposition 6.1 (since \(\kappa \) is sufficiently small in terms of \(\varpi \)) together with the prime number theorem to the two innermost sums and recast the previous display as
where the error terms \(r_{\textbf {d}}\) satisfy
Plugging back, we obtain
But now the v-sum vanishes: with \(h = a_2r_2^2 \mid 2^{\infty }\) we have
Since \(\chi _{-4}\) is primitive, by [IK, (3.9)] the inner sum vanishes, unless \(4 \mid \ell /(\ell , h)\) (which can only happen if \(h=1\)), but then the Möbius function vanishes. We conclude the contribution from \(S_1\) in (6.1) is acceptable.
We now turn to \(S_2\) where the calculation is similar, and we obtain
As before we restrict to \(b_d\le n^{4\kappa }\) at the cost of an error of \(O(x^{1-\kappa })\) (after summing over \(d_1,d_2\le n^\kappa \)). The contribution from \((\delta \delta _1\delta _2,n)>1\) is also negligible, so let us from now on assume \((\delta \delta _1\delta _2,n)=1\). We cannot apply Proposition 6.1 to evaluate the b, p-sum on average over \(d_1,d_2\) directly due to the dependence of the length of the intervals. To remedy this, we split the p-sum into intervals of the form \(((1 - \Delta )P, P]\) with \(\Delta = (\log n)^{-K}\) for some sufficiently large \(K=K(C)\). The condition \(p\le n-bb_d\sqrt{n}\delta ^2\delta _1^2\delta _2^2\) interferes in at most one of such intervals, whose contribution we estimate by the Brun-Titchmarsh inequality at the cost of an admissible error \(O(n \Delta (\log x)^{O(1)})\) which is negligible by choosing K large enough in terms of C. For all other intervals we can split the b-sum in to intervals of the shape \(((1-\Delta )P,P]\). The condition \(b\le (n-Y)/\sqrt{n}(\delta \delta _1\delta _2)^2b_d\) interferes in one of such intervals, whose contribution we can be bound by \(O(nR(\log x)^{O(1)}\Delta )\) using the trivial bound for \({\mathfrak {u}}_R(n, q,a,d,w)\ll \frac{R\tau (q)\tau (d)}{\phi (q)\phi (d)}\). Here R is chosen to be some large power of \(\log x\) in the proof of Proposition 6.1 and so the error term from separating the b-variable is negligible when K is sufficiently large. For the rest of the intervals, we can apply Proposition 6.1. Assembling the main term, we recast \(S_2\) as
where the error terms \(r'_{\textbf {d}}\) satisfy
As a last step, we compute the b-sum in the main term in \(S_2\) using [ABL, Lemma 5.2] and complete the sum over \(b_d\), and recast the previous display up to an admissible error as
where
This finishes the analysis of \(S_2\).
We now return to (6.3) and summarize that there exists some absolute constant \(\kappa >0\) such that
where the singular series \({\mathfrak {S}}_{{{\textbf {d}}}}(n)\) is given by
The sum over \(w\,(mod \,\delta _1'\delta _2')\) can be computed using (2.8) as
It is now a straightforward exercise with Euler products, noting that \(\chi _{p^{*}}(n) = (n/p)\) for odd p and \(L(1, \chi _{-4}) = \pi /4\), to obtain the expression (6.2). This completes the proof.
6.3 Completion of the proof of Theorem 1.1.
With Proposition 6.2 available, Theorem 1.1 follows now easily from application of a sieve. Let \({\mathcal {A}}=(a_k)\) be a finite sequence of non-negative real numbers. We introduce the notation
With these notations, we recall the following lemma from sieve theory (see e.g. [Iw2, Theorem 1]).
Lemma 6.3
Let \(\gamma , L>0\) be some fixed constants. Let \(\Omega (c)\) be a multiplicative function satisfying \(0\le \Omega (p)<p\) and
for all \(2\le w\le w'\). Then we have
where \(s= \log {\mathcal {D}}/\log z\), \(Q(s)< \exp (-s\log s+s\log \log 3s+O(s))\), and \(f_\gamma (s)\) is some continuous function such that \(0< f_\gamma (s)<1\) and \(f(s)=1+O(e^{-s})\) as \(s\rightarrow \infty \).
We are going to sieve the sequence \({\mathcal {A}}={\mathcal {A}}(n)=(a_k)\) where \(a_k=\#\{k=x_1x_2: p+x_1^2+x_2^2=n\}\). Then from Proposition 6.2, there exists some \(\kappa >0\) such that for \(c\le n^{\kappa }\) with \(\mu ^2(c)=1\) we have
where
and
for any \(K>0\). We next examine the multiplicative function \(\omega _\nu (\bullet ;n)\). For odd p we compute explicitly
and
and for \(p=2\) we have
Let \(\Omega (\bullet ;n)\) be the multiplicative function defined by
so that we have from (6.7) and (6.8) that for any \({\mathcal {D}}\le n^{\kappa }\)
where \(A={\mathfrak {S}}(n){\text {Li}}(n)\). Note that we have
Thus there exists some absolute constant L such that (6.6) holds for \(\gamma =2\). We choose \({\mathcal {D}}=n^{\kappa _0}\) for some absolute small constant \(\kappa _0\) (e.g. \(\kappa _0=\kappa /3\)) and choose \(C=C(\kappa _0)\) sufficiently large such that \(f_2(C\kappa _0)>0\) holds. Since \(\Omega (2;n)=2\) if and only if \(n\equiv 0 \,(mod \,2)\) and \(\Omega (3;n)=3\) if and only if \(n\equiv 2\,(mod \,3)\), it follows that \(0\le \Omega (p)<p\) if \(n\equiv 1,3 \,(mod \,6)\). Then, on taking \(s=C\kappa _0,\) it follows from Lemma 6.3 that for \(n\equiv 1,3\,(mod \,6)\) we have
This completes the proof of Theorem 1.1.
6.4 Proof of Theorem 1.2.
Denote
Then we have
and thus it is enough to prove
for some sufficiently large C. We can follow the proof of Theorem 1.1 to obtain (6.9). To be more precise, let \(B_c=B_c(x)=\sum _{k\equiv 0 \,(mod \,c)}b_k\). Then for \(\mu ^2(c)=1\) we have
where
After an application of Lemma 2.1 and Theorem 3.3, we see that uniformly for \(d_1,d_2\le x^\kappa \) with \(\kappa \) sufficiently small we have
Then we can follow the proof of Proposition 6.2 to evaluate the main term and obtain that there exists some absolute constant \(\kappa >0\) such that uniformly for \(d_1, d_2\le n^\kappa \) and \(\lambda _{\textbf {d}}\ll \tau (d_1d_2)^{C'}\) we have
where \({\mathfrak {S}}_{\textbf {d}}(n)\) is as in (6.2). With (6.10), we can follow the proof and notation in Subsection 6.3 with n replaced by 1 in the singular series calculations to see that
for any \(A>0\). Note that have \(\Omega (p;1)<p\) for all p and \({\mathfrak {S}}(1)>0\). Lemma 6.3 together with (6.4) then yields (6.9) for C sufficiently large. For the second claim in Theorem 1.2, we write
Since \( R_C(n)\le r(n-1)\ll n^\epsilon , \) we see that
which shows that there are infinitely many primes that can be written as one plus the sum of two squares of integers having no more than C/2 prime factors.
7 Proof of Theorem 1.3
The rest of the paper is devoted to the proof of Theorem 1.3. The basic idea is similar to the proof of Theorem 1.1, but it is technically and structurally substantially more involved. Again we start with the distribution of smooth numbers in arithmetic progressions to large moduli.
7.1 Smooth numbers in arithmetic progressions.
Denote
We also use the usual notation
For an integer n we set \(n_y=\prod _{\begin{array}{c} p^\nu \Vert n\\ p\le y \end{array}}p^{\nu }\) and \(n_o=\frac{n}{(n,2^\infty )}\). We have the following analogue of Proposition 6.1 for smooth numbers.
Proposition 7.1
Suppose \(|\lambda _d|\ll \tau (d)^C\) for some fixed \(C>0\). Then there exist some constants \(\varpi , \delta , D >0\) such that the following holds. Let \(x\ge 2\), \(c_0,c,d\in {\mathbb {N}}, (c_0, c)=1, a_1,\ell \in {\mathbb {Z}}{\setminus } \{0\}\) such that for
we have
for any \(A>0\).
Remark
One of the important features of Proposition 7.1 is that we make the dependence on \(\ell \) explicit. This is necessary for our application since \(\Psi _\ell (x,y)\asymp \prod _{p\mid \ell _y}(1-p^{-\alpha })\Psi (x,y)\) with \(\alpha = \alpha (x, y)\) as in (1.6), which could be much smaller than \(\Psi (x,y)(\log x)^{-A}\) when y is small. Furthermore, the condition \(a_2\mid \ell \) ensures that \((n,a_2)=1\) in an application of Proposition 5.1.
As a preparation for the proof we recall the following result that is a variation of the result in [Ha, Section 3.3] or [Dr1, Lemme 5].
Lemma 7.2
Suppose \(|\lambda _q|\ll \tau (q)^C\) for some fixed \(C>0\) and \(\ell \in {\mathbb {Z}}{\setminus }\{0\}\). There exist some constants \(D, \eta , \delta >0\) such that the following is true. If \((\log x)^D\le y\le x\), \(q\le Q\le x\) and \(\omega (\ell _y)\ll \log x\), we have
for any \(A>0\).
Proof
When \(\lambda _q=1\), \(\ell =1\), this can be found in [Ha, Section 3.3]. For general coefficients \(\lambda _q\) we can follow the same argument using that
We now deal with general values for \(\ell \). We recall [lBT, Theorem 2.4, Lemme 3.1, eq (2.8)]: we have uniformly in \({\mathcal {L}}(x) \le y \le x\), \(1\le d \le x/y\) \(P^+(\ell )\le y\) and \(\omega (\ell )\le \sqrt{y}\) the estimate
where \(\alpha \) is as in (1.6) satisfying
and \(\xi (t)\) is defined implicitly by
It is easy to see that \(\xi (u) = \log (u \log u) + O(1)\). When \(\exp ((\log x)^{2/5})< y\le x\), we have
which implies that \(\Psi _\ell (x,y)\asymp \Psi (x,y)\). Thus, assuming without loss of generality that \(p\mid \ell \Rightarrow p\le y\), and after using Möbius inversion to detect the condition \((m, \ell )=1\), it is enough to prove that
The contribution from \(d\ge x^{c_0}\) can be bounded trivially by \(x^{1-c_0+\eta +\epsilon }\). The contribution from \(d\le x^{c_0}\) can be bounded individually using the result for \(\ell =1\) (and adjusting the constant \(\eta \)) so that together with (7.1) we obtain the left hand side of (7.3) can be bounded by
where we used the the assumption of \(\ell \) and the range of y in the last step.
It remains to incorporate the condition \((m, \ell ) {=} 1\) for \((\log x)^D {\le } y \le \exp ((\log x)^{2/5})\). To this end, we first generalize [Ha, Proposition 1, Theorem 3] with \(\Psi (x,y)\) replaced by \(\Psi _\ell (x,y)\). This can be done by following the proof of [Ha, Proposition 1, Theorem 3] and replacing \(L(s, \chi ;y)=\prod _{p\le y}(1-\chi (p)p^{-s})^{-1}\) by
Using \(\omega (\ell _y)\ll \log x\) we see that the contribution from \(\sum _{(n, \ell )>1}\Lambda (n)\chi (n)n^{-\sigma }\) can be bounded by
which is \(o(\log x)\) when \(\sigma \in [\alpha -1/300, \alpha ]\) using (7.2). Thus we still have the bound
which is enough to prove the analogue of [Ha, Theorem 3] with \(\Psi (x,y)\) replaced by \(\Psi _\ell (x,y)\) when the summand is restricted to \((m, \ell )=1\). We can also replace [Ha, Smooth Numbers Result 1, 3] by [lBT, Theorem 2.1, 2.4]. Then we can follow the proof in [Ha, Section 3.3] in the case \(\ell =1\) and replace \(\Psi (x,y)\) by \(\Psi _\ell (x,y)\) so that condition \((m,\ell )=1\) is preserved. On the way we need the estimate
which was proved on [Ha, pp. 16-17] when \(\ell =1\). For general \(\ell \), we simply note that the contribution from \(p\mid \ell \) can be trivially bounded by
so that (7.4) still holds since \(u\gg (\log _2x)^2\) by our current assumption on y.
Proof of Proposition 7.1:
The proof follows by combining the proofs of [FT1, Lemme 2.1] and [ABL, Theorem 2.1]. We use \(\varepsilon \) as an arbitrary positive constant, not necessarily the same at each occurrence. Let \(R=x^\eta \) for some sufficiently small \(\eta \) to be determined later. We use the notation \({\mathfrak {u}}_R(n;q, a,d, w)\) as defined in (5.1) to write
where
For \({\mathcal {S}}_1\), we use Lemma 7.2 and \((q,d)=1\) to conclude that for \(\eta \) small enough,
for some large constant \(A'\) depending on \(y, \delta \). Here we used (7.1), \(y\ge (\log x)^D\) and \(\alpha \ge 1-1/D+o(1)\) in the last step. Using ideas as in the proof of Theorem 3.4, we now complete the argument by showing that there exists some \(\varpi , \delta _0>0\) such that
To prove (7.5), we consider separately the cases \(y\ge x^\eta \) and \(y< x^\eta \) for some sufficiently small \(\eta >0\).
When \(y\ge x^\eta \), we use Buchstab’s identity (3.4) to write
where
Consider \({\overline{S}}\) first. Note that we can assume \(y< p_1\ll x^{1/4}\) since otherwise the sum is empty. The condition \(\mu ^2(m_op_1\cdots p_4)=1\) is equivalent to \(\mu ^2(m_o)=(m, p_1\cdots p_4)=1\). We use Möbius inversion to detect \((m, p_1\cdots p_4)=1\). Since we have that the primes \(p_i>y\ge x^\eta \), the contribution from \(k\mid p_1\cdots p_4\) with \(k<x^{\eta /2}\) comes only from the term \(k=1\). Similar to the proof of Theorem 3.4, we localize m and \(p_i\) into intervals \(m\in (M, M(1+{\mathcal {Z}})]\) and \(p_i\in (P_j, P_{j+1}]\) with \(P_j=y(1+{\mathcal {Z}})^j\) and with \({\mathcal {Z}}=R^{-\eta _0}\) for some \(\eta _0>0\) so that
where
If \(M \prod _{i=1}^4P_{j_i}\le x^{1-\eta /2}\), we can bound \(\tilde{{\mathcal {S}}}\ll x^{1-\eta /2+\varepsilon }\) trivially. Otherwise, we can apply the bilinear estimate Proposition 5.1 with \({\varvec{m}}=p_1\) to \(\tilde{{\mathcal {S}}}\) since \(x^{\eta }\le y\le P_{j_1}\ll x^{1/4}\ll (M\prod _{i=1}^4P_{j_i})^{1/4+\eta }\). We obtain
as long as
for some constant \(\varpi _0>0\). This suffices for the bound in (7.5) by choosing a suitable \(\eta _0\).
For \(S_k\), \(1\le k\le 3\), we can use Möbius to detect \((m, p_1\cdots p_j)=1\) as before, replace the characteristic function of the primes by the von Mangoldt function, apply Heath-Brown’s identity (3.1) to each prime variable \(p_i\) and then localize the new variables so that we have
where
with
As before, we can assume \(M\prod _{i=1}^rM_i \prod _{j=1}^s N_j\ge x^{1-\eta /2}\) since otherwise we can bound \(\dot{{\mathcal {S}}}\ll x^{1-\eta /2+\varepsilon }\) trivially. Now we argue similarly as in the proof of Theorem 3.4. If there is a subset \({\mathcal {I}} \subseteq \{M, M_i, N_j\}\) such that \(K:=\prod _{I\in {\mathcal {I}}}I\) such that \(x^{\eta }\le K \le x^{1/4+\eta }\), then we can apply Proposition 5.1 to obtain
as long as (7.6) is satisfied. Otherwise, we can use Möbius inversion to detect \(\mu ^2(m_o)=1\) if necessary, and it is enough to show there exists some constant \(\varpi >0\) such that
for \(r\le x^{\eta }, x^{1-\eta /2}\ll rMKL\le x\) and \(\alpha _r\ll r^\varepsilon \). After an application of Möbius inversion to detect the condition \((mkl,\ell )=1\), we are left with
for \(r\le x^{\eta }, x^{1-\eta /2}\ll r\lambda MKL\le x\) and \(\alpha _r\ll r^\varepsilon \). The contribution from \(\lambda >x^{\eta '}\) can be bounded trivially by \(x^{1-\eta '/2}\). The contribution for any fixed \(\lambda < x^{\eta '}\) can be rewritten and then estimated in the same way as in [ABL, eq. (4.5)], where ultimately Deligne’s estimates for exponential sums over algebraic varieties over finite fields are used. By choosing \({\mathcal {Z}}, \eta '\) suitably, we see that there exists some \(\varpi , \delta _0>0\) such that \(S_k\ll cx(\log x)^{O(1)}R^{-\delta _0}\) for \(1\le k\le 3\).
We can also treat \(S_0\) by (7.7), which completes the proof of (7.5) when \(y\ge x^\eta \).
When \(y\le x^\eta \), we can apply [FT3, Lemme 3.1] in the form of (3.10) so that it is enough to show that there exists some absolute constant \(\varpi , \delta _0>0\) such that
Since \(P^+(l)\le P^-(m)\) we see that
After separating variables l and m as in [Dr1, eq. (3.37)] and localizing l, m into short intervals \(l\asymp L\) and \(m\asymp M\), we can apply Proposition 5.1 for the first sum above to obtain (7.8), as long as \(x^{\eta _1}\le M\) and \(My\le (LM)^{1/4+\eta _1}\), which can be satisfied by choosing \(M=x^{\eta _2}\) for some suitable \(\eta _2>0\) when \(LM\ge x^{1-\eta _3}\). If \(LM\le x^{1-\eta _3}\), we can obtain (7.8) trivially. For the second sum, we move the summation of \(p\le y\) outside so that contribution from \(p\le x^{\eta _4}\) can be estimated using Proposition 5.1 similarly as above and the contribution from \(y\ge x^{\eta _4}\) can be bounded trivially. This complete the proof of (7.5) when \(y\le x^\eta \).
7.2 Implementing sieve weights.
The most straightforward approach to Theorem 1.3 would be to use Proposition 7.1 to evaluate
for square-free numbers \(d_1, d_2\) and \(y = g(n)\), and then apply a sieve. Unfortunately the main term in the asymptotic formula is not a multiplicative function of \(d_1d_2\). As mentioned in the introduction, this makes it problematic to implement a sieve, so instead we will work with the sieve weights directly. We recall [Iw3, Lemma 3].
Lemma 7.3
Let \(\gamma , L>0\) be some fixed constants. Then there exists a sequence \(\lambda _c^-\) supported on square-free integers less than \({\mathcal {D}}\) such that
and that for all multiplicative functions w satisfying \(0\le w(p)<1\) and
for all \(2\le W\le W'\) we have
where \(f_\gamma (s)\) is some continuous function such that \(0< f_\gamma (s)<1\) and \(f_\gamma (s)=1+O(e^{-s})\) as \(s\rightarrow \infty \).
The idea is to estimate the sieve weights \(\lambda _c^-\) with multiplicative factors w(c) first to get a lower bound using Lemma 7.3, and then sum over the smooth numbers m. In order to avoid major technical difficulties, we need to prepare the set-up very carefully.
-
Many estimates are sensitive to the prime factorization of m, and it simplifies our life to restrict m to be square-free. On the other hand, if \(n \equiv 2\) (mod 4), we will need \(4\mid m\) in order to represent n by m and two odd integer squares. Thus we only restrict m to have an odd square-free part. This restriction will not change the order of the number of solutions, but the fact that square-free numbers are not equidistributed among all residue classes will leads to different constants for different n.
-
It turns out that a factor \(\chi _{-4}(h)\) with \(h\mid (n,m)\) appears in the main term. It is convenient to force \(\chi _{-4}(h)>0\) (so that a lower bound suffices), and thus we also restrict m to be coprime to all primes \(p \equiv 3\) (mod 4) dividing n. Since \(n-m\) is a sum of two squares, this is a harmless maneuver and affects only square factors, but it makes an important difference for the computation if we implement this. Note that we could simplify the computations by restricting m to be coprime to n, but this could lose some order of magnitude for certain cases of n, e.g. when n is the product of the first few primes that are \(1 \,(mod \,4)\).
-
Finally, the sieve weights behave a bit erratically for small primes, so we sieve them out directly by Möbius inversion.
With these general remarks in mind, we fix the following notation. We write
Recall the notation (6.5). Let
Then for c square-free, we have
Let \({\mathcal {Q}}\) be some fixed absolute constant (we will later choose \({\mathcal {Q}} = 30\)). Let \(\lambda _c^-\) be some lower bound sieve weights as in Lemma 7.3 of dimension \(\gamma \), level \({\mathcal {D}}=n^{\kappa _0}\) for some sufficiently small \(\kappa _0\) and \(z=n^{1/C}\) for some sufficiently large C so that \(f_\gamma (C\kappa _0)>0\). Then we have the following lower bound for \( S({\mathcal {A}},z)\):
Our next goal is to evaluate the innermost sum using Lemma 2.2.
7.3 The cuspidal contribution.
We first treat the contribution to the m-sum in (7.9) that corresponds to (2.9). To this end we need to show that there exists some absolute constant \(\eta >0\) (which eventually will depend on the constants in Theorem 3.4) such that
uniformly for \(\Delta , q\le n^\eta \) with \((\Delta w, q) = 1\) (specifically, in the notation of (2.9) we have \(\Delta = (\delta \delta _1\delta _2)^2\), \(q = \delta _1'\delta _2'\)) and Hecke eigenvalues of a cusp form \(\phi \) whose conductor is less than \(n^\eta \). By Möbius inversion, the previous display equals
The contribution from \(d\ge n^{4\eta }\) can be bounded by
using the bound (4.3) for the fourth moment of Hecke eigenvalues and recalling that presently the conductor \(C_\phi \le n^{\eta }\).
For \(d\le n^{4\eta }\), we detect the condition \((m, {\mathfrak {N}})=1\) using Möbius inversion to write
The contribution from \(\tau \ge n^{4\eta }\) can be bounded by
Both error terms are acceptable. Thus, for \(d, \tau \le n^{4\eta }\) and automatically \((d\tau , q) = 1\) and \(d_0:= (d^2 \tau , \Delta ) \mid n\), we are left with bounding
and the desired power saving as in (7.10) follows (after obvious smoothing) from Theorem 3.4.
7.4 Computing the main term I: summing over arithmetic progressions.
We now return to (7.9) and insert the contribution corresponding to (2.7). In view of the bounds in the previous subsection, we have for \({\mathcal {D}}=n^{\kappa _0}\) for sufficiently small \(\kappa _0\) (depending on the constants \(\varpi \) in Proposition 7.1 and \(\eta \) is as in (7.10))
where
As in the proof of Proposition 6.2, we apply Dirichlet’s hyperbola method to the r-function. Let \(\eta _0\) be some small positive constant that will be chosen appropriately later. Let
for some \(\eta _0>0\) to be chosen later. In particular, in terms of orders of magnitude, n and x can (and will) be used interchangeably. Note that the contribution from \(m\ge x\) or \(m\le n-x\) can be bounded by \(Z^{-1}n^{1+\varepsilon }\ll n^{1-\eta _0/2}\). Thus we can write
where
Again we expect that \(S_1\) gives a negligible contribution, as in the proof of Proposition 6.2, due to a particular behaviour at the prime 2 which makes the main term disappear, but this is not so easy to see in the present situation. We will carry out the computation for both \(S_1\) and \(S_2\) and combine them to a joint main term for which we obtain the desired lower bound. As before, the computation of \(S_1\) is slightly harder, so we focus on this part in detail, while the computation of \(S_2\) is similar, but technically slightly easier. Our first step is to establish the asymptotic evaluations (7.17) and (7.21) below.
We would like to choose \(\eta _0\) sufficiently small so that we can evaluate the sums by Proposition 7.1 on average of \(d_1,d_2\), but this requires some preparation. We consider \(S_1\) first. As before we write
so that a, m-sum in \(S_1\) becomes
Here we have used the fact that if \(2\mid \delta _1'\delta _2'\), then we must have \(r_2a_2=1\) and \((w, 2)=1\) so that the condition \(n-m\equiv w\,(mod \,2)\) is incorporated in the v-sum. As before, we can restrict \(a_2, a_d\le n^{\eta _0}\) at the cost of an error of size \(O(n^{1+\varepsilon -\eta _0/2})\). The a, m-sum in (7.12) can be written as
By the Chinese remainder theorem, the congruence conditions modulo \(a_dd^2, 4a_2r_2^2, d'\) can be written as a single congruence condition \(\frac{m}{g_1}\equiv c_1\,(mod \,F_1)\) with \((c_1, F_1)=1\), where
In order for the m-sum to be nonempty, we must have
as well as
in which case we can re-write the a, m-sum in (7.13) as
Note that from the construction of the sieve weights we have \(d_1, d_2 \le {\mathcal {Q}}{\mathcal {D}}\ll n^{\kappa _0}\) and thus we automatically have \(g_1 \ll n^{2\eta _0 + 4\kappa _0}\), a small power of n. Also note that \(g_1 \mid (2d_1d_2)^{\infty }\), so that automatically \((g_1, h) = 1\).
We can again restrict \(h\le n^{\eta _0}\) with a total error of size \(O(n^{1+\varepsilon -\eta _0/2})\). Finally we detect the conditions \(\mu ^2(m_o) = 1\) by Möbius inversion. We summarize that the a, m-sum in (7.12) can, up to an admissible error of size \(O(n^{1+\varepsilon -\eta _0/2})\), be replaced with
where \(a_2, a_d \ll n^{\eta _0}, g_1\ll n^{2\eta _0+4\kappa _0}\) are sufficiently small powers of n. We evaluate this as
where
We want to show that \({\tilde{r}}_1\) that is small on average using Proposition 7.1 with \({\textbf{q}}=a, {\textbf{a}}_1=\frac{n}{h}, {\textbf {d}}=F_1\). It is not enough to apply Proposition 7.1 with \({\textbf{a}}_2=g_1\) and sum over \(g_1\) directly due to sparseness of smooth numbers when y is small and thus we need to be more precise about \(g_1\). Note that \(a_2r_2^2a_dd^2d'\mid g_1F_1\) and so \((h,d_1d_2)=1\) and \((a, 2d_1d_2)\) is equivalent to \((a, g_1F_1)=1\). The coefficients of h can be bounded by 1. The coefficients of \(g_1, F_1\) is bounded by \(\tau (dd')^C\ll \tau (g_1F_1)^C\). From (7.14), we see that if \(p\mid g_1, p\not \mid F_1\), then we must have either \(p\mid (n-w,d')\) or \(p\mid 2n\) and \(p^2\mid g_1\). If \(p^2\mid g_1\), we also have \(p\mid 2n\). Therefore we can write \(g_1=g_0g_1'g_3\) where \(g_1'=(n-w, d')\) is square-free, \(g_0\mid (2n)^\infty \) is square-full and \(g_3\mid F_1\). We can then change the order of summation to move \(h, g_0,g_1'\) outside the summation over u, c or equivalently \(d_1,d_2\). After dividing the m-sum into short intervals to separate the variables m from \(g_1\) similarly as in the proof of Proposition 6.2, we can apply Proposition 7.1 for the \(c,a_2,a_d,a,m\)-sum with \({\textbf{d}}=F_1, {\textbf{a}}_2=g_0g_1', {\textbf{a}}_3=g_3\) for fixed \(h, g_0g_1'\) (with \(h\le n^{\eta _0}\) and \(g_0g_1'\ll n^{2\eta _0+4\kappa _0}\) for \(\eta _0, \kappa _0\) sufficiently small) so that the total contribution of the error \({\tilde{r}}_{1}={\tilde{r}}_1(g_3)\) in (7.11) can be bounded by
Note that
and thus for \(y\ge {\mathcal {L}}(x) = (\log x)^{D}\) for some sufficiently large D, Proposition 7.1 gives
We recall (7.1), (7.2) and note that for \(y\ge {\mathcal {L}}(x)\) we have \(\alpha \ge 1-\frac{1}{D}+o(1)\). Thus when \(y\ge {\mathcal {L}}(x)\) we have
by choosing A large enough in terms of C. Since \(\prod _{p\mid n}(1+\frac{1}{p})\ll \log _2 n\), we have
for any \(A>0\) with \({\mathcal {F}}(n, y)\) defined as (1.5).
For the main term, we can substitute the main term in (7.16) back into (7.12) and evaluating the u-sum, we obtain (writing the condition (7.15) on \(g_1\) in terms of the remaining variables and noting that \(\mu ^2(d')=1, a_d\mid d^\infty \)) that we can replace \(S_1\) by \({\tilde{S}}_1\) in (7.11) up to a negligible error \({\mathcal {E}}_1\) where
with
Next we change the summation order of m and a, evaluate the a-sum by [ABL, Lemma 5.2] and then complete the \(a_2, a_d,h\)-sums with an admissible error to finally obtain (with e.g. \(\eta _0=10\kappa _0\)), uniformly for \(d_1,d_2\ll n^{\kappa _0}\) (with \(\kappa _0\) sufficiently small), that
with \({\mathfrak {c}}(f)\) as in (6.4).
The same strategy can be applied to \(S_2\). We give a sketch of the manipulations highlighting the differences from those for \(S_1\). To begin with, we do not need to separate \(r_2\) from \(\delta \delta _1\delta _2\), so that the b, m-sum in \(S_2\) becomes
We can restrict \(b_d\le n^{\eta _0}\) with an error of size \(O(n^{1+\varepsilon -\eta _0/2})\). After splitting m into residue classes modulo \(d_1d_2\), using the Chinese remainder theorem to rewrite the congruence condition modulo \(b_d(\delta \delta _1\delta _2)^2\delta _1'\delta _2'\) as \(m\equiv c_2\,(mod \,F_2)\) with
and then extracting (b, n), we can write the b, m-sum in (7.18) as
We can again restrict \(h\le n^{\eta _0}\) with a total error of size \(O(n^{1+\varepsilon -\eta _0/2})\). Additionally, we need to separate b, m in the summation condition as well as their dependency on \(g_2,F_2\) in order to apply Proposition 7.1. As usual, this can be achieved by splitting m into intervals of the form \([M, (1+\Delta )M]\) with \(\Delta =R^{-\eta _0}\) so that at most one interval interferes with the condition \(mg_2h\le n-bb_d\sqrt{n}(\delta \delta _1\delta _2)^2\), in which case the contribution can be bounded by \(O(n^{1+\varepsilon }\Delta )\) trivially. Here \(R=\exp (\sqrt{\log x})\). Similarly we can split the b-sum into intervals of the form \([M, (1+\Delta )M]\) so that the condition \(b\le \frac{x}{b_d(\delta \delta _1\delta _2)^2\sqrt{n}h}\) only interferes with one interval in which case the contribution is also negligible. After Möbius inversion to detect \(\mu ^2(m_o)=1\), we can finally apply Proposition 7.1 so that the b, m-sum in the last display can be evaluated as
Here the total contribution from the error terms \({\tilde{r}}'_2\) in (7.11) can be bounded using Proposition 7.1 the same way as in \({\mathcal {E}}_1\) so that
for any \(A>0\). After changing the order of summation, evaluating the b-sum in (7.20) by [ABL, Lemma 5.2] (note that the upper limit for \(b_d\) is always \(\gg n^{1/2-\eta _0}\) unless \(|n-mg_2h|\le n^{1-\eta _0}\) whose contribution can be bounded by \(O(n^{1-\eta _0/2})\)) and completing the \(b_d,h\)-sums, we obtain, uniformly for \(d_1,d_2\le n^{\kappa _0}\), that we can replace \(S_2\) by \({\tilde{S}}_2\) in (7.11) up to a negligible error \({\mathcal {E}}_2\) where
and
7.5 Computing the main term II: Euler products.
We substitute (7.17) and (7.21) back into (7.11). By choosing \({\mathcal {D}}=n^{\kappa _0}\) sufficiently small, it follows from (7.9) together with (7.10) that
where
with \(F_1, g_1\) as in (7.14), and
with \(F_2, g_2\) as in (7.19). Now we rename \(mg_i\) as m, change the order of summation and evaluate \(S_1^-, S_2^-\) in roughly the form
for some multiplicative function \(w_{m,n,h}(c)\). In particular, we postpone the summation over m to the last moment. Precisely, we write
where \({\mathfrak {c}}( f)\) is as in (6.4), \(F_1\) is as in (7.14), and
To see this, we recall the conditions (7.15) and observe that after replacing \(mg_1\) with m, the new conditions \(P^+(m)\le y\), \(\mu ^2(m_o)=1\), \((m, {\mathfrak {N}})=1\) take care of the old conditions \(P^+((n,a_dd^2))\le y\), \(P^+((n-w, d'))\le y\), \(\mu ^2((n,a_dd^2))=(n, a_dd^2, {\mathfrak {N}})=(n-w, d', {\mathfrak {N}})=1\). Moreover, the old condition \((m, F_1)=1, mg_1\le x/h\) is equivalent to the new condition \((m, 4a_2r_2^2a_dd^2d')=g_1, m\le x/h\) after replacing \(mg_1\) with m. Note also that \({\mathcal {G}}_{\delta _1'\delta _2'}\simeq {\mathcal {G}}_{d'}\). Similarly, we write
where \(F_2\) is as in (7.19) and
Our next aim is to write the second line in (7.23) as an Euler product which will be given in (7.29) below. The v-sum modulo 4 in (7.23) contributes
In particular, we see that for each \(r_2\), only one \(a_2\) with \((a_2, \delta _1'\delta _2')=1\) gives a non-zero contribution. The w-sum together with \(G_{d'}\) can be evaluated as (treat each prime \(p \mid d'\) at a time)
Finally, the sum over \(a_d\mid d^\infty \) gives
where
Here we use crucially that (the odd part of) m is square-free. For more general m, the formula would look much more complicated, in particular the contribution could be much larger which would create considerable technical problems in applying Lemma 7.3.
We summarize: the second line in (7.23) can be written as an Euler product
where the Euler factor is composed of (7.25), (7.26) and (7.27).
We compute \({\mathfrak {f}}_1(2,\nu ;m,n)\) explicitly using (7.25). If \(2\not \mid d_1d_2\), then \(r_2=1\) and only \(a_2=2^{\nu _2(n)}\) contributes. If \(2\parallel d_1d_2\), then we have two cases: when \(2\parallel \delta _1'\delta _2'\), we have \(r_2=1\) and only \(a_2=1\) contributes (since we require \((a_2, \delta _1'\delta _2')=1\)); and when \(2\parallel \delta _1\delta _2\), we have \(r_2=2\) and only \(a_2=2^{\nu _2(n)-2}\) contributes. If \(2\mid (d_1,d_2)\), then \(r_2=2\) and only \(a_2=2^{\nu _2(n)-2}\) contributes. Precisely, we have
We can also compute \({\mathfrak {f}}_1(p,\nu ;m,n)\) for \(2\not \mid p\) using (7.26) and (7.27) as
For \(S_2^-\), we have analogous results with \({\mathfrak {f}}_1\) replaced by \({\mathfrak {f}}_2\) where
(noting that if \(2\mid \delta _1'\delta _2'\) then the w-sum is zero unless \(2| mn, 2\not \mid (m,n)\); and if \(2\mid \delta \delta _1\delta _2\) then the 2-part in the \(b_d\)-sum becomes \(\frac{1}{\phi (4/(n,4))}\mathbb {1}_{(m,4)=(n,4)}\)), and for \(2\not \mid p\) we have
7.6 Computation of the main term III: the sieve weights.
We are now prepared to further evaluate \(S_1^{-}\) and \(S_2^{-}\). We will bound \(S_1^{-}+S_2^{-}\) from below by (7.6) using properties of the lower bound sieve weights in Lemma 7.3.
To begin with, we substitute (7.29) into (7.23). We still keep h and m fixed and consider the \(u, c, d_1, d_2\)-sum in (7.23) which now equals
We recast this as (noting \({\mathfrak {c}}(2)=1\))
(of course each of the products over primes has only one factor, since \(\nu \) is unique) where
We observe that the sum over \(d_1,d_2\) gives
where
Using the formulas (7.32) and (7.33) instead of (7.30) and (7.31), the same computation evaluates the corresponding \(u, c, d_1, d_2\)-sums of \(S_2^-\) in (7.24) as
where \(\tilde{{\mathfrak {f}}}_2(2;m, n; u)\) is defined in the same way as (7.35), replacing the index 1 with the index 2. Combining the above calculations, we have shown
We are now finally in a position to apply Lemma 7.3 to bound from below the sum over the sieve weights. We make the following important observations. From (7.34), (7.31), (7.28), (6.4) we have
for \(p\not \mid 2h\). So we can choose \({\mathcal {Q}} = 30\) and use a sieve of dimension \(\gamma = 5\). Secondly, since \(h\mid n\), \((h, 2{\mathfrak {N}})=1\), we have \( \chi _{-4}(h)=1\), so that a lower bound for the sum of sieve weights with multiplicative coefficients suffices. This is an important device in the argument. Now applying Lemma 7.3 and (7.22), we obtain
where \(f_5\) has the same meaning as in Lemma 7.3, \({\mathcal {D}}=n^{\kappa _0}\) for some small \(\kappa _0>0\) and C a large constant such that \(f_5(C\kappa _0)>0\) and
where \({\mathfrak {F}}_{{\mathcal {Q}}}(m,n,h)=\prod _{\begin{array}{c} p\mid {\mathcal {Q}}\\ p\not \mid h \end{array}}{\mathfrak {F}}_p(m,n,h)\) with
(the right hand side is independent of h) and
Recalling (7.36), we define multiplicative functions \({\mathfrak {g}}_i\) supported on square-free integers with \({\mathfrak {g}}_i(p)=1\) except for \( p\mid P_{{\mathcal {Q}}}(z)\) where
so that
We rename mh by m, h by hg where \(h\mid (m,n), h\not \mid {\mathcal {Q}}\), and \(2\not \mid g\mid (m,n, {\mathcal {Q}})\). In this way we can re-write \({\mathcal {S}}(n)\) as
We first see that the sum over g can be written as (note that \({\mathfrak {F}}_p(\bullet ,\bullet ,p)=1\))
With \({\mathcal {Q}}=30\), \(\mu ^2(r_o)=1\), this becomes
Using (7.38), (7.30), (7.32) and (7.39), (7.36), we compute explicitly
(note that if \(3\mid n\) then \((r, 3) = 1\) since \((r, {\mathfrak {N}}) = 1\)) and (with \({\mathfrak {c}}(5)=\frac{16}{21}\))
In particular, we can always find some residue class \(r_0\,(mod \, 120)\) such that \({\mathfrak {F}}_{{\mathcal {Q}}}(r,n)>0\) for \(r\equiv r_0\,(mod \,120)\).
Noting that \((m, {\mathcal {Q}})= \mu ^2(m)=1\) and \({\mathfrak {g}}_i(p)=1\) if \(p\mid {\mathcal {Q}}\), we next evaluate the inner sum over h as
where \({\mathfrak {G}}(m;n)\) is a multiplicative function of m with \({\mathfrak {G}}(p^\nu ;n)=0\) for \(\nu \ge 2\) and \({\mathfrak {G}}(p;n)=0\) for \(p\mid {\mathcal {Q}}{\mathfrak {N}}\). We summarize
7.7 Computation of the main term IV: summing over smooth square-free numbers.
Our last goal is a lower bound for \({\mathcal {S}}(n)\) based on the formula (7.44).
In order to carry out the sum over m in (7.44), it is useful to define \({\mathfrak {h}}(\bullet ; n)={\mathfrak {G}}(\bullet ; n)*\mu \). Recall that the definition of \({\mathfrak {G}}(m;n)\) is implicit in (7.43), so that \({\mathfrak {h}}(p^\nu ;n) = 0\) if \(\nu > 2\), and for \(\nu \le 2\) we have
and
In particular, from (7.40) and (6.4) we have
where we can take 5 as implied constants. Using (7.1) and \(\alpha \ge 1-\frac{1}{D}+o(1)\) with D sufficiently large, the contribution from \(r\ge K=n^{1/100}\) in (7.44) can be bounded by
To evaluate the contribution from \(r\le K\), we consider two cases depending on the size of y.
7.7.1 Case I: \(y\ge \exp ((\log _2x)^2)\).
We write
We can truncate the t-sum using (7.1) as in (7.46) so that the contribution from \(t\ge K\) (7.47) can be bounded by \(K^{-1/3}n^\varepsilon \Psi (x,y)\). For \(t,r \le K\) we evaluate the sum by [FT2, Theorem 2 with \(q=1\)] getting
where \(R(x) = [x]/x\). Strictly speaking, the application of [FT2, Theorem 2] requires \(y \le x/K^2 \le x/(tr)\), which we can assume without loss of generality since \(\Psi (x, y) \asymp \Psi (x, x)\) for \(y > x^{1/2}\), say.
To estimate the error term, we use (7.1) again, and obtain the bound
for \(y \ge \mathcal \exp ((\log _2x)^2)\).
We now focus on the main term in (7.48) which we substitute back as the m-sum in (7.44). By partial summation, the main term equals
where
It is clear that \(W(U) = 0\) for \(U < 1\) and \(W(U) = c + O(1/U)\) for some constant c (depending on n and z). In preparation for an application of [FT1, Lemme 4.1] we define
We compute M(s) as
where
Recalling (7.40) and (6.4) we have
We can extend r, t to infinity getting
where uniformly in \(|s| \le 1/2\) we have (once again by Rankin’s trick)
and by Cauchy’s formula this bound remains true for all fixed derivatives with respect to s. Note that by (7.45) we have
Combing (7.41), (7.42), (7.45), (7.51) and (7.53), we obtain for all \(j\ge 0\) that
for \(y \ge \mathcal \exp ((\log _2x)^2)\) and \(\log z\asymp \log n\), and we also see \({\mathfrak {S}}(n, 0) \gg 1.\)
We are now prepared to apply [FT1, Lemme 4.1] to evaluate (7.50), which we complement with [FT2, Lemme 4.1] for bounds of \(\rho ^{(j)}(u)\). Using again \( y\ge {\mathcal {L}}(n) \) and \(\log z\asymp \log n\) and recalling the error terms (7.49), (7.46) and (7.52), we conclude
We plug this lower bound back into (7.37), noting that \({\mathcal {F}}(n,y)\asymp 1\) when \(y\ge \exp ((\log _2x)^2)\), and choose \(C=C(\kappa _0)\) a sufficiently large constant so that \(f_5(C\kappa _0) > 0\) getting
provided that \(y \ge \mathcal \exp ((\log _2x)^2)\) and \({\mathcal {D}}=n^{\kappa _0}\) for some sufficiently small \(\kappa _0>0\).
This concludes the proof of Theorem 1.3 when \( y\ge \exp ((\log _2x)^2)\).
Remark
We close this computation by a short comment on the “singular series” \({\mathfrak {S}}(n, 0)\) encountered in the final lower bound, which might look a bit unexpected. This is due to our restrictions on m, in particular the fact that we impose a square-free condition on m. As square-free numbers are not equidistributed among all residue classes, the singular series \({\mathfrak {S}}(n,0)\) will genuinely depend on the factorization of n in our set-up. For example, we can see from (7.41) that there are fewer solutions to \(n=m+x^2+y^2\) with \((xy,3)=1\), \(\mu ^2(m_o)=1\) when \(n\equiv 2\,(mod \,3)\) than in the other cases, whilst the number of solutions to \(n=m+x^2+y^2\) with \((xy,3)=1\) and no extra conditions on m would be approximately equal for all residue classes of \(n\,(mod \,3)\).
7.7.2 Case II: \((\log x)^D\le y\le \exp ((\log _2x)^2)\).
When y is small, the factor \(\prod _{p\mid n}(1+\frac{O(1)}{p^\alpha })\) in (7.49) can be much bigger than a large power of \(\log x\) and so we cannot bound \({\mathfrak {h}}(t;n)\) by absolute values. Instead, we apply Perron’s formula to evaluate the m-sum in (7.44) directly as follows. Write
where \(G(s;n)=\sum _{m=1}^\infty {\mathfrak {G}}(m;n)n^{-s}\) and as before \(\alpha \) is defined in (1.6). (Recall that for D large enough we have \(9/10 \le \alpha \le 1\).) Denote
Using (7.53) we can write
for some function \({\tilde{G}}(s;n)\) which is holomorphic and uniformly bounded in \(\Re (s)\ge 2/3\), say. We use the saddle point method to evaluate (7.54). The main term comes from a small neighbourhood of the point \(s=\alpha \). Precisely, we truncate the integral in (7.54) at height \(1/\log y\) to write
say. To handle the error term E(x, y) we need some estimate of G(s; n) when \(s=\alpha +i\tau \) for \(|\tau | \ge 1/\log y\). This can be done following the proof of the bound for \(\zeta _m(s,y)/\zeta _m(\alpha ;y)\) in [lBT, (4.42)]. In fact, the bound for \(\zeta _m(s,y)/\zeta _m(\alpha ;y)\) follows from the case of \(m=1\) together with bounds for the contribution from primes satisfying \(p\mid m\). In our case, we can bound the contribution from \(p\mid n_y\) in the same way (since \(\omega (n_y)\ll \sqrt{y}\) for \(y\ge {\mathcal {L}}(x)\)) and thus we can obtain for \(s=\alpha +i\tau \) that
where \(Y_\varepsilon =\exp \big ((\log y)^{3/2-\varepsilon }\big )\) and
Note that we have (see e.g. [lBT, eq. (3.9)])
for \(k \ge 1\). Similarly to the proof of [lBT, eq. (4.49)], we can show that for \(1\le z\le Y_\varepsilon \) we have
for some small \(c>0\). Set
for some \(c'>0\). With (7.57) and (7.60), we can follow [HT, Lemma 10] to bound the contribution from \(1/\log y\le |\tau |\le T\) and \(|\tau |\ge T\) in (7.54) respectively with \(T:=1/R^2,\) so that (7.56) holds with
which can be bounded using (7.61) and (7.59) by
It remains to evaluate the integral in (7.56). Let \(T_0=(u^{1/3}\log y)^{-1}\). Using (7.57), the contribution from \(|\tau |\le T_0\) can be bounded (see also [HT, p. 281])
The contribution from \(|\tau |\le T_0\) in the integral (7.56) gives the main term. We can show that
To prove (7.63), we compare it with the estimation for \(\Psi (x,y)\). Using [HT, last equation on p. 280 in the proof of Lemma 11] which states that uniformly for \(x\ge y\ge 2\) we have
we can write
Thus it remains to show that second term in (7.65) is of smaller order. Set
Then we have uniformly for \(1/2<\sigma <1\) and \(\Re (s)=\sigma \) that
We have the Taylor expansion around \(\tau =0\) for \(|\tau |\le T_0\ll (\sigma _3)^{-1/3}\) (see e.g. [lBT, eq. (4.54)])
where \(\sigma _i\) is as in (7.58), and for \(|\tau |\le T_0\) we have
Using this together with the formulas
we see that
Using (7.59), (7.66) and (7.67) this can be bounded by (for \(\alpha \ge 9/10\))
This completes the proof of (7.63).
Combining (7.56), (7.62), (7.63) and (7.64), we see that
Recalling (7.51), (7.55) and \(\Psi (x,y)\asymp \frac{x^\alpha \zeta (\alpha ,y)}{\alpha \sqrt{2\pi \sigma _2}}\) (see e.g. [HT, Theorem 1]), we conclude that when \(y\ge {\mathcal {L}}(x)\) the formula (7.44) becomes
We plug this lower bound back into (7.37) and choose \(C=C(\kappa _0)\) a sufficiently large constant so that \(f_5(C\kappa _0) > 0\) getting
provided that \(y \ge {\mathcal {L}}(x)\) and \({\mathcal {D}}=n^{\kappa _0}\) for some sufficiently small \(\kappa _0>0\).
This concludes the proof of Theorem 1.3 when \({\mathcal {L}}(x)\le y\le \exp ((\log _2x)^2)\).
Notes
As an aside we remark that we could also follow Munshi’s strategy [Mu], but note that his Lemma 6 needs to be corrected by a factor \(q_1\) when \(q_1 = {\tilde{q}}_1\).
References
J. Arthur, L. Clozel, Simple Algebras, Base Change, and the Advanced Theory of the Trace Formula, Annals of Math. Studies 120, Princeton University Press 1990
E. Assing, V. Blomer, J. Li, Uniform Titchmarsh divisor problems, Adv. Math. 393 (2021), 108076.
A. Balog, On additive representation of integers, Acta Math. Hungar. 54 (1989), 297-301.
V. Blomer, J. Brüdern, R. Dietmann, Sums of smooth squares, Compos. Math. 145 (2009), 1401-1441.
E. Bombieri, J. B. Friedlander and H. Iwaniec, Primes in arithmetic progressions to large moduli. II, Math. Ann. 277 (1987) 361-393.
Z. I. Borevich, I. R. Shafarevich, Number Theory, Academic Press 1966
J. Brüdern, É. Fouvry, Lagrange’s four squares theorem with almost prime variables, J. Reine Angew. Math. 454 (1994), 59–96.
R. de la Bretèche and G. Tenenbaum, Propriétés statistiques des entiers friables, Ramanujan J. 9 (2005), 139–202.
D. A. Cox, Primes of the form\(x^2+ny^2\), John Wiley, 1989.
S. Drappeau, Théorèmes de type Fouvry-Iwaniec pour les entiers friables, Compositio Math. 151 (2015), 828–862.
S. Drappeau, Sums of Kloosterman sums in arithmetic progressions, and the error term in the dispersion method, Proc. Lond. Math. Soc. 114 (2017), 684-732.
S. Drappeau, B. Topacogullari, Combinatorial identities and Titchmarsh’s divisor problem for multiplicative functions, Algebra Number Theory 13 (2019), 2383–2425.
W. Duke, J. Friedlander, H. Iwaniec. Bounds for automorphic L-functions, Invent. Math. 112 (1993), 1–8.
É. Fouvry, G. Tenenbaum, Diviseurs de Titchmarsh des entiers sans grand facteur premier, in: Analytic number theory (Tokyo, 1988), Lecture Notes in Mathematics 1434 (Springer, Berlin, 1990), 86–102.
É. Fouvry, G. Tenenbaum, Entiers sans grand facteur premier en progressions arithmétiques, Proc. London Math. Soc. (3) 63 (1991), 449–494.
E. Fouvry, G. Tenenbaum, Répartition statistique des entiers sans grand facteur premier dans les progressions arithmétiques, Proc. London Math. Soc. 72 (1996), 481-514.
E. Fouvry, G. Tenenbaum, Multiplicative functions in large arithmetic progressions and applications Trans. Amer. Math. Soc. 375 (2022), no. 1, 245-299.
J. B. Friedlander, H. Iwaniec, Opera de cribro, Coll. Publ. 57 (2010), AMS, Providence, RI.
J. B. Friedlander and H. Iwaniec, Coordinate distribution of Gaussian primes, J. Eur. Math. Soc. 24 (2022), 737–772.
A. Harper, Bombieri-Vinogradov and Barban-Davenport-Halberstam type theorems for smooth numbers, arXiv:1208.5992
G. H. Hardy, J. E. Littlewood, Some problems of partitio numerorum; III: on the expression of a number as a sum of primes, Acta Math. 44 (1922), 1-70.
D. R. Heath-Brown, A new form of the circle method, and its application to quadratic forms, J. Reine Angew. Math. 481 (1996), 149-206.
D. R. Heath-Brown, D. I. Tolev, Lagrange’s four squares theorem with one prime and three almost-prime variables, J. Reine Angew. Math. 558 (2003), 159–224.
A. Hildebrand, G. Tenenbaum, On integers free of large prime factors. Trans. Amer. Math. Soc. 296 (1986), 265-290.
C. Hooley, On the representation of a number as the sum of two squares and a prime, Acta Math. 97 (1957), 189-210.
C. Hooley, Applications of sieve methods to the theory of numbers, Cambridge Tracts in Mathematics 70, Cambridge University Press 1976
C. Hooley, On the representation of a number as the sum of a prime and two squares of square-free numbers, Acta Arith. 182 (2018), 201-229.
H. Iwaniec, Primes of the type\(\phi (x,y)+A\)where\(\phi \)is a quadratic form, Acta Arith. 21 (1972), 203-234.
H. Iwaniec, Rosser’s sieve, Acta Arith. 36 (1980), 171-202.
H. Iwaniec, A new form of the error term in the linear sieve, Acta Arith. 37 (1980), 307-320.
H. Iwaniec, Topics in classical automorphic forms, Grad. Stud. Math. 17 (1997), AMS.
H. Iwaniec, E. Kowalski, Analytic Number Theory, AMS Colloquium Publications 53, Providence 2004
M. Jutila, A variant of the circle method, in: Sieve methods, exponential sums, and their application in number theory (Cardiff 1995), 245-254.
H. Ki, H. Maier, A. Sankaranarayanan, Additive problems with smooth integers, Acta Arith. 175 (2016), 301-319.
E. Kowalski, P. Michel, J. VanderKam, Rankin-Selberg \(L\)-functions in the level aspect, Duke Math. J. 114 (2002), 123-191.
X. Li, Upper bounds on\(L\)-functions at the edge of the critical strip, IMRN 2010, 727-755.
Ju. V. Linnik, An asymptotic formula in an additive problem of Hardy-Littlewood, Izv. Akad. Nauk SSSR Ser. Mat. 24 (1960), 629-706.
K. Matomäki, J. Teräväinen. On the Möbius function in all short intervals, J. Eur. Math. Soc. 25 (2023), 1207–1225.
R. Munshi, Shifted convolution of divisor function\(d_3\)and Ramanujan\(\tau \)function, Ramanujan Math. Soc. Lect. Notes Ser. 20 (2013), 251-260.
N. Pitt, On an analogue of Titchmarsh’s divisor problem for holomorphic cusp forms, J. Amer. Math. Soc. 26 (2013), 735-776.
B. Topacogullari, The shifted convolution of divisor functions, Q. J. Math. 67 (2016), 331-363.
A. I. Vinogradov, General Hardy-Littlewood equation, Mat. Zametki 1 (1967), 189-197.
Acknowledgements
We would like to thank the referee for a careful reading of the manuscript.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The first and third author were supported in part by DFG Grants BL 915/2-2 and BL 915/5-1 and Germany’s Excellence Strategy Grant EXC-2047/1-390685813. The second author received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, grant agreement no. 851318. The third and fourth author acknowledge support of the Max Planck Institute for Mathematics. The fourth author was supported by DFG project number 255083470 and by a Leverhulme Early Career Fellowship, and received support from the European Reseach Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant ID 648329).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Blomer, V., Grimmelt, L., Li, J. et al. Additive problems with almost prime squares. Geom. Funct. Anal. 33, 1173–1242 (2023). https://doi.org/10.1007/s00039-023-00635-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00039-023-00635-w