Diophantine equations in semiprimes

A semiprime is a natural number which is the product of two (not necessarily distinct) prime numbers. Let $F(x_1, \ldots, x_n)$ be a degree $d$ homogeneous form with integer coefficients. We provide sufficient conditions, similar to those of the seminal work of B. J. Birch, for which the equation $F (x_1, \ldots, x_n) = 0$ has infinitely many integer solutions with semiprime coordinates. Previously it was known, by a result of \'A. Magyar and T. Titichetrakun, that under the same hypotheses there exist infinitely many integer solutions to the equation with coordinates that have at most $384 n^{3/2} d (d+1)$ prime factors.


Introduction
Solving Diophantine equations in primes or almost primes is a fundamental problem in number theory. For example, the celebrated work of B. Green and T. Tao [5] on arithmetic progressions in primes can be phrased as the statement that given any n ∈ N the system of linear equations x i+2 − x i+1 = x i+1 − x i (1 ≤ i ≤ n) has a solution (p 1 , . . . , p n+2 ) such that each p i is prime and p 1 < p 2 < . . . < p n+2 . A major achievement extending this result in which a more general system of linear equations is considered has been established by B. Green, T. Tao, and T. Ziegler (see [6], [7], [8]) and we refer the reader to [6,Theorem 1.8] for the precise statement. Another important achievement in this area is the well-known theorem of Chen [3] related to the twin prime conjecture. The theorem asserts that the equation x 1 −x 2 = 2 has infinitely many solutions ( 1 , p 2 ) where 1 has at most two prime factors and p 2 is prime.
The main focus of this paper is on equations involving higher degree polynomials. Let d > 1. Let F(x) be a degree d homogeneous form in Z[x 1 , . . . , x n ]. We are interested in integer solutions x to the equation for which all coordinates have small numbers of prime factors. For this to be possible one has to impose appropriate conditions. Let Z × p be the units of p-adic integers. We consider the following conditions. Local conditions ( ). The equation (1) has a non-singular real solution in (0, 1) n , and also has a non-singular solution in (Z × p ) n for every prime p. Let V * F be an affine variety in A n C defined by By Euler's formula it follows that V * F is the singular locus of V (F) = {z ∈ C n : F(z) = 0}, but we shall consider it as a subvariety of A n C and let codim V * F = n − dimV * F . For solving general non-linear polynomial equations in primes, the following important result was established by B. Cook and Á. Magyar [4]. Here the theorem requires codim V * F to be very large. In fact, the required bound on codim V * F "already exhibit(s) tower type behavior in d" [4]. We also refer the reader to [17] for the case of quadratic forms. It is expected that a lower bound exponential in d is sufficient in Theorem 1.1 [4], because this is the case for integer solutions as seen in the work of B. J. Birch [1]. As the requirement on codim V * F in Theorem 1.1 is significantly larger than what is expected, it is natural to consider if one can achieve a result analogous to Theorem 1.1 for almost primes, which are positive integers with a small number of prime factors (counting multiplicity), with smaller codim V * F . In this direction, there is a result by Á. Magyar and T. Titichetrakun [12] provided codim V * F > 2 d (d − 1), which is also the required bound in [1]. Theorem 1.2. [12, Theorem 1.1] Let F(x) ∈ Z[x 1 , . . . , x n ] be a degree d homogeneous form. Suppose that F satisfies the local conditions ( ) and codim V * F > 2 d (d − 1). Then the equation (1) has an infinite number of solutions ( 1 , . . . , n ) such that i has at most 384n 3/2 d(d + 1) prime factors for each 1 ≤ i ≤ n.
This result was established by combining sieve methods with the Hardy-Littlewood circle method. In order to keep the amount of notation to a minimum we presented simplified statements of Theorems 1.1 and 1.2 (without quantitative estimates and only the case of one homogeneous form instead of systems of homogeneous forms of equal degree); we refer the reader to the respective papers for the precise statements. We also refer the reader to [10,Section 5.2] and [15,Section 17] for overviews of the progress on a related problem, the Goldbach-Waring problem with almost primes. In a related but different direction, an important method known as the affine linear sieve was introduced and developed by J. Bourgain, A. Gamburd, and P. Sarnak in [2], which established the existence of almost prime solutions to certain quadratic equations in [11]. We refer the reader to [2] and [11], and also a short discussion of this work in [4, Section 1], for more detailed information on this topic.
The main result of this paper improves on the bound on the number of prime factors in Theorem 1.2 with a modest cost on codim V * F . In fact we establish a result analogous to Theorem 1.1 for semiprimes, which are natural numbers with precisely two (not necessarily distinct) prime factors, with an exponential lower bound for codim V * F .
. , x n ] be a degree d homogeneous form. Suppose that F satisfies the local conditions ( ) and codim V * F > 4 d · 8(2d − 1). Then the equation (1) has an infinite number of solutions ( 1 , . . . , n ) such that i has precisely two (not necessarily distinct) prime factors for each 1 ≤ i ≤ n.
We note that a more general result, Theorem 5.2, is proved in this paper, where we obtain quantitative estimates on the number of semiprime solutions of a specific shape, from which Theorem 1.3 follows immediately. We present this theorem in Section 5. The proof is based on several key observations. The first observation is that solving the equation (1) in semiprimes is equivalent to solving the equation in primes. This observation appears to be not particularly helpful at first because the only known result for solving general polynomial equations in primes is Theorem 1.1. However, we observe that F(x 1 y 1 , . . . , x n y n ) is now a bihomogeneous form (defined in Section 2), and we can in fact exploit this structure to obtain an estimate on the number of prime solutions to (3) efficiently. We employ the work of D. Schindler [13] on bihomogeneous forms to achieve this. Therefore, we do not rely on the sophisticated method of B. Cook and Á. Magyar [4] which would drive up the requirement for codim V * F . In particular, our method avoids the use of sieve theory, unlike the work of [12]. Another observation is that the dimensions of the variants (defined in (7)) of the singular locus of {(x, y) ∈ C 2n : F(x 1 y 1 , . . . , x n y n ) = 0} are well-controlled by dimV * F (Theorem 5.1), and this plays a crucial role in the proof of Theorem 5.2. We remark that Theorem 1.2 was improved recently by D. Schindler and E. Sofos in [14]. As a special case of their main result [14, Theorem 1.1], D. Schindler and E. Sofos established [14,Corollary 1.2], which holds when F is non-singular, d ≥ 5, and n > 2 d−1 (d 2 − 1), from which one can obtain a quantitative estimate on the number of solutions to the equation (1) whose coordinates have at most O(d log n/(log log n)) prime factors. Their approach is based on combining sieve methods and the Hardy-Littlewood circle method. Note we have stated this result by D. Schindler and E. Sofos and Theorem 1.2 in terms of the number of prime factors, but in fact the results were obtained in terms of the smallest prime divisors. Thus they obtained results for a different problem from which the mentioned statements follow immediately.
The organization of the rest of the paper is as follows. We devote Sections 2, 3, and 4 to establishing Theorem 2.1, which is of interest on its own, regarding the number of prime solutions to systems of bihomogeneous equations. This is achieved by the Hardy-Littlewood circle method. We cover preliminaries in Section 2, and obtain the minor arcs estimate in Section 3 and the major arcs estimate in Section 4. In Section 5, we establish the main results of this paper by using estimates obtained in the previous sections.

Preliminaries
We set some notation to be used throughout Sections 2, 3, and 4. Let x = (x 1 , . . . , x n 1 ) and y = (y 1 , . . . , y n 2 ). We consider the following degree (d 1 + d 2 ) polynomials with integer coefficients which will be referred to as g. We denote the homogeneous degree (d 1 + d 2 ) portion of these polynomials as G 1 (x; y), . . . , G R (x; y) respectively, which will be referred to as G. We further assume that each G r (x; y) is bihomogeneous of bidegree (d 1 , d 2 ), in other words G r (sx 1 , . . . , sx n 1 ;ty 1 , . . . ,ty n 2 ) = s d 1 t d 2 G r (x; y).
We define them in a similar manner for other systems of bihomogeneous forms as well.
We devote Sections 2, 3, and 4 to proving the following theorem.
Theorem 2.1. Let g be as in (4), P = P d 1 1 P d 2 2 , and 1 ≤ b = log P 1 log P 2 . Suppose Then there exists c > 0 such that the following holds Furthermore, σ g > 0 provided the system of equations (6) has a non-singular solution in (Z × p ) n 1 +n 2 for each prime p and the system G r (x; y) = 0 (1 ≤ r ≤ R) has a non-singular real solution in (0, 1) n 1 +n 2 . We establish Theorem 2.1 by an application of the Hardy-Littlewood circle method. Let P = P d 1 1 P d 2 2 . We define the major arcs M(ϑ ) to be the set of points α = (α 1 , . . . , α R ) ∈ [0, 1) R satisfying the following: there exist 1 ≤ q ≤ P R(d 1 +d 2 −1)ϑ and a 1 , . . . , a R ∈ Z with gcd(q, a 1 , . . . , a R ) = 1 and 2|qα r − a r | ≤ P −d 1 1 We define the minor arcs to be the complement By the orthogonality relation, we have For a suitable choice of ϑ , we prove estimates for the integral over the minor arcs in Section 3 and over the major arcs in Section 4. In this section, we collect results to set up the proof for these estimates. We make frequent use of the following basic lemma on the dimensions of affine varieties.

Lemma 2.2. Let X be an irreducible affine variety in
where Y i 's and Z j 's are the irreducible components of Y and Z respectively, Proof. The first part of the statement is precisely [9, Exercise I. 1.8]. For the second part we recall [9, Proposition I.7.1]: If V and W are irreducible affine varieties in A n C and V ∩W = / 0, then dim(V ∩W ) ≥ dimV + dimW − n. The second part of the statement follows immediately from this result, and we leave the details to the reader.
Then we have Proof. We consider the case s = 1 and t = 0 as the general case follows by repeating the argument for this case. It is clear from the definition that Jac F,1 is obtained by removing the first column from Jac G,1 | x 1 =0 . Let W be the affine variety in A n 1 −1+n 2 C defined by the entries of the first column of Jac G,1 | x 1 =0 . In particular, W is defined by R homogeneous polynomials, and hence codim W ≤ R. Let λ 1 (x, y), . . . , λ K 1 (x, y) denote the determinants of matrices formed by R columns of Jac G,1 . Then we see that V * G,1 is defined by these polynomials. Take a point corresponds to R columns of Jac G,1 which contains the first column.
Then since every entry of the first column of Jac G,1 is 0 at (0, x 0 , y 0 ), we have λ k (0, x 0 , y 0 ) = 0. On the other hand, suppose λ k (x, y) corresponds to a collection of R columns which does not contain the first column. In this case λ k (0, x 2 , . . . , x n 1 , y) is the determinant of one of the matrices formed by taking R columns of Jac F,1 , and hence ( x 0 , y 0 ) is a zero of this polynomial. Thus we have λ k (0, x 0 , y 0 ) = 0 in this case as well. Therefore, we have shown that . Next we consider the case i = 2. In this case Jac F,2 is obtained by setting Therefore, it follows that dimV * F,2 ≤ dimV * G,2 , and consequently we have codim V * F,2 ≥ codim V * G,2 − 1. Our result is then immediate.
By applying Cauchy-Schwarz inequality we obtain We then apply Cauchy-Schwarz inequality once more and obtain where In order to simplify our notation we denote u = (x, x ) and v = (y, y ), and write the sum on the right hand side of (11) as It is clear from the definition of the polynomial d r (u; v) given in (12) that it is a degree (d 1 + d 2 ) polynomial (in u and v) whose homogeneous degree Let M 1 be the matrix obtained by removing n 1 columns corresponding to x (that is (n 1 + 1)-th column to (2n 1 )-th column) from Jac D,1 . It is clear from (13) Consequently, by Lemma 2.2 we obtain dimV * D,1 − n 2 ≤ n 1 + dimV * G,1 , which is equivalent to By reversing the roles of x and x with that of y and y , we also obtain codim V * D,2 ≥ codim V * G,2 . Therefore, it follows from (8) that Let δ 0 > 0 be a sufficiently small constant. We now define the following constant In particular, we have We make use of the following generalization of [13,Lemma 4.3] which gives us an exponential sum estimate on the minor arcs. We remark that owing to a minor oversight in [13, pp. 498], the presence of δ 0 in the statement is necessary. Since the lemma can be obtained by following the argument of [13,Lemma 4.3] in our setting, we omit the details. We shall refer to B ⊆ R m as a box, if B is of the form B = I 1 × · · · × I m , where each I j is a closed or open or half open/closed interval (1 ≤ j ≤ m).  f 1 (u; v), . . . , f R (u; v) be degree (d 1 + d 2 ) polynomials with rational coefficients and let their degree (d 1 + d 2 ) homogeneous portions be F 1 (u; v), . . . , F R (u; v) respectively. For each 1 ≤ r ≤ R, suppose F r (u; v) is a bihomogeneous form of bidegree (d 1 , d 2 ) with integer coefficients. Let δ 0 > 0 be a sufficiently small constant. Let P = P d 1 1 Consider the exponential sum Then we have either Here the implicit constant is independent of ϑ , and it is also independent of the coefficients of (f r (u; v) − F r (u; v)) for each 1 ≤ r ≤ R.
We remark that the hypotheses in the statement of Lemma 2.4 are sufficient and the additional assumption [13, lines 1-2, pp.488] is in fact unnecessary; this can be verified by going through the proof of [13,Lemma 4.3] and observing that the expression in [13, line 22, pp.496] is a multilinear form with integer coefficients due to the factor d 1 !d 2 ! as long as F 1 , . . . , F R have integer coefficients. We note the fact that the implicit constant is independent of the lower degree terms of f r (u; v) becomes crucial when we apply this lemma in Section 4. We have the following exponential sum estimate as a corollary which we also use in Section 4.
Here the implicit constant is independent of ϑ . We define which we know to be positive because of (16). Let us fix ϑ 0 satisfying for some ε 0 > 0 sufficiently small, which is possible because of (16). Let us set which can be verified to satisfy 0 < ζ < 1. Throughout Sections 3 and 4 we let C to be a sufficiently large positive constant which does not depend on P. Let us define ϑ i+1 = ζ ϑ i (0 ≤ i ≤ M − 1), where M is the smallest positive integer such that P ϑ M ≤ (log P) C . From the definition of M it follows that (log P) Cζ < P ϑ M = P ζ M ϑ 0 , for otherwise we have P ϑ M−1 = P ϑ M /ζ ≤ (log P) C and this is a contradiction.
As the material in this section is fairly standard, we keep the details to a minimum and also refer the reader to [4,Sections 6 and 7] or [16,Section 7] where similar work has been carried out. Let us define C 0 by P ϑ M = (log P) C 0 . It is clear that C 0 depends on P; however, by the definition of θ M we have Cζ < C 0 ≤ C. By the definition of M(ϑ M ) we can write It can be verified that the arcs M a,q (C 0 )'s are disjoint for P sufficiently large. We define ψ h (t) = ∑ 0≤v≤t v≡h(mod q) Λ * (v).
Let φ be Euler's totient function. For a positive integer q, let U q be the group of units in Z/qZ. Let B 0 = [0, 1] n 1 +n 2 and We denote P d 1 1 P d 2 2 β = (P d 1 1 P d 2 2 β 1 , . . . , P d 1 1 P d 2 2 β R ). With these notations we have the following lemma.
We omit the proof of Lemma 4.1 because it can be established by following the argument of [4, Lemma 6] in our setting and the changes required are minimal.
Let us define It then follows by [13,Lemma 5.6] that under our assumptions on G, namely (8), we have which is called the singular integral, exists, and that We note that µ(∞) is the same as what is defined in [13, (5.3)], and we have µ(∞) > 0 (23) provided the system of equations G r (x; y) = 0 (1 ≤ r ≤ R) has a non-singular real solution in (0, 1) n 1 +n 2 . Let us define the following sums: A(q) = ∑ 0≤a<q gcd(q,a)=1 1 φ (q) n 1 +n 2 S a,q , and S(P) = ∑ q≤(log P) C 0 R(d 1 +d 2 −1) Then by combining Lemma 4.1,(22), and the definition of major arcs, we obtain the following.
where the summation in the O-term is over 1 ≤ q ≤ (log P) C 0 R(d 1 +d 2 −1) .
We still have to deal with the term S(P), and this is done in the following section.

Singular Series
We now bound S a,q when q is a prime power. In order to simplify the exposition let us define .
, we can verify that (27) Lemma 4.3. Let p be a prime and let q = p t , t ∈ N. Let 0 ≤ a < q with gcd(q, a) = 1. Let ε > 0 be sufficiently small. Then we have the following bounds where the implicit constants are independent of p and t.
Proof. We consider the two cases t ≤ 2(d 1 + d 2 ) and t > 2(d 1 + d 2 ) separately. We begin with the case t ≤ 2(d 1 + d 2 ). In this case we apply the inclusion-exclusion principle (see [4, (7.3)]) and express S a,q as ∑ I 1 ⊆{1,2,...,n 1 } I 2 ⊆{1,2,...,n 2 } where H I 1 ,I 2 (k; v) is the characteristic function of the set {k ∈ (Z/qZ) n 1 +n 2 : k i, j = pv i, j ( j ∈ I i , i = 1, 2)}. Here we are using the notations k = (k 1 , We now bound the summand in the expression (28) by further considering two cases, . In the first case |I 1 | + |I 2 | > B 2 d 1 +d 2 −2 (R+1) , we use the following trivial estimate On the other hand, suppose . Let us label s = (s 1 , . . . , s n 1 −|I 1 | ) and w = (w 1 , . . . , w n 2 −|I 2 | ) to be the remaining variables of x and y after setting x j = 0 for each j ∈ I 1 and y j = 0 for each j ∈ I 2 respectively. For each 1 ≤ r ≤ R, let f r (s; w) be the polynomial obtained by substituting x j = pv 1, j ( j ∈ I 1 ) and y j = pv 2, j ( j ∈ I 2 ) to the polynomial g r (x; y). Thus f r (s; w) is a polynomial in s and w whose coefficients may depend on p and v. With these notations we have ∑ k∈(Z/qZ) n 1 +n 2 We can also deduce easily that the homogeneous degree (d 1 + d 2 ) portion of the polynomial f r (s; w), which we denote F r (s; w), is obtained by substituting x j = 0 ( j ∈ I 1 ) and y j = 0 ( j ∈ I 2 ) to G r (x; y). In particular, it is independent of p and v. It then follows from Lemma 2.3 that Let ε > 0 be sufficiently small. Thus by Corollary 2.5 we obtain Consequently, we have from (30) that in this case as well. By applying the estimates (29) and (31) in (28), we obtain the desired estimate for the case t ≤ 2(d 1 + d 2 ).
We now consider the case t > 2(d 1 + d 2 ). By the definition of S a,q we have g r (k 1 + pb 1 ; k 2 + pb 2 ) · a r /q .
For each fixed k ∈ U n 1 +n 2 p , we have Clearly every monomial of ϖ r;p,k (b) has degree in b i strictly less than d i for one of i = 1 or 2, and its coefficients are integers which may depend on p and k. We let We can then express the inner sum on the right hand side of (32) as We have that each c r has coefficients in Q, and its degree (d 1 + d 2 ) homogeneous portion G r has coefficients in Z. We apply Lemma 2.4 with B 1 = [0, 1) n 1 , B 2 = [0, 1) n 2 , α r = a r /p t−d 1 −d 2 (1 ≤ r ≤ R), P 1 = P 2 = p t−1 , and P = p (t−1)(d 1 +d 2 ) . Let θ = 1 2(d 1 +d 2 )(d 1 +d 2 −1)(R+1) < 1 d 1 +d 2 . Suppose there exist a 1 , . . . , a R and 1 ≤ q ≤ P R(d 1 +d 2 −1)θ such that gcd( q, a 1 , . . . , a R ) = 1 and .
By a similar argument as in [9, Chapter VIII, §2, Lemma 8.1], one can show that A(q) is a multiplicative function of q. We omit the proof of the following lemma as it is a basic exercise involving the Chinese remainder theorem and manipulating summations. Recall we defined the term S(P) in (25). For each prime p, we define which converges absolutely under our assumptions on g. Furthermore, the following limit exists which is called the singular series. We prove these statements in the following Lemma 4.5.
Therefore, the limit in (36) exists, and the product in (36) converges. We leave the details that these two quantities are equal to the reader.
Proof. For any t ∈ N, we know that φ (p t ) = p t (1 − 1/p) ≥ 1 2 p t . Therefore, by considering the two cases as in the statement of Lemma 4.3 we obtain for some δ 1 > 0, where the last inequality follows from (27). We note that the implicit constants in are independent of p here.
Let q = p t 1 1 · · · p t v v be the prime factorization of q ∈ N. Without loss of generality, suppose we have . Note we can assume the implicit constant in Lemma 4.3 is 1 for p sufficiently large with the cost of p −ε . By a similar calculation as above and the multiplicativity of A(·), it follows that for some δ 2 > 0, where we obtained the last inequality from (27). We note that the implicit constant in is independent of q here. Therefore, we obtain Using the bound (37), we obtain that the first term in the O-term of (26) is bounded by Let ν t (p) denote the number of solutions (x, y) ∈ (U p t ) n 1 +n 2 to the congruence relations g r (x; y) ≡ 0 (mod p t ) (1 ≤ r ≤ R). It is then a basic exercise (see [16, pp. 58] Therefore, under our assumptions on g we obtain We can then deduce by an application of Hensel's lemma that µ(p) > 0, if the system (6) has a nonsingular solution in (Z × p ) n 1 +n 2 . From this it follows in combination with (36) and Lemma 4.5 that if the system (6) has a non-singular solution in (Z × p ) n 1 +n 2 for every prime p, then By combining (38) and Lemmas 4.2 and 4.5, we obtain the following.
Proposition 4.6. Given any c > 0, under our assumptions on g the following holds where P θ M = (log P) C 0 .

Proof of Theorem 1.3
We begin this section by proving the following theorem.
Proof. Let X be an irreducible component of V * G,1 such that dim X = dimV * G,1 . By relabeling the variables if necessary, let us suppose we have for some 0 ≤ m ≤ n.
Let P = (x 0 , z 1 , . . . , z m , 0) ∈ X with (z 1 , . . . , z m ) ∈ (C\{0}) m . Let us consider where W 1, j 's are the irreducible components of X ∩V (y 1 − z 1 ). Recall if Z is an irreducible affine variety and H is a hypersurface, then we have one of: Z ∩ H = Z, Z ∩ H = / 0 and every irreducible component of Z ∩ H has dimension dim Z − 1. Therefore, it follows that the dimW 1, j ≥ dim X − 1 for each 1 ≤ j ≤ 1 .
Next without loss of generality suppose P ∈ W 1,1 . Let us consider where W 2, j 's are the irreducible components of W 1,1 ∩V (y 2 − z 2 ). By the same argument as above, we obtain By continuing in this manner, we obtain the result.
Let us fix (z 1 , . . . , z m ) ∈ (C\{0}) m as in Claim 1. Let z m+1 = · · · = z n = 0. Then we have We also have For each 1 ≤ k ≤ n, let us define Then it follows from (41) that Claim 2: We have Proof of Claim 2. First we have This is because the dimension of is either dim T k+1 or dim T k+1 + 1. Furthermore, intersecting this set with V (x k+1 ), which is T k , either reduces the dimension by 1 or the dimension stays the same. Therefore, we have dim Then it is a basic exercise to show that the largest possible value of max 1≤k≤n L k for any such set of integers is k 0 , where Since we can choose L k = dim T k (1 ≤ k ≤ n), the result follows.
Therefore, by combining (40), (42) and (43), we obtain By symmetry we obtain the same bound for codim V * G,2 as well.
Let d > 1. Throughout this section we let f (x) be a degree d polynomial in Z[x 1 , . . . , x n ], and denote its degree d homogeneous portion by F(x). We now solve the equation It is clear that N 2 ( f ; N; N 1 , N 2 ) is the number of semiprime solutions (p 1 q 1 , . . . , p n q n ) ∈ [0, N] n to the equation (44), where p j ≥ q j , p j ∈ [0, N 1 ]∩℘, and q j ∈ [0, N 2 ]∩℘, counted with weight ∏ 1≤ j≤n (log p j )(log q j ). We also consider the following modification of the local conditions ( ) given in Section 1.
Local conditions ( ). The equation has a non-singular real solution in (0, 1) n , and the equation (44) has a non-singular solution in (Z × p ) n for every prime p.
It is clear that these conditions are identical to the local conditions ( ) when the polynomial in consideration is homogeneous. We prove the following theorem. Then we have N 2 ( f ; N; N 1−δ , N δ ) N n−d .