On the Termination of the General XL Algorithm and Ordinary Multinomials

The XL algorithm is an algorithm for solving overdetermined systems of multivariate polynomial equations. It was initially introduced for quadratic equations -- however the algorithm works for polynomials of any degree and we will focus on the performance of XL for degree $\geq3$, where the optimal termination value of $D$ is still unknown. We prove that the XL algorithm terminates at a certain value of $D$ when the number of equations exceeds the number of variables by 1 or 2, and give strong evidence that this value is best possible. Our analysis involves some commutative algebra and proving that ordinary multinomials are strongly unimodal, and this result may be of independent interest.


Introduction
Algorithms and methods for solving systems of polynomial equations in several variables have many applications, and the XL algorithm is one such method. Most papers so far that analyse the XL algorithm have considered quadratic equations only. In this article we will present some results on the success parameter of the XL algorithm for arbitrary degree polynomials.
In Section 2 we give background on the XL algorithm, and in Section 3 we prove a new theoretical result in commutative algebra (following Diem [Die04]) which will allow us to estimate the optimal termination value.
In Section 4 we will prove that the ordinary multinomials N k s are strongly unimodal, as well as finding the smallest k such that the inequality N k s ≤ k holds. These results are of independent interest and this section may be read independently of the rest of the paper. We will prove another inequality involving ordinary multinomials in Section 6. Other papers have outlined proofs of unimodality of ordinary multinomials before; however, our method is different and proves strong unimodality.
Section 5 contains the main results of the paper -our theoretical and computational results on the XL algorithm in the case of one more equation than unknown, and Section 6 considers the case of two more equations than unknowns.

Background on the XL algorithm
The XL (eXtended Linearization) Algorithm, introduced in [CKPS00], is an algorithm to solve (overdetermined) systems of multivariate polynomial equations. Consider a system of multivariate polynomial equations f 1 (x 1 , . . . , x n ) = 0 . . . f n+c (x 1 , . . . , x n ) = 0 over a (finite) field K, where c ≥ 1. Fix D ∈ N, where D > deg(f i ) for all i. We call D the maximal degree. We consider the system of all products k l=1 x j l · f i (x 1 , . . . , x n ) where k ≤ D − deg(f i ) for i = 1, . . . , n + c. Note that [CKPS00] assumes deg(f i ) = 2 for all i, an assumption we do not make. The idea of the XL algorithm is to linearize this new system in hope of finding a univariate equation which then allows us to solve the initial system. Definition 2.1: (The XL Algorithm) Input: polynomials f 1 , . . . , f n+c in variables x 1 , . . . , x n (an overdetermined system with 0-dimensional solution space) and a positive integer D ≥ 1 + max i deg(f i ).
2. Linearize: Consider each monomial in the x i of degree ≤ D as a new variable and perform Gaussian elimination on the equations obtained in step 1. The ordering on the monomials must be such that all the terms containing one (fixed) variable (say x 1 ) are eliminated last. See [CY04] for some variations and discussions of the algorithm, and a comparison with Gröbner basis algorithms. In particular, it was shown by Moh [Moh00] (see also [CY04]) that the algorithm terminates for some D provided the solution set is 0-dimensional.
One might set the starting value of D at D = 1 + max i deg(f i ). If the 'Solve' step fails for this D, or any D, we increment the input value of D and run the algorithm again. In practice, the first D at which the algorithm succeeds and terminates is usually larger than D = 1 + max i deg(f i ).
Considering only those D for which the XL algorithm succeeds, the algorithm increases in running time with D. Therefore, when looking for maximum efficiency we would like to know the smallest D such that the XL algorithm succeeds. Let us denote this value by D * , the optimal input value of D. If we know D * , we would use this as the starting input value for D. If we don't know the exact value of D * then a lower bound for D * is useful because we can use the lower bound as the starting value in the XL algorithm. Using ordinary multinomials, we will derive some lower bounds in this article.

A Hilbert series associated to the XL algorithm
We will now follow C. Diem ( [Die04]) to set up the theoretical background for determining the optimal choice of the maximal degree D.
Let V D be the K-vector space generated by the products produced in the first step of the XL algorithm, i.e.
. So step 2 of the XL algorithm produces a univariate equation in x 1 in this case. Clearly then, for reasons of efficiency of the XL algorithm, we would like to know the the smallest D such that χ(D) ≤ D. We will now try to estimate χ(D) in order to find the smallest D such that χ(D) ≤ D.
Let K[x 0 , x 1 , . . . , x n ] D be the K-vector space of all homogeneous polynomials of total degree D. Let F i ∈ K[x 0 , x 1 , . . . , x n ] denote the homogenization of f i . Then V D ∼ = I D via the degree D homogenization map, where Now I D is the D th homogeneous component of the homogeneous ideal I := (F 1 , . . . , F n+c ) ⊳ K[x 0 , . . . , x n ]. It follows that  Proposition 3.5: Let K be a field (any characteristic), let F 1 , . . . , F m ∈ R := K[x 0 , . . . , x n ] be forms of degree d 1 , . . . , d m (not necessarily generic). Let I := (F 1 , . . . , F m ) ⊳ R. Let H g be the generic Hilbert series of type (n + 1; m; d 1 , . . . , d m ). Then H R/I ≥ H g coefficient-wise. Proof: See [Die04].
is injective, and we have a short exact sequence Proposition 3.7: Let G 1 , . . . , G m ∈ R be a generic system of forms of degrees d 1 , . . . , d m with m ≤ n + 1. Then Proof: By the previous proposition, (1−T ) n+1 . All preceding results of this section can be found in [Die04]. We incude the statements because they are needed for the next result, which is new. [Die04]. The general case is similar. We will prove that the generic Hilbert series of type (n + 1; n + The result then follows by Proposition 3.5. Let G 1 , . . . , G n+c ∈ R := K[x 0 , . . . , x n ] (with char K = 0) be a generic system of forms of degrees d 1 , . . . , d n+c . Let R ′ := R/(G 1 , . . . , G n+1 ) and let by Proposition 3.6. Thus we have and hence Furthermore, if all the forms are generic, then equality holds. Proof: Take c = 1 in Theorem 3.8, and recall that the coefficient of The second statement follows from Proposition 3.7.
In Section 5 we use this Corollary to find the smallest D such that χ(D) ≤ D (and XL succeeds by Theorem 3.1) in the c = 1 case.
Remark 3.10: We remark that there are three possibly different D's under discussion: 1. the smallest D such that the XL algorithm terminates (call it D * ), 2. the smallest D such that In the case c = 1, Corollary 3.9 implies that D m ≤ D χ , and D m = D χ when the equations are generic.
In Section 5 we investigate the relationship between these when c = 1. At the end of the section we will conjecture that D * = D χ = D m when c = 1 and the equations are generic, and provide evidence.

Ordinary Multinomials
This section is independent of the rest of the paper. We will prove here that ordinary multinomials are strongly unimodal, and some inequalities. Definition 4.1: A sequence s 0 , s 1 , . . . , s N of integers is said to be unimodal if there is an integer t with 0 ≤ t ≤ N such that A unimodal sequence is said to be strongly unimodal if all the inequalities are strict, except that s t = s t+1 may hold. For example, the sequence of binomial coefficients N k (k = 0, 1, . . . , N) is strongly unimodal. Remark 4.2: We remark that there are different definitions of strongly unimodal in the literature. One definition is the same as ours except it does not allow s t = s t+1 , in which case the binomial coefficients are strongly unimodal only for N even. (c)  Theorem 4.7: The ordinary multinomials are strongly unimodal. For N ≥ 2 we have Proof: It follows from Lemma 4.6 that We proceed by induction on N.
Case 1: where s(N − 1) is even. In this case ⌈ s(N −1) by induction assumption and since m < s/2. Case 2: where s(N − 1) is odd. In this case ⌈ s(N −1) 2 . If sN is even, then we have a peak at sN 2 . Remark 4.9: One can also prove (weak) unimodality of N k s using Theorem 4.7 of [DJD88] which says that the convolution of two symmetric discrete unimodal distributions is again unimodal, along with N k s = N Here is a theorem we will need later in Section 5. Theorem 4.12: For 2 ≤ s < N +1 2 + 2 N , the smallest k such that N k s ≤ k is k = sN − 1.
We will also use the following result, which will be proved in Section 6. Theorem 4.14:

One more equation than unknowns (c = 1) in XL
We are now ready to find our lower bound on the smallest D such that χ(D) ≤ D for the case c = 1, i.e., when the number of equations exceeds the number of unknowns by 1. By Theorem 3.1 the XL algorithm will succeed for this D, so this may be a good choice for the initial value of D. We will see that this is actually an excellent choice, and very often the optimal choice. According to Corollary 3.9, χ(D) ≥ n+1 D d−1 when we have n unknowns, n + 1 equations, and all equations have the same degree d. According to Theorem 3.1 χ(D) ≤ D is a sufficient condition for the XL algorithm to succeed. Combining these we get as sufficient for success, and so we consider the inequality Corollary 5.1 : If f 1 , . . . , f n+1 ∈ K[x 1 , . . . , x n ] with deg(f i ) = d for all i and 3 ≤ d ≤ n+2 2 + 2 n+1 , then the smallest D such that χ(D) ≤ D is at least (d − 1)(n + 1) − 1. Proof: Take s = d − 1 and N = n + 1 in Theorem 4.12.
Remark 5.2: This is why we proved Theorem 4.12 earlier. We note that the proof of Theorem 4.12 uses strong unimodality of ordinary multinomials, and this is why we included that result in our paper.
Remark 5.7: In the notation of Remark 3.10 we are conjecturing that D * = D χ = D m .
Remark 5.8: Our conjecture is consistent with Proposition 6 in [Die04] where it is shown that D χ ≥ n/(1 + √ c − 1) for quadratic equations. When c = 1 this becomes D χ ≥ n, and our conjecture with d = 2 becomes D * = n.
Remark 5.9: When the equations do not all have the same degree, we can take d to be the maximum of the degrees and the above arguments will give a good starting input value for D, but not necessarily the optimal value.  6 Two more equations than unknowns (c = 2) in XL When c > 1 (c being the difference between the number of equations and the number of unknowns), then Theorem 3.8 tells us that nonnegative. Now let k = s + t where 1 ≤ t ≤ s, i.e. s < k ≤ 2s. Thus Now we are left with considering the inequality (s − t + 1)(s + 2t + 1) + Since t ≤ s, we can write s = t + a, where a ≥ 0, and substitute: (t + a) 2 − 3t 2 + 2(t + a)t + (t + a) − 3t + 2 = a 2 + a + 4at − 2t + 2 The right hand side is ≤ 0 when a = 0, and > 0 when a ≥ 1. Thus (see proof of Theorem 4.7). We will proceed by induction on N, and assume that Case 1: 2 | sN. Assume 2t ≤ s − 1.
For N ≥ 4, 2t ≤ s − 1 implies 2t(sN − s + 1) ≤ s 2 N − s 2 − sN + 2s − 1 < s 2 N − s 2 − s Case 2: 2 ∤ sN. Assume 2t ≤ s. We conjecture that this lower bound is tight. Based on the c = 1 and c = 2 cases, it appears as though the optimal starting value for D in the XL algorithm will be approximately (d−1)(n+c)/c.