On the degree of varieties of sum of squares

We study the problem of how many different sums of squares decompositions a general polynomial $f$ with SOS-rank $k$ admits. We show that there is a link between the variety $\mathrm{SOS}_k(f)$ of all SOS-decompositions of $f$ and the orthogonal group $\mathrm{O}(k)$. We exploit this connection to obtain the dimension of $\mathrm{SOS}_k(f)$ and show that its degree is bounded from below by the degree of $\mathrm{O}(k)$. In particular, for $k=2$ we show that $\mathrm{SOS}_2(f)$ is isomorphic to $\mathrm{O}(2)$ and hence the degree bound becomes an equality. Moreover, we compute the dimension of the space of polynomials of SOS-rank $k$ and obtain the degree in the special case $k=2$.


Introduction
Motivation.Let V be a complex vector space of dimension n + 1 with basis {x 0 . . ., x n } and let d ≥ 0 be an integer.Let f ∈ C[x 0 , . . ., x n ] be a homogeneous polynomial of degree 2d, that is f ∈ Sym 2d V .A starting case, when f is real, is the problem of computing the global infimum of f , f * = inf z∈R n f (z).Polynomial optimisation problems appear frequently in practice in many different fields, including areas of engineering and social science such as computer vision [PPCVG19,AST13], control theory [HG05,HvK03] and optimal design [DCGH + 19].However, even for deg f ≥ 4 this is an NP-hard problem [MK87].As such, many methods have been developed to approximate f * .A popular method is to relax the optimisation problem: Clearly, being a sum of squares implies non-negativity.It is well-known that these notions are equivalent in two homogeneous variables.However, due to the counter example by Motzkin this is not true in general [Mot67].
In [Las00], using the duality between moments and sums of squares, Lasserre constructed a hierarchy of semi-definite programs whose solutions converge to the true infimum f * .However, in general, the decompositions obtained from semi-definite programming are approximate certificates of non-negativity.In recent years there has been an increased study on computing exact certificates [PP08,MSED18].Hence, one wants to understand the algebraic structure of SOS decompositions and the related semi-definite programs.Prior works.Following the classical works of Sylvester [Syl51], the study of so-called Waring decompositions, decompositions of homogeneous polynomials by powers of linear forms, is an active area of research.In [FOS12] it was proved that any general f ∈ Sym 2d V is a sum of at most 2 n squares.For fixed n, this bound is sharp for all sufficiently large d.The authors of [LORS19] investigate the minimal numbers of squares in a decomposition of a generic polynomial in two variables.Then, in [FLOS18], the authors give a conjecture on the generic SOS-rank of polynomials, see Definition 1.2, in terms of number of variables and degree.On the other hand, in this paper we will study generic polynomials of a given SOS-rank.
In this paper, one aim is to analyse the degree of SOS decompositions directly from an algebraic geometry point of view.Another aim is to better understand the structure of the SOS decompositions of a given polynomial.Main results.We consider SOS decompositions of polynomials of degree 2d.
In this paper, we define and study two varieties related to exact SOS decompositions.The first is defined by all polynomials of rank less than or equal to k, with a general point f ∈ SOS k being a polynomial of rank k.Definition 1.2.Let SOS k be the subvariety in Sym 2d V obtained from the Zariski closure of the set of all SOS-rank k polynomials.
The generic SOS-rank is the smallest number k such that SOS k covers the ambient space.
Another notion that can be explored is instead of analysing all polynomials of a given rank, one can seek to understand all the different decompositions of the general polynomial f .Definition 1.3.Let f ∈ SOS k be a generic polynomial.We define the variety of all the SOS decompositions of f as While we investigate the SOS k (f ) variety for all ranks k, in particular we give a complete description of the k = 2 case.
be a generic polynomial that is the sum of two squares.Then, SOS 2 (f ) has two irreducible components isomorphic to SO(2).
Since SO(k) acts on any decomposition using k squares, we have the inequality In Corollary 3.6 we prove a statement which implies the following result.
Theorem 1.5.Let f ∈ SOS k be generic with k ≤ n.Then, By analysing the general polynomial in SOS 2 , we prove a formula for the degree of this variety.
Theorem 1.6.Let N = dim Sym d V = n+d d .The degrees of the varieties of squares and of sum of two squares in P(Sym 2d V ) are given by .
Moreover, the dominant map π : Structure of the paper.In Section 2, we begin by recalling some definitions in sums of squares decompositions, algebraic geometry and commutative algebra.Then, in Section 3 we investigate the variety of all possible sums of k-squares decompositions of a given polynomial.We describe the action of the orthogonal group of size k on this variety and conjecture that there is an isomorphism between these two objects.We provide experimental and theoretical support for this conjecture and conclude by showing that it holds for k = 2. Finally, in Section 4 we use the results of Section 3 to prove a formula for the degree of the variety of all SOS decompositions of two squares in addition to a upper bound on this degree for k ≥ 3.

Preliminaries
Let V be a complex vector space of dimension n + 1.We will denote the n-dimensional projective space associated to V by PV .Definition 2.1.We define the d-Veronese embedding as the map Notice that the map ν d is closed [SR13].Therefore, we define the d-Veronese variety in P Sym d (V ) as the the image of PV under the Veronese embedding ν d .
In other words, f is the sum of r decomposable polynomials.
Observe that the Veronese variety ν d (V ) ⊂ P Sym d V consists exactly of the rank one polynomials.
Definition 2.3.Let X be a subvariety of V .The k-th secant variety of X, denoted Σ k (X), is defined as the Zariski closure of the union of all the k linear subspaces spanned by points in X.That is If X = ν d (PV ) ⊂ P Sym d V , then the generic elements in the k-th secant variety of the Veronese variety consist exactly of polynomials of rank k as long the inclusion Σ k (ν d (PV )) ⊂ P Sym d V is strict.
Let U denote Sym d V .We can decompose Sym 2d U follows: where C is obtained by plethysm, see [Wey03] for more details.The space C corresponds to the quadrics on U that vanish on ν d (PV ).Moreover, Sym 2d V is the degree two piece of the coordinate ring of ν d (PV ).
Let {x 0 , . . ., x n } be a basis of V .Consider a basis Switching to the coordinates given by V we have This means that rank one quadrics in Sym 2 U correspond to square powers in Sym 2d V .Furthermore, applying the same argument for a rank k quadric f ∈ Sym 2 U , we see that f corresponds to a sum of k squares in Sym 2d V .
Notice that if (f 1 , . . ., f k ) ∈ SOS k (f ), as defined in Definition 1.3, then for any permutation σ ∈ S k , where S k is the symmetric group of order k, we have that One could desire to remove such "overlapping" points by taking the quotient by S k .However, there is another important group, containing such permutations, that acts on SOS k (f ).
Let O(k) be the orthogonal group of order k.Fix a point (f 1 , . . ., f k ) ∈ SOS k (f ) and fix the ordering of the basis {w 1 , . . ., w N } of Sym d V .Define A to be the k × N matrix whose i-th row is the coefficients of the polynomial f i .Then, Let O ∈ O(k), then the action on the left by A preserves the polynomial f .That is, Essentially, such an action leads to a different decomposition (f ′ 1 , . . ., f ′ k ) of f , where f ′ i is the ith row of the matrix OA.
Definition 2.4.Let f ∈ Sym d V , let {x 0 , . . ., x n } be a basis of V and let ∂ 0 , . . ., ∂ n be the dual basis of V ∨ .For each m < d, we define the linear map The matrix corresponding to this linear map is called the catalecticant matrix of f .
We give some cohomological definitions that are going to be used later on.Let S = q Sym q (V ) be the symmetric algebra of V .Definition 2.5.Let R be a ring and F a free module of rank r over R. Given an R-linear is called the Koszul complex associated to k.The maps ϕ l are defined as where the notation e i means that this element is omitted from the product.Definition 2.6.Let M be a finitely generated graded S-module and let F 0 , . . ., F m be the free S-modules that give a minimal free resolution of M .That is, there is an exact sequence and the matrices of the maps φ i : F i+1 → F i have no non-zero constant entry, see [Eis95].The Betti number β i,j is the number of generators of degree j needed to describe F i .That is, , where S(−j) is the j-graded part of S.
Definition 2.7.Let M , N be two graded S-modules and let F • be a free resolution of N .Consider the complex F • ⊗ M .The Tor groups are defined by The next result shows the relation between the Tor groups of M and the Betti numbers of a free resolution of M .
Proposition 2.8.[Gre89, Section 1] Let m ⊂ S be the maximal ideal m = q≥1 Sym q (V ) and let k = S/m be the residual field.Then, Tor S p (M, k) q has rank equal to β p,q .This connection between the Betti numbers and the Tor groups is important because it correlates the Betti numbers with cohomology.This allows us to use semi-continuity on the Betti numbers, as explained in the next theorem.
Theorem 2.9.[Har77, Theorem 12.8] Let f : X → Y be a projective morphism of noetherian schemes.Let F be a coherent sheaf on X and flat over Y , in other words, F is a finitely presented O X -module and the functor -⊗ O Y,f (x) : Mod Fx → Mod Fx is exact for every x ∈ X.Then for each i ≥ 0, the function is upper semi-continuous on Y .

The degree of the variety of all SOS decompositions
x n ] be a sum of squares with degree 2d.We consider the variety in the ambient space k i=1 Sym d (V ) of all possible SOS decompositions of the given polynomial f .
We conjecture the degree of this variety, when n ≥ k, to be the degree of the orthogonal group O(k).In [BBB + 17] the authors give the degree of SO(k), and thus O(k), to be the determinant of the following binomial matrix . (1) For the case d = 1, the argument simplifies and so we give the following lemma.
Lemma 3.1.Let f ∈ Sym 2 V be a quadric of SOS-rank k ≤ n.Then, in the affine setting, the degree of SOS k (f ) is equal to the degree of O(k).
Proof.With f = k i=1 f 2 i , n ≥ k implies that we can encode f in a k × (n + 1) matrix, A, whose rows give the coefficients of the linear forms f i .Then, with x = (x 0 , . . ., x n ) we have that Ax t 2 = f .Thus, for any orthogonal matrix O ∈ O(k) we have that Hence, there is an action on the SOS k (f ) variety by O(k).Additionally, there are at least two identical irreducible components that correspond to det O = ±1.
We now show that up to a change of coordinates and multiplication by an orthogonal matrix, this SOS expression is unique.Let A and B be k × (n + 1) matrices encoding SOS decomposition of f .Then, up to a change of coordinates, we can ensure that the first k columns are linearly independent and so QR decompositions can be found.Thus, ) upper triangular matrices.Then, R 1 and R 2 also encode SOS decompositions of f .By the equation R 1 x t 2 = f we can identify exactly the entries of R 1 , up to multiplication by ±1 in the rows, or in other words, up to multiplication by an orthogonal matrix.The same holds for R 2 and so the decompositions encoded by A and B must be in the same orbit of the action of O(k) on SOS k (f ).Therefore, there is only one orbit and so the degree of SOS k (f ) is equal to the degree of O(k).
The argument above also works partially for the case d ≥ 2. Once a basis is chosen for Sym d V , we can construct the matrix in the same way with k rows but n+d d columns.Then, the group O(k) acts on the left to give new decompositions.However, the QR decomposition no longer implies uniqueness of the orbit.This is because there exist relations between the monomials described by the columns of f .In other words, the Gram matrix associated to f is not only symmetric but also has a moment structure.Thus, it is no longer easy to see that the non-linear equations given by the norm of Ax t squared, Ax t 2 , have a unique solution.
Experimentally, up to k ≤ 6, we observe a stabilisation of the degree of the variety SOS k (f ) as the degree of f increases.The following table derives from [BBB + 17, Table 1].Since the degree of SOS 7 (f ) is at least 233, 232 for a generic f ∈ SOS 7 , k ≤ 6 is the currently the limit for our experimental methods.
k Symbolic Formula (O(k)) Formula (SO(k))  1. Degree of SOS k (f ) for n ≥ k.See formula 1 for the degree of O(k).
The next example shows that the condition n ≥ k is sharp.
Example 3.2.The general plane quartic can be expressed as g 2 1 + g 2 2 + g 2 3 in 63 ways, where g i ∈ Sym 2 C 3 .A proof of such result is presented in [Dol12, Theorem 6.2.3].The idea is to consider the quartic form as the determinant of a 2 × 2 matrix whose entries are quadric forms.
The next lemma gives an indication of the connection between k-SOS decompositions and the orthogonal group O(k).Indeed, fixing a matrix A 0 ∈ M k×N is equivalent to fixing a sum of squares decomposition of rank k of f = x t A T 0 A 0 x.Lemma 3.3.Let N ≥ k ≥ 1 be integers and A, A 0 ∈ M k×N be matrices, A 0 of maximal rank and consider the entries of A as variables x ij .Then the variety Y defined by the equation Proof.Up to an action of the group of N × N invertible matrices, GL(N ), on the left and O(k) on the right of A 0 , we may suppose without loss of generality that A 0 = I k 0 , with I k the k × k identity matrix and 0 a null matrix of size k × (N − k).Let A = X 0 X 1 , again with X 0 a k × k matrix and X 1 a k × (N − k) matrix.
In those coordinates, the variety is determined by where A ∈ M k×N has the coefficients of f i as its i-th row.This gives a natural isomorphism (2) Denote the Gram matrix W A = A t A and note that rk W A = k when the above decomposition is minimal.The previous lemma implies that O(k) ∼ = {B ∈ M k×N |W B = W A } ⊂ SOS k (f ).This, together with the isomorphism (2) implies that SOS k (f ) can be described by as many copies of O(k) as the number of distinct symmetric matrices W A of rank k such that x t W A x = f .Let f ∈ Sym 2d V and N = n+d d .Notice that the following diagram commutes.
The fiber π −1 (f ) = W 0 + C, where W 0 is the rank N catalecticant matrix of f such that xW x t = f and C is the variety This means that the problem can be reformulated in terms of the intersection ϕ(C k ⊗ C N ) ∩ (C + W 0 ): when this intersection is just a single point, as is the case for k ≤ 6 shown in Table 1, this implies that there exists only one C 0 ∈ C such that W 0 + C 0 has rank k.This is equivalent to saying that SOS k (f ) consists of a single copy of O(k).Thus, we arrive at the following conjecture.
Of course, if this intersection consists of more than a single point, one would arrive at exactly the number of copies of O(k) such that SOS k (f ) is isomorphic.
Consider a tuple (f 1 , . . ., f k ) ∈ SOS k (f ), we denote the tangent space of SOS k (f ) at this point by TSOS k (f ) (f 1 ,...,f k ) .Recall that if we consider an orthogonal matrix O ∈ O(k) and A f ∈ M k×N , then the rows of A f O are polynomials giving a k-SOS decomposition of f .We are interested in understanding the local behavior of this variety.More specifically, we want to show that the tangent space TSOS k (f ) (f 1 ,...,f k ) has dimension equal to the dimension of O(k).This means that locally, the variety SOS k (f ) is exactly equal to O(k).In order to do that, we can show that the only syzygies of a vector (f 1 , . . ., f k ) ∈ SOS k (f ) are given by the Koszul syzygies.In the next paragraphs, we further explain the concept of Koszul syzygies and how they are related to the tangent space of SOS k (f ).
Let A f be the matrix whose rows are the coefficients of f 1 , . . ., f k .Observe that the map as the fiber at zero.Therefore, the tangent space TSOS k (f ) (f 1 ,...,f k ) is the space generated by the nullity of the derivative of φ at the point (f 1 , . . ., f k ).This equivalent to saying that where V ∈ M k×N .Notice that equation ( 4) is trivially satisfied when A t f V is a skewsymmetric matrix.A syzygy satisfying this equation is a Koszul syzygy of (f 1 , . . ., f k ).If we have that the Koszul syzygies are the only syzygies of the point (f 1 , . . ., f k ), we obtain that they span the tangent space at this point.In such case, the tangent space has dimension equal to the dimension of O(k).
A more geometric and intuitive explanation can be described by looking at the usual set of coordinates instead of matrices.We may see SOS k (f ) as the nullity of the map The tangent space TSOS k (f ) (f 1 ,...,f k ) is computed once again as the space generated by the nullity of the derivative of the expression This means that the tangent space is generated by The vanishing of this expression by considering tuples (g 1 , . . ., g k ) such that we have pairs i = j with g i = f j and g j = −f i is a Koszul syzygy of the vector (f 1 , . . ., f k ).Observe that this corresponds exactly to the matrix A t f V being skew-symmetric, where V is the matrix that has g i as the ith-row.
Proof.Let A = I 0 be a matrix as in equation (2) giving a SOS decomposition of f = In this basis, since each monomial coefficient is equal to zero we obtain 2(v ij + v ji ) = 0 as desired.
The importance of this result is that it guarantees that at the point has dimension equal to the number of Koszul syzygies, since they span the null space of ϕ ′ (x d 0 , . . ., x d k ).Moreover, this dimension is exactly equal to the dimension of the tangent space of O(k).This implies that locally at the point (x d 0 , . . ., x d k ), the variety SOS k (f ) is equal to the subvariety O(k) ⊂ SOS k (f ).We wish to extend this result to every point (f 1 , . . ., f k ) ∈ SOS k (f ).We obtain that this can be extended to a vector (f 1 , . . ., f k ) by means of semi-continuity.Indeed, consider the kernel K of the map defined by the vector (f 1 , . . ., f k ), where O PV is the sheaf defining PV as a scheme (PV, O P V ).
The minimal resolution of the kernel, when there are only Koszul syzygies, start with By Proposition 2.8, the Betti numbers β p,p+q of the minimal resolution of K correspond to the rank of Tor S p (K, k) p+q , this is the component of degree p + q of T or S p (K, k).Since we can correlate the Betti numbers with cohomology dimensions using Proposition 2.8, we have by Theorem 2.9 that for a local deformation of K, the Betti numbers satisfy semi-continuity.Moreover, since we know that for any other point (f 1 , . . ., f k ) will have at least the Koszul syzygies, this implies that it will have only them.
Corollary 3.5.Suppose that k ≤ n and f ∈ SOS k is general.Let (f 1 , . . ., f k ) be a vector in (Sym d V ) ×k giving the decomposition as k sum of squares of a polynomial f Then the only syzygies of (f 1 , . . ., f k ) are the Koszul ones.
Corollary 3.6.Suppose that k ≤ n and f ∈ SOS k is general.We have an isomorphism SOS k (f ) ∼ = O(k) p , for some p ∈ Z + .Note that this does not depend on the degree of f .In We notice that the diagram 3 can have its conclusion interpreted in a different manner.Instead of considering W 0 a maximal rank matrix, one may consider a fixed matrix A 0 defining f , and let Notice that such interpretation means that adding C 0 = 0 is equivalent to changing the O(k) component of SOS k (f ).Thus, if there exists no other matrix C 0 besides 0 such that rank(A T 0 A 0 + C 0 ) = k, it implies that there exists only one component.
In the next pages we explore this equivalent problem and compare the dimensions of symmetric matrices of rank k and C.Although a proof that the only translation by C preserving the rank is 0 is not obtained, by a comparison of dimensions we get a clear indicator that we should not expect other solutions.
Let S N k be the variety of symmetric matrices of size N = n+d d of rank at most k.Then, for some fixed W ∈ S N k , consider the variety Note that this is indeed a variety as it is defined by the minors of the matrix B + W and moreover, for all M ∈ (S + W ) N k , we have that M − W ∈ S N k .Hence, we can consider this variety a translation of S N k by the matrix W .
and note that the following statement holds: Firstly, note that since W is symmetric of rank k, there exists a decomposition of the form W = A T A where A ∈ M k×N .Then, since every symmetric matrix of size N gives a polynomial, through a moment vector x, we obtain a decomposition of x T W x as a sum of k squares as x T A T Ax.Then, as is discussed above, we would obtain equality for Corollary 3.6.
From the translation argument above, we obtain the following equivalences, The equations defining C are not general.Each equation specifies that a particular coefficient in the expansion of x t Bx be zero.Hence, no coefficients of a general f are zero, we have that a generic W is not contained in the hyperplanes defined by any of the n+2d 2d equations defining C.
Let N = n+d d and let S be the polynomial ring C[x ij |1 ≤ i, j ≤ N ].We set x ij = x ji and consider X = (x ij ) 1≤i,j≤N to be an N × N variable symmetric matrix.For 1 ≤ k ≤ N − 1, we denote by I k the ideal generated by the k + 1 minors of X.It is known that S/I k is a Cohen-Macaulay normal domain with dimension Then, recall that The following lemma, through a dimension count, gives further support for Conjecture 3.1.
Proof.Firstly, note that dim S/I k is maximal when k = N = n+d d and that the dimension decreases monotonically as k decreases.However, since we restrict to k ≤ n, it suffices to show that dim S/I n < codim C. Now, suppose that d = 1.Then, Hence, it suffices to prove that for all d ≥ 2, We proceed by induction on d.In the base case d = 2 we have, It is easy to see that the polynomial n 2 − 5n + 12 is positive for all n and so the base case holds.Now, assume for some fixed d ≥ 2 that n+2d 2d > n n+d d and consider Thus, by induction, codim C − dim S/I k > 0.
We finish this section by proving that Conjecture 3.1 holds for k = 2.
Proof.Consider the projection Let Ab 2 (ν 2 (Sym d V )) = {(α, β, g)|α 2 + β 2 = g} be the abstract Veronese variety that under the projection is mapped to Σ 2 (ν 2 (Sym d V )).Notice that the fiber of this projection on a point g is O(2) by Lemma 4.2.We may consider a similar projection We may define X = {(α, β, f )|α 2 + β 2 = f } in the same fashion as before.Under this projection we have that X is mapped to SOS 2 and the fiber on a point f is SOS 2 (f ) = O(2) by Lemma 1.4.Notice that the map Sym 2 (Sym d V ) → Sym 2d V that corresponds to the change of coordinates w 1 = x d 0 , . . ., w N = x d n is injective when restricted to Σ 2 (ν 2 (Sym d V )) and so is the induced linear map from Ab 2 to X.
Joining those maps into a diagram we obtain: From the previous remarks, ϕ is an one-to-one map and the fibers of ψ and ξ are both equal to O(2).Since the diagram commutes, we also obtain that ζ is a one-to-one map. .
We notice that in the case of n = 2 and d = 2 Theorem 4.1 is sharp in the sense that for the 3-secant variety of ν 2 (PU ) the intersection with C is non-empty.Indeed, one can find by computation that the intersection of Σ 1 (ν 2 (PU )) and Σ 2 (ν 2 (PU )) with C are empty.Thus, deg(SOS 1 ) = 32 and deg(SOS 2 ) = 126 as expected.However, for SOS 3 the intersection has codimension 3 in P 5 = PU .When the intersection is non-empty, the degree of Σ k (ν 2 (PU )) is still an upper bound for the degree of SOS k .