Linear Algebra and its Applications On the exactness of sum-of-squares approximations for the cone of 5 × 5 copositive matrices

We investigate the hierarchy of conic inner approximations K ( r ) n ( r ∈ N ) for the copositive cone COP n , introduced by Parrilo (2000) [22]. It is known that COP 4 = K (0)4 and that, while the union of the cones K ( r ) n covers the interior of COP n , it does not cover the full cone COP n if n ≥ 6. Here we investigate the remaining case n = 5, where all extreme rays have been fully characterized by Hildebrand (2012) [12]. We show that the Horn matrix H and its positive diagonal scalings play an exceptional role among the extreme rays of COP 5 . We show that equality COP 5 = (cid:2) r ≥ 0 K ( r ) 5 holds if and only if every positive diagonal scaling of H belongs to K ( r ) 5 for some r ∈ N . As a main ingredient for the proof, we introduce new Lasserre-type conic inner approximations for COP n , based on sums of squares of polynomials. We show their links to the cones K ( r ) n , and we use an optimization approach that permits to exploit ﬁnite convergence results on Lasserre hierarchy to show membership in the new cones.

We investigate the hierarchy of conic inner approximations K (r) n (r ∈ N) for the copositive cone COP n , introduced by Parrilo (2000) [22]. It is known that COP 4 = K (0) 4 and that, while the union of the cones K (r) n covers the interior of COP n , it does not cover the full cone COP n if n ≥ 6. Here we investigate the remaining case n = 5, where all extreme rays have been fully characterized by Hildebrand (2012) [12]. We show that the Horn matrix H and its positive diagonal scalings play an exceptional role among the extreme rays of COP 5 . We show that equality COP 5 = r≥0 K (r) 5 holds if and only if every positive diagonal scaling of H belongs to K (r) 5 for some r ∈ N. As a main ingredient for the proof, we introduce new Lasserre-type conic inner approximations for COP n , based on sums of squares of polynomials. We show their links to the cones K

Introduction
The main object of study in this paper is the cone of copositive matrices COP n , defined as COP n = {M ∈ S n : x T Mx ≥ 0 for all x ∈ R n + } (1. 1) or, equivalently, as after setting x •2 = (x 2 1 , . . . , x 2 n ) and R n + = {x ∈ R n : x ≥ 0}. Optimizing over the copositive cone is hard in general since this captures a wealth of hard combinatorial optimization problems such as finding maximum stable sets in graphs and minimum graph coloring, and, more generally, mixed-integer binary optimization problems (see, e.g., [2], [10], [3], [8]). Determining whether a matrix M is copositive is a co-NP-complete problem (see [20]). These hardness results motivate investigating hierarchies of cones that offer tractable approximations for the copositive cone. Such conic approximations arise naturally by replacing the condition x T Mx ≥ 0 on R n + in (1.1), or the condition (x •2 ) T Mx •2 ≥ 0 on R n in (1.2), by a sufficient condition for nonnegativity. Parrilo [22] introduced the cones K (r) n , whose definition relies on requiring that the polynomial ( Indeed, a matrix M lies in the interior of COP n precisely when x T Mx > 0 on R n + \ {0}, or, equivalently, when (x •2 ) T Mx •2 > 0 on R n \ {0}. Using a result of Pólya [24] (or Reznick [26]), this strict positivity condition implies existence of an integer r ∈ N for which ( As is well-known, the cone K (0) n consists of the matrices that can be written as the sum of a positive semidefinite matrix and a nonnegative matrix, thus it is the dual cone of the doubly-nonnegative cone (see [22]). We have equality COP n = K (0) n if and only if n ≤ 4 [4]. For n ≥ 6, copositive matrices that do not belong to any cone K (r) n have been constructed in [17], so the inclusion (1.4) is strict for any n ≥ 6. However, the question of deciding whether the inclusion (1.4) is strict for n = 5, which is the main topic of this paper, remains open. Question 1.1 ( [17]). Does equality COP 5 = r≥0 K (r) 5 hold?
It is known that the inclusion K (0) 5 ⊆ COP 5 is strict. For instance, the following matrix, known as the Horn matrix, is copositive, but it does not belong to the cone K (0) 5 . On the other hand, H belongs to the cone K (1) 5 (see [22]). Clearly, any positive diagonal scaling of a copositive matrix remains a copositive matrix, i.e., M ∈ COP n implies DM D ∈ COP n for any diagonal matrix D with D ii > 0 for i ∈ [n]. However, this operation does not preserve the cone K (r) n for r ≥ 1 (see [5]). For instance, H ∈ K (1) 5 , but not every positive diagonal scaling of H belongs to K (1) 5 ; in fact the positive diagonal scalings of H that still belong to K (1) 5 are characterized in [17]. It is an open question whether every positive diagonal scaling of H belongs to some cone K (r) 5 . Question 1.2. Is it true that DHD ∈ r≥0 K (r) 5 for all positive diagonal matrices D?
As we will show in this paper, a positive answer to Question 1.2 would imply a positive answer to Question 1.1. The following is the main contribution of this paper.
We group a few observations regarding the proof of this result. In view of relation (1.5), in order to show that any 5 × 5 copositive matrix lies in some cone K (r) 5 , we can restrict our attention to copositive matrices that lie on the boundary ∂COP 5 of the copositive cone. Moreover, it suffices to consider matrices that lie on an extreme ray of COP 5 . A crucial ingredient for the proof of Theorem 1.3 is the fact that all the extreme rays of the cone COP 5 are known. They have been characterized by Hildebrand [12], who proved that (up to simultaneous row/column permutation) they fall into three categories: either they are generated by a matrix in K (0) 5 , or they are generated by a positive diagonal scaling of the Horn matrix, or they are generated by a positive diagonal scaling of a class of special matrices T (ψ) (see Theorem 2.2 below for details). Hence, in order to show Theorem 1.3, we need to show that all positive diagonal scalings of the matrices T (ψ) lie in some cone K (r) 5 . This forms the main technical part of the paper, which, as we will explain later in Section 4, relies on following an optimization approach.
We now place the above results into the broader context of the relevant literature.

On the impact of diagonal scaling
Diagonal scaling plays a crucial role in our main result (Theorem 1.3) and more generally in the analysis of the copositive cone. As already noted above, positive diagonal scaling preserves the copositive cone COP n . It also preserves the cone K (0) n , but it does not preserve the cone K (r) n when r ≥ 1 and n ≥ 5. Indeed, Dickinson et al. [5] show that, for every matrix M ∈ COP n \ K (0) n and every r ∈ N, there exists a positive diagonal scaling of M that does not belong to K (r) n . On the other hand, it is shown in [5] that every 5 ×5 copositive matrix with an all-ones diagonal belongs to K (1) 5 . Hence, a method for checking whether a 5 × 5 matrix belongs to COP 5 is to scale it to obtain a matrix with binary diagonal entries and then to check whether this new matrix belongs to K (1) 5 . In contrast, it is shown in [17] that, for every n ≥ 7, there exist matrices in COP n \ r≥0 K Selecting for M the Horn matrix H gives a 7 × 7 copositive matrix that does not belong to any cone K Note that the answer to this question is negative in general if we allow M to have a zero diagonal entry. Indeed, it is shown in [17] that, if M ∈ COP n \ K (0) n , then the matrix M ⊕ 0 is copositive but does not belong to any cone K (r) n+1 . So selecting M = H gives the 6 × 6 matrix H ⊕ 0, which is copositive but does not belong to any cone K (r) 6 .

Copositive matrices from graphs
The cones K (r) n are used by de Klerk and Pasechnik [10] for defining a hierarchy of upper bounds ϑ (r) (G) for the stability number α(G) of a graph G = (V = [n], E). Recall α(G) denotes the maximum cardinality of a set S ⊆ V , which is stable (aka independent), i.e., contains no edge of G. The parameter ϑ (r) (G) is defined as the smallest scalar λ n . Here, A G is the adjacency matrix of G, I is the identity matrix and J is the all-ones matrix. As an application of a result of Motzkin-Straus [19], it follows that the following graph matrix M G := α(G)(I + A G ) − J is copositive. Hence, the parameters ϑ (r) (G) provide a hierarchy of upper bounds for α(G), converging to it asymptotically [10]. The authors conjecture in [10] that the convergence is, in fact, finite and takes place after α(G) steps, i.e., they conjecture that ϑ (r) (G) = α(G) if r ≥ α(G) −1 or, equivalently, that the matrix M G belongs to the cone K (α(G)−1) n . In fact, it is not even known whether M G belongs to some cone K  The class of graph matrices M G has been recently further investigated in [7], where they are used, in particular, to construct large classes of matrices generating extreme rays of COP n .
A possible strategy for attacking the above conjectures is to use hierarchies of subcones of the cones K (r) n , that have additional structural properties and thus may be easier to handle. The starting point to define such subcones is to use the formulation (3.2), which characterizes membership M ∈ K for all graphs G with α(G) ≤ 6. Laurent and Gvozdenović [11] strengthen this and show M G ∈ Q (α(G)−1) n for all graphs with α(G) ≤ 8, which settles Conjecture 1.5 for the class of graphs with α(G) ≤ 8. Very roughly, the cones Q (r) n allow an inductive proof, by combining with combinatorial properties of the graph matrices M G ; however this argument breaks down when α(G) ≥ 9. So a new strategy seems needed to attack Conjecture 1.5 for general graphs.
In this paper, we introduce another class of cones LAS (r) n (see (2.3)), which we use to establish our main result (Theorem 1.3). Roughly, these cones are based on following an optimization approach, where one minimizes the quadratic form x T Mx over the standard simplex, and then considers the associated Lasserre-type sum-of-squares hierarchy, which allows applying known results about their finite convergence.
We have (implicitly) used the cones LAS (r) Δ n already in our previous work [16] to prove a partial result on Conjecture 1.6. Call an edge e of G critical if its deletion increases the stability number, i.e., if α(G \ e) = α(G) + 1. Now we say that G is critical if all its edges are critical, and that G is acritical if none of its edges is critical. In [16] it is shown that, if G is acritical, then M G belongs to some cone LAS (r) n ⊆ K (r) n . Observe that the Horn matrix coincides with the graph matrix M C 5 of the 5-cycle. As C 5 is a critical graph, the above mentioned result of [16] does not apply to it. In fact, we will show in Lemma 3.10 that H = M C 5 / ∈ r≥0 LAS (r) 5 . So we cannot use the cones LAS (r) n to show membership in the cones K (r) n of the positive diagonal scalings of H. Recall that H ∈ K (1) 5 . However, as positive diagonal scaling does not preserve the cone K (r) n in general, we will need another strategy to show that every positive diagonal scaling of H belongs to some K (r) 5 .

Organization of the paper
The paper is organized as follows. In Section 2 we give an overview of the main tools and results: we introduce other sum-of-squares hierarchies of inner approximations for COP n (including the cones Q (r) n and LAS (r) n ), we present the known description of the extreme rays of COP 5 (that include the matrices T (ψ) (ψ ∈ Ψ) from (2.7)), and we sketch the main arguments used to show the main result (Theorem 1.3). As we explain there, Theorem 1.3 reduces to showing that every positive diagonal scaling of the matrices T (ψ) (ψ ∈ Ψ) belongs to some of the cones LAS (r) n (Theorem 2.3). In Section 3 we show the relationships between the various hierarchies of inner approximations for COP n . Section 4 is devoted to the proof of Theorem 2.3. We conclude with some final remarks in Section 5.
i denotes its Euclidean norm and x 1 = n i=1 |x i | denotes its 1 -norm. For a vector x ∈ R n , Supp(x) = {i ∈ [n] : x i = 0} denotes its support. The vectors e 1 , . . . , e n denote the standard basis vectors in R n , and e = e 1 + . . . + e n is the all-ones vector.
Given polynomials p 1 , . . . , p k , we let (p 1 , . . . , } denote the ideal generated by the p i 's. Throughout we denote by I Δ the ideal generated by the polynomial and variables x = (x 1 , . . . , x n ), we use the notation x S = i∈S x i and, for a sequence β ∈ N n , we set x β = x β 1 1 · · · x β n n .

Overview of results and methods
In this section we give a broad overview of the strategy that we will follow to show our main result. As indicated above, we need to show that every positive diagonal scaling of the special matrices T (ψ) (introduced below in relation (2.7)) lies in some cone K (r) 5 . In fact we will show a sharper result and show membership in another, more restricted, conic hierarchy. This alternative conic hierarchy arises naturally by considering other sufficient positivity conditions for the polynomials x T Mx and (x •2 ) T Mx •2 . We begin with introducing these alternative conic approximations.

Alternative conic approximations for COP n
The definitions of the cone COP n in (1.1) and in (1.2) rely on requiring, respectively, nonnegativity of the polynomial x T Mx on R n + , and nonnegativity of the polynomial (x •2 ) T Mx •2 on R n . Since these two polynomials are homogeneous, this is equivalent to requiring, respectively, nonnegativity of x T Mx on the standard simplex Δ n = {x ∈ R n + : In other words we can reformulate the copositive cone as Now we relax the nonnegativity condition and ask instead for a sufficient condition for nonnegativity on the simplex Δ n or on the unit sphere S n−1 , in terms of sum-of-squares representations that involve the constraints defining the simplex or the sphere. This follows the commonly used approach in polynomial optimization, based on Lasserretype relaxations (see [13] and overviews in, e.g., [14], [15]), which justifies our notation below.
For any integer r ∈ N, based on definition (2.1) for COP n , we define the following cones The index 'P' used in the notation LAS (r) Δ n ,P refers to the fact that the decomposition uses the preordering, which consists of all conic combinations of products of the constraints defining the simplex with sumof-squares polynomials as multipliers. Clearly, for any integer r ≥ 0, we have Moreover, the cones LAS (r) Δ n cover the interior of the copositive cone.
In a similar manner, based on definition (2.2) for COP n , we define the following cone LAS (r) As we will show later (see Theorem 3.1), we have the following relationships between the various approximation cones for COP n that were introduced above: S n−1 for any n ≥ 1 and r ≥ 2. (2.6) A main motivation for introducing the cones LAS (r) Δ n lies in the fact that they permit to capture certain copositive matrices on the boundary of COP n , namely those matrices M that arise as positive diagonal scaling of a class of matrices generating extreme rays of COP 5 (see Theorem 2.2 and Theorem 2.3 below). Of course, as a direct application of (2.6), these matrices also lie in some Parrilo cone K (r) n . The point, however, is that we do not know how to show this directly, without showing that they belong to some cone LAS (r) n , which is a stronger result. The alternative cones LAS (r) S n−1 are introduced for the sake of completeness, since they arise naturally in view of definition (2.2) for COP n , however we will not use them for the proofs of our results in the paper.

Extreme rays of COP 5
For answering the question of whether the two cones COP 5 and r≥0 K (r) 5 coincide it suffices to look at the matrices that generate an extreme ray of COP 5 . This indeed follows directly from the fact that any M ∈ COP 5 can be decomposed as a finite sum of matrices generating an extreme ray. For convenience we say that a copositive matrix is extreme if it generates an extreme ray of COP n .
A positive diagonal scaling of a matrix M is a matrix of the form DM D where D ∈ D n ++ . Notice that if M is an extreme matrix of COP n then every positive diagonal scaling of M is also an extreme matrix. Moreover, if M ∈ COP n is an extreme matrix then the same holds for every row/column permutation of M , i.e., for any matrix of the form P T MP , where P is a permutation matrix. As observed above, positive diagonal scaling does not preserve in general membership in K (r) n (r ≥ 1), however taking a row/column permutation clearly does preserve membership in any K (r) n (r ≥ 0). Hildebrand [12] characterized the set of extreme matrices of COP 5 . For this, he defined the following matrices where ψ ∈ R 5 , and proved the following theorem.
where P is a permutation matrix, D ∈ D 5 ++ and the quintuple ψ is an element of the set In summary, the extreme matrices M of COP 5 can be divided into three categories: n , (ii) M is (up to row/column permutation) a positive diagonal scaling of the Horn matrix, (iii) M is (up to row/column permutation) a positive diagonal scaling of a matrix T (ψ) for some ψ ∈ Ψ.
Our main result in this paper is to show that every matrix from the third category of extreme matrices of COP 5 belongs to some cone LAS (r) Δ 5 and thus, in view of (2.6), to some cone K (r) 5 .
In view of Theorem 2.2, Theorem 1.3 directly follows from Theorem 2.3. As a direct consequence, in order to answer Question 1.1, it suffices to look at the extreme matrices from the second category (i.e., at the positive diagonal scalings of the Horn matrix H).
On the other hand, as we will show later in Lemma 3.10, the Horn matrix H does not belong to any of the cones LAS (r) Δ 5 . Hence, in order to show that any diagonal scaling of the Horn matrix belongs to some cone K (r) 5 and thus give an affirmative answer to Question 1.1, it will not be sufficient to consider the cones LAS (r) Δ 5 . A different, new strategy will be needed.

Sketch of the proof
We are left with the task of proving Theorem 2.3. For this we follow an optimization approach and consider the following standard quadratic program, for a given copositive matrix M ∈ COP n : Δ n . Hence we may restrict our attention to the case when p * M = 0, i.e., when M ∈ ∂COP n . We now consider the Lasserre sum-of-squares hierarchy for problem (SQP M ), where, for any integer r ≥ 1, we set Then the bounds p For this we will use a general theorem of Nie [21] that ensures finite convergence of the Lasserre hierarchy (2.9) when the classical optimality conditions hold at every global minimizer (see Theorem 4.2 below). In our case the global minimizers of problem (SQP M ) are given by the zeros of the quadratic form x T Mx in Δ n , whose structure is well-understood for the matrices M = T (ψ). See Section 4 for details.

Relationships between sum-of-squares conic approximations for COP n
In this section we show the relationships from (2.6) between the cones K  Here we show the following result, which establishes the links announced in relation (2.6) between the various cones defined in previous sections. This result is implicitly shown in [16] (see Corollary 3.9), where we compared different bounds for standard quadratic programs obtained via sums of squares of polynomials. We begin with observing that in the definition (2.4) of the cone LAS (r) Δ n ,P we may assume that the summation only involves sets S ⊆ [n] with |S| ≡ r (mod 2).
Proof. To see this consider a term x S σ S , where |S| ≤ r, |S| ≡ r (mod 2) and σ S ∈ Σ r−|S| . Then |S| ≤ r−1, deg(σ S ) ≤ r−|S| −1 and thus, modulo the ideal I Δ , we can replace Next we recall an alternative definition of the cone K (r) n , following from a result in [27]. σ S x S for some σ S ∈ Σ d−|S| . (3.1) In particular, for any r ≥ 0, we have Note the similarity between the description of LAS we have a representation of ( i x i ) r−2 x T Mx. The next lemma (whose main idea was already used, e.g., in [9]) gives a simple trick, useful to navigate between these two types of representations.
and assume f is homogeneous. The following assertions hold.
Proof. The assertion (i) follows by expanding ( We now show (ii). The claim that g is a homogeneous polynomial of degree d + r is easy to check. Assume now f − g ∈ I Δ . By evaluating , and the result follows after multiplying both sides by ( We will also use the following simple fact.
. Then σ is a homogeneous polynomial of degree k. Moreover, is a homogeneous polynomial with degree deg(σ). It suffices now to observe that ( Using these two lemmas we can now relate the two cones LAS . As r − |S| ≡ deg( σ S )(mod 2) we have σ S ∈ Σ r−|S| by Lemma 3.5(i). In view of relation (3.2), this shows that M ∈ K (r−2) n . Conversely, assume M ∈ K (r−2) n . Then, in view of (3.2), we have a decomposition of the form ( n i=1 ) r−2 x T Mx = |S|≤r,|S|≡r(mod 2) σ S x S , where σ S ∈ Σ r−|S| . By applying Lemma 3.4(i), we obtain x T Mx = |S|≤r,|S|≡r(mod 2) σ S x S +q, where q ∈ I Δ . Combining with Lemma 3.2 this shows M ∈ LAS (r) Δ n ,P .
To complete the proof of Theorem 3.1 we now establish the relation to the cone LAS (r) S n−1 , which follows from a result in [9].
Proposition 3.7 ( [9]). Let f be a homogeneous polynomial of degree 2d and r ∈ N. Then, In particular, for any r ≥ 2 we have LAS (r) We conclude this section with a reformulation for the cone LAS (

3.4)
If r is even and r ≥ 4, then we have LAS (r) Proof. The proof is similar to that of Lemma 3.2, except we now have a summation that involves only sets S ⊆ [n] with |S| ≤ 1. We spell out the details for clarity. Consider first the case when r is odd. Assume M ∈ LAS (r) Combining Lemma 3.4(ii) and Lemma 3.5 we obtain a decomposition as in (3.4). Conversely, starting from a decomposition as in (3.4) we get a decomposition as in (2.3) by applying Lemma 3.4(i).
Consider now the case r ≥ 4 even. Assume M ∈ LAS Δ n . Starting from a decomposition as in (2.3) and using as above Lemma 3.4(i) and Lemma 3.5, we obtain a decomposition where σ 0 ∈ Σ r and σ i ∈ Δ r−1 . From this it follows that the polynomial n j=1 x j divides σ 0 , which implies its square divides σ 0 . Then we can divide out by n j=1 x j and obtain an expression as in (3.4) (replacing r by r − 1), that certifies membership of M in LAS (r−1) Δ n .

Link to the cones Q (r) n
The definition (3.2) of the cone K (r) n involves only square-free monomials, of the form x S = i∈S x i . As observed in [27,23], one can allow arbitrary monomials and, after using again the argument of Lemma 3.4, we get the following alternative definitions Based on relation (3.5), the authors of [23] proposed the cones Q (r) n , that are defined as the variation (3.7) of (3.5) obtained by just considering the terms associated to the monomials x β with highest degree r or r + 2. In other words, = M ∈ S n : x T Mx = β∈N n |β|=r,r+2 σ β x β + q for σ β ∈ Σ r+2−|β| and q ∈ I Δ , (3.8) where the equivalence of (3.7) and (3.8) follows again using Lemma 3.4. Clearly, we have inclusion Q   . In view of (3.6), membership in K (r−2) n requires a decomposition using terms of the form x β σ β for all β such that |β| ≤ r and |β| ≡ r(mod 2). In view of (2.3), for membership in LAS (r) Δ n , we consider only the terms x β σ β with lowest degree |β| = 0, 1. On the other hand, in view of (3.8), for membership in Q (r−2) n , we consider only the terms with highest degree |β| = r, r − 2. Hence, it is interesting to note that the two cones LAS (r) Δ n and Q (r−2) n use the "two opposite ends" of the spectrum of possible degrees for the terms x β σ β .
We conclude this section with observing that, while the Horn matrix H belongs to K (1) 5 , it in fact does not belong to any of the cones LAS (r) Δ n . The proof exploits the fact that the quadratic form x T Hx has infinitely many zeros in the simplex Δ n .
Proof. Assume by contradiction that H ∈ LAS (r) Δ 5 for some r ∈ N, i.e., . For a fixed scalar t ∈ (0, 1), consider the vector u t = ( 1 2 , 0, t 2 , 1−t 2 , 0) ∈ Δ 5 , which can be verified to define a zero of x T Hx, i.e., u T t Hu t = 0. By evaluating the quadratic form x T Hx at the point x + u t we obtain As u T t Hu t = 0 and x T Hu t = x 2 t + (1 − t)x 5 we obtain x i ).
(3.9) We now compare some coefficients of the monomials (in x) in both sides of (3.9) in order to reach a contradiction. As there is no constant term in the left hand side, the constant term in the right hand side is equal to 0. This gives σ 0 (u t ) + σ 1 (u t )/2 + tσ 3 (u t )/2 + (1 − t)σ 4 (u t ) = 0 and thus σ i (u t ) = 0 for i = 0, 1, 3, 4. As σ i (x + u t ) is a sum-of-squares polynomial in x this in turn implies that there is no linear term in x in each of the polynomials σ i (x + u t ) for i = 0, 1, 3, 4. Next, combining this with the fact that the coefficient of x 1 in the left hand side is equal to 0, one obtains that the polynomial q(x + u t ) has no constant term (i.e., q(u t ) = 0). Now we compare the coefficients of x 2 in both sides. In the left hand side it is equal to 2t, while in the right hand side it is equal to σ 2 (u t ). Hence we have 2t = σ 2 (u t ). We now reach a contradiction since σ 2 (u t ) is a sum-of-squares polynomial in t.
We now show that the cones LAS Assume now n = 3, we show that the matrix does not belong to any of the cones LAS (r) Δ 3 . The proof follows a similar argument as the one used for Lemma 3.10, using the fact that u t = (t, 0, 1 − t) defines a zero of M for any t ∈ (0, 1), i.e., u T t Mu t = 0.

Proof of Theorem 2.3
We now proceed to prove Theorem 2.3. As mentioned in Section 2.3, we will follow an optimization approach, which allows us to apply a result of Nie [21] as a key ingredient for our proof. We proceed in three steps. First, we recall the sum-of-squares Lasserre hierarchy for a general polynomial optimization problem and the result of Nie [21], that shows finite convergence of this hierarchy under the classical optimality conditions. Second, applying this result to a class of standard quadratic programs, we obtain a set of sufficient conditions for a matrix M ∈ ∂COP n , that permit to claim that every positive diagonal scaling of M belongs to some cone LAS (r) Δ n . Finally, we show that these sufficient conditions hold for the matrices T (ψ) (ψ ∈ Ψ), which concludes the proof of Theorem 2.3.

Optimality conditions and finite convergence of Lasserre hierarchy
In this section we recall a useful general result of Nie [21] that gives sufficient conditions for having finite convergence of the Lasserre hierarchy for a general polynomial optimization problem.
Given n-variate polynomials f , g j for j ∈ [m], and h i for i ∈ [k], consider the general polynomial optimization problem where K is the semialgebraic set defined by We say that the Archimedean condition holds if there exists N ∈ R such that For any integer r ∈ N consider the corresponding Lasserre sum-of-squares hierarchy By the following result of Putinar [25], under the Archimedean condition, asymptotic convergence is guaranteed, i.e., f (r) → f min as r → ∞.

Theorem 4.1 ([25]). Assume K satisfies the Archimedean condition (4.1). If a polynomial
p is strictly positive on K, then p can be written as The Lasserre hierarchy is said to have finite convergence if f (r) = f min for some r ∈ N. In general, finite convergence is not always achieved. However, Nie showed in [21] a very useful result that permits to show finite convergence of the Lasserre hierarchy under some extra conditions apart from the Archimedean condition. These conditions rely on the classical optimality conditions, that we now recall (see, e.g., the textbook [1]).
Let u be a local minimizer of problem (Poly-Opt) and let J(u) = {j ∈ [m] : g j (u) = 0} be the set of inequality constraints that are active at u. Then the constraint qualification constraint (abbreviated as CQC) holds at u if the set {∇g j (u) : j ∈ J(u)} ∪ {∇h i (u) : i ∈ [k]} is linearly independent. If CQC holds at u then there exist scalars λ 1 , . . . , λ k , μ 1 , . . . , μ m ∈ R satisfying If, in addition, μ j > 0 holds for all j ∈ J(u), then one says that the strict complementarity condition (abbreviated as SCC) holds. Let L(x) the Lagrangian function, defined by Another necessary condition for u to be a local minimizer is the following inequality where G(u) is defined by If it happens that the inequality (SONC) is strict, i.e., if then one says that the second order sufficiency condition (SOSC) holds at u.
We can now state the following result by Nie [21]. In the next section we will apply Theorem 4.2 to a class of standard quadratic programs in order to show finite convergence of the corresponding Lasserre hierarchy. One important observation, already made in [21], is that this strategy can only work when the number of global minimizers is finite.

Optimality conditions for standard quadratic programs
Consider a matrix M ∈ ∂COP n . The objective of this section is to give sufficient conditions on M that permit to conclude that DM D ∈ r≥0 LAS (r) Δ n for all D ∈ D n ++ . This will be very useful since, in the next section, we will show that the matrices T (ψ) (ψ ∈ Ψ) satisfy these sufficient conditions and thus we will be able to conclude the proof of Theorem 2.3. Our strategy is to apply the result from Theorem 4.2 to the setting of standard quadratic programs. Let us recall the following problem, already introduced in Section 2.3: and the corresponding Lasserre hierarchy introduced in relation (2.9). Note the optimal value of (SQP M ) is zero as M ∈ ∂COP n . Now we will apply Theorem 4.2 to problem (SQP M ). The set K = Δ n indeed satisfies the Archimedean condition (this is well-known and easy to check; see, e.g., [16]). By [18,Theorem 3.1], the feasible region of the Lasserre hierarchy (2.9) associated to problem (SQP M ) is a closed set. Hence, the 'sup' in program (2.9) can be changed to a 'max'. As a consequence, for a matrix M ∈ ∂COP n , having finite convergence of the Lasserre hierarchy (2.9) associated to problem (SQP M ) is equivalent to having M ∈ r≥0 LAS (r) Δ n . So we obtain the following corollary. As mentioned earlier, our objective is to give sufficient conditions on M that permit to claim DM D ∈ r≥0 LAS (r) Δ n for all D ∈ D n ++ . For this we will apply Corollary 4.3, combined with the following result, which will be a key ingredient in our argument. In what follows we will prove Theorem 4.4. Given M ∈ ∂COP n and D ∈ D n ++ , let us consider the standard quadratic program associated to DM D: Observe that the optimal value of program ( Let us recall a result from [4] about the support of optimal solutions for problem (SQP M ), which we will use for the analysis of the conditions (SCC) and (SOSC). We give the short proof for clarity.  As observed, e.g., in [21], if the sufficient optimality conditions (CQC), (SCC), (SOSC) hold at every global minimizer, then the number of minimizers must be finite. We now show a useful fact: if a standard quadratic program has finitely many minimizers, then (SOSC) holds at all of them. x has infinitely many zeros on Δ |S| . Hence, x T Mx has infinitely many zeros on Δ n , contradicting the assumption.
Let u be a minimizer of problem (SQP M ) with support S and consider as above its restriction ũ ∈ R |S| . Observe that the second order sufficiency condition (SOSC) for problem (SQP M ) at u reads

Proof of Theorem 2.3
Now we can prove the result of Theorem 2.3; that is, we show that DT (ψ)D ∈ r≥0 LAS (r) Δ n for all D ∈ D n ++ and ψ ∈ Ψ. We show this result as a direct application of Theorem 4.8. It thus remains to check that the two assumptions in Theorem 4.8 hold. First, by combining two results from [12], the description of the (finitely many) minimizers of problem (SQP M ) for M = T (ψ) (ψ ∈ Ψ) can be found.

Concluding remarks
In this paper we investigate whether the cones K (r) n provide a complete approximation hierarchy for the copositive cone COP n , i.e., whether their union covers the full cone COP n . As mentioned earlier, the answer is positive for n ≤ 4 (then K (0) n = COP n [4]) and negative for n ≥ 6 [17]. As our main result we show that the answer is positive for n = 5 if and only if every positive diagonal scaling of the Horn matrix belongs to some cone K (r) 5 . Our proof technique relies on considering an alternative approximation hierarchy of COP n , provided by the Lasserre-type cones LAS (r) n . Namely, we show that all the extreme matrices of COP 5 , that do not belong to K (0) 5 and are not a positive diagonal scaling of the Horn matrix, do indeed belong to r≥0 LAS (r) Δ n . As we have seen earlier, for a matrix M ∈ COP n , the number of zeros of the form x T Mx in the simplex Δ n plays an important role for checking membership of M in the cones LAS (r) Δ n . If M is strictly copositive (i.e., x T Mx has no zeros in Δ n ), then M ∈ r≥0 LAS (r) Δ n . If M has finitely many zeros in Δ n , then, as was shown in Section 4.1, one possible strategy for showing membership in r≥0 LAS (r) Δ n is following an optimization approach and checking the classical optimality conditions at every zero in Δ n (i.e., every minimizer of x T Mx over Δ n ). This has been our strategy for showing that every matrix T (ψ) (for ψ ∈ Ψ) belongs to some cone LAS (r) Δ 5 . Finally, if x T Mx has infinitely many zeros in Δ n , then the classical optimality conditions cannot hold and thus the strategy from Section 4.1 does not work. One example that illustrates how the number of zeros causes issues is the Horn matrix H. While H belongs to K (1) 5 , it does not belong to LAS (r) Δ 5 for any r ∈ N. To show this, we have exploited the structure of the (infinitely many) zeros of the form x T Hx in Δ 5 . Hence, another strategy will be needed for settling the question whether every positive diagonal scaling of H belongs to some cone K (r) 5 .
In [16] we proved (rephrased in the language of the present paper) that, if G is an acritical graph, then its graph matrix M G belongs to some cone LAS (r) Δ n . Our strategy there was also based on applying the optimization approach and showing that the optimality conditions hold at all the zeros of x T M G x in Δ n . The assumption that G is acritical indeed ensures that the number of zeros in Δ n is finite (the zeros correspond then to the stable sets of maximum cardinality α(G)). Therefore, as a direct application of Theorem 4.4, we obtain that every positive diagonal scaling of M G belongs to some cone LAS Dealing with general graphs (with critical edges) will likely require another strategy.

Declaration of competing interest
None declared.