Conic optimization: a survey with special focus on copositive optimization and binary quadratic problems

A conic optimization problem is a problem involving a constraint that the optimization variable be in some closed convex cone. Prominent examples are second order cone programs (SOCP), semideﬁnite problems (SDP), and copositive problems. We survey recent progress made in this area. In particular, we highlight the connections between nonconvex quadratic problems, binary quadratic problems, and copositive optimization. We review how tight bounds can be obtained by relaxing the coposi-tivity constraint to semideﬁnitness, and we discuss the eﬀect that diﬀerent modelling techniques have on the quality of the bounds


Introduction
A conic optimization problem is a problem involving a constraint that the optimization variable be in some closed convex cone. The field of conic optimization is a broad one, as any convex optimization problem can be cast as a conic problem, see [82]. In this paper, we will focus on more specific conic problems which appear naturally when solving quadratic or combinatorial optimization problems. In particular, we will highlight developments in second order cone programming (SOCP), semidefinite programming (SDP), and copositive optimization.

The general linear conic problem and its dual
Consider a proper cone K, i.e., a closed convex and full dimensional cone which is also pointed, meaning that K does not contain a straight line, or equivalently, that K ∩ (−K) = {0}. Then a linear conic optimization problem over K is a problem of the form where C, X, A i are matrices (or vectors) of suitable dimension, and b i ∈ R for all i = 1, . . . , m. In case of matrices, ·, · denotes the Frobenius inner product A, B := trace(A T B), in case of vectors, it denotes the Euclidean inner product. Problem (P) therefore aims to minimize a linear function over the intersection of a proper cone and an affine subspace. As in linear programming, a primal problem of the form (P) always comes with a dual problem which involves the dual cone: given an arbitrary cone K ⊆ R m×n , the dual cone K * is defined as K * := {X ∈ R m×n | X, K ≥ 0 for all K ∈ K}.
As usual, the Lagrangian function L : K × R m → R is defined as This gives the dual problem For the inner minimization problem to be finite, we require that C − m i=1 y i A i , X ≥ 0 for all X ∈ K, in other words, we require C − m i=1 y i A i ∈ K * . Therefore, we arrive at the dual problem d * = max b, y It is easy to see that the duality gap C, X − b, y equals the inner product of the primal and dual variables: C, X − b, y = Z, X . Since Z, X ≥ 0 for any pair of primal/dual feasible points X ∈ K, Z ∈ K * , we immediately get weak duality.
Clearly, if the duality gap is zero for a pair of primal/dual feasible points X ∈ K and (y, Z) ∈ R m × K * , then X is optimal for (P) and (y, Z) is optimal for (D). The converse is, however, not true in general: a positive duality gap may exist, or the optimal value of (P) or (D) may not be attained. Examples for this phenomenon in second order cone programming can be found in [3] or in [13,Section 2.4.1]. For the SDP case, examples can be found in [55], and a thorough analysis of this behavior can be found in [89].
In order to get strong duality, we need constraint qualifications: A point X is called strictly feasible for (P) if X is feasible for (P) and X ∈ int K. A pair (y, Z) is called strictly feasible for (D) if (y, Z) is feasible for (D) and Z ∈ int K * . If such points exist, then we say that the problem fulfills the primal (resp. dual) Slater condition.
Note that strict feasibility can always be enforced by considering the so called skewsymmetric embedding of the original problem, see [32]. Assuming strict feasibility gives us strong duality: Theorem 1.2 (Strong Duality Theorem). Assume that problem (D) has a strictly feasible solution (y, Z). Then the primal and dual optimal values are equal: p * = d * , and if p * < +∞, then p * is attained, i.e., there exists a primal feasible solution X * with p * = C, X * .
Conversely, assume that problem (P) has a strictly feasible solution X. Then the primal and dual optimal values are equal: p * = d * , and if d * > −∞, then d * is attained, i.e., there exists a dual feasible solution (y * , Z * ) with d * = b, y * .
A proof of this theorem along with a thorough discussion of conic duality can be found for example in [13]. It has been shown in [43] that the Slater condition (and hence strong duality) is a generic property of conic problems which loosely speaking means that Slater's condition is fulfilled (and hence strong duality holds) for almost all feasible conic problems.
Note that existence of strictly feasible points is important not only for theoretical purposes to ensure strong duality, but also many optimization algorithms require this property. In the absence of strictly feasible points, a solver may not terminate or may produce a "solution" with no useful meaning. This is a feature that distinguishes general conic optimization from linear programming. Consequently, very careful modelling is needed, since often existence of strictly feasible points can be guaranteed if the problem is modelled in a proper way. We will return to this point in Section 5.
We stress that a constraint qualification is unnecessary if the cone K is polyhedral like in linear programming, where K = R n + . The reason why a positive duality gap may occur in general conic programming lies in the geometry of the problem and happens if the feasible set is contained in a face of the cone. Two approaches have been developed to tackle conic problems that fail to fulfill a constraint qualification: (i) Facial reduction attempts to identify the so called minimal cone F min for problem (P), such that problem (P) with K replaced by F min is strictly feasible and has the same optimal solution as (P). This facial reduction technique goes back to Borwein and Wolkowicz [22,23]. (ii) Other approaches (e.g. [96]) work on the dual side and construct an extended dual which achieves strong duality without assuming a constraint qualification. A good exposition of these two approaches can be found in [88].

LP, SOCP, and SDP
Depending on which cone K is considered, conic optimization includes various classes of problems: If K = R n + , then (P) is a linear problem, a well studied class which appears in numerous applications. LPs are used to model not only straightforward linear constraints, but also constraints involving 1 -or ∞ -norms or absolute values.
If the cone K in (P) is the second order cone, then (P) is called a second order cone problem (SOCP). The second order cone in R n (sometimes also called Lorentz cone or ice cream cone) is defined as L n := {(x 0 , x) ∈ R n | x 0 ≥ x 2 }. It appears in optimization problems involving Euclidean norms: for example, the constraint Ax + b 2 ≤ c T x + d can be written as (c T x + d, Ax + b) ∈ L n+1 . This is often used in robust optimization when an ellipsoidal uncertainty set is used [12]. Other applications of SOCP can be found in [3,73]. Certain risk measures in stochastic optimization may also lead to optimization problems over the so called p-order cone L p n : [102]. A third prominent setting is semidefinite programming (SDP), where K is considered to be the cone of symmetric positive semidefinite matrices S + n := {X ∈ R n×n | X = X T , X 0}. SDPs are used to model problems with linear matrix inequalities. They appear in eigenvalue optimization and control theory, see [101,55]. Arguably the two most important areas of application for SDP are robust and combinatorial optimization. For an in depth discussion of SDP in robust optimization, we refer to the book [12] ant the recent survey paper [106]. The role of SDP in relaxations of combinatorial problems will be covered in more detail below.
The cones R n + , L n , and S + n are self-dual, whereas the dual of L p n is L q n with q such that 1 p + 1 q = 1. We mention that R n + , L n , and S + n are instances of so called symmetric cones that can be studied in the unifying framework of Euclidean Jordan algebras, see [47] and references therein.
These three problem classes have been studied for decades because of their countless applications and because they can be solved efficiently: it has been shown in the vast literature on interior point methods pioneered by [83] (see also [97]) that these algorithms are able to solve LPs, SOCPs, and SDPs in polynomial time. A different class of algorithms that solves SDPs is conic bundle methods, see [57,56]. Numerous software implementations are available, and we refer the reader to Hans Mittelmann's website [79] for an up-to-date list of the various packages.

Variants of SDP and SOCP
So far, we discussed linear conic optimization problems. However, the enormous modelling power of semidefinite and second order cone programming only unfolds if we allow for nonlinearities or integer variables: Mixed integer conic optimization problems are linear conic problems with a constraint that some of the variables are integer valued: Sometimes binary constraints X ij ∈ {0, 1} for (i, j) ∈ J are used instead.
Nonlinear conic problems are nonlinear problems that involve a cone constraint, mostly a semidefinitness constraint (K = S + n ) or an SOCP constraint (K = L n ). Naturally, mixed integer nonlinear conic problems have been studied likewise.
It would be beyond the scope of this paper to discuss the development in mixed integer nonlinear conic optimization here. We mention just a few applications: Nonlinear SOCPs appear for example in facility location [28]. Mixed integer SOCPs appear in engineering (e.g. turbine balancing problems), in service system design [51], in finance (e.g. cardinality-constrained portfolio optimization), or in combinatorial problems like the Euclidean Steiner Tree Problem, see [49] and references therein. Solution approaches for these problems include semismooth Newton methods [28], outer approximation algorithms [41], cutting plane algorithms [9,40,61], and Branch-and-Bound algorithms [49].
Mixed integer SDPs have applications in truss topology optimization [49], in certain clustering problems [4], or in sparse principal component analysis [71]. References to numerous fields of application of nonlinear SDPs in engineering, (robust) control, finance and others can be found in [5] and [105].
Solution algorithms for these problems include augmented Lagrangian methods, sequential SDP methods, and primal-dual interior point methods, see e.g., [5,62,105]. For pointers to software implementations, we refer again to Hans Mittelmann's website [79].

Conic reformulations of quadratic problems
Conic optimization problems play a particularly fruitful role in the theory of quadratic and binary quadratic optimization problems. This is accomplished by a technique called lifting, which was pioneered by [99] and [75]. The main idea can be seen as follows: consider a quadratic expression x T Qx with a symmetric matrix Q ∈ R n×n and x ∈ R n . If we introduce a new variable X ∈ R n×n to represent the rank-1 matrix xx T , then we get By this technique, quadratic terms in x ∈ R n become linear terms in X ∈ R n×n . Since many optimization problems considered in the sequel contain nonnegativity constraints x ≥ 0, this leads to the definition of a convex matrix cone that turns out very useful for modelling purposes: The cone of completely positive matrices is defined as and its dual cone, the cone of copositive matrices, is defined as For the ease of notation, we will omit the index n in notations like COP n or CP n unless it is necessary to stress the dimension. Both COP and CP have been studied for decades in the linear algebra literature, see [15] and references therein. They have numerous interesting properties but are still not fully understood, cf. [14]. Note that the two cones are given in different form: CP is given by its extreme rays which are precisely the rank-1 matrices xx T with x ≥ 0, whereas COP is given as the solution set of (infinitely many) inequalities. This fact plays a role when considering approximations of these cones, see Section 3. A characterization of the extremal rays of COP n has only been given for n ≤ 6, cf. [1]. Likewise, only limited knowledge is available about the facial structure of CP and COP, cf. [34]. The earliest use of these cones in optimization was a paper by Preisig [94] who studied a particular fractional quadratic problem, and by Quist, de Klerk, Roos and Terlaky [95], who were the first to introduce a conic optimization perspective while deriving relaxations for general quadratic optimization problems. Bomze et al. [20] introduced the term "copositive optimization" and showed for the first time equivalence of a nonconvex quadratic optimization problem and a linear problem over CP resp. COP: They considered standard quadratic optimization problems, i.e., nonconvex quadratic problems over the standard simplex ∆ := {x ∈ R n | e T x = 1} where e ∈ R n denotes the all-ones vector. Given a symmetric matrix Q ∈ R n×n , a standard quadratic problem is of the form In spite of its simple structure, (StQP) is an NP-hard problem if Q has a negative eigenvalue, see [86]. Alternatively, NP-hardness of (StQP) can be seen from the fact that the maximum clique problem can be formulated as (StQP): consider a graph with n vertices. Denote its adjacency matrix by A, its clique number by ω, and define J := ee T . It was shown by Motzkin and Straus [80] that The max clique problem is a particularly difficult NP-hard problem, and even computing an approximation of any reasonable quality is NP-hard [54]. We will see below how copositive optimization can be used to tackle this and other NP-hard problems. By squaring the constraint in (StQP) and applying the lifting transformation outlined in (1), it is easy to see that the following problem is a relaxation of (StQP): as is its dual problem It is easy to verify that Slater's condition and hence strong duality holds for (3)-(4). Since the objective function of (3) is linear and the feasible set is convex, it follows that the optimal solution is attained at an extreme point of the feasible set, which can be shown to be the matrices of the form xx T with x ∈ ∆, cf. [20]. This implies that (3) and (4) are not merely relaxations but exact reformulations of (StQP) in the sense that all three problems have the same optimal value. The optimal solutions of (StQP) and (3) fulfill the following relation: if x * is optimal for (StQP), then X * := x * (x * ) T is optimal for (3). Conversely, if X * is optimal for (3), then it can be decomposed as x i is an optimal solution of (StQP). These reformulations (3)-(4) are interesting because they show that the NP-hard (StQP) can be reformulated equivalently as a linear problem over the convex cones CP or COP. In these formulations, all local minima vanish, and the complexity of the problem is entirely moved into the cone constraint. This indicates that CP and COP must be intractable. Indeed, it was shown in [36], that checking membership in CP is NP-hard. Checking membership in COP is co-NP-complete, as was shown in [81]. Whether or not checking membership in CP is also in NP is still one of the many open problems related to these cones, cf. [14].
Returning to the maximum clique problem for a graph with adjacency matrix A, it follows from [20] that the clique number ω equals the optimal value of the following copositive problem: Many other graph parameters have a representation as a copositive or completely positive problem. We refer to [42] for references. Copositive optimization experienced a breakthrough with Burer's 2009 paper [25]. He showed that every quadratic problem with linear and binary constraints can be rewritten as such a problem. More precisely, he showed that a quadratic binary problem of the form with Q ∈ S n , c, a i ∈ R n (i = 1, . . . , m), b ∈ R m , and B ⊆ {1, . . . , n} can equivalently be reformulated as the following completely positive problem: provided that (5) satisfies the so-called key condition, i.e., a T i x = b i for all i and x ≥ 0 implies x j ≤ 1 for all j ∈ B. As noted by Burer, this condition can be enforced without loss of generality. Doing so may, however, have consequences when relaxing the cone constraint [21,60,18].
Similar techniques can be used to derive copositive or completely positive formulations for problems involving quadratic constraints or replacing the constraint x ∈ R n + in (5) by other closed convex cones. This leads to reformulations involving so called generalized copositive and completely positive cones: given a closed convex cone K ⊂ R n , one can define and its dual cone of generalized copositive matrices These cones were introduced in [95] and studied also in [44]. As shown by Burer [26], the problem where the nonnegativity constraints in (5) are replaced by the constraint x ∈ K is (under mild conditions) equivalent to a linear conic program over the cone of matrices which are completely positive over R + × K, i.e., over the cone conv{yy T | y ∈ R + × K}. Eichfelder and Povh [45,35] generalize this even more to the case where K is an arbitrary set, and they also give a formulation for problems involving one quadratic constraint. When considering quadratic constraints, reformulating the problem as a conic problem becomes more involved. Burer [25] already considered certain special cases, namely binary constraints (which can be viewed as quadratic equations x 2 i = x i ), and complementarity constraints. For more general quadratically constrained quadratic problems, similar reformulations have been obtained: consider a quadratically constrained quadratic problem of the form , and b ∈ R m . Burer and Dong [27] show two different ways of formulating a (QCQP) as such a generalized completely positive problem over CP K : one where K is a direct product of R n + and second-order cones, and another where K is the direct product of R n + and semidefinite cones (viewed as vectors by stacking the columns on top of each other). Bai et al. [10] and Arima et al. [8] derive similar formulations under milder assumptions.
We remark that a different generalization of CP is the cone of completely positive semidefinite matrices, i.e., the cone consisting of all n × n matrices that admit a Gram representation by positive semidefinite matrices. This cone appears when studying quantum analogues of graph parameters like the stability or chromatic numbers. We refer to [70] for an in-depth discussion.
It goes without saying that generalized completely positive and copositive cones are even harder to work with than CP or COP. The appeal of the formulations discussed above lies in the fact that by this technique difficult, NP-hard problems can be reformulated as linear problems over a convex cone. Hence these reformulations are convex problems which do not possess local minima, and the hardness of the problem is completely captured by the cone constraint. Therefore, any progress made in understanding the cones can be used to help solving a whole range of different problems. As a first step, the approximation schemes for COP and CP discussed in Section 3 can be extended to COP K and CP K . However, more research on algorithmic approaches for problems over generalized copositive and completely positive cones is needed to make these approaches work numerically for bigger problems.

Extensions: Polynomial optimization and infinite dimensional conic problems
It is easy to see that any polynomial optimization problem can be rewritten as a quadratic problem by introducing extra variables and constraints. For example, by defining an extra variable and constraint y = x j x k , the cubic term x i x j x k becomes the quadratic term x i y. So the reformulations discussed above can in principle be applied to polynmial problems, as well. A different line of research has worked with the cone of completely positive tensors. This concept was originally introduced by Dong [39] and consititutes another natural extension of CP. Peña et al. [91] consider optimization problems involving n-variate polynomials and show that under certain assumptions these can be reformulated as linear problems over the cone of completely positive tensors of order d and dimension n+1. They also show that in case of a compact feasible set order d = 4 is sufficient. The approach has been extended in [63,104]. We also refer to [7] for more discussion on the connection between conic and polynomial optimization. We mentioned above that the maximum clique problem (and, equivalently, the stable set problem) on a graph with vertex set V = {1, . . . , n} can be formulated as an (StQP) and consequently as a copositive or completely positive problem (3)-(4). This can be generalized to infinite graphs, i.e., to the setting where the vertex set is not finite but a compact metric space V equipped with a probability measure ω. The problem of determining the stability number of an infinite graph appears e.g. in the kissing number problem [37] and other packing problems, see [33]. Dobre et al. [37] generalized the concept of copositive matrices to the infinite dimensional setting by defining copositive kernels: a kernel is a continuous function K : It can be shown that the cone of copositive kernels is independent of the choice of ω. This cone as well as its dual (the cone of completely positive measures) can be used to derive exact copositive and completely positive reformulations of the stability number problem for infinite graphs, see [37]. Since these cones are intractable, approximations have been proposed in [65,64]. This in turn has been used to derive good bounds for the underlying problems.

Approximation hierarchies for COP and CP
Since the cones COP and CP are computationally intractable, approximations have to be used in order to solve an optimization problem over one of these cones. As outlined in (1), the motivation to introduce the cone CP was by introducing a symmetric matrix X to represent the rank-1 matrix xx T . So a first straightforward relaxation is to replace the constraint X = xx T by X xx T (meaning that X − xx T ∈ S + n ), which by Schur's complement lemma is equivalent to This relaxation goes back to Shor [99] and corresponds to the simple fact that S + n ⊆ COP n for any n. It is interesting to note that the SDP-relaxation of a quadratic problem corresponds to the Lagrangian dual of that problem, whereas considering partial Lagrangian duals (i.e., dualizing the problem only with respect to a subset of the constraints) leads to various copositive relaxations, cf. [17]. The Shor relaxation can be improved by adding more constraints to the SDP, or by using some relaxation-linearization techniques, yielding stronger SDP-relaxations. This has been discussed in detail in [11].
Shor's approximation can be strengthened by using better approximations to COP: Denote by N n the set of symmetric entrywise nonnegative n × n matrices. Then it is obvious from the definition that N n ⊆ COP n . We therefore get that N n + S + n ⊆ COP n , and by duality CP n ⊆ N n ∩ S + n . Interestingly, both inclusions are equalities for n ≤ 4 and are strict for n ≥ 5, cf. [78]. Matrices in N n ∩ S + n are sometimes called doubly nonnegative (both the entries and the eigenvalues are nonnegative). This cone has been frequently used to obtain bounds for certain combinatorial problems, with the most prominent case being the Lovász-Schrijver bound ϑ + (G) (sometimes called ϑ (G)) on the clique number of a graph G, see [98].
In order to get better approximations of COP and CP, a number of techniques have been developed which often lead to so called approximation hierarchies, i.e., monotonous sequences of inner or outer approximations of COP or CP which are, in some sense, exact in the limit. The approximating cones are constructed in such a way that optimizing over them amounts to solving an LP, an SOCP, or an SDP, all of which can be done in polynomial time. Several of these hierarchies have been proposed, and we discuss the most important ones next. Note that these hierarchies were originally designed to approximate either COP or CP. However, it should be clear that any inner (resp. outer) approximation hierarchy of one cone by duality yields an outer (resp. inner) approximation hierarchy for the dual cone.

Inner approximation hierarchies for COP
Parrilo [87] was the first to propose a hierarchy approximating COP from the interior. The basic idea is to reformulate the copositivity condition as a nonnegativity condition for certain polynomials, and then to use the sufficient condition that a polynomial is nonnegative if it can be represented as a sum of squares (sos) of other polynomials. Suppose we are given a matrix A ∈ S n and we would like to determine whether or not A ∈ COP n . To this end, consider the polynomial and observe that A ∈ COP n if and only if P A (x) ≥ 0 for all x ∈ R n . A sufficient condition for this is that P A (x) is sos. Parrilo showed that the set of matrices A for which P A (x) is sos equals S + n + N n . Moreover, he was able to refine this by using a result by Pólya [92] and considering higher order polynomials. For any r ∈ N, define the cone x 2 i r has an sos decomposition .

Parrilo showed that
so the cones K r approximate COP from the interior. The sos condition can be written as a system of linear matrix inequalities (LMIs), and therefore optimizing over K r amounts to solving an SDP. However, it should be noted that for increasing values of r, the size of these SDPs increases rapidly, resulting in problems that are beyond the range of current SDP-solvers even for moderate values of r.
Ahmadi and Majumdar [2] developed a more general theory for nonnegativity of polynomials which when applied in our context boils down to relaxing the sos-condition by requiring that P A (x) resp. P A (x) n i=1 x 2 i r has a decomposition as a sum of squares of binomials. This is clearly a weaker sufficient condition for nonnegativity of P A (x), but the advantage is that this condition can be verified by solving an SOCP. They also consider scaled versions and obtain hierarchies that they call rDSOS n and rSDSOS n .
An alternative sufficient condition for nonnegativity of a polynomial is that all of its coefficients are nonnegative. Exploiting this idea, de Klerk and Pasechnik [31], cf. also Bomze and de Klerk [19], define the cones x 2 i r has nonnegative coefficients .
They showed that Each of the cones C r is polyhedral, so optimizing over one of them is solving an LP.
Peña et al. [90] refined the above approaches and derived a hierarchy of cones Q r which in a sense sits between C r and K r , i.e., it fulfills C r ⊆ Q r ⊆ K r for all r ∈ N. These cones can be described by LMIs as well, so optimizing over Q r is again an SDP.
Each of the above hierarchies provides a uniform inner approximation to COP. However, this may not be desirable when considering a specific optimization problem over COP. Rather, one would like to obtain a good approximation of COP in the vicinity of the optimal solutions, whereas in other parts of the feasible set a coarse approximation is sufficient. This idea gave rise to the approach by Bundfuss and Dür [24]: It is easy to see that the definition of COP is equivalent to COP = {A ∈ S | x T Ax ≥ 0 for all x ∈ ∆}. Now [24] consider partitions P = {S 1 , . . . , S m } of ∆ into subsimplices and give conditions ensuring nonnegativity of x T Ax over each S i . Let S = conv{v 1 , . . . , v n } ⊆ ∆ be such a simplex. Then x ∈ S can be written as a convex combination x = n i=1 λ i v i with n i=1 λ i = 1 and λ i ≥ 0 for all i. Copositivity of a matrix A then means that Since λ i ≥ 0 by construction, a sufficient condition for (8) is that v T i Av j ≥ 0 for all i, j. Note that this constitutes a system of linear inequalities for the entries of A. Therefore, is a polyhedral inner approximation of COP. It is shown in [24] how the partition P can be refined in order to obtain a sequence of inner approximations that can either be tailored to yield a uniform approximation of COP or an adaptive approximation with good quality in the vicinity of the optimal solution of the underlying copositive optimization problem.

Outer approximation hierarchies for COP
Since we can write COP = {A ∈ S | x T Ax ≥ 0 for all x ∈ ∆}, outer approximations of COP can be obtained by picking suitable subsets I ∈ ∆ and considering {A ∈ S | x T Ax ≥ 0 for all x ∈ I}. One option studied by Yıldırım [107] is to consider regular grids of rational points on the unit simplex defined as Then the set O r := {A ∈ S | x T Ax ≥ 0 for all x ∈ δ(r)} is clearly a polyhedral outer approximation of COP for any r ∈ N, and one can show that O 0 ⊃ O 1 ⊃ . . . ⊃ COP and COP = r∈N O r . This approximation scheme gives again uniform approximations of COP and allows for exact assessment of the quality of the approximation. Alternatively one can use an approach developed by Lasserre in a series of papers which makes use of the vast body of theory on positive (resp. nonnegative) polynomials and polynomial optimization. Denote by P + n,d the cone of n-variate polynomials of total degree ≤ d which are nonnegative on R n (note that such a polynomial necessarily has even degree). Then A ∈ S n is copositive if and only if the polynomial P A from (7) fulfills P A ∈ P + n,4 . The Riesz-Haviland Theorem tells us that the dual of P + n,d is the so called moment cone. Exploiting this, one can obtain another hierarchy of cones approximating COP resp. CP which is a special case of the Lasserre-hierarchy applied to the setting of copositivity and complete positivity. We refer to the book by Lasserre [66] and the survey by Laurent [69] which both give excellent introductions to the general moment approach for polynomial optimization. The paper by Lasserre [67] explicitly describes how to construct hierarchies of outer approximations of COP and inner approximations of CP by using this moment approach. It should be noted that these hierarchies are based on conditions that can be expressed as LMIs, and hence optimizaing over these hierarchies amounts to solving SDPs.
A third option is the adaptive approximation approach by Bundfuss and Dür [24]  This yields a hierarchy of polyhedral approximations that can again be tailored to either yield a uniform outer approximation of COP or a finer approximation in the vicinity of the set of optimal solutions but only a coarse approximation in the remaining parts.

Inner approximation hierarchies for CP
Recall that CP n = conv{xx T | x ∈ R n + } = cone conv{xx T | x ∈ ∆}. Therefore, inner approximations of CP can be constructed analogous to outer approximations of COP, namely by chosing suitable subsets I ∈ ∆ and considering C(I) := cone conv{xx T | x ∈ I}. A thorough treatment investigating properties of the approximation in dependence of the set I is given in [108].
If the set I stems from a finite discretization of ∆, then C(I) is polyhedral. A different approach was developed by Gouveia et al. [52] based on similar work in [2]. They consider the cone SDD n + := conv{xx T | x ∈ R n + , | supp(x)| ≤ 2} ⊆ CP n , where the support of a vector x is defined as supp(x) := {i | x i = 0}. It can be shown (see [2,52] and references therein) that A ∈ SDD n + if and only if A is scaled diagonally dominant, i.e., if there exists a diagonal matrix D with positive diagonal entries such that DAD is diagonally dominant. From the definition we get that that A ∈ SDD n + if and only if A can be written as A = i<j M ij , where M ij are symmetric nonnegative and positive semidefinite matrices whose entries are zero everywhere except at the positions ii, ij, ji, jj. Observe that positive semidefinitness of 2 × 2 symmetric matrices can be characterized by second order conditions. Indeed, we have and the latter condition is equivalent to the second order cone constraint (a+c, 2b, a−c) T ∈ L 3 . Therefore, optimizing over SDD n + amounts to solving an SOCP. In [52], this approach is further refined by considering scaled variants of SDD n + which can be tailored to obtain either uniform or problem-dependent approximation schemes.

Examples of binary quadratic problems
In this section, we discuss a few combinatorial problems which can be formulated as binary quadratic optimization problems. These problems are typically NP-hard, so it will be useful to consider reformulations and relaxations which are tractable. It will turn out that relaxations based on conic optimization are particularly useful. We will mostly focus on relaxations in the cone of positive semidefinite matrices. Throughout this section, assume we are given an undirected graph G = (V, E) with V = {1, . . . , n} and adjacency matrix A ∈ S n .

Unconstrained binary quadratic optimization and MaxCut
An unconstrained binary quadratic optimization problem takes as input a symmetric n × n matrix Q and asks to find Since x i = x 2 i , a possible linear term in the objective function could be integrated in the main diagonal of Q, so it is not necessary to explicitly include a linear term in this model.
The MaxCut Problem is defined by an edge weighted graph, given through its adjacency matrix A. Hence A is symmetric, but we do not impose any further restrictions to the entries of A. In particular, a ij < 0 is possible. If [i, j] is not an edge of the graph, we set a ij = 0. The Laplacian matrix L associated to A is defined by l ij := −a ij for i = j, and l ii := k a ik .
Note that Le = 0, and if A ≥ 0, then L 0. It is a simple exercise to verify that for y ∈ {−1, 1} n the value of the cut defined by S := {i | y i = 1} is given by Using Le = 0 and setting x := 1 2 (y + e) ∈ {0, 1} n , we get that This shows that the MaxCut Problem max 1 4 y T Ly such that y ∈ {−1, 1} n and the binary quadratic optimization problem (10) are indeed equivalent optimization problems. Lasserre [68] showed an even more general result: Considering the linearly constrained binary quadratic problem Lasserre showed that this can by reformulated as a MaxCut problem on a graph with n + 1 nodes that can be explicitly constructed from the data of the problem. So in a sense MaxCut is a canonic model for linear and quadratic binary problems. Note that (11) is s special instance of the problem (5) studied by Burer. It can therefore be formulated as a copositive problem, and then approximation hierarchies from Section 3 can be used. Historically, linear and then semidefinite relaxations were studied before copositivity came into play. In a celebrated paper, Goemans and Williamson [50] took the following approach: since y T Ly = L, yy T they introduce the matrix Y taking the role of yy T . Then Y 0 and diag(Y ) = e must hold, and this yields the semidefinite relaxation For graphs with nonnegative edge weights, they were able to show the celebrated result that the optimal value of (12) is at most 13.83% higher than the optimal value of MaxCut.
In other words, the SDP relaxation has a performance guarantee of ≈ 87%.

Partition and Clustering problems
We briefly discuss various extensions of unconstrained binary quadratic optimization problems which lead to additional linear constraints on the binary variables.
k-cluster problems The simplest extension of problem (10) consists in asking that exactly k of the variables in (10) are set to 1. Let A ≥ 0 be a symmetric n × n matrix. We may think of a ij as a measure for the interaction between i and j. The problem max 1 2 x T Ax such that e T x = k and x ∈ {0, 1} n asks for a subset of k vertices having maximum total pairwise interaction. Such a set may be viewed as a "cluster" in the sense that it collects a set of k items with maximum mutual interaction. This type of problem has found increased interest from applications in data mining, see for instance [48]. Various SDP relaxations have been discussed in [77]. A different application of this type of problem is related to the stable set problem. Let A be the adjacency matrix of an unweighted graph G and let k be given. Consider the minimization problem z(k) := min 1 2 x T Ax such that e T x = k and x ∈ {0, 1} n .
If the optimal value fulfills z(k) > 0, then we have a proof that G has no stable set of size k, so that the stability number α(G) fulfills α(G) ≤ k − 1. This idea will be further exploited in Section 4.3.

Max-k-Cut
The MaxCut problem may also be seen as a very simple graph partition problem as it asks to separate the vertices of the graph into two parts such as to maximize the weight of the edges joining the two partition blocks. It is a natural generalization to consider vertex partitions into (at most) k partition blocks for some fixed k ≥ 2.
We represent k-partitions of V = {1, . . . , n} by 0-1 matrices X of order n × k satisfying Xe = e. This condition simply states that (Xe) i = 1 for all i, meaning that vertex i is in exactly one partition block, namely block j in case x ij = 1. The sum of the elements of column j of X equals the number of vertices (possibly zero) in partition block j. It is a simple exercise to verify that the total weight of edges joining vertices in distinct partition blocks is given by 1 2 trace(LXX T ). Therefore, the Max-k-Cut problem may be formulated as max 1 2 trace(LXX T ) such that Xe = e and X ∈ {0, 1} n×k .
The k-partition problem is obtained by further constraining the partitions to have exactly k partition blocks, and to require that partition block j contains exactly m j ∈ N vertices, where j m j = n. We collect the cardinalities m j in the vector m ∈ N k , so that feasible partitions are represented by matrices X ∈ {0, 1} n×k satisfying Xe = e and X T e = m.
The special case where all the m i are equal is of special interest in certain telecommunication problems, and we refer to [72] for a discussion of these applications and relaxations based on semidefinite optimization.

Stable sets and graph coloring
We briefly look at formulations for the stability and the chromatic number of a graph in connection with quadratic binary optimization.
A subset S of vertices of a graph G is called stable (or independent), if the subgraph of G induced by S is empty. The stability number α(G) denotes the cardinality of a largest stable set in G. Determining α(G) is considered an extremely difficult combinatorial optimization problem, cf. [54]. The following binary quadratic optimization problem determines α(G): The optimal value z(k) of the following optimization problem can be used to check whether G contains a stable set of size k. As usual, A denotes the adjacency matrix of G.
If z(k) = 0, then G contains a stable set of size k given by S = {i | x i = 1}. On the other hand, z(k) > 0 shows that G has no stable set of size k and therefore α(G) < k.
Let us now turn to vertex colorings of G. A k-coloring of V (G) can be seen as a vertex partition of V (G) into k stable sets (the color classes). The chromatic number χ(G) denotes the smallest number k such that G is k-colorable. A formulation to compute χ(G) as the solution of a copositive problem was given in [53].
Expressing χ(G) as a binary optimization problem is usually done as follows. Let S = {s 1 , s 2 , . . .} be the collection of characteristic vectors of stable sets in G.
This is a linear optimization problem in binary variables λ i . Unfortunately, there may be an exponential number of them. Weakening the condition λ i ∈ {0, 1} to 0 ≤ λ i ≤ 1 for all i leads us to the fractional chromatic number χ f (G): Computing the optimal value χ f (G) of this linear program is again known to be NP-hard, see for instance [76].
Let us now consider testing whether G contains a k-coloring for some given k. We introduce the n × k binary matrix X and require k r=1 x ir = 1 for all 1 ≤ i ≤ n, which we write in a slight abuse of notation as Xe = e. This condition asks that each row of X contains exactly one entry equal to one, so that the columns of X provide a vertex partition of V (G) into (at most) k partition blocks. Since each row of X has exactly one nonzero entry we see that k r=1 x ir x jr = 1 for some i = j is only possible if i and j both belong to the same partition block r for some r ∈ {1, . . . , k}. As a consequence In a slight abuse of notation, we use z(k) again for the optimal value of the relaxation with exactly k columns in X. If z(k) = 0, then the optimal X provides a partitioning of V (G) into (at most) k stable sets and therefore χ(G) ≤ k. On the other hand z(k) > 0 implies that no k-partition exists where all partition blocks are stable sets, and therefore χ(G) > k. We will come back to SDP relaxations for both k-ones problems and investigate connections to the ϑ number in Section 5.2.

Quadratic set cover
The (linear) set cover problem is defined as follows. We are given a set C := {v 1 , . . . , v n } of n elements and a collection S of m subsets of C such that their union equals C. Each subset S i in S has cost q i . The task is to select subsets in S such that their union is S and such that the cost of the selected subsets in minimized. This problem is one of Karp's 21 NP-complete problems.
To state this problem formally, define an n × m binary matrix A with a ij = 1 if v i ∈ S j . Row i of A indicates which subsets S j contain v i , column j of A is the incidence vector of subset S j . With this, the linear set cover problem reads: q j x j such that j a ij x j ≥ 1 for all i = 1, . . . , n and x ∈ {0, 1} m .
Let us denote the largest row sum of A by f and the largest column sum of A by g.
The following approximation results go back to the 1980's. Hochbaum [58] introduces a primal-dual LP-rounding heuristic which gives (in polynomial time) a feasible solution to set cover with value z at most f z * , i.e. z ≤ f z * . Chvátal [29] proposes a greedy rounding heuristic which yields (in polynomial time) a feasible solution to set cover with value z at most (1 + log(g))z * , i.e., z ≤ (1 + log(g))z * .
The quadratic set cover problem differs from the linear one only in the objective function which now also may contain quadratic terms: It is clear that if q ij = 0 for all i = j, then we recover the linear set cover problem (as x j = x 2 j ). A good summary on complexity issues related to quadratic set cover is given by Escoffier and Hammer [46]. They relate quadratic set cover to deciding whether a graph has chromatic number 3: Let G be a (nonbipartite) graph on n vertices and construct a quadratic set cover instance as follows. The ground set is V = {1, . . . , n} and we have 3n subsets S ir with S ir := {i} for r = 1, 2, 3, so each set consists of only one element, and we have three copies of each set. The n covering conditions ask that We may think of these constraints as asking that each vertex should receive color 1 or 2 or 3. Thus we do not allow that all three of these variables are zero but more than one of them may be set to one. Escoffier and Hammer [46] show the following theorem.
As a consequence, it is NP-hard to decide whether a quadratic set cover problem has optimal value 0 or greater than 0. The covering problem in this construction is quite simple. Each set consists of only one element (S ir = {i}), and each cover constraint involves only three elements from the ground set. This problem would therefore be trivial to solve with a linear objective function.

Quadratic assignment problem
The Quadratic Assignment Problem (QAP) asks to minimize a quadratic objective function over the set of permutation matrices. It contains many prominent NP-hard problems as special cases, see for instance [85]. We define it through three data matrices A, B and C of order n × n and assume that A and B are symmetric. The set of n × n permutation matrices is denoted by Π n or Π for short. The QAP then reads: min AXB + C, X such that X ∈ Π.
Note that Π is contained in the affine space E := {X ∈ R n×n | Xe = X T e = e} of all matrices having row and column sums equal to 1. Let the (n − 1) × n matrix V represent a basis of e ⊥ , the orthogonal complement to the vector e ∈ R n . It is well known that any X ∈ E may be written as where J := ee T and M ∈ S is an arbitrary symmetric matrix of order n − 1. Note in particular that V M V T lies in the linear space of matrices having row and column sums equal to 0. Povh and Rendl [93] proposed a copositive formulation of QAP and semidefinite relaxations based upon it. To this end, it is useful to rewrite the objective function of QAP in terms of x := vec(X), where vec(X) is a vector obtained from the matrix X by stacking the columns of X on top of each other. Using the Kronecker product B ⊗ A of the matrices B and A, it is not difficult to see that We also set c := vec(C) and derive We are now interested in the set P := conv{xx T : x = vec(X), X ∈ Π}.
We have just seen that . Then x = W z and xx T = W zz T W T . The definition of z implies (zz T ) 1,1 = 1. The SDP relaxation of QAP is now obtained by allowing any semidefinite matrix R with (R) 1,1 = 1 in place of zz T . For ease of notation we introduce Y := W RW T and get the following semidefinite relaxation of QAP: Since Y takes the role of xx T we may think of the n 2 ×n 2 matrix Y as being partitioned into n × n matrices Y i,j such that Y i,j corresponds to the matrix X .,i · X T .,j . Since X i,k X i,l = 0 for k = l, it follows immediately that the submatrix Y i,i is 0 outside its main diagonal, i.e., (Y i,i ) k,l = 0 for all k = l. In a similar way we conclude that diag(Y i,j ) = 0 for i = j. We refer to [93] for further details.

Modelling linear equalities and inequalities in SDP relaxations
In this section we take a closer look at modelling issues related to combinatorial optimization problems. We stress that formulations which are equivalent in the binary setting may give different results when we move to conic relaxations. For instance, suppose we have binary variables x i ∈ {0, 1} and we would like to express the constraint that for a given pair i, j at most one of the associated variables x i and x j is allowed to be equal to 1. This could be done either by imposing the linear inequality x i + x j ≤ 1 or by requiring the quadratic equation x i x j = 0 to hold. In the binary setting, both conditions are equivalent, but once we move to relaxations, they may yield different results. Moreover, various ways of constructing SDP relaxations have been proposed in the literature which may also lead to bounds of varying quality. We refer to [6] for a recent discussion of various SDP models related to the stable set problem.
Finally, the idea of SDP hierarchies has recently found increased scientific interest. Applied to combinatorial optimization problems, these hierarchies typically have the property that the quality of the relaxation gets tighter as one moves up in the hierarchy, yielding the integer optimum as one moves up high enough in the hierarchy. Unfortunately, it is computationally challenging to tackle even the first few levels in these hierarchies. If the initial problem has n binary variables, the SDP in the first level of the hierarchy is formulated in matrices of order n + 1, but already the second level uses matrices of order n 2 which is prohibitive once n is much larger than 100.

Lifting linear constraints
Here we focus on the first question and investigate how linear constraints may be lifted into the SDP relaxation. For given a 0 > 0, we consider the SDP relaxation of This is an binary quadratic optimization problem with a single linear equality constraint a T x = a 0 . How should this equation be included in the SDP relaxation?
The semidefinitness constraint X − xx T 0 with diag(X) = x immediately shows that Since we should have equality it seems plausible to optimize over the set Unfortunately, this construction makes feasible matrices singular, as we show in the next lemma: Lemma 5.1. Let (X, x) ∈ F 1 . Then the matrix X x x T 1 is singular and Xa = a 0 x.
Proof. We first note that a −a 0 = 0, and therefore Xa = a 0 x.
As a consequence, we se that if we work with the set F 1 in an SDP-relaxation, then the Slater condition is necessarily violated, and we already saw that this is disadvantagous both from a theoretical perspective (strong duality may not hold) and from a practical perspective (solvers ma not be able to handle the problem). As an alternative, we propose the set F 2 := {(X, x) | X 0, diag(X) = x, a T x = a 0 , Xa = a 0 x}.
We next show that the two sets are actually equal: Proof. We first take (X, x) ∈ F 1 . Then X 0, diag(X) = x and a T x = a 0 . Lemma 5.1 shows that Xa = a 0 x and therefore (X, x) ∈ F 2 . Conversely, let (X, x) ∈ F 2 . We immediately get that a T x = a 0 and a T Xa = a 2 0 . It remains to show that a 0 a T Xv = 0. This shows that the null-space of X (extended with an additional component equal to 0) is contained in the null-space of Y . We also have Y a −a 0 = 0, so that X and Y have the same rank. If X 0 then all nonzero eigenvalues of Y are positive by the interlacing property between the eigenvalues of X and Y .
Remark 5.3. The matrix Y above is sometimes called a flat extension of X. It is a well known fact that flat extensions of semidefinite matrices are also semidefinite.
Next we investigate the situation where a linear term a T x is required to be contained in some interval, say |a T x| ≤ a 0 , with a 0 > 0. The constraints X − xx T 0 and diag(X) = x imply a T Xa ≥ (a T x) 2 .
We conclude that a 2 0 ≥ a T Xa is at least as strong as the original inequality |a T x| ≥ a 0 . Finally, we observe that the situation is different for one-sided linear inequalities of the form a T x ≥ a 0 with a 0 > 0. Arguing as before we see that hence the original original inequality a T x ≥ a 0 is at least as strong as a T Xa ≥ a 2 0 .

Stable set and coloring relaxations
We recall the formulation for the stability number α(G) for a graph G: One of its first relaxations using SDP was introduced in a seminal paper by Lovász [74]. It may be derived from nonzero binary vectors x by introducing X := 1 x T x xx T . Note that e T x = e T Xe holds for any feasible x. Let As any characteristic vector x of a stable set leads to a feasible matrix X for this problem, it is clear that α(G) ≤ ϑ(G). Lovász and Schrijver [75] showed that ϑ(G) can also be obtained as the optimal value of the following SDP: We now return to the parameter z(k) from (13) and derive upper bounds on α(G) based on it. We denote by A the adjacency matrix of our graph G. For given k ∈ N we have z(k) := min 1 2 x T Ax such that e T x = k and x ∈ {0, 1} n . If z(k) > 0, then clearly G has no stable set of size k and therefore α(G) < k. Computing z(k) is an NP-complete problem, see for instance [16], so we consider the following tractable relaxation denoted by P (t) for some t > 1: Note that this is a doubly nonnegative relaxation, since the constraints Y 0, Y ≥ 0 mean that Y should be in the cone of doubly nonnegative matrices. we denote the optimal value of P (t) by val(P (t)). Note that val(P (t)) > 0 implies α(G) ≤ t , so we are interested in finding the smallest t such that val(P (t)) > 0. It turns out that the answer to this question is closely related to Schrijver's refinement ϑ + = ϑ + (G) (sometimes denoted by ϑ (G)) of the original theta function ϑ(G): Before we establish this connection we recall the following alternative formulation of ϑ + : The following property of optimal solutions to (S 1 ) will be used later on.
We are now ready to show the following result.
Theorem 5.5. Let A be the adjacency matrix of a graph G. Then val(P (t)) > 0 if and only if t > ϑ + (G).
Proof. We first consider the problem P (t) for the value t = ϑ + := ϑ + (G): Now take an optimal solution (Y, y) for problem (S 1 ), so trace( . Lemma 5.4 shows us that Y e = ϑ + y. We conclude that (Y, y) is feasible for problem P (ϑ + ). Since y ij = 0 on E(G) we conclude that val(P (ϑ + )) = 0.
Next, suppose that val(P (t)) = 0 and consider t with 1 < t < t. Suppose that (Y, y) is optimal vor P (t). It is a simple exercise to verify that is feasible for P (t ) with objective value 0, hence val(P (t)) = 0 for all 1 < t ≤ ϑ + . Finally, the definition of problem (S 1 ) shows that ϑ + is the largest possible value for the trace of a matrix Y which satisfies Y 0, Y ≥ 0, y ij = 0 for all [i, j] ∈ E(G), and therefore val(P (t)) > 0 for any t > ϑ + .
A similar approach can also be used to get lower bounds for the chromatic number χ. We recall the binary quadratic problem from (14): z(k) := min 1 2 X, AX such that Xe = e, X ∈ {0, 1} n×k .
If z(k) > 0 for some given k, then χ(G) > k. As before, we need a tractable relaxation for this problem. It is obtained by first extending X with an additional row of all ones, so we introduceX := X e T and observe thatXX T = XX T e e T k .
The main diagonal of the matrix XX T clearly equals the all-ones vector, because each row of the 0-1 matrix X has exactly one entry equal to 1. A relaxation is obtained by allowing arbitrary matrices Y instead of XX T . We get (P (t)) min 1 2 A, Y such that Y e e T t 0, diag(Y ) = e, Y ≥ 0.
From val(P (t)) > 0 we may conclude that χ(G) ≥ t . Thus we would like to find the largest value t such that val(P (t)) > 0. It turns out that the strengthening of the ϑ number towards the chromatic number χ(G) proposed by Szegedy [100] provides the answer to this question. The parameter ϑ − (G) defined for the complement G of G was introduced by Szegedy as a lower bound on the chromatic number of G: ϑ − (G) := min t such that Y e e T t 0, diag(Y ) = e, Y ≥ 0, y ij = 0 for all [i, j] ∈ E(G).
Theorem 5.6. For a given graph G we have that val(P (t)) > 0 if and only if t < ϑ − (G).
Proof. We first consider problem P (t) for t = ϑ − (G). Let Y be an optimal solution for ϑ − (G). Then Y is also feasible for P (ϑ − (G)) with value 0, so P (ϑ − (G)) = 0. The solution Y remains feasible for any t > t = ϑ − (G) so that val(P (t )) = 0 also in this case. Finally, the definition of ϑ − (G) shows that for any t < ϑ − (G), the system Y e e T t 0, diag(Y ) = e, Y ≥ 0, y ij = 0 for all [i, j] ∈ E(G) is infeasible. Therefore, any Y satisfying Y e e T t 0, diag(Y ) = e, Y ≥ 0 will have an entry y ij > 0 for some [i, j] ∈ E(G), and hence val(P (t)) > 0.
We set n := |V (G)| and m := |E(G)|. The previous two theorems can be used to get bounds for α(G) and χ(G). Contrary to the computation of ϑ + (G) which requires the solution of an SDP with more than m equality constraints, the SDP relaxation P (t) contains n + 1 equality constraints, independent of m. As a drawback, one has to guess the proper value t which might require solving several SDPs for different values of t.

Conclusions
We have seen that conic optimization is an extremely versatile tool with an abundance of applications. Depending on the cone in question, different complexities may occur: while linear programming, second order cone programming as well as semidefinite programming are solvable in polynomial time, optimizing over the cones of copositive or completely positive matrices is NP-hard. The cones COP and CP are highly useful modelling tools nonconvex quadratic or combinatorial optimization. In many cases, it is possible to reformulate such problems equivalently as linear problems over COP or CP. Relaxing the cone constraint to a semidefinitness or double nonnegativity constraints yields very good bounds which are often provably tighter than LP-based bounds. When using approximation hierarchies, one can often show that at some finite level of the hierarchy the relaxation gives the exact solution of the underlying combinatorial problem.
In contrast to linear programming, the existence of strictly feasible solutions plays a crucial role in conic optimization. In the absence of strictly feasible points, strong duality may not hold and algorithms may fail to solve the problem. This is therefore a point that should be carefully considered when modelling the problem in question as a conic optimization prolem.
At the moment, the main bottleneck for using SDP relaxations or conic optimization in a broader context of applications is the lack of algorithms that can solve large scale problems in reasonable time. Semidefinite relaxations of a problem in R n clearly involve matrices of order at least n × n, so the number of variables is roughly squared. If one works with approximation hierarchies, the the SDPs get larger at each level of the hierarchy. Very quickly, these SDPs are out of reach for current computational algorithms.
A possible remedy when the underlying combinatorial problem is highly structured is to exploit the symmetry by using the theory of matrix C * -algebras. Roughly speaking, the idea is to pre-process the SDP by applying a suitable unitary transformation in such a way that the resulting matrices in the SDP exhibit block diagonal structure. This structure can then be exploited by interior point methods. We refer to de Klerk [30] for a survey on this approach, and to [38] for a discussion on how it can be used in approximation hierarchies.
Unfortunately however, not many SDPs exhibit symmetries, and so there is a need for faster algorithms. Maybe in the future first order methods will turn out to be efficient. For example, the ADMM method has proved successful when applied to SDP relaxations of binary quadratic problems [103], the quadratic assignment problem [84], or the quadratic shortest path problem [59].
As we have outlined, the past decades have seen an enormous progress in understanding conic problems and using them for modelling purposes. The next decades should be particularly devoted to the numerical side.