A global QP-free algorithm for mathematical programs with complementarity constraints

In this paper, a primal–dual interior point QP-free algorithm for mathematical programs with complementarity constraints is presented. Firstly, based on Fischer–Burmeister function and smoothing techniques, the investigated problem is approximated by a smooth nonlinear constrained optimization problem. Secondly, combining with an effective penalty function technique and working set, a QP-free algorithm is proposed to solve the smooth constrained optimization problem. At each iteration, only two reduced linear equations with the same coefficient matrix are solved to obtain the search direction. Under some mild conditions, the proposed algorithm possesses global convergence. Finally, some numerical results are reported.


Introduction
In this paper, we discuss the following mathematical programming problem with complementarity constraints (MPCC for short): where f : R n+m → R, g = (g 1 , . . . , g m g ) T : R n+m → R m g , F = (F 1 , . . . , F m ) T : R n+m → R m are continuously differentiable functions. "F(x, y) ⊥ y" means that the vectors F(x, y) and y are perpendicular to each other.
MPCC (1) has a broad of applications in real world, such as engineering design, traffic transportation, game theory and so on. A detailed overview of MPCC applications can be found in [1] and the monographs [2][3][4].
It is well known that the QP-free method is one of the efficient methods for nonlinear programming (see [14,15,[19][20][21][22][23][24][25] ). The nice properties of the QP-free method are as follows: (a) The search directions are determined only by solving systems of linear equations rather than solving QP-subproblems. As a consequence, the computational cost is decreased greatly. (b) Compared with the SQP method, the size of the investigated problem solved by QP-free method is larger. It is worth mentioning that the primal-dual interior point QP-free algorithm in [21] has an improvement, that is, the strict restriction that the Lagrangian Hessian estimate is uniformly positive is relaxed. It is noticed that the active set identifying technique is incorporated into the QP-free algorithms in [22,23]. Consequently, the computational cost is further decreased and the numerical performance in [22] is encouraging.
Although there are many OP-free algorithms for nonlinear programming, to the best of our knowledge, there are few MPCC-tailed QP-free algorithms for the MPCC (1). Motivated by the ideas of the algorithms in [21,[24][25][26][27] and combining with smoothing techniques, we propose a primal-dual interior point QP-free algorithm for the MPCC (1). The proposed algorithm possesses the following nice properties: (a) The technique of active set is incorporated into the algorithm. As a consequence, the size of SLEs becomes smaller and the computational cost is decreased. (b) At each iteration, the search direction is obtained by solving two SLEs with the same coefficient matrix, which further decreases the computational cost. (c) The uniformly positive definiteness on the Lagrangian Hessian estimate H k is relaxed. (d) The algorithm possesses global convergence without assuming that the stationary points are isolated.

Preliminaries and reformulation
In this section, for completeness, we first restate some definitions and results about the MPCC (1), then deduce an approximation problem of the MPCC (1) by Fischer-Burmeister function and smoothing techniques. Denoted by X 0 the feasible set of the MPCC (1), and denoted by I C = {1, 2, . . . , m} and I g = {1, 2, . . . , m g } are the index sets of the complementarity constraints and the inequality constraints, respectively. Definition 1 ([2]) Given (x * , y * ) ∈ X 0 , if then we call that the lower-level strict complementarity (LLSC) is satisfied at (x * , y * ).
It is well known that the Fischer-Burmeister function is a complementarity function, which is defined by Obviously, ψ satisfies the following basic property: And Let v = F(x, y), according to the property (3), the MPCC (1) is equivalently transformed into the following nonsmooth nonlinear optimization problem: where Obviously, the function ψ is not differentiable at the point (0, 0). Borrowing ideas from [28], we define the functions as follows: For any ε > 0, the function ψ ε (y i , v i ) is differentiable, and it follows that So the smooth problem will be used as an approximation of the MPCC (1). Obviously, if I C (y * , v * , ε) = ∅, then the problem (6) is equivalent with the problem (4). Under some mild conditions, I C (y * , v * , ε) = ∅ is guaranteed, where (x * , y * , v * ) is an accumulation point of the iterative sequence {(x k , y k , v k )} (see the proof of Theorem 1 in the sequel).
Proof Since (z, λ r ) is a KKT pair of the problem (7), the vector pair (z, λ r ) satisfies the following relations: Note that we get Let λ F,I C = λ r F,I Cre I C , λ Ψ ε ,I C = λ r Ψ ε ,I Cre I C , λ g,I g = λ r g,I g , we obtain In view of v -F(x, y) = 0, Ψ ε (y, v) = 0, g(x, y) ≤ 0, and λ g,i ≥ 0, λ g,i g i (x, y) = 0, i ∈ I g .

Description of the algorithm
For the sake of theoretical analysis, we make a basic assumption throughout this paper.

Assumption 1
(1) For any (x, y) ∈ R n+m , the matrix ∇ y F(x, y) is a P 0 matrix, i.e., all the principal minors of ∇ y F(x, y) are nonnegative.
In order to construct the coefficient matrix of linear equations conveniently, denote and denote by A J the gradient matrix of Ω J , that is, Define the matrix Similar to the proof in [30,31], we can prove that the following proposition is true.

Proposition 3 (1) Suppose that the matrix U is nonsingular. Then the matrix A I(x,y) is full of column rank, if and only if the matrix
is full of column rank, where (U -1 ) m is the mth-order principal submatrix of U -1 , which consists of the first m rows and m columns of U -1 . (2) Suppose that Assumption 1 holds, then the matrix U is nonsingular.
is full of column rank.
Remark 1 According to Proposition 3, if Assumption 1 is true, then Assumption 2 is equivalent to the LICQ of the problem (7).
Based on Proposition 2, we know that if one can construct an efficient algorithm for the problem (7) and adjust penalty parameter r to force the iterate to asymptotically satisfy v -F(x, y) = 0, Ψ ε (y, v) = 0, then the solution to the problem (6) can be yielded.
For the current iterate z k ∈ X 0 , we define the corresponding multiplier vector where s 0 > 0, and ( λ k-1 , r k-1 ) is computed in the previous (k -1)th iteration, we construct the working set I k as follows: When (z k , λ k ) is sufficiently close to a KKT pair (z * , λ * ) of the problem (6), and the second-order sufficient conditions and the MFCQ hold at (z * , λ * ), the set I k equals the precise identification set I g 0 (z * ). Therefore, only inequality constraints associated with the working set I k need to be considered.
For the current iterate z k ∈ R n+2m , the first SLE in our algorithm is constructed as follows: where the coefficient matrix M I k is defined by with is an approximation of the Lagrangian Hessian, the matrix S I k := diag(s k I k ), where the vector s k . It is not difficult to prove that the following result is true.

Lemma 1 Suppose that Assumptions 1-2 hold, and the symmetric matrix H k satisfies the following relation:
then the coefficient matrix M I k defined by (13) is invertible, where the matrix order A B means that A -B is positive definite.
We now describe our algorithm as follows.
Step 2. (Yield a matrix H k ) Generate a symmetric matrix H k such that it is an approximation of the Lagrangian Hessian of the problem (7) and satisfies the condition (14).
Step 3. (Generate a search direction) (1) Solve the first SLE1 (12). Let its solution be ( dz (2) If the following conditions hold: (4) Solve the second SLE as follows: where the perturbation vector η k is determined by the following convex combination: with s k I k := (s k F,I C , s k Ψ ε ,I C , s k g,I k ) ∈ R 2m+|I k | . Let its solution be (dz k , λ k I k ) with λ k Step 4. (Perform line search) Compute the step size t k , which is the first number t of the sequence {1, β, β 2 , . . .} satisfying Step 5. (Update) Let z k+1 = z k + t k dz k , and compute Step 6. Let k := k + 1, and go back to Step 1.
Remark 2 The correction technique of penalty parameter r k in Step 3(2) is from [33], when the conditions (i)-(ii) in Step 3 (2) are satisfied, the current iteration point z k is close to a KKT point of the problem (7). However, the conditions (iii)-(iv) in Step 3 (2) indicate that z k is far away from the feasible area of the problem (6), and the penalty parameter needs to be increased.
For convenience, denote

Lemma 2 For the directions dz k and dz k found in
Step 3(1)(4), the following conclusions hold: Furthermore, when iteration goes into Step 3(3)(4), we have dz k = 0 and ζ k < 0, so dz k is a feasible descent direction of the problem (7) at z k , hence, Algorithm A is well defined.
By ∇ z f r k (x k , y k ) T dz k ≤ 0, we know that dz k is a feasible descent direction of the problem (7) at point z k . Together with v k -F(x k , y k ) < 0, Ψ ε (y k , v k ) < 0, g(x k , y k ) < 0 and ζ k < 0, Step 4 in Algorithm A can be finished by finite calculations, hence Algorithm A is well defined.

Global convergence
In this paper, we assume that Algorithm A generates an infinite iteration sequence {z k }. For the sake of convenience, let z * be an accumulation point of the iteration sequence {z k }. First, we show that the penalty parameter r k can be augmented only in a finite number of steps. Then we prove that z * is a KKT point of the problem (7). Finally, we show that z * is a KKT point of the MPCC (1). More precisely, we show that Algorithm A is globally convergent. For this purpose, the following assumptions are necessary.

Assumption 3
Suppose that the sequences {H k } and {z k } are all bounded, each accumulation point of {z k } satisfies LLSC (2). Assume that there exists a positive constant σ such that Assumption 4 For any z ∈ X, if z / ∈ X, then no scalars λ F,i ≥ 0, i ∈ I F 0 (z); λ Ψ ε ,i ≥ 0, i ∈ I Ψ ε 0 (z) and λ g,i ≥ 0, i ∈ I g 0 (z), exist such that

Lemma 3 Suppose that Assumptions 1-4 hold, then the penalty parameter r k can be augmented only in a finite number of steps.
Proof By contradiction, suppose that r k is increased infinitely many times, i.e., there exists an infinite index set K such that r k+1 > r k for any k ∈ K . The increase of r k must satisfy the following conditions for any k ∈ K : and S k I g = diag(s k I g ), where s k I g = (s k F,I C , s k Ψ ε ,I C , s k g,I g ). According to the SLE we have Since {r k } tends to infinity, and λ k = λ k-1 -r k-1ẽ , k ≥ 1, such that { λ k ∞ } tends to infinity as well, we have for k (∈ K ) large enough, it follows that Since the sequence {z k } k∈K is bounded by Assumption 3, there exists an infinite index set K ⊆ K , z * ∈ R n+2m , and (λ * F,I C ,λ * Ψ ε ,I C ,λ * g,I g ) = 0, for all k ∈ K , such that In view of the boundedness of {z k } and the continuity of the functions, it follows that are bounded. Further, {S k I g } are bounded by construction. Multiplying both sides of (41) by β -1 k , we have From (10) and (42), we know that { r k Then it follows from the condition (ii) and (42) that And from (41), we obtain furthermore, multiplying both sides of this equation by β -1 k , together with the condition (i) and taking the limit as k K − → ∞, we obtain g i x * , y * λ * g,i = 0, i ∈ I g , so λ * g,i = 0 for all i ∈ I g \ I g 0 (z * ).
In view of the condition (i) and the continuity of the functions, multiplying (40) by β -1 k and taking the limit as k Sinceλ * F,I C ,λ * Ψ ε ,I C , andλ * g,I g are not zero, combining with (45) and Assumption 2, it follows that i.e., z * / ∈ X and a > 0.
Based on (43), dividing both sides of (45) by a, we obtain Adding both sides of the above equality to In view of (44) and z * / ∈ X. This contradicts Assumption 4, so the penalty parameter r k is updated at most a finite number of times.

Lemma 4
Suppose that Assumptions 1-4 hold, then there exists an integer k 0 , such that

Proof By Step 3(2) and
Step 5 of Algorithm A, the sequence {ε k } is monotonically decreasing and bounded from below, so the sequence {ε k } is convergent. Let where K = {0, 1, 2, . . .}. Now we turn to prove that K 1 and K 2 are finite sets. By contradiction, if K 1 is infinite set, then lim k→∞ ε k = 0. There exist i ∈ I C and K 1 ⊆ K 1 , such that Together with the boundedness of {z k+1 } K 1 , we suppose that z k+1 K 1 − → z * . From (47), we have (y * i ) 2 + (v * i ) 2 = 0, which contradicts Assumption 3. So K 1 is a finite set. Similarly, we show that K 2 is finite. Based on the finiteness of K 1 and K 2 , from the update rule (27) of ε k and Step 3(2), we can conclude that Lemma 4 holds.
Based on Lemmas 3-4, in the following analysis, we suppose, without loss of generality, that r k ≡ r, ε k ≡ ε.

Lemma 5 Suppose that Assumptions 1-4 hold. Then
(1) The coefficient matrix sequence {M I k } is unified invertible, and there exists a positive constant L such that (M I k ) -1 ≤ L, ∀k ≥ 0. Denote where the matrix S * I = diag(s * I ) with s * I = (s * F,I C , s * Ψ ε ,I C , s * g,I ).
Under the Assumptions, it can be shown easily that M * I is nonsingular. So we obtain (2) In view of SLE1 (12), and Lemma 3 as well as r k ≡ r, we see that {( dz k , λ k )} is bounded.

Lemma 6
Suppose that Assumptions 1-4 hold, z * is an accumulation point of the infinite sequence {z k } generated by Algorithm A, and {z k } K converges to z * . If {ζ k } K → 0, then z * is a KKT point of the problem (7), and both { λ k } K and {λ k } K converge to the unique multiplier vector λ * corresponding to z * .
Therefore, in view of z * ∈ X and (50), we see that (z * , λ * ) is a KKT pair of the problem (7). Moreover, the above analysis indicates that the sequence { λ k } K possesses a unique accumulation point λ * , so lim k∈K λ k = λ * . (15)- (17) and (22), we have so it follows from the above SLE and SLE1 (12) that which together with Lemma 5(1) gives Based on the lemmas above, we now present the global convergence of Algorithm A. Proof A. We first show that z * is a KKT point of the problem (6) (called Conclusion A). Let K be an infinite index set such that Denote where the matrix S * I = diag(s * I ) with s * I = (s * F,I C , s * Ψ ε ,I C , s * g,I ). By contradiction, suppose that z * is not a KKT point of the problem (6). From Lemma 5, without loss of generality, suppose that λ k = λ k-1rẽ K − → λ . Therefore, it follows that Based on the boundedness of {dz k } K and the continuity of functions, for all k ∈ K large enough and t > 0 sufficiently small, one gets In view of i ∈ I F 0 (z * ), we have v * i -F i (u * ) = 0. Similarly, ψ ε (w * i ) = 0 for i ∈ I Ψ ε 0 (z * ), and g i (u * ) = 0 for i ∈ I g 0 (z * ). From (22) and SLE2 (21), we have that is, By Taylor's expansion and dz k ≥ 0, for t > 0 sufficient small, we get Furthermore, we consider ζ k ≤ζ 2 < 0, by (19), (20), and the boundedness of b k by Lemma 5, we see that there exists a constantā > 0 such that k ≥ā > 0, k ∈ K . For k ∈ K large enough and t > 0 small enough, we have Similarly, for k ∈ K large enough and t > 0 small enough, we have (ii) Consider the inequality (23). By (29) and Taylor expansion, for k ∈ K large enough and t > 0 small enough, it follows that By summarizing the above discussion, we can conclude that the inequalities (23)- (26) hold for k ∈ K large enough and t > 0 small enough, that is,t = inf{t k : k ∈ K } > 0.
In what follows, based on t k ≥t > 0 (k ∈ K ) we will deduce a contradiction. In view of the monotone decreasing property of {f r (z k )} and lim k∈K f r (z k ) = f r (z * ), we have lim k→∞ f r (z k ) = f r (z * ). Moreover, from (23) and (29), for k ∈ K large enough, we have Taking the limit k K → ∞ in the inequality above, we obtain a contradiction.
From the above analysis, we can conclude that z * is a KKT point of the problem (6).
From Lemma 4 and the updating rule of ε k in Step 5, we see that which implies that I C (y * , v * , ε) = ∅.
Based on Conclusions A and B, and combining with Proposition 1, we can conclude that z * is a KKT point of the MPCC (1).

Numerical experiments
In this section, a preliminary implementation of Algorithm A is given on a Intel(R) Core(TM) i5-7200U CPU (@ 2.50 GHz, 2.71 GHz) and RAM (4 GB). A Matlab code (Version R2014a) is written corresponding to this implementation.
In our tests, referring to [21], we initialize H 0 = E n , and for k = 1, 2, . . . , the approximate Hessian matrix H k is determined by the following equation: where E n+2m is an (n + 2m)th-order identity matrix, h k is chosen as follows: if λ min > s max ; -λ min + s min , if |λ min | ≤ s max ; 2|λ min |, o t h e rwi s e , where s min > 0 and s max > 0 are sufficiently small and sufficiently large, respectively, λ min is the smallest eigenvalue of the following matrix: In the numerical experiments, the parameters are chosen as follows: Algorithm A stops if one of the following termination criteria is satisfied: (1) ρ(z k , λ k ) ≤ 10 -5 and I C (y k , v k , ε k ) = ∅.
(2) dz k ≤ 10 -5 , max{λ k g,i , i ∈ I g } < 10 -5 and I C (y k , v k , ε k ) = ∅. The following test problems are selected from [34]. Algorithm A can find their solutions within a small number of iterations. We compared Algorithm A with four algorithms, i.e., filtermpec, snopt, loqo, and knitro given in [35]. The computational results are given in Table 1. The meaning of some notations in Table 1 are as follows: Problem: The problem in [34]. dim: the dimensions of the variables (x, y); itr: the number of iterations; f opt : the optimal value given in [34]; f * : the optimal value obtained by Algorithm A.

Conclusions
In this paper, based on Fischer-Burmeister function and working set techniques, a primal-dual interior point QP-free algorithm for mathematical programs with complementarity constraints is presented. At each iteration, only two reduced linear equations with the same coefficient matrix are solved to yield the search direction. The use of the working set decreases the computational cost. Moreover, the uniformly positive definiteness on the Lagrangian Hessian estimate H k is relaxed. Under some mild conditions, the proposed algorithm is globally convergent. The preliminary numerical results show that the proposed algorithm is effective.