Global Convergence of a New Nonmonotone Filter Method for Equality Constrained Optimization

A new nonmonotone filter trust region method is introduced for solving optimization problems with equality constraints. This method directly uses the dominated area of the filter as an acceptability criterion for trial points and allows the dominated area decreasing nonmonotonically. Compared with the filter-type method, our method has more flexible criteria and can avoid Maratos effect in a certain degree. Under reasonable assumptions, we prove that the given algorithm is globally convergent to a first order stationary point for all possible choices of the starting point. Numerical tests are presented to show the effectiveness of the proposed algorithm.

There are many trust region methods for equality constrained nonlinear programming (), for example, Byrd et al. [1], Dennis Jr. et al. [2] and Powell and Yuan [3], but in these works, a penalty or augmented Lagrange function is always used to test the acceptability of the iterates.However, there are several difficulties associated with the use of penalty function, and in particular the choice of the penalty parameter.Hence, in 2002, Fletcher and Leyffer [4] proposed a class of filter method, which does not require any penalty parameter and has promising numerical results.Consequently, filter technique has been employed to many approaches, for instance, SLP methods [5], SQP methods [6][7][8], interior point approaches [9], bundle techniques [10], and so on.
Filter technique, in fact, exhibits a certain degree of nonmonotonicity.The nonmonotone technique was proposed by Grippo et al. in 1986 [11] and combined with many other methods.M. Ulbrich and S. Ulbrich [12] proposed a class of penalty-function-free nonmonotone trust region methods for nonlinear equality constrained optimization without filter technique.Su and Pu [13] introduced a nonmonotone trust region method which used the nonmonotone technique in the traditional filter criteria.Su and Yu [14] presented a nonmonotone method without penalty function or filter.Gould and Toint [15] directly used the dominated area of the filter as an acceptability criteria for trial points and obtained the global convergence properties.We refer the reader [16][17][18] for some works about this issue.
Motivated by the ideas and methods above, we propose a modified nonmonotone filter trust region method for solving problem ().Similar to the Byrd-Omojokun class of algorithms, each step is decomposed into the sum of two distinct components, a quasi-normal step and a tangential step.The main contribution of our paper is to employ the nonmonotone idea to the dominated area of the filter so that the new and more flexible criteria is given, which is different from that of Gould and Toint [15] and Su and Pu [13].

The Fraction of Cauchy Decrease and the Composite SQP Step
Consider the following unconstraint minimization optimization problem: where  :   →  is a continuously differentiable function.
A trust region algorithm for solving the above problem is an iterate procedure that computes a trial step as an approximate solution to the following subproblems: where  is the Hessian matrix ∇ 2 () or an approximate to it and Δ > 0 is a given trust region radius.
To assure the global convergence, the step is required only to satisfy a fraction of Cauchy decrease condition.This means that  must predict via the quadratic model function () at least as much as a fraction of the decreased given by the Cauchy step on (); that is, there exists a constant  > 0 fixed across all iterations, such that where   is the steepest descent step for () inside the trust region.

Lemma 1. If the trial step 𝑑 satisfies a fraction of Cauchy decrease condition, then
Proof (see Powell [19] for the proof).Now, we turn to explain the composite SQP step.Given an approximate estimate of the solution   at th iteration, following Dennis Jr. et al. [2] and M. Ulbrich and S. Ulbrich [12], we obtain the trial step   = where Δ  is a trust region radius and   = ∇(  ) ∈  × ,  > 0. In order to improve the value of the objective function, we solve the following subproblem to get    : Then we get the current trial step   =    +    .Let    =      , where    ∈  − and   ∈  ×(−) denote a matrix whose columns form a basis of the null space of    .We refer to [2] for a more detailed discussion of this issue.
In usual way that impose a trust region in stepdecomposition methods, the quasi-normal step    and the tangential step    are required to satisfy where 0 <  < 1.Here, to simplify the proof, we only impose a trust region on ‖   ‖≤ Δ  and ‖   ‖≤ Δ  , which is natural.Note that    ∇  (  ) is the reduced gradient of   in terms of the representation   =    of the tangential step: Define Then the first order necessary optimality conditions (Karush-Kuhn-Tucker or KKT conditions) at a local solution  ∈   of problem () can be written as

A New Nonmonotone Filter Technique
In filter method, originally proposed by Fletcher and Leyffer [4], the acceptability of iterates is determined by comparing the value of constraint violation and the objective function with previous iterates collected in a filter.Define the violation function ℎ() by ℎ() =‖ ()‖ 2 2 , it is easy to see that ℎ() = 0 if and only if  is a feasible point, so a trial point should reduce either the value of constraint violation or that of the objective function .
In the process of the algorithm, we need to decide whether the trial point  +  is any better than   as an approximate solution to the problem ().If we decide that this is the case, we say that the iteration  is successful and choose  +  as Figure 1 the next iterate.Let us denote by S the set of all successful iterations, that is, In traditional filter method, a point  is called acceptable to the filter if and only if ℎ () ≤ ℎ  or  () ≤   − ℎ  , ∀(ℎ  ,   ) ∈ F, (12) where 0 <  <  < 1, F denotes the filter set.Define A trial point  +  is accepted if and only if (ℎ +  ,  +  ) ∉ D(F  ).Now, similar to the idea of Gould and Toint [15], we give a new modified nonmonotone filter technique.For any (ℎ, )pair, define an area that represents its contribution to the area of D(F), we hope this contribution is positive; that is, the area of D(F) is increasing.For convenience, we partition the right half-plane [0, +∞] × [−∞, +∞] into four different regions (see Figure 1).Define D(F  )  to be the complement of D(F  ).Let ℎ These four parts are (1) the dominated part of the filter: (2) the undominated part of lower left corner of the half plane: (3) the undominated upper left corner: (4) the undominated lower right corner: Consider the trial point  +  , if the filter is empty, then define its contribution to the area of the filter by where   > 0 is a constant.If the filter is not empty, then define the contribution of  +  to the area of the filter by four different formulae.
If (ℎ +  ,  +  ) ∈ (F  ), assume If If If where  Next, we should consider the updating of the filter.If (ℎ  ,   ) ∉ D(F  ), then If (ℎ  ,   ) ∈ D(F  ), then We now return to the question of deciding whether a trial point  +  is acceptable for the filter or not.We will insist that this is a necessary condition for the iteration  to be successful in the sense that  +1 =  +  .If we consider an iterate   , there must exist a predecessor iteration such that  + () =  ()+1 =   .Under the monotonic situation, a trial point  +  would be accepted whenever it results in an sufficient increase in the dominated area of the filter, that means  +  would be accepted whenever where where  () def = (  , F () ), U = {| filter is updated for (ℎ  ,   )}, () ≤  is some reference iteration for U, () ∈ U, U ⊆ S,  () ∈ [0, 1], ∑  =()+1, ∈U  () = 1.Compared to condition (2.21) [15], our condition (24) is more flexible if   is negative.
According to condition (24), it is possible to accept  +  even though it may be dominated.Then  +  will be accepted if either (23) or (24) holds.

The New Nonmonotone Filter Trust Region Algorithm
Our algorithm is based on the usual trust region technique; define the predict reduction for the function   () to be pred and the actual reduction Moreover, let   = ared(  )/pred(  ), if there exists a nonzero constant  1 such that   ≥  1 and condition ( 23) and (24) hold, the trial point  +  will be called acceptable.Then the next trial point  +1 is obtained, and for its feasibility, we consider the condition A formal description of the algorithm is given as follows.
Step 5.If   +   is not acceptable to the filter, go to Step 8.
Step 7. By restoration Algorithm B to get    , then the trial point    =   +    .
Step 8. Update the trust region radius by Algorithm C, let  =  + 1 and go to Step 3. We aim to reduce the value of ℎ() in the restoration Algorithm B, that is to get (   ) = 0 by Newton-type method.
(3) If  +  is acceptable to the filter and satisfies (27 From the description above and the idea of the algorithm, we can see that our algorithm is more flexible.Every successful iterate must be any better than the predecessor one in some degree according to the traditional filter method.But our algorithm relaxes this demand by using the nonmonotone technique and also avoids Maratos effect in a certain degree.Moreover, Algorithm C allows a relatively wide choice of the trust region.

The Convergence Properties
In this section, to present a proof of global convergence of algorithm, we always assume that the following conditions hold.
By the assumptions, we can suppose there exist constants

By (A1) and (A2), it holds
where  min , ℎ max > 0, hence in the (ℎ, )-plane, the (ℎ, )-  24).Once a trial point is accepted as a new iterate, it must be provided some improvement, and we formalize this by saying that iterate   =  ()+1 improves on iterate  () .That is the trial point   is accepted at iterate (); it happens under two situations, one is by the criteria (23), that is, the other is by the criteria (24), that is, Now consider any iterate   , it improved on  () , which was itself accepted because it improved on  (()) , and so on, until back to  0 .Hence we may construct a chain of successful iterations indexed by C  = { 1 ,  2 , . . .,   } for each , such that where  1 is the smallest index in the chain of successful iterations.
Lemma 6. Suppose that Assumptions hold.If Algorithm A does not terminate finitely and the filter contains infinite iterates, then lim  → ∞ ℎ  = 0.
Proof.Suppose by contradiction that there exists a constant  > 0 and infinite sequence {  } ⊆ S such that ℎ   ≥  for all .Because there are infinite iterations in the filter, we have |S| = ∞, then ℎ   ≥  for ∀.
Then by (31), area(D(F  )) is upper bounded for each .That means it exists  max . Hence  must be finite, it contradicts to the infinity of {  }.The proof is complete.Lemma 7. Suppose that Assumptions hold and Algorithm A terminate finitely, then ℎ  = 0.
Proof.From the Algorithm A and the definition of filter, the conclusion follows.
Lemma 8.For any trial point  +1 ̸ =   , there must be one accepted by the filter.Lemma 9. Suppose that Assumptions hold, there exists  5 > 0 independent of the iterates such that Proof.By (32), the assumptions and The proof is complete.
Lemma 10.Suppose that Assumptions hold and     ĝ     ≥   , if where ℎ  def = min ∈U ℎ  is the smallest value of violation function in filter.Then Δ  ≤  0 Δ  1 .By the above analysis, we know  ≥  1 + 1, that is  − 1 ≥  1 .From the Algorithm and (60), it concludes By ( 60) and ( 61), (53) can be obtained.In Lemma 11, let  − 1 instead of , it deduces Based on Lemma 12, together with (60), (61), and the algorithm, we can see It can be seen that ( 53) is true for  − 1 ≥ , with (55), we can deduce That means  + −1 can be accepted by the filter.From above and (55), we know Δ  ≥ Δ −1 .Hence the index  is not the first one after  1 which satisfied (60), that is a contradiction.So, for any  >  1 , it holds Δ  ≥  0  3 .Define we can see that By the Algorithm, we can get By ( 68), (69), and (71), it deduces lim Based on the assumptions, Lemma 3 and Theorem which contradicts (72).The conclusion follows.
Theorem 15.Suppose the assumptions hold, and apply the algorithm to problem (), then where ĝ =      ,   = ∇(  ), () denotes a matrix whose columns form a basis of the null space of ()  .
Proof.If the algorithm terminates finitely, it is obvious that it holds.Otherwise, by Lemmas 6 and 14, the conclusion also can be obtained.and  * satisfies the one order KKT condition of ().
The conclusion follows.
( The numerical results for the test problems are listed in Table 1. In Table 1, the problems are numbered in the same way as in Schittkowski [20] and Hock and Schittkowski [21].For example, "S216" is the problem (216) in Schittkowski [20] and "HS6" is the problem (6) in Hock and Schittkowski [21].NF, NG represent the numbers of function and gradient calculations and "L's" is the solution in [22].The numerical results show that the our algorithm is more effective than the L's for most test examples.Moreover, the higher the level of nonmonotonic, the better the numerical results.The results show that the new algorithm is robust and effective, and more flexible for the acceptance of the the trial iterate.
and ℎ P  min def = min ∈P  ℎ  ,  P  min def = min ∈P    .

Figure 2
Figure 2 illustrate the corresponding areas in the filter.Horizontally dashed surfaces indicate a positive contribution and vertically dashed ones a negative contribution.Note that (, F) is a continuous function of (ℎ(), ()).
to Step 1, where ℎ

Theorem 16 .
Suppose the assumptions hold, and {  } is the infinite sequence obtained by the algorithm, then there must exist a subsequence such that lim  → ∞    =  * (75)
≤ Δ  , +    by computing a quasi-normal step    and a tangential step    .The purpose of the quasi-normal step    is to improve feasibility.To improve optimality, we seek    in the tangential space of the linearized constraints in such a way that it provides sufficient decrease for a quadratic model of the objective function ().Let   () =    +(1/2)    , where   is a symmetric approximation of ∇ 2 ().
27) is true or not, where  3 and  are all positive constants, if it is not true, then turn to the feasibility restoration phase and define and , then compute    .Step 3. If    ≤  2 ,    /2,  =  + 1 and go to Step 2.