A method with inertial extrapolation step for convex constrained monotone equations

In recent times, various algorithms have been incorporated with the inertial extrapolation step to speed up the convergence of the sequence generated by these algorithms. As far as we know, very few results exist regarding algorithms of the inertial derivative-free projection method for solving convex constrained monotone nonlinear equations. In this article, the convergence analysis of a derivative-free iterative algorithm (Liu and Feng in Numer. Algorithms 82(1):245–262, 2019) with an inertial extrapolation step for solving large scale convex constrained monotone nonlinear equations is studied. The proposed method generates a sufficient descent direction at each iteration. Under some mild assumptions, the global convergence of the sequence generated by the proposed method is established. Furthermore, some experimental results are presented to support the theoretical analysis of the proposed method.


Introduction
Our main aim in this paper is to find the approximate solutions of the systems of monotone nonlinear equations with convex constraints; precisely, the problem where h : R n → R n is assumed to be a monotone and Lipschitz continuous operator, while C is a nonempty, closed, and convex subset of R n .
Monotone operator was first introduced by Minty [2]. The concept has aided several studies such as the abstract study of electrical networks [2]. Interest in the study of the systems of monotone nonlinear equations with convex constraint (1) stems mainly from their several applications in various fields. For instance, in power flow equations [3], economic equilibrium problems [4], chemical equilibrium [5], and compressive sensing [6]. These applications have attracted the attention of many researchers. Thus, numerous iterative methods have been proposed by many authors to approximate solutions of (1) (see  and the references therein).
Among the early methods introduced and studied in the literature are Newton method, quasi-Newton method, Gauss-Newton method, Levenberg-Marquardt method, and their modifications (see, e.g., [36][37][38][39] and the references therein). These methods have fast local convergence but are not efficient for solving large scale nonlinear monotone equations, because they involve the computation of the Jacobian matrix or its approximation per iteration, which is well known to require a large amount of storage. To overcome this problem, various alternatives and modifications of the early methods have been proposed by several authors. Amongst these methods are conjugate gradient methods, spectral conjugate gradient methods, and spectral gradient methods. Extensions of the conjugate gradient method and its variant to solve large scale nonlinear equations have been obtained by several authors. For instance, motivated by the stability and efficiency of the Dai-Yuan (DY) conjugate gradient method [40] for solving unconstrained optimization problems, Liu and Feng [1] proposed a derivative-free projection method based on the structures of the DY conjugate gradient method [40]. This method inherits the stability of the DY method and greatly improves its computing performance.
In practical applications, it is always desirable to have iterative algorithms that have a high rate of convergence [41][42][43][44][45][46]. An increasingly important acceleration method is the inertial extrapolation type algorithms [47,48]. They use an iterative procedure in which subsequent terms are obtained using the preceding two terms. This idea was first introduced by Polyak [49] and was inspired by an implicit discretization of a second-order-intime dissipative dynamical system, so-called 'Heavy Ball with Friction': where γ > 0 and f : R n → R is differentiable. System (2) is discretized so that, having the terms x k-1 and x k , the next term x k+1 can be determined using where j is the step size. Equation (3) yields the following iterative algorithm: where β = 1γ j, α = j 2 and β(x kx k-1 ) is called the inertial extrapolation term which is intended to speed up the convergence of the sequence generated by equation (4). Several algorithms with inertial extrapolation term have been tested in the solution of several problems (for example, imaging/data analysis problems and motion of a body in a potential field), and the test showed that the inertial steps remarkably increase the convergence speed of these algorithms (see [47,48,50] and other references therein). Therefore, this property is very important. As far as we know, there are not many results regarding algorithms of inertial derivative-free projection for solving (1).
Our concern now is the following: Based on the derivative-free iterative algorithm of Liu and Feng [1], can we construct an inertial derivative-free method for solving the system of monotone nonlinear equations with convex constraints?
In this paper, we give a positive answer to the aforementioned question. Motivated and inspired by the algorithm in [1], we introduce an inertial derivative-free algorithm for solving (1). Our proposed method is a combination of inertial extrapolation step and the derivative-free iterative method for nonlinear monotone equations with convex constraints [1]. We obtain the global convergence result under mild assumptions. Using a set of test problems, we illustrate the numerical behaviors of the algorithm in [1] and compare it with the algorithm presented in this paper. The results indicate that the proposed algorithm with the inertial step is superior in terms of the number of iterations and function evaluations.
The rest of paper is organized as follows. The next section contains some preliminaries. The proposed inertial algorithm is presented in Sect. 3, and its convergence is presented in the fourth section. The last section is devoted to presentation of examples and numerical results.

Preliminaries
We recall some known definitions and results which will be used in the sequel. First, let us denote by SOL(h, C) the solution set of (1).

Definition 2.1
Let C be a nonempty closed convex subset of R n . A mapping h : R n → R n is said to be: (ii) L-Lipschitz continuous on C, if there exists L > 0 such that Definition 2.2 Let C ⊂ R n be a closed and convex set, some vector x ∈ R n , the orthogonal projection of x onto C denoted by P C (x), is defined by The following lemma gives some well-known characteristics of the projection operator.

Lemma 2.3
Let C ⊂ R n be a nonempty closed and convex set. Then the following statements hold: Lemma 2.4 ([51]) Let R n be an Euclidean space. Then the following inequality holds: where ∞ k=1 z k < ∞, then lim k→∞ x k exists.

Proposed method
Based on the Liu and Feng [1] derivative-free iterative method for monotone nonlinear equation with convex constraint, in the sequel, we present an inertial extrapolation algorithm for solving the system of nonlinear monotone equations (1). The corresponding algorithm, which we refer to as the inertial projected Dai-Yuan (IPDY) algorithm, uses a strategy which tracks the optimal x-value by starting with an initial x-value x 0 and thereafter updating the x by performing iterations of the form where α k is a positive step size obtained by a line search procedure, and d k is the search direction implemented so that is fulfilled. Next, we give a precise statement for our method as follows.
6 (S.3) Find z k = w k + α k d k , where α k = ar i with i being the smallest nonnegative integer such that
Remark 3.1 For all k ≥ 0, it can be observed from equation (7) that Throughout this paper, we make use of the following assumptions.
(A3) h is Lipschitz continuous on C.

Convergence result
In this section, convergence analysis of our algorithm is presented. We start by proving some lemmas followed by the proof of the main theorem.
Proof For k = 1, multiplying both sides of (8) by h(w 0 ) T , we have Also for k > 1, multiplying both sides of (8) by h(w k ) T , we get Remark 4.2 From the definition of y k-1 and t k-1 , it holds that then from (12) we have This indicates that d T k-1 y k-1 is always positive when the solution of (1) is not achieved, which means that the parameters ζ k and β k are well defined. (10) is well defined. That is, for all k ≥ 1, there exists a nonnegative integer i satisfying (10).

Lemma 4.3 The line search condition
Proof The proof of Lemma 4.3 can be obtained in the same way as [1] with the difference that the sequence {x k } is replaced with the inertial extrapolation term w k .

Lemma 4.4 Suppose that h is a monotone and Lipschitz continuous mapping, and {w k }
and {z k } are sequences generated by Algorithm 1, then Proof From line search (10), if α k = a, thenα k r -1 does not satisfy the line search. That is, This fact, in combination with the Lipschitz continuity assumption (A3) and the sufficient descent condition (12), expresses This yields the desired inequality (13).
Moreover, the sequence {x k } is bounded and Proof By the monotonicity of the mapping h, we have By Lemma 2.3(iii), (16), and (17), it holds that, for any x * ∈ SOL(h, C), From inequality (18), we can deduce that From Remark 3.1, noting that ∞ k=1 θ k x kx k-1 < ∞, by Lemma 2.5, we deduce that the sequence { x kx * } is bounded by a positive number, say M 0 . Therefore, for all k, we have that Thus, we can infer that x kx k-1 ≤ 2M 0 . Using the aforementioned facts, we have Combining (21) with (18), we have Thus, we have Adding (23) for k = 1, 2, 3, . . . , we have Remark 4.6 By the definition of {z k } and (24), we have Theorem 4.7 Suppose that the conditions of Assumption 1 hold. If {x k } is the sequence generated by (11) in Algorithm 1, then Furthermore, {x k } converges to a solution of (1).
Proof We first prove that Suppose that equality (27) does not hold. Then there exists a constant ε > 0 such that This fact, in combination with the sufficient descent condition (12), implies that This shows that On the other hand, by the Lipschitz continuity assumption (A3) and (20), we have By using the Cauchy-Schwarz inequality, Remark 4.2, and (28), it follows from (8)- (9) that, for all k > 1, Then we get from (13) that which contradicts (29). Thus, (27) holds. Now, since we know that by the continuity of h, we have that From the continuity of h, the boundedness of {x k }, and (32), it implies that the sequence {x k } generated by Algorithm 1 has an accumulation point x * such that h(x * ) = 0. On the other hand, the sequence {x kx * } is convergent by Lemma 2.5, which means that the whole sequence {x k } globally converges to the solution x * of system (1).

Numerical experiments
In this section, an efficiency comparison between the proposed method called IPDY and the method proposed Liu and Feng in [1] called PDY is presented. Recall that the IPDY is a modification of the method in PDY by introducing the inertial term. The metrics considered for the comparison are the number of iterations (NI) and function evaluations (NF). This means that the method with the least NI and NF is the best method. The following were considered for the experimental comparison: • Dimensions: 1000, 5000, 10,000, 50,000, 100,000.
• Parameters: For IPDY, we select θ = 0.8, a = 1, r = 0.7, σ = 0.01, c 0 = 1. As for PDY, all parameters are selected as in [1]. • Terminating criterion: When h(w k ) ≤ 10 -6 . • Implementation software: All methods are coded in MATLAB R2019ba and run on a PC with an intel COREi3 processor, 8 GB of RAM and CPU 2.30 GHz. The two methods were compared based on the following test problems, where h = (h 1 , h 2 , . . . , h n ) T .
The numerical results are given in Tables 2-11 in the Appendix section for the sake of comparison. From the table, it can be observed that the IPDY method has lower NI and NF than the PDY in most of the problems. This is the result of the inertial effect possessed by the IPDY method. For all initial points used, it can be observed that the IPDY method was able to solve the test problems. However, it can be seen that for Problem 3, using the randomly selected initial points, the IPDY method failed for dimension 5000 and 10,000. On the overall, to visualize the performance of IPDY verses the PDY method, we employ the well-known performance profiles of Dolan and Moré [59] defined as: where T P is the test set, |T P | is the number of problems in the test set T P , Q is the set of optimization solvers, and t p,q is the NI (or the NF) for t p ∈ T P and q ∈ Q. Figures 1 and 2 were obtained using the above performance profiles. From Figs. 1 and 2, the IPDY method has the least NI and NF in over 80% of the problem, respectively. This can be seen on the y-axis of the plots. As a conclusion, it can be said that the purpose of introducing the inertial effect was achieved as the IPDY method recorded the lowest number of iterations and function evaluations.

Conclusion
The paper has proposed an inertial derivative-free algorithm, called IPDY, for solving systems of monotone nonlinear equations with convex constraints in the Euclidean space. Under some suitable conditions imposed on parameters, we established the global convergence of the algorithm. In all our comparisons, the numerical results as shown in Tables 2-11 and Figs. 1, 2 demonstrate that our method converges faster and is more efficient than the PDY algorithm. In the future, we plan to study different variants of derivativefree methods with the inertial extrapolation step and apply them in various directions like image deblurring and signal processing problems.