New conjugacy condition with pair-conjugate gradient methods for unconstrained optimization

Conjugate gradient methods are wildly used for unconstrained optimization especially when the dimension is large. In this paper we propose a new kind of nonlinear conjugate gradient methods which on the study of Dai and Liao (2001), the new idea is how to use the pair conjugate gradient method with this study (new cojugacy condition) which consider an inexact line search scheme but reduce to the old one if the line search is exact. Convergence analysis for this new method is provided. Our numerical results show that this new methods is very efficient for the given ten test function compared with other methods. 1.Introduction We are concerned with the following unconstrained minimization problem: ) ( minimize x f ...(1) where R R f n → : is smooth and its gradient ) ( ) ( x f x g ∇ = is exist. There are several kinds of numerical methods for solving (1), which include the steepest descent method, the Newton method and quasi-Newton methods, for example. Among them the conjugate gradient method is one choice for solving large-scale problems, because it does not need any matrices. Conjugate gradient methods are iterative methods of the form Abbas Y. Al-Bayati and Huda I. Ahmed 22 1 1 − − + = k k k k d x x α ...(2)    = − ≥ + − = − 1 k for 2 k for 1 k k k k k g d g d β ...(3) where k g denotes ) ( k x f ∇ and k β is a scalar. If ) (x f is a strictly convex quadratic function: c x b Gx x x f T T + + = 2 1 ) ( ...(4) where nXn R G ∈ is asymmetric positive definite matrix, and k α is given by: k T k k k Ad d g 2 = α ...(5) then the method (2)-(3) is called the linear conjugate gradient method, where . denotes the Euclidean norm. The linear conjugate gradient method was originally proposed by Hestenes and Stiefel (1952) for solving linear system of equations b Gx = ...(6) within the framework of linear conjugate gradient methods, the conjugacy condition is defined by j i for , 0 ≠ = j T i Gd d ...(7) for search directions, and this condition guarantees the finite termination of the linear conjugate gradient methods. On the other hand, the method (2)-(3) is called nonlinear conjugate gradient method for general unconstrained optimization problem (general nonlinear function). The nonlinear conjugate gradient method was first proposed by Fletcher and Reeves (Fletcher and Reeves, 1964). Within the framework of nonlinear conjugate gradient methods, the conjugacy condition is replaced by 0 1 = − k T k y d ...(8) Where 1 1 − − − = k k k g g y ... (9) for search direction, because the relations. 1 1 1 1 1 1 1 1 ) ( 1 ) ( 1 − − − − − − − = − = − = k T k k k k T k k k k T k k k T k y d g g d x x G d Gd d α α α . Hold for the strictly convex quadratic objective function. Multiplying 1 − k y in (3) and using (8), we can deduce a formula for the scalar


1.Introduction
We are concerned with the following unconstrained minimization problem: is smooth and its gradient is exist. There are several kinds of numerical methods for solving (1), which include the steepest descent method, the Newton method and quasi-Newton methods, for example. Among them the conjugate gradient method is one choice for solving large-scale problems, because it does not need any matrices. Conjugate gradient methods are iterative methods of the form 1  for search directions, and this condition guarantees the finite termination of the linear conjugate gradient methods. On the other hand, the method (2)-(3) is called nonlinear conjugate gradient method for general unconstrained optimization problem (general nonlinear function). The nonlinear conjugate gradient method was first proposed by Fletcher and Reeves (Fletcher and Reeves, 1964). Within the framework of nonlinear conjugate gradient methods, the conjugacy condition is replaced by for search direction, because the relations.
Hold for the strictly convex quadratic objective function. Multiplying 1 − k y in (3) and using (8), we can deduce a formula for the scalar k  , as: This is the so-called HS formula, which was given by Hestenes and Stiefel (1952), also there is well-known formulae for k  are the Fletcher-Reeves (FR), (Fletcher, 1964) and Polak Ribiere (PR), (Polak,1969) and (Polyak, 1969) they are given by To establish the convergence results methods mentioned above, it is usually required that the step k  should satisfy the following strong Wolfe conditions On the other hand, many numerical methods (e.g. the steepest descent method and quasi-Newton methods) for unconstrained optimization are proved to be convergent under the Wolfe conditions: Thus it is an important issue to study global convergence of conjugate gradient methods under the Wolfe conditions instead of the strong Wolfe conditions.

The Dai and Liao method
As stated in section 1, the conjugacy condition which may be represented by the form: for nonlinear conjugate gradient methods. The extension of the conjugacy condition was studied by Peery and also Shanno (Peery,1978) and (Shanno, 1978). However, both the conjugacy conditions (7) and (17) depend on the exact line searchs. In practical computation, one normally carries out inexact line search instead of exact line searches. In the case when , the conjugacy conditions (7) and (17) may have some disadvantages, for this reason the extension of the conjugacy condition studied by Perry (1978), he tried to accelerate the conjugate gradient method by incorporating the secodorder information into it, specifically, he used the quasi-Newton condition 1 this gives the Dai and Liao formula we note that the case 1 = t reduces to Perry formula: the equation (23) can be written by: for which we see that formula (23) with really defines aclass of nonlinear conjugate gradient methods. Similarly, we call the method defined by (2)-(3) with k  from (23), method (DL), the aim of Dai and Liao is how to fined the best value of t to give the best nonlinear conjugate gradient method. For any 0  t , denote k d and k d to be the search directions given by method (23) and the HS method, respectively, namely:

Lemma
Suppose that f is quadratic function given in (4); then we have that: The prove of this Lemma is defined in (Dai and Liao, 2001).
from Lemma Dai and Liao obtained the best value of t which defined by:

3.New nonlinear conjugacy gradient method using pair direction
In this section we find the new value of t by using pair direction U and V, before that we give some definitions.

Definition
Vectors

Remark
If G is symmetric and nonsingular, then we observe that the left conjugate direction vectors of G are also right direction vectors of G. In this case, we call the vectors conjugate gradient vector of G. In terms of

Definition
Let G, U and V be nonsingular nxn matrices. Then (U,V)is an Gconjugate pair if is Lower triangular (Wyk, 1977).

New delimitative for finding the value of t for pair conjugate gradient method.
Suppose that f is given in (4). Then we have that : (from the definition of the semi conjugate direction ) then (46) is becomes :

Abbas Y. Al-Bayati and Huda I. Ahmed
is a scalar. In practical if we have to take

4.The algorithm of the new pair conjugate gradient method
We list bellow the out lines of the new method For an initial point 0 x : Step (1): set k=1, Step (2): set where k  is a scalar chosen in such a way such that Step ( Step (4): if k  2 go to step (5), else go to step (8).

Generalized conjugate directions
We will now formulate the analogous generalized conjugate direction method for the minimization of function f(x) . Suppose that U and V form a conjugate pair. Set = 0 x arbitrary, for I=0,1,---,compute: Before we prove that this algorithm will find the minimum of quadratic function in n steps, then we show that if f is quadratic then * * k  in (35) are the same as the

Theorem
If the iteration (50) is applied to the quadratic function where (U,V) form a G-conjugate pair, the minimum is found in at most n iterations, moreover, n x lies in the subspace generated by 0 for some i and j=0,1,…,i-1, then for j=0,1,…,i Due to the induction hypothesis and the conjugacy, for all j<i, and for the case j=i , (see, Wyk, 1977).

Numerical results
We tested the HS method (10) . For the DL method (25) Dai and Liao, 2001).
We have test ten function with different dimension n=100, 1000 and 10000. The numerical results are given in the form of NOF and NOI where NOF denote the numbers of function evaluations, and NOI denote the numbers of iterations. The stopping condition is Comparing the new pair method (49) with HS method, Peery method, DL method we could say that the new method is more efficient than all especially for Powell function, Wood function, Helical function, Powell3 function, Helical function, Edeger function and Resip function from the ten function test in this section as we see from the Tabel (7.1), (7.2), (7.3).