Steepest descent method implementation on unconstrained optimization problem using C++ program

Steepest Descent is known as the simplest gradient method. Recently, many researches are done to obtain the appropriate step size in order to reduce the objective function value progressively. In this paper, the properties of steepest descent method from literatures are reviewed together with advantages and disadvantages of each step size procedure. The development of steepest descent method due to its step size procedure is discussed. In order to test the performance of each step size, we run a steepest descent procedure in C++ program. We implemented it to unconstrained optimization test problem with two variables, then we compare the numerical results of each step size procedure. Based on the numerical experiment, we conclude the general computational features and weaknesses of each procedure in each case of problem.


Introduction
Consider the following nonlinear unconstrained minimization problem find * n x R ∈ such that ( ) ( ) where : n f R R → . Steepest descent method is the simplest gradient method for solving unconstrained optimization which is designed by Cauchy in 1847. This gradient method search along the negative gradient function, can ensure a reduction of the objective function as long as the current iterate point is not a stationary point ( [1]). Many researches have been done which is concern in finding the better and more appropriate step size procedure which affecting to the significance of minimizing the objective functions. In this paper we give some reviews the properties of steepest descent method, its development, and compare the numerical results of steepest descent step size from some literatures on solving global optimization problem. We run a steepest descent in C++ program to test the performance of tested step size procedure, then compare the numerical results obtained of program execution. Based on the numerical experiment, the features as well as the weaknesses of each step size procedure in each case of problem are concluded.

Steepest Descent Method
Steepest descent method is the simple minimizing gradient method for solving nonlinear equations since it is based on the linear approximation of Taylor series The term ( ) T f x δ ∇ is the directional derivatives of f at x . The step δ is a descent direction if the directional derivative is negative, which guarantees that the function f could be reduced along this direction. Furthermore, the step δ is need to be chosen to make the linear directional derivative as negative as possible, which give the maximum reduction of f but no consuming too much time in make a choice.
Assume that a function ( ) f x is continuous in the neighborhood of point x , d g = − is the steepest descent direction at point x and change δ in x given by A small positive constant α is called step size, which its selection make an effect to the reduction value of ( ) f x . By solving one dimensional global optimization problem the maximum reduction in ( ) f x can be obtained. The procedure of steepest descent iteration is performed by formula Starting with an initial point 0 x , a direction 0 0 d g = − and the step size 0 α that minimizes can be determined so that the next point 1 (1) is repeated until the convergence is achieved or the value d k k α is sufficiently small. The algorithm of steepest descent is presented as follows.
ALGORITHM STEEPEST DESCENT given : until convergence or stopping criterion is satisfied

Steepest Descent Step Size
The step size in steepest descent method plays the important role in obtaining the minimizing the objective function. The step size of each iteration k α , must be effectively chosen to give a significant reduction of function value, while the efficiently timing of choosing it also be optimized.
The early step size procedure of steepest descent is considered as step size which obtained by exact line search, But if the steepest descent direction is used with exact line search step size would bring to the zigzag behavior which make the convergence is very slowly. Akaike ([2]) analyzing that by this search procedure the converges is linearly and badly affected by ill-conditioning, by shown that the two asymptotic directions are in a two dimensional sub space spanned by two eigen vector of the Hessian matrix. Greenstadt ([3]) studied the efficiency of steepest descent method and shown the bound of the ratio between the reductions obtained by Cauchy step and by the Newton's method, and the rate of convergence of steepest descent was proved by Forsythe ( [4] [13]) and the others. Steepest descent method with mentioned conditions is always convergent theoretically, which is will not terminate unless the stationary point is found. However, step size computed by those inexact line search procedure also appearing the bad zigzag behavior as well as in exact line search procedure.
Considering the weakness of existing step sizes, Barzilai and Borwein ([10]) proposed the the step size along negative gradient direction from a two point approximation to the secant equation from quasi-Newton's method so called BB method. If the dimension is two, Barzilai and Borwein established R-superlinear convergence result for the method and their analyses indicated that the convergence rate is faster. For the general n − dimensional strictly convex quadratic function, Raydan ([14]) proved that the two-point step size gradient method is globally convergent, and the convergence rate is R − linear (Dai and Liao ( [15]). For the non-quadratic case, Raydan ([14]) incorporated the globalization scheme of the two point step size gradient method using the traditional technique of nonmonotone line search by Grippo et al ( [16]). The resulted algorithm is competitive and sometimes preferable to several famous conjugate gradient algorithm for large scale unconstrained optimization. Due it simplicity and numerical efficiency the two-point step size gradient method has received many studies. They have been successfully applied to obtaining the local minimizers of large scale real problems ( [17]). The algorithm of Raydan ([15]) was further generalized by Birgin et al ( [18]) for the minimization of differentiable function on closed convex sets, yielding an efficient projected gradient method. Efficient projected algorithm based on BB -like methods have also been designed (Serafini et al ( [19]) and Dai and Fletcher([20]) for special quadratic programs arising from training support vector machine, that has a singly linear constraint in addition to box constraints. The BB method has also received much attention in finding sparse approximation solution to large undetermined linear systems of equations from signal/ image processing and statics for example in Wright et al ( [21]). However Fletcher ([22]) shown that for some problems of non-quadratic function this method may very slow.
There are many existing step sizes modification, some of them are easy to be applied but some are satisfied the complicated algorithm. Following are some step size procedure which is simple to be applied to steepest descent method. Step Size Remark Step size method by Cauchy (1847) which is computed by exact line search (C step size). 2 condition to terminate the line search procedure.
These two point step sizes are method of non-line search procedure which is namely after the inventors, Barzilai and Borwein's Formula. These formula reduce the value of objective function and gradient and improved the convergence from linear to R − superlinear.
This step size is so called elimination line search (EL step size), which is estimate the step size without compute the Hessian.

Numerical Experiment
In this section, several step size procedure of the steepest descent method shown in Table 1 are tested computationally. The programming is using Visual C++ 6.0, which is applied to 14 testing function of global optimization given in the following.

Table 3. Summary of Numerical Results
Step Size ns nl nf From the numerical result obtained, the A method and BB1 method are the method with the most successful among others for the case of two dimensional global optimization given. A method never failed in solving the testing problems but have 4 times go into another local minimizer. The C method is the least in obtaining the minimizer successfully with one failure. The B, BB2 and EL method reach the same level in successfully obtain the minimizer. The B method reach without any fail, but BB2 and EL have 1 fail. From this numerical experiment we can conclude that A is better than others based on the number of minimizer obtained in the testing example given in given initial condition.

Conclusion
The advantages and disadvantages of steepest descent method are, this method is sensitive to the initial point; this method has descent property and it is a logical starting procedure for all gradient based methods. Since , so x k approaches the minimizer rather slowly, in fact in a zigzag way. It is not economical to do linear searches thoroughly. All that is necessary to do is to obtain reduction in function values at successive iterates/ According to the numerical experiment of the given test function, we can conclude that in the case of two dimensional unconstrained global optimization problem given, the A and BB1 is considered as the better method among others and C method is good enough, for the case of testing example given with given initial condition.