Stability analysis of Jacobian-free Newton’s iterative method

: It is well known that scalar iterative methods with derivatives are highly more stable than their derivative-free partners, understanding the term stability as a measure of the wideness of the set of converging initial estimations. In multivariate case, multidimensional dynamical analysis allows us to afford this task and it is made on different Jacobian-free variants of Newton’s method, whose estimations of the Jacobian matrix have increasing order. The respective basins of attraction and the number of ﬁxed and critical points give us valuable information in this sense.


Introduction
Let F(x) = 0, F : D ⊆ R n −→ R n , be a system of nonlinear equations. Usually, this kind of problems can not be solved analytically and the approach to a solution is made by means of iterative techniques. The best known iterative algorithm is Newton's method, with second order of convergence and iterative expression F(x (k) ), k = 0, 1, . . . (1) from an estimation x (0) ∈ D. This iterative scheme needs the evaluation of the nonlinear function and its associate Jacobian matrix at each iteration. However, sometimes, the size of the system or the specific properties of the problem do not allow to evaluate the Jacobian matrix F (x), or even its calculation at each iteration (for example, if F is an error function); in these cases, some approximations of the Jacobian matrix can be used. The most usual one is a divided difference matrix, that is, a linear operator [x (k) , y (k) ; F] satisfying condition (see [1,2]) In this case, when F (x (k) ) in Newton's method is replaced by [x (k) , y (k) ; F], where y = x (k) + F(x (k) ), we obtain the so-called Steffensen's method [1], also with order of convergence two. To compute in practice the elements of the divided difference operator, the following first-order divided difference operator [x (k) , y (k) ; F] 1 i,j = F i (y 1 , y 2 , ..., y j−1 , y j , x j+1 , ..., x n ) − F i (y 1 , y 2 , ..., y j−1 , x j , x j+1 , ..., x n ) or the symmetric second-order one, [x (k) , y (k) ; F] 2 i,j = (F i (y 1 , y 2 , ..., y j−1 , y j , x j+1 , ..., x n ) − F i (y 1 , y 2 , ..., y j−1 , x j , x j+1 , ..., x n ) +F i (x 1 , x 2 , ..., x j−1 , x j , y j+1 , ..., y n ) − F i (x 1 , x 2 , ..., x j−1 , y j , y j+1 , ..., y n ))/ 2(y j − x j ) , (4) are proposed in [3]. Let us remark that operator (4) is symmetric and it can be used to evaluate the divided difference even when the problem is nonsymmetric. However, the number of evaluations of the scalar functions in the computation of (4) is higher than those in (3). Moreover, when divided difference (3) is used as an approximation for the Jacobian matrices appearing in any iterative method, then usually the iterative procedure does not preserve its order of convergence.
The authors in [4] proposed to replace y = x + F(x) in (2) by y = x + Γ(x), being Γ(x) = ( f 1 (x) m , f 2 (x) m , ..., f n (x) m ) T , with m ∈ N. Then, divided difference becomes an approximation of F (x) of order m. It was also shown that, by choosing a suitable value of m, the order of convergence of any iterative method can be preserved with a reduced computational cost. So, the Jacobian-free variants of Newton's scheme that we analyze hold the second order of convergence of the original method.
Our aim is to see if, further on the order of convergence of the method, the use of different divided differences to replace the Jacobian matrix in the iterative expression of Newton's method, can affect the dependence on initial estimations of the modified scheme to converge.
By using Taylor expansion of the divided difference operator (5), authors in [4] proved the following results.
is 2m as an approximation of Based on these results, in [4] it was presented a new technique to transform iterative schemes for solving nonlinear systems into Jacobian-free ones, preserving the order of convergence in all cases. The key fact of this new approach is the mth power of the coordinate functions of F(x), that needs different values depending on the order of convergence of the first step of the iterative method. This general procedure was checked, both theoretical and numerically, showing the preservation of the order of convergence and very precise results when the appropriate values of m were employed.
Also the authors in [5] showed that the order of the approximation of F (x) might be improved (in terms of efficiency) by means of the Richardson extrapolation. It can be seen in the following result.
which is obtained by Richardson extrapolation of (6) is an approximation of order 4m of F (x).
Although the design and convergence analysis of iterative methods for solving nonlinear problems is a successful area of research in the last decades, it has been recently that the study of their stability has become usual (see, for example, [6][7][8][9][10]). So, when a method is presented, not only its order of convergence and efficiency are important, but also its dependance on the initial estimations used to converge. This is known as the stability analysis of the iterative scheme.
The study of the stability of an iterative procedure has been mostly made by using techniques from complex discrete dynamics, that are very useful in the scalar case. Nevertheless, it frequently does not provide enough information when systems of nonlinear equations must be solved. This is the reason why the authors in [11] applied by first time real multidimensional discrete dynamics in order to analyze the performance of vectorial iterative methods on polynomial systems. In this way, it was possible to conclude about their stability properties: their dependence on the initial estimation used and the simplicity or complexity of the sets of convergent initial guesses (known as Fatou set) and their boundaries (Julia set). These procedure have been employed in the last years to analyze new and existing vectorial iterative schemes, see for instance [5,[12][13][14]. We are going to use these techniques in this paper to the vectorial rational functions obtained by applying different Jacobian-free variants of Newton's method on several low-degree polynomial systems. These vectorial rational functions are called also multidimensional fixed point functions in the literature.
The polynomial systems used in this study are defined by the nonlinear functions: By using uncoupled systems as p(x) = 0 and r(x) = 0 and coupled as p(x) = 0, we can generalize the performance of the proposed methods to another nonlinear systems. Moreover, let us remark that these results can be obtained by using similar systems with size n > 2 but we analyze the case n = 2 in order to use two-dimensional plots to visualize the analytical findings.
In the next section, the dynamical behavior of the fixed point functions of Jacobian-free versions of Newton's method applied on q(x), r(x) and p(x) are studied when forward, central and Richardson extrapolation-type of divided differences are used. To get this aim, some dynamical concepts must be introduced. [11]) Let G : R n −→ R n be a vectorial function. The orbit of the point x (0) ∈ R n is defined as the set of successive images of x (0) by the vectorial function, {x (0) , G(x (0) ), ..., G m (x (0) ), ...}.

Definition 1. (See
The dynamical behavior of the orbit of a point of R n can be classified depending on its asymptotic behavior. In this way, a point The following results are well known results in discrete dynamics and in this paper we use them to study the stability of nonlinear operators.
Also, a fixed point is called hyperbolic if for all eigenvalues λ j of G (x * ), we have |λ j | = 1. If there exists an eigenvalue such that |λ j | < 1 and an eigenvalue that |λ i | > 1 the hyperbolic point is called saddle point.
Let us note that, the entries of G (x * ) are the partial derivatives of each coordinate function of the vectorial rational operator that defines the iterative scheme. When the calculation of spectrum of G (x * ) is difficult the following result which is consistent with the previous theorem, can be used.

Proposition 1.
(See [11]) Let x * be a fixed point of G then, n for all i, j ∈ {1, ..., n}, then x * ∈ R n x * is unstable and lies at the Julia set.
In this paper, we only use Theorem 2 to investigate the stability of the fixed points. Let us consider an iterative method for finding the roots of a nonlinear systems F(x) = 0. This generates a multidimensional fixed point operator G(x). A fixed point x * of G(x) is called a strange fixed point if it is not a root of the nonlinear function F(x). The basin of attraction of x * (which may be a root of F(x) or a strange fixed point) is the set of pre-images of any order such that The critical points play an important role in this study since a classical result of Julia and Fatou establishes that, in the connected component of any basin of attraction including an attracting fixed point, there is always at least a critical point.
As it is obvious, a superattracting fixed point is also a critical point. A critical point that is not a root of function F(x) is called free critical point.
The motivation of this work is to analyze the stability of the Jacobian-free variants of Newton's method for the most simple nonlinear equations. Certainly, it is known that, in general, divided differences are less stable than Jacobian matrices, but we study how the increasing of the precision in the estimation of the Jacobian matrix affects to the stability of the methods and in the wideness of the basins of attraction of the roots.

Jacobian-Free Variants of Newton's Method
In this section, we study the dynamical properties of Jacobian-free Newton's method when different divided differences are used. To get this purpose we analyze the dynamical concepts on polynomial systems q(x) = 0, r(x) = 0 and p(x) = 0. The dynamical concepts on two dimensional systems can be extended to an n-dimensional case (see [11] to notice how the dynamics of a multidimensional iterative method can be analyzed), so for visualizing graphically the analytical results we investigate the two-dimensional case. From now on, the modified Newton's scheme which results from replacing forward divided difference (5) instead of Jacobian matrix in Newton's method, is denoted by FMN m , for m = 1, 2, . . .. In a similar way, when central divided difference (6) is used to replace Jacobian matrix in Newton's procedure, the resulting modified schemes are denoted by CMN m , for m = 1, 2, . . .. Also, the modified Newton's method obtained by using divided difference (7) is denoted by RMN m , for m = 1, 2, . . ..
Let us remark that Newton's method has quadratic convergence and, by using the mentioned approximations of the Jacobian matrix, this order is preserved, even in case m = 1 (in this case, the scheme is known as Steffensen's method).
We use proposed families FMN m , CMN m and RMN m on the polynomial systems q(x) = 0, r(x) = 0 and p(x) = 0. In the following sections, the coordinate functions of the different classes of iterative methods, joint with their fixed and critical points are summarized.
provided that k and l are not equal to 1 and −1, simultaneously.

Remark 1.
Except for m = 1, that is a case with 12 free critical points, for m > 1 there exist 32 free critical points of the fixed point operator associated to FMN m . In particular, free critical points for the fixed point provided that k and l are not equal to 1 and −1, simultaneously. Figure 1 shows the dynamical behavior of fixed point function λ 1,m (x) for m = 1, 2, . . . , 6. These figures have been obtained by the routines described in [16]. To draw them, a mesh of 400 × 400 points has been used, 200 was the maximum number of iterations involved and 10 −3 the tolerance used as stopping criterium. In this paper, we have used a white star to show the roots of the nonlinear polynomial system and a white square for the free critical points. Figure 1 shows that, as greater m is, the wideness of basins of attraction decreases, in spite of having a better approximation of the Jacobian matrix. The color assigned to each of the basin of attraction corresponds to one of the roots of the nonlinear system. The black area shows the no convergence in the maximum number of iterations, or divergence.
To see the behavior of the vectorial function in the black area of dynamical planes, we visualize the orbit of the rational function corresponding to the starting point (−3, −3) after 200 iterations. This orbit appears as a yellow circle per each iterate and yellow lines between each pair of consecutive iterates. In Figure 1a, corresponding to m = 1, its value in the 200th iteration is (−213.0141, −213.0141). Figure 1b, which corresponds to m = 2, shows lower rate of divergence (or convergence to infinity), being its value at the 200th iteration (−8.6818, −8.6818). This effect is higher by increasing m but, for m ≥ 3 it is observed that The vectors on the top of Figure 1c which are the same components of the fixed point function of Newton's method on q(x).
Since q(x) is a 2nd-degree polynomial system, the approximations of order equal or higher than 2 of Jacobian matrix are exact and the fixed point function of the CMN m and RMN m methods coincide with that of Newton's method. In this case, (−1, −1), (−1, 1), (1, −1) and (1, 1) are superattracting fixed points and there are no strange fixed points or free critical points. Figure 2 shows the resulting dynamical plane, that coincides with that of Newton's method.

Sixth-Degree Polynomial System r(x) = 0
The corresponding results about FMN m , CMN m and RMN m classes applied on sixth-degree polynomial system r(x) = 0 are summarized in the following propositions. They resume, in each case, the iteration function, the fixed and critical points.   provided that k and l are not equal to −1 and 1 simultaneously. Figure 3 shows the dynamical planes of the fixed point function h 1,m (x) for m = 1, 2, . . . , 6. Inside the black areas, there are regions where the orbits tend toward the poles of the rational function (those points that make the denominator of the elements of the rational function null). In fact, the orbits of points in these areas reach very close to the roots, so the vectorial rational function at these points become numerically singular. Figure 3a,b show two points in these black areas.
where k and l are not simultaneously equal to 1 and −1.   provided that k and l are not equal to 1 and −1 simultaneously.
Central divided differences show very stable behavior. All the free critical points are inside the basins of attraction of the roots. This means that it is not possible other performance than convergence to the roots (or divergence). In fact, the basins of attraction of the roots in this case are much greater than those in the forward divided difference. Black areas near to the boundaries of the basins of attraction are regions of slow convergence or divergence. Figure 6a,b shows the slow convergence of the point (−1.5, 1.5) towards (−1,1); let us observe that, as for m = 1 point (−1.5, 1.5) is closer to the boundary so its speed is higher than that for m = 2. The behavior of function h 2,m (x) in black area among the basins of attraction and near the axis x and y are similar for m = 1, 2, . . .. In fact by choosing points in this area the iterative function is slowly divergent (see Figure 6c,d). Let us remark that, although this scheme is quite stable, the basins of attraction of the roots are much smaller than those of Newton's method on r(x), where there are not black regions.
Finally, the following result gives us information about the stability of the sixth-degree system when higher-order estimations of the Jacobian are made.
for j = 1, 2 and m = 1, 2, . . ., provided that −4x 5 j + x j r j (x) 4m = 0 and k and l are not simultaneously equal to 1 and −1. In this case, we obtain approximations of order 4, 8, 12, . . . for m = 1, 2, 3, . . ., respectively, but nevertheless their basins of attraction are smaller than those of central divided difference (and of course of Newton's method). Figure 7 shows the dynamical planes for m = 1, 2, 3, 4. As in the previous cases, in the black area the convergence is very slow (or it diverges). The results about FMN m , CMN m and RMN m classes applied on second-degree polynomial system p(x) = 0 are stated in the following propositions. They resume, in each case, the iteration function, the fixed and critical points.  The calculation of free critical points of k 1,m (x) for m > 1 is very complicated hence, so it has been provided only for m = 1. Figure 8 shows the dynamical planes of the fixed point function k 1,m (x) for m = 1, 2, 3, 4. Similar results to polynomial system q(x) = 0 and r(x) = 0 have been obtained in this case, as by increasing m, the wideness of the basins of attraction decreases. To study the behavior of the vectorial function in the black area of dynamical planes, we visualize the diverging orbits of  The following proposition shows, that using CMN m , for m = 1, 2, . . ., and RMN m , for m = 1, 2, . . ., on polynomial system p(x) = 0, we obtain, as was the case for the system q(x) = 0, the same fixed point function as the classical Newton's method. Proposition 8. The coordinates of the fixed point operator k 2,m (x) (which is denoted by k 2 (x)) associated to CMN m for m = 1, 2, . . . and RMN m for m = 1, 2, . . . on polynomial system p(x) are which are the same components of the fixed point function of Newton's method on p(x). Moreover, the fixed points of k 2 (x) are the roots (−1, 0), (0, −1), 1 2 1 − √ 5 , 1 2 1 − √ 5 and 1 2 1 + √ 5 , 1 2 1 + √ 5 that are superattracting. There are no strange fixed points nor free critical points in this case. Figure 9 shows the dynamical plane of the fixed point function k 2 (x), corresponding to methods CMN m and RMN m for any m = 1, 2, . . . as well as their original partner, Newton's scheme.

Conclusions
In this paper, several new Jacobian-free Newton's method have been introduced, by using forward, central and Richardson divided differences based on an element-by-element power of the nonlinear function F(x), G(x) = ( f 1 (x) m , f 2 (x) m , . . . , f n (x) m ) T . As far as we know, these Jacobian-free variants of Newton's method have not been analyzed until now. We conclude that better estimations do not always involve greater stability. In fact, the best scheme in terms of numerical efficiency and wideness of the sets of converging initial points, is CMN m . This central differences method does not need to calculate and evaluate the Jacobian matrix as Newton's method and provides similar basins of attraction. Although Richardson's method reaches good results of convergence, has a computational cost that discourages its use.