Next Article in Journal
On the Alternative SOR-like Iteration Method for Solving Absolute Value Equations
Previous Article in Journal
Statistical Inference for the Kavya–Manoharan Kumaraswamy Model under Ranked Set Sampling with Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Adaptive Accelerated Levenberg–Marquardt Method for Solving Nonlinear Equations and Its Applications in Supply Chain Problems

1
School of Mathematics and Statistics, Beihua University, Jilin 132013, China
2
School of Information Engineering, Hainan Vocational University of Science and Technology, Hainan 571126, China
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(3), 588; https://doi.org/10.3390/sym15030588
Submission received: 10 February 2023 / Revised: 21 February 2023 / Accepted: 22 February 2023 / Published: 24 February 2023

Abstract

:
In this paper, a new adaptive Levenberg–Marquardt method is proposed to solve the nonlinear equations including supply chain optimization problems. We present a new adaptive update rule which is a segmented function on the ratio between the actual and predicted reductions of the objective function to accept a large number of unsuccessful iterations and avoid jumping in local areas. The global convergence and quadratic convergence of the proposed method are proved by using the trust region technique and local error bound condition, respectively. In addition, we use the proposed algorithm to test on the symmetric and asymmetric linear equations. Numerical results show that the proposed method has good numerical performance and development prospects. Furthermore, we apply the algorithm to solve the fresh agricultural products supply chain optimization problems.

1. Introduction

With the development of science and technology, more and more fields are involved in the solution of nonlinear equation problems, such as chemistry, mechanics, economics and product management [1,2,3,4]. For example, decentralized decision models in supply chain management and gas pressure volume models in physics can be converted into the following nonlinear equations
F ( x ) = 0 ,
where F ( x ) : R n R m is a continuously differentiable function. In particular, symmetric nonlinear equations with the Jacobian matrix symmetry also have a wide range of applications, such as the gradient mapping of unconstrained optimization problem, the Karush–Kuhn–Tucker (KKT) of equality constrained optimization problem, and other fields [5,6].
The steepest descent method, Newton method, quasi-Newton method, Gauss–Newton (GN) method are commonly used iterative methods for solving (1) [7,8,9,10]. The GN method is one of the most famous methods, when the Jacobian matrix is Lipschitz continuous and nonsingular at the solution of (1), the GN method has quadratic convergence. However, when the Jacobian matrix is singular or nearly singular, the GN method may not be well defined. In order to overcome this difficulty, the Levenberg–Marquardt (LM) method [11,12] for solving (1) was proposed. At the k-th iteration, the trial step is
d k = ( J k T J k + λ k I ) 1 ( J k T F k ) ,
where F k = F ( x k ) , J k = J ( x k ) is a Jacobian matrix of F ( x ) at x k , which may be a symmetric matrix or non-symmetric matrix, I is an identity matrix and the LM parameter λ k > 0 .
The LM method ensures the uniqueness of solution of (1), and it also has quadratic convergence if J k is Lipschitz continuous, nonsingular at the solution, and λ k is selected appropriately. In this sense, the update of the LM parameter has a great impact on the performance and efficiency of algorithm, many effective LM parameters have been proposed. Yamashita and Fukushima [13] chose the LM parameter as λ k = F k 2 , and proved that the LM method had quadratic convergence under the local error bound condition and J k is Lipschitz continuous at the solution. However, when { x k } is far away from the solution set, λ k may be very large, which makes d k very small and reduces the efficiency of algorithm; when { x k } is sufficiently close to the solution set, λ k may be smaller than the machine epsilon and lose its role.
Based on these observations, Fan and Yuan [14] generalized the LM parameter in [13], and proved that the numerical results for choosing λ k = F k is better than choosing λ k = F k 2 . Fan [15] first introduced the regularization factor μ k into the LM method and chose λ k = μ k F k , with numerical results showing that this choice of λ k provides the best performance. However, when { x k } is far away from the solution set, the choice of both LM parameters does not provide good results. Therefore, to avoid this situation, Fan and Pan [16] chose the LM parameter as λ k = μ k ρ ( x k ) , in which μ k is updated by a trust region technique. They defined ρ ( x k ) as a positive function of R n R + , i.e.,
ρ ( x k ) = ρ ˜ ( x k ) , if ρ ˜ ( x k ) 1 , 1 , otherwise ,
where ρ ˜ ( x k ) = O ( F k δ ) . This update strategy can obtain larger LM trial steps, so that the iterative sequence can quickly converge to the solution set when { x k } is far away from the solution set. Amini et al. [17] chose the LM parameter as
λ k = μ k F k 1 + F k .
It is clear that when { x k } is far away from the solution set and F k is very large, F k 1 + F k is close to 1, so λ k is close to μ k . The choice of λ k speeds up the efficiency of the algorithm more than previous LM parameters.
In addition to the above different choices of LM parameters, the introduction of adaptive technology also has a great impact on the LM method. As we all know, the ratio r k between the actual and predicted reductions of the objective function reflects the degree to which the approximate quadratic model approaches the value function. To make more use of information about the ratio, Fan and Yuan [18] proposed an adaptive LM method by selecting λ k + 1 = μ k + 1 F k + 1 δ , μ k + 1 = μ k q ( r k ) , q ( r k ) is a continuous non-negative function about r k , and σ ( 0 , 2 ] . The introduction of q ( r k ) avoids discontinuities when crossing the threshold μ k + 1 μ k of the ratio, and better numerical results can be obtained.
In fact, similar adaptive techniques have been proposed in the trust region algorithms. If r k is sufficiently greater than 1, the iteration is too successful at this time, then we can reduce μ k to a very small value. Then, the algorithm will continue to perform a large number of consecutive unsuccessful iterations. On the other hand, if r k , d k is a far-from-satisfactory trial step, then we can increase μ k greatly. At this moment, the successive iteration points will be close to each other and the algorithm will converge slowly. Therefore, Hei [19] proposed an R-function by using an adaptive update strategy to update the trust region radius Δ k , i.e.,  Δ k + 1 = R ( r k ) Δ k . Furthermore, Walmag and Delhez [20] proposed a Λ -function to update the trust region radius, i.e.,  Δ k + 1 = Λ ( r k ) Δ k , where Λ is a non-negative and bounded function about r k . On this basis, Lu et al. [21] argued that the consistency between the model and the objective function is not good enough in too-successful iterations, so an L-function was proposed to update the trust region radius. They showed that the L-function contains some favorable features of the R-function and the Λ -function, and the method is more efficient in too-successful iterations. In this paper, we want to learn from the presentation of the L-function and provide a new adaptive strategy to update the LM parameter. Our innovations mainly include the following:
◊ A new adaptive accelerated LM method is proposed, which can improve the consistency between the model and the objective function in too-successful iterations by using the ratio information of the actual reduction to the predicted reduction;
◊ The new algorithm can solve the situation in which the iterative sequence is far away from the optimal solution set, accept a large number of unsuccessful iterations and avoid jumping in local areas, thus improving the efficiency and stability of the algorithm;
◊ The new adaptive accelerated LM method has global convergence and quadratic convergence under local error bound.
The rest of this paper is organized as follows. In Section 2, we describe in detail a new adaptive accelerated LM method which makes full use of the ratio information. Furthermore, we demonstrate that the new algorithm has global convergence under the appropriate conditions and maintains quadratic convergence under local error bound condition. In Section 3, numerical results are given, indicating that the new algorithm is efficient. The conclusion is given in the last section.

2. Methodology

2.1. The Adaptive Accelerated Levenberg–Marquardt Method

In this section, our main aim is to discuss how to update the LM parameter to propose a new adaptive accelerated LM method. It is easy to see from (2) that d k is the solution to the optimization problem
min d R n F k + J k d 2 + λ k d 2 ψ k ( d ) .
If
Δ k = ( J k T J k + λ k I ) 1 ( J k T F k ) ,
then d k is also the solution of the subproblem
min d R n F k + J k d 2 φ k ( d ) , s . t . d Δ k .
Therefore, the LM method can be regarded as a trust region method, which implicitly modifies the trust region radius Δ k . The difference between the general trust region method and the LM method is that the LM method does not directly update the trust region radius, but updates the regularization factor μ k .
We define the actual reduction and predicted reduction of the merit function F k 2 at the k-th iteration as
A r e d k = F k 2 F ( x k + d k ) 2
and
P r e d k = φ k ( 0 ) φ k ( d k ) .
The ratio between the actual and predicted reductions of the objective function is defined by
r k = A r e d k P r e d k .
This ratio determines whether the trial step d k is accepted. Here, we choose the LM parameter as
λ k + 1 = μ k + 1 F k + 1 1 + F k + 1 .
The usual empirical rules [22,23,24,25] of μ k + 1 can be usually summarized as follows
μ k + 1 = 4 μ k , if r k < p 1 , μ k , if p 1 r k p 2 , max { μ k 4 , m } , if r k > p 2 ,
where m > 0 and 0 < p 1 < p 2 < 1 are constants.
Iterations with r k greater than p 2 are very successful iterations. In this case, it is usually assumed that the approximation of the model function to the objective function is accurate and μ k should be reduced. However, at too-successful iterations, i.e., r k is sufficiently greater than 1, the consistency between the model and the objective function is not good enough. Thus, we use an adaptive strategy to update the factor μ k + 1 , i.e., μ k + 1 = K ( r k ) μ k , where K ( r k ) is a function about r k .
We construct K ( r k ) as follows:
K ( r k ) = β 1 + ( β 2 β 1 ) exp ( ( r k + p 1 p 1 ) 2 ) , if r k p 1 , β 2 , if p 1 < r k < p 2 , 1 β 3 exp ( p 2 ) 1 exp ( p 2 ) ( 1 β 3 ) exp ( p 2 ) 1 exp ( p 2 ) exp ( r k + p 2 ) 1 2 , if r k p 2 ,
where 0 < β 2 < 1 < β 1 β 3 and 0 < p 1 < p 2 < 1 are constants. Here, K ( r k ) satisfies the following properties   
( 1 ) lim r k K ( r k ) = β 1 ;
( 2 ) lim r k p 1 K ( r k ) = β 2 ;
( 3 ) lim r k p 2 K ( r k ) = 1 2 ;
( 4 ) lim r k + K ( r k ) = 1 β 3 exp ( p 2 ) 1 exp ( p 2 ) 1 2 .   
If we obtain a satisfactory trial step d k and ratio r k , then we accept trial step d k and reduce μ k ; otherwise, we reject trial step d k and increase μ k . At too-successful iterations, the actual reduction of the objective function obtained at iteration k is obviously greater than the predicted reduction. Although the current iteration allows the algorithm to progress towards the optimum, the approximation of the model function to the objective function is bad. Therefore, to avoid reducing μ k too quickly, we use the K-function to update μ k .
According to the properties of the K-function, the rate of μ k reduction is the fastest when r k is close to 1, i.e., when the model function provides an accurate local approximation of the objective function. The new idea we propose is to allow μ k to be updated at a variable rate according to r k , which would improve the efficiency and stability of the algorithm.
Based on the above analysis, we state a description of the new adaptive accelerated LM method (Algorithm 1) as follows.
In Algorithm 1, m is a given lower bound of the parameter μ k . It is introduced to prevent the step from being too large when the sequence is near the solution.
Algorithm 1 NAALM.
0.
Given x 0 R n , μ 0 > m > 0 , 0 p 0 < p 1 < p 2 < 1 , 0 < β 2 < 1 < β 1 β 3 , ε > 0 .
Let k : = 0 .
1.
Compute F k and J k . If J k T F k ε , stop. Otherwise, compute λ k by (9).
2.
Solving the following system
( J k T J k + λ k I ) d = J k T F k
to determine d k .
3.
Compute P r e d k , A r e d k and r k by (6)–(8), respectively.
4.
Set
x k + 1 = x k + d k , if r k p 0 , x k , if r k < p 0 .
5.
Choose μ k + 1 as
μ k + 1 = max { m , K ( r k ) μ k } ,
where K ( r k ) is given by (11). Set k : = k + 1 and go to Step 1.

2.2. The Global Convergence

In this section, to obtain the global convergence of NAALM algorithm, we make the following assumption.
Assumption 1.
F ( x ) is continuously differentiable, F ( x ) and the Jacobian matrix J ( x ) are Lipschitz continuous, i.e., there exist positive constants L 1 and L 2 such that
J ( y ) J ( x ) L 1 y x , x , y R n ,
and
F ( y ) F ( x ) L 2 y x , x , y R n .
Lemma 1.
Let d k be computed by (12), then the inequality
P r e d k J k T F k min d k , J k T F k J k T J k
holds for all k 0 .
Proof. 
From (7), for α [ 0 , 1 ] , we have
P r e d k = F k 2 F k + J k d k 2 F k 2 F k J k α d k J k T F k J k T F k 2 2 α d k J k T F k α 2 d k 2 J k T F k ,
then
P r e d k max 0 α 1 2 α d k J k T F k α 2 d k 2 J k T F k J k T F k min d k , J k T F k J k T J k .
The proof is complete. □
Theorem 1.
Under the conditions of Assumption 1, the sequence { x k } generated by NAALM algorithm satisfies
lim k J k T F k = 0 .
Proof. 
If the theorem is not true, then there exist a positive τ and infinitely many k such that
J k T F k τ .
Let T 1 , T 2 be the sets of all indices that satisfy
T 1 = { k | J k T F k τ }
and
T 2 = { k | J k T F k τ 2 and x k + 1 x k } .
Then, T 1 is an infinite set. In the following, we will derive the contradictions regarding whether T 2 is finite or infinite.
Case (I) T 2 is finite.
It follows from the definition of T 2 that the set
T 3 = { k | J k T F k τ and x k + 1 x k }
is also finite. Let k ˜ be the largest index of T 3 . Then, we know that x k + 1 = x k holds for all k { k > k ˜ | k T 1 } . Define the indices set
T 4 = { k > k ˜ | J k T F k τ and x k + 1 = x k } .
Suppose k T 4 . It is easy to see that J k + 1 T F k + 1 τ . Moreover, we have x k + 2 = x k + 1 . Otherwise, if x k + 2 x k + 1 , then k + 1 T 3 , which contradicts the fact that k ˜ is the largest index of T 3 . Hence, we have k + 1 T 4 . By induction, we know that J k T F k τ and x k + 1 = x k hold for all k > k ˜ .
It now follows from Step 3 of the NAALM Algorithm that r k < p 0 for all k > k ˜ , which imply
μ k + and λ k + ,
due to (12)–(14) and x k + 1 = x k for all k > k ˜ . Hence, we have
lim k d k = 0 .
Furthermore, it follows from (21), (23) and Lemma 1 that
| r k 1 | = A r e d k P r e d k 1 = F k + J k d k 2 F ( x k + d k ) 2 P r e d k = F k + J k d k O ( d k 2 ) + O ( d k 4 ) P r e d k F k + J k d k O ( d k 2 ) + O ( d k 4 ) J k T F k min d k , J k T F k J k T J k O ( d k 2 ) d k 0 ,
that is, r k 1 . In view of the updating rule of μ k , we know that there exists a positive constant m ˜ > m such that μ k < m ˜ holds for all sufficiently large k, which is a contradiction to (22). Hence, the supposition (21) cannot be true while T 2 is finite.
Case (II) T 2 is infinite.
It follows from Lemma 1 that
F 1 k T 2 ( F k 2 F k + 1 2 ) k T 2 p 0 P r e d k k T 2 p 0 J k T F k min d k , J k T F k J k T J k k T 2 p 0 τ 2 ,
which gives
k T 2 d k < + .
The above inequality, together with the Lipschitz conditions (15) and (16), implies that
k T 2 J k T F k J k + 1 T F k + 1 < + .
Relation (27) and the fact that (21) holds for infinitely many k indicate that there exists a k ^ with J k ^ T F k ^ τ such that
k T 2 , k k ^ J k T F k J k + 1 T F k + 1 < τ 2 .
By induction, we obtain that J k T F k τ 2 for all k k ^ . This result and (26) mean that
lim k d k = 0 .
It follows from (12) and (13) that μ k + . By the same analysis as (24), we know that μ k 1 . Hence, there exists a positive constant m ¯ > m such that μ k < m ¯ holds for all large k, which introduces a contradiction. Therefore, the supposition (21) cannot be true when T 2 is infinite. The proof is complete. □

2.3. Local Convergence

In this section, we will study the local convergence properties of the NAALM algorithm by using the singular value decomposition (SVD) technique. We assume that the sequence { x k } generated by the NAALM algorithm converges to the nonempty solution set X * and lies in some neighborhood of x * X * . Firstly, we present some assumptions which the local convergence theory required.
Definition 1.
Let N R n such that N X * ϕ , we say that F ( x ) provides a local error bound on N for (1) if there exists a positive constant c > 0 such that
c dist ( x , X * ) F k , x N ,
where dist ( x , X * ) is the distance from x to X * .
Assumption 2.
(i) F ( x ) is continuously differentiable, and J ( x ) is Lipschitz continuous on N ( x * , b 1 ) with b 1 < 1 , i.e., there exists a positive constant L 1 such that
J ( y ) J ( x ) L 1 y x , x , y N ( x * , b 1 ) = { x | x x * b 1 } .
(ii) F ( x ) provides a local error bound on some neighborhood of x * X * , i.e., there exists a positive constant c 1 > 0 such that
F ( x ) c 1 dist ( x , X * ) , x N ( x * , b 1 ) .
By the Lipschitzness of the Jacobian matrix proposed by (30), we have
F ( y ) F ( x ) J ( x ) ( y x ) = 0 1 J ( x + t ( y x ) ) ( y x ) d t J ( x ) ( y x ) y x 0 1 J ( x + t ( y x ) ) J ( x ) d t L 1 y x 2 ,
and
F ( y ) F ( x ) L 2 y x , x , y N ( x * , b 1 ) ,
where L 2 is a positive constant.
In the following, we use x ¯ k to denote the vector in X * that satisfies
x ¯ k x k = dist ( x k , X * ) , x , y N ( x * , b 1 ) .
To obtain the local convergence rate of x k , we present some lemmas.
Lemma 2.
Under the conditions of Assumption 2, for all sufficiently large k, there exists a constant c 2 > 0 such that
d k c 2 x ¯ k x k .
Proof. 
According to (34), we have
x ¯ k x * x ¯ k x k + x k x * 2 x k x * b 1 ,
which means that x ¯ N ( x * , b 1 ) . Following from (13),
λ k = μ k F k 1 + F k = μ k 1 1 1 + F k m 1 1 1 + c 1 x ¯ k x k = m c 1 x ¯ k x k 1 + c 1 x ¯ k x k ,
and we have from (32) that
F k + J k ( x ¯ k x k ) 2 = F ( x ¯ k ) F k J k ( x ¯ k x k ) 2 L 1 2 x ¯ k x k 4 .
As d k is a minimizer of ψ k ( d ) , we have
d k 2 1 λ k φ k ( d k ) 1 λ k φ k ( x ¯ k x k ) = 1 λ k ( F k + J k ( x ¯ k x k ) 2 + λ k x ¯ k x k 2 ) 1 + c 1 x ¯ k x k m c 1 x ¯ k x k ( L 1 2 x ¯ k x k 4 ) + x ¯ k x k 2 = O ( x ¯ k x k 2 ) ,
then there exists a constant c 2 > 0 such that d k c 2 x ¯ k x k . The proof is completed. □
Lemma 3.
Under the conditions of Assumption 2, for all sufficiently large k, there exists a positive constant M > m such that
μ k M .
Proof. 
First, we show that for sufficiently large k, the following inequality holds
P r e d k = F k 2 F k + J k d k 2 min c 1 2 c 2 , c 1 2 F k d k .
We consider two cases. In one case, if x ¯ k x k d k , then the definition of d k and Assumption 2 imply that
F k F k + J k d k F k F k + J k ( x ¯ k x k ) c 1 x ¯ k x k L 1 x ¯ k x k 2 c 1 2 c 2 d k .
In the other case, if x ¯ k x k > d k , then we have
F k F k + J k d k F k F k + d k x ¯ k x k J k ( x ¯ k x k ) d k x ¯ k x k F k F k + J k ( x ¯ k x k ) d k x ¯ k x k c 1 x ¯ k x k L 1 x ¯ k x k 2 c 1 2 d k .
Inequalities (41) and (42), together with Lemma 2 show that
P r e d k = ( F k + F k + J k d k ) ( F k F k + J k d k ) F k ( F k F k + J k d k ) min c 1 2 c 2 , c 1 2 F k d k ,
which gives (40). Hence, it follows from (40), Assumption 2 and Lemma 2 that
| r k 1 | = A r e d k P r e d k 1 = F k + J k d k O ( d k 2 ) + O ( d k 4 ) P r e d k F k O ( d k 2 ) + O ( d k 4 ) O ( F k d k ) = O ( d k ) 0 .
Therefore, we have r k 1 , thus, there exists a constant M > m such that μ k M for all large k. The proof is completed. □
Without generality, we assume rank ( J ( x * ) ) = r for all x ¯ N ( x * , b 1 ) X * . Suppose the SVD of J ( x ¯ ) is
J ( x ¯ ) = [ U ¯ 1 , U ¯ 2 ] Σ ¯ 1 0 0 0 V ¯ 1 T V ¯ 2 T = U ¯ 1 Σ ¯ 1 V ¯ 1 T ,
where Σ ¯ 1 = diag ( σ ¯ 1 , σ ¯ 2 , , σ ¯ r ) with σ ¯ 1 σ ¯ 2 σ ¯ r > 0 and U ¯ = [ U ¯ 1 , U ¯ 2 ] , V ¯ = [ V ¯ 1 , V ¯ 2 ] are orthogonal matrices. Correspondingly, we consider SVD of J ( x k ) by
J ( x k ) = [ U 1 , U 2 , U 3 ] Σ 1 0 0 0 Σ 2 0 0 0 0 V 1 T V 2 T V 3 T = U 1 Σ 1 V 1 T + U 2 Σ 2 V 2 T ,
where U = [ U 1 , U 2 , U 3 ] , V = [ V 1 , V 2 , V 3 ] are orthogonal matrixes, Σ 1 = diag ( σ 1 , σ 2 , , σ r ) with σ 1 σ 2 σ r > 0 and Σ 2 = diag ( σ r + 1 , σ r + 2 , , σ r + q ) with σ r + 1 σ r + 2 σ r + q > 0 .
Lemma 4.
Under the conditions of Assumption 2, for all sufficiently large k, we have
( a ) U 1 U 1 T F k O ( x ¯ k x k ) ;
( b ) U 2 U 2 T F k O ( x ¯ k x k 2 ) ;
( c ) U 3 U 3 T F k O ( x ¯ k x k 2 ) ;
( d ) F k + J k d k O ( x ¯ k x k 2 ) .
Proof. 
The result ( a ) follows immediately from (16). By (15) and the theory of matrix perturbation [26], we have
diag ( Σ 1 Σ ¯ 1 , Σ 2 , 0 ) J k J ( x ¯ k ) L 1 x ¯ k x k ,
which implies that
Σ 1 Σ ¯ 1 L 1 x ¯ k x k and Σ 2 L 1 x ¯ k x k .
Let s k = J k + F k , where J k + is the pseudo-inverse of J k . It is easy to see that s k is the least-squares solution of min F k + J k s , so we obtain from (32) that
U 3 U 3 T F k = F k + J k s k F k + J k ( x ¯ k x k ) O ( x ¯ k x k 2 ) .
Let J ¯ k = U 1 Σ 1 V 1 T and s ¯ k = J ¯ k + F k . Since s ¯ k is the least-squares solution of min F k + J ¯ k s , it follows from (32) that
( U 2 U 2 T + U 3 U 3 T ) F k = F k + J ¯ k s ¯ k F k + J ¯ k ( x ¯ k x k ) F k + J k ( x ¯ k x k ) + ( J ¯ k J k ) ( x ¯ k x k ) L 1 x ¯ k x k 2 + U 2 Σ 2 V 2 T ( x ¯ k x k ) L 1 x ¯ k x k 2 + L 1 x ¯ k x k x ¯ k x k O ( x ¯ k x k 2 ) .
Due to the orthogonality of U 2 and U 3 , we obtain the result ( b ) .
Using (12) and (45), we obtain
d k = V 1 ( Σ 1 2 + λ k I ) 1 Σ 1 U 1 T F k V 2 ( Σ 2 2 + λ k I ) 1 Σ 2 U 2 T F k ,
and
F k + J k d k = F k U 1 Σ 1 ( Σ 1 2 + λ k I ) 1 Σ 1 U 1 T F k U 2 Σ 2 ( Σ 2 2 + λ k I ) 1 Σ 2 U 2 T F k = λ k U 1 ( Σ 1 2 + λ k I ) 1 U 1 T F k + λ k U 2 ( Σ 2 2 + λ k I ) 1 U 2 T F k + U 3 U 3 T F k .
Following from (13) and (33), the LM parameter satisfies
λ k = μ k F k 1 + F k μ k F k M L 2 x ¯ k x k .
Since { x k } converges to the solution set X * , we assume that L 1 x ¯ k x k σ ¯ r 2 holds for all sufficiently large k. Then, it follows from (46) that
Σ 1 1 1 σ ¯ r L 1 x ¯ k x k 2 σ ¯ r .
It then follows from Lemmas 3 and 4 that
F k + J k d k λ k Σ 1 2 U 1 T F k + U 2 T F k + U 3 U 3 T F k 4 L 2 M x ¯ k x k 2 σ ¯ r 2 + O ( x ¯ k x k 2 ) + O ( x ¯ k x k 2 ) = O ( x ¯ k x k 2 ) .
The proof is completed. □
We can state the quadratic convergence result of the NAALM algorithm.
Theorem 2.
Let the sequence { x k } be generated by the NAALM algorithm, under Assumption 2, the sequence { x k } converges quadratically to a solution of nonlinear Equation (1).
Proof. 
It follows from Assumption 2, Lemma 2 and (47) that
c 1 x ¯ k + 1 x k + 1 F ( x k + 1 ) = F ( x k + d k ) F k + J k d k + O ( d k 2 ) = O ( x ¯ k x k 2 ) .
On the other hand, it is clear that
x ¯ k x k = dist ( x k , X * ) x ¯ k + 1 x k x ¯ k + 1 x k + 1 + d k .
It follows from Lemma 2 that, for any sufficiently large k, we have
x ¯ k x k 2 d k O ( x ¯ k x k ) .
Therefore, d k = O ( x ¯ k x k ) . This, along with (48), indicates that
d k + 1 O ( d k 2 ) ,
which implies that { x k } is quadratically convergent to a solution of set X * . The proof is completed. □

3. Numerical Results

In this section, the numerical performance of NAALM algorithm will be listed. All codes were written in MATLAB R2016b on a PC with 1.19 GHz, 8.00 GB RAM, using Windows 11 operation system. In this section, we will expand on the following two aspects. On the one hand, the effectiveness of the NAALM algorithm is illustrated by comparing it with other algorithms on some test questions. On the other hand, it shows that the NAALM algorithm has good development prospects by applying the algorithm to a fresh agricultural products supply chain problem.

3.1. Some Singular Nonlinear Equations Problems

The test problems are constructed by modifying the nonsingular problems given by Moré et al. [27], which have the following form as [28]:
F ^ ( x ) = F ( x ) J ( x * ) A ( A T A ) 1 A T ( x x * ) ,
where F ( x ) is the standard test function, A R n × k has full column rank with 0 k n and x is a solution of the equation F ( x ) = 0 . According to the definition of F ^ ( x ) , we obtain
J ^ ( x * ) = J ( x * ) ( I A ( A T A ) 1 A T ) ,
where J ^ ( x * ) is Jacobian matrix of F ( x ) at x * with rank n k and F ^ ( x * ) = 0 . In our test problems, some of J ^ ( x * ) are symmetric matrices and some are non-symmetric matrices. Note that some roots of F ^ ( x ) may not be roots of F ( x ) . Similar to [28], we construct two sets of singular problems while J ^ ( x * ) have rank n 1 or n 2 , by choosing
A = [ 1 , 1 , . . . , 1 ] T R n × 1 ,
and
A = 1 1 1 1 . . . 1 1 1 1 1 . . . ± 1 T R n × 2 .
We test our NAALM algorithm on some singular nonlinear equations, and compare it with the self-adaptive Levenberg–Marquardt algorithm (SLM) proposed in [18]. The main differences between these two algorithms are in the updating rule of μ k .
We set p 0 = 10 4 , p 1 = 1 4 , p 2 = 3 4 , β 1 = 5 4 , β 2 = 1 3 , β 3 = 6 5 , m = 10 8 , μ 0 = 10 2 , for all the tests. All test methods are terminated when J k T F k 10 5 . The algorithm is considered to fail when the number of iterations exceeds 500. Considering the global convergence of the algorithms, we run each test problem for five starting points, 10 x 0 , x 0 , x 0 , 10 x 0 and 100 x 0 , where x 0 is given by [28]. For n as a variable, we take n = 500 , n = 1000 , respectively.
The performance profile of two algorithms. including the number of iterations (NI), function evaluations (NF), gradient evaluations (NG) and CPU time (CPU), is analyzed using the profiles of Dolan and Mor e ´ [29]. Let Y and W be the set of methods and test problems, n y , n w be the number of methods and test problems, respectively. The performance profile ψ : R [ 0 , 1 ] is for each y Y and w W defined that a w , y > 0 is NI or NF or NG or CPU required to solve problems w by method y. Furthermore, the performance profile is obtained by
ψ y ( τ ) = 1 n w s i z e { w W : l o g 2 r w , y τ } ,
where τ > 0 , s i z e { · } is the number of the elements in a set, and r w , y is the performance ratio defined as
r w , y = a w , y m i n { a w , y : w W a n d y Y } .
Generally, the method whose performance profile plot is on the top right will represent the best method.
As can be seen from Figure 1, the NAALM algorithm is better than the SLM algorithm in terms of the number of iterations, especially when τ > 2 , the curve of NAALM algorithm becomes stable, which indicates that NAALM algorithm can solve the problem only with fewer iterations. In terms of function evaluations, as shown in Figure 2, the NAALM algorithm curve in τ > 1.75 , it has reached a stable state, while SLM algorithm can reach a stable state only when the curve coincides with that of NAALM algorithm at τ > 2.75 ; Figure 3 shows the performance diagram of the SLM algorithm and the NAALM algorithm in the Jacobian matrix. It can be seen that the NAALM algorithm can successfully solve test problems up to 98%, while SLM can only reach 94%, which shows that the NAALM algorithm can reduce the calculation times of the Jacobian matrix and save the calculation amount. Figure 4 shows the CPU time performance of the NAALM algorithm and the SLM algorithm. It can be seen from the figure that when τ < 4.5 , the curves of the NAALM algorithm and the SLM algorithm are similar, but when τ > 4.5 , both the NAALM algorithm and the SLM algorithm tend to be stable and coincide. Therefore, Figure 1, Figure 2, Figure 3 and Figure 4 show that the accelerated version of the LM algorithm proposed in this paper can not only converge to the solution quickly, but also reduce the computation amount of the Jacobian matrix.

3.2. Supply Chain Optimization Problems

The security and stability of the supply chain has a great impact on promoting high-quality and sustainable development of the economy. Therefore, supply chain has been applied to many fields, such as low-carbon supply chain, manufacturing green supply chain, food trade supply chain. In recent years, with the improvement of living standards, the quality of fresh agricultural products has attracted widespread attention from consumers. In order to meet the demand of consumers for high quality and low price of fresh agricultural products, we use the NAALM algorithm to study how suppliers and retailers make decisions to maximize both their own profits and the total profit of the fresh agricultural products supply chain under the decentralized policy.
In this supply chain, as the leader of Stackelberg game, fresh agricultural product suppliers supply the same variety of ordinary fresh agricultural products (ofp) and green fresh agricultural products (gfp) to retailers as followers, while retailers sell them to consumers. Suppliers need to choose the optimal wholesale price strategy of two fresh agricultural products, and retailers need to choose the optimal retail price strategy of two fresh agricultural products and determine the order quantity of two fresh agricultural products by market demand.
Without considering the impact of emergencies, the market demand for fresh agricultural products is relatively stable, and it is only related to price and freshness. Due to the substitution of the same varieties of ofp and gfp, there is a competitive relationship in the demand market. Based on the demand function theory of alternative price competition, it is assumed that the demand function of two fresh agricultural products is as follows
q i = a b p i θ + r p j θ , i = 1 , 2 , j = 3 i ,
where q 1 , q 2 represent the market demand of gfp and ofp, respectively, a represents the total potential market capacity of fresh agricultural products, p 1 , p 2 represent the retail price of gfp and ofp, respectively, b is the price sensitivity coefficient, r is the competitive substitution coefficient of the two products, and it satisfies b > r > 0 , θ ( 0 θ 1 ) is the freshness of fresh produce when it arrives at the retailer’s store.
Under the decentralized policy, we regard suppliers and retailers as independent entities, and both with the goal of maximizing their respective interests. Now, the profit function of fresh agricultural products retailer is as follows
max p 1 , p 2 π s = ( p 1 w 1 ) ( a b p 1 θ + r p 2 θ ) + ( p 2 w 2 ) ( a b p 2 θ + r p 1 θ ) ,
where w 1 , w 2 represent the supply price of gfp and ofp, respectively, and the profit function of fresh agricultural products suppliers is as follows
max w 1 , w 2 π R = ( w 1 c 1 1 β ) ( a b p 1 θ + r p 2 θ + ( w 2 c 2 1 β ) ( a b p 2 θ + r p 1 θ ) ,
where β ( 0 < β < 1 ) represents the quantity loss of fresh produce when it reaches the retailer’s store, c 1 , c 2 represents the unit production cost of gfp and ofp, respectively. Obviously, p 1 > p 1 > 0 and c 1 > c 2 > 0 . We record the total profit of fresh agricultural products supply chain as follows:
π T = π s + π R .
With reference to the setting of the parameters in the relevant literature [30], we set: a = 50 , b = 2 , c 1 = 4 , c 2 = 2 , r = 1.5 , β = 0.2 , θ = 0.85 . These values satisfy the theoretical proof in [30] and can guarantee that the optimal value has practical significance. Now, we transform the unconstrained optimization problem (51) into a nonlinear equation problem, and then choose different initial points and use the NAALM algorithm to solve the nonlinear equation problem.
As can be seen from Table 1, with certain parameters, the NAALM algorithm can be used to solve the optimization problem, so as to obtain the optimal pricing strategy with maximum profit in the supply chain led by suppliers under the decentralized policy. In addition, the global convergence and robustness of the NAALM algorithm are verified according to different initial values and the number of iterations.

4. Conclusions

We constructed a new function that makes full use of the ratio information to update LM parameters adaptively. Based on this new LM parameter, we presented an adaptive accelerated Levenberg–Marquardt method for solving nonlinear equations. Furthermore, we showed the global convergence analysis of the proposed algorithm. Furthermore, the quadratic convergence is also obtained under the local error bound condition. Numerical experiments demonstrated that our method has good numerical performance. In addition, the application of the NAALM algorithm to a supply chain problem showed that the new algorithm has a good application prospect. We further highlight that the proposed NAALM algorithm can be used in other fields, such as the symmetric system of nonlinear equations. It is vital to note that the method’s convergence analysis in Hölderian local error bound condition will be taken into account in our future work.

Author Contributions

Conceptualization, R.L., M.C. and G.Z.; methodology, R.L., M.C. and G.Z.; software, R.L., M.C. and G.Z. validation, R.L., M.C. and G.Z.; formal analysis, R.L., M.C. and G.Z.; investigation, R.L., M.C. and G.Z.; resources, R.L., M.C. and G.Z.; data curation, R.L., M.C. and G.Z.; writing—original draft preparation, R.L., M.C. and G.Z.; writing—review and editing, R.L., M.C. and G.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the key project of natural science foundation joint fund of Jilin Province (YDZJ202101ZYTS167, YDZJ202201ZYTS303); the project of education department of Jilin Province (JJKH20210030KJ, JJKH20230054KJ); the graduate innovation project of Beihua University (2022033, 2021002); the youth science and technology innovation team cultivation program of Beihua University.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the anonymous referees and editor for reading this paper carefully, providing valuable suggestions and comments, which grately improved the final version.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Musa, Y.B.; Waziri, M.Y.; Noor, M.A. An efficient method for solving system for nonlinear equation. J. Math. Anal. 2022, 13, 1–10. [Google Scholar]
  2. Ribeiro, S.; Lopes, L.G. Overview and computational analysis of PSO variants for solving systems of nonlinear equations. Commun. Intell. Syst. 2022, 461, 1093–1105. [Google Scholar]
  3. Ji, J.Y.; Wong, M.L. Decomposition-based multiobjective optimization for nonlinear equation systems with many and infinitely many roots. Inf. Sci. 2022, 610, 605–623. [Google Scholar] [CrossRef]
  4. Artacho, F.J.A.; Fleming, R.; Vuong, P.T. Accelerating the DC algorithm for smooth functions. Math. Program. 2018, 169B, 95–118. [Google Scholar] [CrossRef] [Green Version]
  5. Sabi’u, J.; Muangchoo, K.; Shah, A.; Abubakar, A.B.; Aremu, K.O. An inexact optimal hybrid conjugate gradient method for solving symmetric nonlinear equations. Symmetry 2021, 13, 1829. [Google Scholar] [CrossRef]
  6. Sabi’u, J.; Muangchoo, K.; Shah, A.; Abubakar, A.B.; Jolaoso, L.O. A modified PRP-CG type derivative-free algorithm with optimal choices for solving large-scale nonlinear symmetric equations. Symmetry 2021, 13, 234. [Google Scholar] [CrossRef]
  7. Niri, T.D.; Heydari, M.; Hosseini, M.M. Correction of trust region method with a new modified Newton method. Int. J. Comput. Math. 2022, 97, 1–15. [Google Scholar]
  8. Bellavia, S.; Morini, B.; Rebegoldi, S. On the convergence properties of a stochastic trust-region method with inexact restoration. Axioms 2023, 12, 38. [Google Scholar] [CrossRef]
  9. Zheng, L.; Chen, L.; Ma, Y.F. A variant of the Levenberg-Marquardt method with adaptive parameters for systems of nonlinear equations. AIMS Math. 2021, 7, 1241–1256. [Google Scholar] [CrossRef]
  10. Yudin, N.E. Adaptive Gauss-Newton method for solving systems of nonlinear equations. Dokl. Math. 2021, 104, 293–296. [Google Scholar] [CrossRef]
  11. Levenberg, K. A method for the solution of certain nonlinear problems in least squares. Quart. Appl. Math. 1944, 2, 164–166. [Google Scholar] [CrossRef] [Green Version]
  12. Marquardt, D.W. An algorithm for least-squares estimation of nonlinear inequalities. SIAM J. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
  13. Yamashita, N.; Fukushima, M. On the Rate of Convergence of the Levenberg-Marquardt Method. Computing 2001, 15, 239–249. [Google Scholar]
  14. Fan, J.Y.; Yuan, Y.X. On the Convergence of a New Levenberg-Marquardt Method; Report No. 005, AMSS; Chinese Academy of Sciences: Beijing, China, 2001. [Google Scholar]
  15. Fan, J.Y. A modified Levenberg-Marquardt algorithm for singular system of nonlinear equation. J. Comput. Math. 2003, 21, 625–636. [Google Scholar]
  16. Fan, J.Y.; Pan, J.Y. A note on the Levenberg-Marquardt parameter. Appl. Mathe. Comput. 2009, 207, 351–359. [Google Scholar] [CrossRef]
  17. Amini, K.; Rostami, F.; Caristi, G. An efficient Levenberg-Marquardt method with a new LM parameter for systems of nonlinear equations. Optimization 2018, 67, 637–650. [Google Scholar] [CrossRef]
  18. Fan, J.Y.; Yuan, Y.X. Convergence properties of a self-adaptive Levenberg-Marquardt algorithm under local error bound condition. Comput. Optim. Appl. 2006, 34, 47–62. [Google Scholar] [CrossRef]
  19. Hei, L. A self-adaptive trust region algorithm. J. Comput. Math. 2003, 21, 229–236. [Google Scholar]
  20. Walmag, J.M.B.; Delhez, E.J.M. A note on trust-region radius update. Siam J. Optim. 2005, 16, 548–562. [Google Scholar] [CrossRef]
  21. Lu, Y.L.; Li, W.Y.; Cao, M.Y.; Yang, Y.T. A novel self-adaptive trust region algorithm for unconstrained optimization. J. Appl. Math. 2014, 2014, 1–8. [Google Scholar] [CrossRef] [Green Version]
  22. Amini, K.; Rostami, F. A modified two steps Levenberg-Marquardt method for nonlinear equations. J. Comput. 2015, 288, 341–350. [Google Scholar] [CrossRef]
  23. He, X.R.; Tang, J.Y. A smooth Levenberg-Marquardt method without nonsingularity condition for wLCP. AIMS Math. 2022, 7, 8914–8932. [Google Scholar] [CrossRef]
  24. Fan, J.Y. Accelerating the modified Levenberg-Marquardt method for nonlinear equations. Math. Comput. 2014, 83, 1173–1187. [Google Scholar] [CrossRef]
  25. Chen, L. A high-order modified Levenberg-Marquardt method for systems of nonlinear equations with fourth-order convergence. Appl. Math. Comput. 2016, 285, 79–93. [Google Scholar] [CrossRef]
  26. Stewart, G.W.; Sun, J.G. Matrix Perturbation Theory; Academic Press: San Diego, CA, USA, 1990. [Google Scholar]
  27. More, J.J. Recent developments in algorithms and software for trust region methods. Math. Program. 1983, 85, 258–287. [Google Scholar]
  28. Schnabel, R.B.; Frank, P.D. Tensor methods for nonlinear equations. SIAM J. Numer. Anal. 1984, 21, 815–843. [Google Scholar] [CrossRef] [Green Version]
  29. Dolan, E.; More, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2022, 91, 201–213. [Google Scholar] [CrossRef]
  30. Wen, H. Research on Profit Maximization Strategy of Fresh Agricultural Products Supply Chain under Different Dominated Subjects. Ph.D. Thesis, Huazhong Agricultural University, Wuhan, China, 2020. [Google Scholar]
Figure 1. Performance profiles for the iterations.
Figure 1. Performance profiles for the iterations.
Symmetry 15 00588 g001
Figure 2. Performance profiles for the function evaluations.
Figure 2. Performance profiles for the function evaluations.
Symmetry 15 00588 g002
Figure 3. Performance profiles for the gradient evaluations.
Figure 3. Performance profiles for the gradient evaluations.
Symmetry 15 00588 g003
Figure 4. Performance profiles for the CPU time.
Figure 4. Performance profiles for the CPU time.
Symmetry 15 00588 g004
Table 1. The optimal solution corresponding to different initial points by NAALM.
Table 1. The optimal solution corresponding to different initial points by NAALM.
Initial Pointp1p2w1w2q1q2πT
(1;1;1;1)45.026643.720565.06564.32910.581813.24791.4587 × 103
(10;10;10;10)44.964843.746064.99164.35810.516313.26071.4587 × 103
(30;30;30;30)44.992043.749664.95664.37610.582513.30371.4587 × 103
(50;50;50;50)45.015843.752565.06264.37210.512513.30361.4587 × 103
(100;100;100;100)44.975143.755365.95564.26910.522213.26791.4587 × 103
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, R.; Cao, M.; Zhou, G. A New Adaptive Accelerated Levenberg–Marquardt Method for Solving Nonlinear Equations and Its Applications in Supply Chain Problems. Symmetry 2023, 15, 588. https://doi.org/10.3390/sym15030588

AMA Style

Li R, Cao M, Zhou G. A New Adaptive Accelerated Levenberg–Marquardt Method for Solving Nonlinear Equations and Its Applications in Supply Chain Problems. Symmetry. 2023; 15(3):588. https://doi.org/10.3390/sym15030588

Chicago/Turabian Style

Li, Rong, Mingyuan Cao, and Guoling Zhou. 2023. "A New Adaptive Accelerated Levenberg–Marquardt Method for Solving Nonlinear Equations and Its Applications in Supply Chain Problems" Symmetry 15, no. 3: 588. https://doi.org/10.3390/sym15030588

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop