Efficient Parameters Estimation Method for the Separable Nonlinear Least Squares Problem

In this work, we combine the special structure of the separable nonlinear least squares problem with a variable projection algorithm based on singular value decomposition to separate linear and nonlinear parameters. -en, we propose finding the nonlinear parameters using the Levenberg–Marquart (LM) algorithm and either solve the linear parameters using the least squares method directly or by using an iteration method that corrects the characteristic values based on the L-curve, according to whether or not the nonlinear function coefficient matrix is ill posed. To prove the feasibility of the proposedmethod, we compared its performance on three examples with that of the LM method without parameter separation. -e results show that (1) the parameter separation method reduces the number of iterations and improves computational efficiency by reducing the parameter dimensions and (2) when the coefficient matrix of the linear parameters is well-posed, using the least squares method to solve the fitting problem provides the highest fitting accuracy. When the coefficient matrix is ill posed, the method of correcting characteristic values based on the L-curve provides the most accurate solution to the fitting problem.


Introduction
e separable nonlinear least squares problem is a special case of nonlinear least squares, and its model can be expressed as a linear combination of nonlinear functions. More generally, one can also consider that there are two sets of unknown parameters, where one set is dependent on the other and can be explicitly eliminated. is problem was first proposed in parameter estimation of the atomic physics particle half-life formula developed by Golub and Pereyra in 1973 [1]. In real life, models of this type are very common. For example, in the machine learning community, neural networks, their numerous variants [2,3], and some neuro fuzzy systems [4,5] take the form of a linear ensemble of some nonlinear basis functions. In the field of signal processing, Prony's method [6,7], which takes the sum of complex exponentials, is frequently used to analyze the frequency components of a signal. In the field of algorithm application, waveform data decomposition is one of the key steps in processing based on airborne full-waveform light detection and ranging (LiDAR) data. e full waveform can be decomposed into a linear combination of multiple Gaussian functions. rough waveform data decomposition, discrete point cloud and waveform parameter information can be obtained [8]. Furthermore, this approach has many applications in areas such as mechanical systems [9], telecommunications [10], robotics [11], and environmental sciences [12]. ese applications can be viewed as nonlinear data-fitting problems in terms of numerical expression. However, data-fitting problems are often quite challenging numerically. Fortunately, by exploiting the special structure of separable nonlinear models, efficient algorithms can be obtained.
Based on the special structure of the separable nonlinear least squares problem, Golub and Pereyra [1] proposed the variable projection (VP) algorithm to eliminate linear parameters and obtain simplified functions involving only nonlinear parameters and also used the Levenberg-Marquart (LM) algorithm for its solution. Dimensionality reduction of parameter space improves the possibility of obtaining a global optimal solution [13]. erefore, improvement and application [14] of VP have been conducted on that basis. Kaufman [15] proposed a modified VP algorithm based on trapezoidal decomposition to simplify the calculation, which reduced its computational complexity and improved computational efficiency. Subsequently, Ruano et al. [16] proposed an improved VP algorithm based on QR decomposition for the sparse case of nonlinear function matrix, which also effectively improved operation efficiency. Further, Gan and Li [17] proposed a VP algorithm based on Gram-Schmidt matrix decomposition for the case in which the number of observations is much larger than the number of linear parameters, which reduces the amount of calculation required.
In this study, in view of the ill-posed condition of a nonlinear function matrix, singular value decomposition (SVD) [18] is adopted to simplify the VP algorithm and improve the stability of the calculation. en, after linear parameters are eliminated by the improved VP algorithm, the separable least squares problem is transformed into an optimization problem containing only nonlinear parameters [19]. As regards an estimation method for nonlinear parameters, Liu et al. [20] combined this with sequential quadratic programming (SQP), developing a gradient-based optimization algorithm to determine the optimal time-delays and system parameters in a novel dynamic optimization problem for nonlinear multistage systems with time delays. However, this approach was restricted to parameter identification problems. e common nonlinear least squares iterative solutions [21] are the gradient descent method [22], Gauss-Newton method [23], and Levenberg-Marquart (LM) method [24,25]. For example, in [26], the nonlinear least squares problem of the distributional robust parameter identification model for time-delay systems is transformed into a single-level optimization problem and a gradient-based optimization method is developed to solve the transformed problem. e method only involves the first-order moment information and is simple to calculate. To obtain robust estimates against the noise in measurements, Liu et al. [27] proposed a robust estimation formulation, in which the cost function was the variance of the error function and an additional constraint indicates an allowable sacrifice from the optimal expectation value of the classical estimation problem. On this basis, a gradient-based optimization algorithm to numerically solve the classical and robust parameter estimation problems was developed. is is more efficient than the existing methods used for problems, where optimization parameters outnumber constraints. is method involves simple calculations, but its convergence speed is generally slower than that of the Gauss-Newton method.
Torres et al. [28] used the sequentially semiseparable matrix to calculate the Jacobian matrix [29] and Hessian matrix [30], employing the Gauss-Newton method to optimize the output error of the global system. e effectiveness of the algorithm was verified by numerical examples. Bellavia et al. [31] improved the approximation function by controlling the accuracy level when the accuracy was too low to be optimized, and then proposed the LM method based on the dynamic precision relationship between the evaluation function and gradient for solving large-scale nonlinear least squares problems. ey proved the global and local convergence and complexity of this method. e Gauss-Newton method has the advantages of fast convergence and high precision. However, the Jacobian matrix is required to be of full rank in the iterative process. If the problem has high nonlinearity or the residual is large, the method may not produce convergence. erefore, the LM algorithm overcomes this shortcoming and adjusts the damping parameters according to the idea of a trust region, to effectively control the direction of iterative descent. Once the nonlinear parameters were determined, the least squares (LS) method [32] is used to estimate the linear parameters.
In view of a general nonlinear multistage system with time-delay and system parameters, Liu et al. [33] proposed a new parameter estimation formulation, in which the cost function is the variance of the error function and the constraint indicates an allowable sacrifice from the optimal expectation value of the classical parameter estimation problem. is parameter estimation approach is capable of solving parameter estimation problems with multiple stages and multiple time delays and, compared with classical parameter estimation, it is able to withstand the uncertainty in the distribution of measurement data. Nevertheless, this method has the limitation of relying on the statistical distribution of the noisy measurement output.
To enhance the estimation accuracy, Ding et al. [34,35] presented a filtering-based gradient iterative algorithm and a filtering-based least squares iterative algorithm, which resulted in improved convergence speed. However, when nonlinear parameter estimation led to the ill condition of the linear least squares coefficient matrix, the estimation results of the linear parameters obtained by the LS method were unstable and even sometimes significantly different from the true values. For this problem, the regularization method is often used to solve for ill-posed problems in linear parameter estimation [36], such as the Tikhonov regularization method [37], truncated singular value method [38], and iteration by correcting characteristic values [39,40].
In addition, Chen et al. [41] proposed a weighted generalized crossvalidation method to determine Tikhonov regularization parameters for the regularization of separable nonlinear least squares ill-posed problems based on the VP algorithm and verified its effectiveness experimentally. Aiming at the randomness of parameter selection in the iteration by correcting characteristic values in the process of linear least squares solution, Zhai et al. [42] constructed the L-curve [43,44] of the relationship between the norm of the parameter solution and the residual. ey selected the maximum curvature point as the regularization parameter and verified the correctness of the method through numerical experiments. e existing literature presents several effective solutions for the parameter estimation problem, but only a few studies have been conducted on the structural transformation of separable nonlinear models. In this study, we 2 Complexity consider the special structure of the separable nonlinear least squares problem, separating two types of parameters using a VP algorithm based on SVD. We then use the LM algorithm to estimate the nonlinear parameters while using the LS method or iteration by correcting characteristic values based on the L-curve method. By comparing the experimental results of parameter estimation of the Gauss function fitting model and fractional fitting model with those of the LM method without parameter separation, the validity of the VP algorithm based on SVD is evaluated and the accuracy of different linear parameter estimation methods is analyzed. e remainder of this paper is outlined as follows. In Section 2, based on an improved VP algorithm derived from SVD, the methods of nonlinear parameter estimation and linear parameter estimation are explained and the algorithm for solving separable nonlinear least squares problems is provided. In Section 3, the Gaussian function fitting model and the fractional fitting model experiments are used to compare and analyze the proposed method with the traditional LM algorithm without unseparated parameters. Finally, we present our conclusions in Section 4.

Variable Projection Algorithm Based on SVD.
Consider a set of observations (t i , y i )(i � 1, 2, . . . , m). e problem of parameter estimation is to find the optimal parameters a � (a 1 , a 2 , . . . , a p ) Τ and b � (b 1 , b 2 , . . . , b q ) Τ when the following formula reaches the minimum value: where φ j (b; t i )(i � 1, 2, . . . , m; j � 1, 2, . . . , p) is a nonlinear function and p and q are the number of linear and nonlinear parameters to be estimated, respectively. e above formula can be written in a matrix form as where the column of Φ(b) corresponds to the nonlinear function φ j (b; t) associated with the parameter b and ‖·‖ 2 denotes the Euclidean norm. Let For the given nonlinear parameters b, the linear parameters a can be estimated by solving the following linear least squares problem: where P Φ(b) is the orthogonal projector on the linear space spanned by the columns of Φ(b) and P ⊥ Φ(b) is the projector on the orthogonal complement of the column space.
To simplify the calculation, the matrix Φ(b), which is composed of nonlinear functions, is decomposed by SVD: where U is an m × m orthogonal matrix, S is an m × p diagonal matrix, and V is a p × p orthogonal matrix. We obtain Φ + (b) as en, Suppose the rank of the matrix Φ(b) is r, the first r elements on the diagonal of S are not zero, i.e., r ≤ p. U can be divided into m × r matrix U 1 and m × (m − r) matrix U 2 . V can be divided into p × r matrix V 1 and p × (p − r) matrix V 2 . en, e matrix composed of m residual functions is simplified to the following equation by VP based on SVD: en, the objective function of the separable nonlinear least squares problem is simplified to

Theorem 1. We assume that in the open set
among the minimizing pairs of f(a, b), then a must satisfy (4).
where ⊕ signifies a direct sum. Assume that b is a global minimizer of f(b) in Ω and a is defined by (4); then, Because in Ω, and part (a) of the theorem is proved. ) is a global minimizer, we must have equality. If there is a unique a among the minimizers of f(a, b), then a * ≡ a. We still have that b is a global mini- is completes the proof.

Nonlinear Parameter Estimation Using the LM Algorithm.
For a separable least squares problem with only nonlinear parameters, the LM algorithm is adopted for the solution and the nonlinear parameters b is updated as where b k is current nonlinear parameter vector. α k is a small step length that ensures the decrease of the objection function (11), which is calculated by an imprecise search method, such as line search, in which we let m k be the smallest nonnegative integer m satisfying f( where d k is search direction, which can be determined by the following equation: where J k ∈ R m×p is the Jacobian matrix of F(b). Kaufman [15] formulated an explicit analytic expression for the Jacobian. is ensures the efficiency and reliability of the VP algorithms. In addition, combined with the simplified objective function by SVD, the Jacobian matrix J k of the residual vector F is expressed as

Complexity
where μ k is damping parameter. It is adjusted with a strategy similar to the trust region radius [45], and a quadratic function is defined at the current iteration point as follows: en, the incremental ratio of v(x k + d k ) to the objective function is calculated: When η k is close to one, the fitting between the quadratic function v(d) and objective function is good at point x k . LM is used to solve the nonlinear least squares problem. e parameter μ should be smaller; that is, in this case, the Gauss-Newton method is more effective. When it is close to zero, the fitting between the quadratic function v(d) and objective function is poor at point x k . LM is used to solve the nonlinear least squares problem, and the parameter μ should be larger. When η k is neither close to zero nor one, then μ k is suitable and does not need to be adjusted. e critical values of η are usually 0.25 and 0.75, and the adjustment rules of μ are as follows.
When η k is close to one, the fitting between quadratic function v(d) and the objective function is good at point x k . e parameter μ should be smaller when the LM algorithm is used to solve the nonlinear least squares problem. When η k is close to zero, the fitting between the quadratic function v(d) and the objective function is poor at point x k . e parameter μ should be larger when LM is used to solve the nonlinear least squares problem. When η k is neither close to zero nor one, μ is suitable and does not need to be adjusted.
e critical values are usually 0.25 and 0.75, and the adjustment rules of μ are as follows:

Linear Parameter Estimation by Correcting Characteristic
Value Based on L-Curve. e nonlinear parameter estimations b are solved in Section (2); the linear parameters a can then be calculated by (4).
is a nonsingular matrix, the least squares method is used to calculate it directly. However, when Φ Τ (b)Φ(b) is singular or the condition number of Φ(b) exceeds 100, solution of the linear least squares method is unstable or cannot be solved. erefore, it needs to be solved by the regularization method. In this study, an iteration method that corrects characteristic values based on the L-curve is used to solve the problem. e least squares normal equation for the linear parameter estimation is Adding λa to both sides of (21), we obtain and W � Φ(b)y, the step of the iteration method that corrects the characteristic value is where a k is the current nonlinear parameter vector and λ is a regularization parameter selected according to the L-curve method. e L-curve describes the relationship between the norm of the regularized solution ‖a(λ)‖ 2 2 and the residual norm ‖y − Φ(b)a(λ)‖ 2 2 , which corresponds to each set of regularized parameter values. e convergence of the spectral correction iterative method based on the L-curve is proved as follows.
Let the initial estimated value of a be a 0 , e iterative calculation process of spectral correction can then be written as When i � m, m is the number of inadequate iterations of the spectral modified iteration method; therefore, the relationship between estimated value a m and iteration initial value a 0 is When i � j, j is the number of sufficient iterations of the spectral modified iterative method, and m < j; therefore, the relationship between estimated value a m and full iteration result a 0 is According to (23), the relation between the estimated result a j of spectral correction and the initial iteration value a 0 is as follows: Complexity 5 In the case of Rank(N) � t, N is the t th order positive definite matrix. An orthogonal matrix of order t is Q that follows the expression where D � diag(σ 1 , σ 2 , . . . , σ t ), σ 1 ≥ σ 2 ≥ · · · ≥ σ t > 0, is the eigenvalue of N. Let D � diag(σ 1 + λ, σ 2 + λ, . . . , σ t + λ), then According to the convergence of the matrix sequence, it can be known that, in the case of ‖N LP ‖ < 1, matrix Matrix power series ∞ i�0 N LP · · · N LP √√√√ √√√√ i is absolutely convergent, and the sum is at is, in the case of Rank(N) � t, the spectral correction iterative method based on L-curve converges to the least squares solution for any initial value. erefore, the relationship between the estimated result a m and the iteration termination value a j can be further rewritten as In the case of Rank(N) < t, the initial value of a 0 is equal to 0 in the spectral correction iterative method. Further, lim i⟶∞ a i � N − 1 W. erefore, this method is convergent with the least squares method and is also feasible.

Summarization of the Algorithm.
According to the estimation methods of linear parameters and nonlinear parameters proposed in Sections 2.1-2.3, the separable nonlinear least squares solution method-in which nonlinear parameters in the nonlinear least squares problem with SVD variable projection separation are solved with the LM algorithm and linear parameters are solved directly with the least squares method-is referred to as LM VP + LS. When the linear parameters are solved using the iteration method that corrects characteristic values based on the L-curve, the method is referred to as LM VP + CCVL. e traditional LM method without separation of parameters is referred to as LM unSep . e steps for solving the separable nonlinear least squares problem are summarized as follows.
A flowchart of Algorithm 1 is presented in Figure 1.

Numerical Examples
In this study, we used three examples, the Gaussian function fitting model, the fractional fitting model, and decomposition of full-waveform LiDAR data, to verify the method proposed in Section 2 for solving separable nonlinear least squares problems. e results were then compared with the LM algorithm with unseparated parameters with respect to iteration times, function calculation times, and fitting accuracy. e experiments were performed using MATLAB 2016b on a 2.3-GHz desktop PC running Windows 10.
Step 2: calculate the matrix Φ(b k ) of the nonlinear function and Jacobian matrix J k of the objective function after VP, using (17).
Step 3: calculate the iteration step length and iteration direction using (15) and (16), and then update the nonlinear parameters using (14).
Step 5: calculate linear parameters when Φ Τ (b)Φ(b) is a nonsingular matrix. e least squares method is used to calculate this directly; when Φ Τ (b)Φ(b) is singular or the condition number of Φ(b) exceeds 100, the solution is obtained using (23).   Table 1, which are obtained by taking the average of 10 times. e fitting curve is shown in Figure 2, and the difference between the parameter estimation and true value is shown in Figure 3. Because the estimated parameters obtained by the three methods do not differ significantly, and the fitting curves are basically coincident, only one fitting curve is drawn.
We can see from Table 1 that the results of the three methods are completely consistent in their estimation of the nonlinear parameters (μ and θ). For the estimation of linear parameters, the results obtained by the LM unSep and LM VP + LS methods are identical and the results obtained by the LM VP + CCVL method deviate slightly from the true values compared to those obtained by the other methods. From Figure 2, we can see that the parameters estimated by the three methods can fit the observations very well and that their fitting curves are basically the same. Figure 3 shows that the difference value is closer to zero, and the parameter valuation is closer to the truth value. erefore, the parameter estimation results of the LM unSep and LM VP + LS methods may be taken to be identical and the difference value is small.

Complexity
To better understand the comparison of the parameter estimation results obtained using the three methods, Table 2 lists the maximum, minimum, and average values of the sum of squares of residuals between all the parameter estimation values and the true values (all_SSR), which were calculated ten times. Because the latter two methods eliminate linear parameters using the VP algorithm, the sum of squares of residuals between the nonlinear estimated parameters and the true value (nonpar_SSR) are also listed in Table 2.
We compare the fitting residual sum of squares and calculation process of the three methods in Table 3. Table 2 indicates that, in terms of numerical value, the all_SSR values obtained by LM VP + LS and LM unSep methods are equal. e results of nonpar_SSR using LM VP + CCVL are equal to those of the other two methods; however, the results of all_SSR are higher, indicating that the estimation of linear parameters is worse than that of the other two methods. Table 3 indicates that the sum of squares of residual errors between the predicted and true values obtained using LM VP + CCVL is the largest, and those of the other two methods are equal. e mean square errors of the LM unSep , LM VP + LS, and LM VP + CCVL methods are 0.0107, 0.0107, and 0.0248, respectively, which are relatively small, indicating that the results obtained by all three methods are reliable. Compared with the number of iterations and average number of calculations of the function, the method based on VP exhibits a considerable reduction compared with the method in which the parameters are not separated. e residual change of the objective function in the iterative process is shown in Figure 4.
According to Figure 4, in the case of Gaussian function fitting, on the whole, the residual changes of the three methods are all reduced and the estimated results of the final parameters are very close to the truth value, suggesting that the three methods are convergent.
In addition, according to the structural characteristics of the model, when the nonlinear parameters are fixed in the LM VP + LS and LM VP + CCVL methods, the nonlinear function matrix is full rank and the number of conditions is less than five for all ten calculations; thus, there is no illcondition problem.
Based on the experimental results, we draw two conclusions: (1) Because there is no rank deficiency or ill posing in the matrix of the nonlinear function, the LS method helps to attain more accurate linear parameters directly than the regularization method. (2) e LM VP + LS method eliminates the linear parameters using the VP algorithm based on SVD, reduces the dimension of the parameters to be estimated, and improves the efficiency of the iteration process. e number of iterations required in the LM VP + LS method is less than that for the LM unSep method and produces the best effect under the same precision.

Example 2: Fractional Fitting Model.
e fractional fitting model is also a commonly used curve-fitting method. e model is described by the following expression: where a � (a 1 , a 2 , a 3 , a 4 , a 5 , a 6 ) Τ is a linear parameter vector to be estimated and b � (b 1 , b 2 , b 3 , b 4 , b 5 , b 6 , b 7 , b 8 ) Τ is the nonlinear parameter vector. e model in (35) is written in the form of a matrix as follows: where Φ(u, θ; t) � [φ 1 , φ 2 , . . . , φ 6 ] and φ i (i � 1, 2, . . .  are solved using the LM unSep , LM VP + LS, and LM VP + CCVL methods. e parameter estimates obtained by taking the average of the ten calculations are listed in Table 4. Because the estimated parameters obtained by the three methods do not differ significantly, and the fitting curves are basically coincident, only one fitting curve is drawn. e fitting curve is shown in Figure 5. Table 4 and Figure 5 indicate that the three methods produce similar true values for the nonlinear parameters. e calculation results in the LM VP + CCVL and LM VP + LS methods are the same because the separation of the objective function is equal, and the nonlinear parameters are all estimated by the LM algorithm. When the LM VP + LS method is used, significant deviation is observed between the true values of the linear parameters. Combined with Figure 5, the fitting curves of the three methods are similar, which shows that the LM VP + LS method cannot obtain the optimal parameters even though it satisfies the least squares principle in solving the linear parameters. is is because the condition number of the nonlinear function matrix is considerably greater than 100 in the process of solving the separation problem after VP. e number of conditions is above 10 7 for all ten calculations, and these were seriously illposed problems. Because the differences in the linear parameters solved by the LM VP + LS method are much higher than those of the nonlinear parameters, the original difference was reduced by 10 4 and then plotted. e differences between the parameter estimation solutions of the three methods and the true value are shown in Figure 6. Figure 6 indicates that the difference between the parameter estimation and the true value obtained by the LM VP + CCVL method is the smallest among the three methods, that is, the result is closest to the true values. e    To better understand the comparison of the parameter estimation results of the three methods, Table 5 lists the maximum, minimum, and average values of the sum of squares of residuals between all the parameter estimation values and the true values (all_SSR) obtained from the ten calculations. Because the latter two methods eliminate linear parameters using the VP algorithm, the sums of squares of residuals between nonlinear estimated parameters and the true value (nonpar_SSR) are also listed in Table 5. Table 5 indicates that the sum of the residuals of all the parameters using the LM VP + LS method is much larger than the other two methods. e main reason is that the matrix of the nonlinear function is seriously ill posed, and the results of direct solution using the least squares method are not the optimal parameters. erefore, the LM VP + CCVL method using the iteration method by correcting characteristic values based on the L-curve obtained the optimal estimates that are similar to the true values. e sum of the residual squares of parameter estimates and true values is the smallest. e overall result is also superior to the LM unSep method when the parameters are not separated. For the sum of residual squares of nonlinear parameters, the LM VP + CCVL and LM VP + LS methods exhibit the same results, which are lower than the values obtained by the LM unSep method.
A comparison of the three methods in terms of model fitting and calculation processes is presented in Table 6. Table 6 indicates that in terms of the sum of squared residues between the predicted value and the true value, irrespective of the maximum, minimum, or average value, the lowest values are obtained when the LM VP + CCVL method is used, followed by the LM unSep method; highest values are obtained using the LM VP + LS method. In terms of the mean square error, the LM VP + CCVL method provides the lowest values. e number of iterations and average number of functions considerably decrease for the VP algorithm; the average number of iterations was reduced from 161.4 to 20.8 and the average calculation number of functions was reduced from 2516.2 to 207.7. e residual change in the objective function of the iterative process is shown in Figure 7.
According to Figure 7, in the case of Gaussian function fitting, on the whole, the residual changes of the three methods are all reduced and the estimated results of the final parameters are very close to the truth value, so it can be seen that the three methods are convergent. e experimental analysis indicates that the problem is seriously ill posed, and when considering separable nonlinear least squares, the results obtained by the LM VP + CCVL method afford the highest fitting precision.
e computational efficiency is also much better than that for the LM method without separation.

Example 3: Decomposition of Full-Waveform LiDAR Data.
is section of the experimental data originated from the full-waveform LiDAR data of a region collected in 2016. e full-waveform lidar system record the backscattered    energy at different elevation points in a certain elevation range in the form of waveforms. e survey area mainly contains buildings, trees, and roads. e point cloud and waveform information are stored in the LAS file in the LAS1.3 standard format. e waveform data sampling interval is 1 ns and the number of samples is 287. e echo waveform is regarded as the superposition of several Gaussian functions. e model form is as follows: where ξ t i is the Gaussian noise, a � (a 1 , a 2 , a 3 , a 4 ) Τ is the linear parameter vector to be estimated, and b � (μ 1 , θ 1 , It can be seen from Figure 8 that, in the case of the LM unSep method, only the first decomposition waveform can be fitted to the observation value and the latter three decomposition waveforms are not well fitted. LM VP + LS and LM VP + CCVL provide an accurate fitting result. As there is no truth value in the actual measurement, the estimated result is no longer compared with the truth value. To better understand the comparison of the parameter estimation results of the three methods, the mean square error between the fitting results and the observed values, the maximum fit difference (Diff-max), the minimum fit   12 Complexity difference (Diff-min), the number of iterations, and function count calculations of the three methods are presented in Table 7. It can be seen from Table 7 that, like the simulation experiment results, owing to parameter separation, the reduced dimension of the parameter estimation improves the chance of convergence. LM VP + LS is the best of the three methods, with the least number of iterations and function calculations. LM VP + CCVL is second. However, the LM unSep method, in which parameters are not separated, does not produce the correct result. e residual changes of the objective function in the iterative process are shown in Figure 9.
From Figure 9, we can know that all three methods tend to converge, but LM unSep , whose parameters are not separated, does not converge to the optimal result, and the sum of squared residual convolutions is still large. e sum of the residuals of the remaining two methods tends to zero. e number of iterations is reduced from 20 to 10, and function count is also decreased from 290 to 106. e calculation is greatly simplified.
In short, these experiments indicate that the estimation method of parameter separation shows a great improvement in terms of the number of iterations, number of function calculations, and calculation result.

Conclusion
In this study, linear and nonlinear parameters were separated by the VP algorithm based on SVD. Further, the separable least squares problem was transformed into a least squares problem with only nonlinear parameters, which reduced the dimension of the parameters, number of iterations, and function calculations and improved operation   efficiency. In addition, to solve the ill-posed problem of the coefficient matrix comprising nonlinear functions when solving for linear parameters, an iteration method that corrects characteristic values based on the L-curve was adopted. is also helped to ensure the convergence of model parameter estimation and improved prediction accuracy. e parameter estimation method used in our study is suitable for cases with a large number of linear parameters in separable nonlinear least squares problems. Additionally, an iteration method that involves correcting characteristic values based on L-curve was used to solve the ill-conditioned coefficient matrix while solving linear parameters. However, one limitation is that the situation of rank deficit often occurred in the process of separable nonlinear least squares parameter estimation; this will be addressed in future research.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper. 14 Complexity