On Levenberg-Marquardt-Kaczmarz iterative methods for solving systems of nonlinear ill-posed equations

In this article a modified Levenberg-Marquardt method coupled with a Kaczmarz strategy for obtaining stable solutions of nonlinear systems of ill-posed operator equations is investigated. We show that the proposed method is a convergent regularization method. Numerical tests are presented for a non-linear inverse doping problem based on a bipolar model.


Introduction
In this paper we propose a new method for obtaining regularized approximations of systems of nonlinear ill-posed operator equations.
The inverse problem we are interested in consists of determining an unknown quantity x ∈ X from the set of data (y 0 , . . . , y N −1 ) ∈ Y N , where X, Y are Hilbert spaces and N ≥ 1 (the case y i ∈ Y i with possibly different spaces Y 0 , . . . , Y N −1 can be treated analogously). In practical situations, we do not know the data exactly. Instead, we have only approximate measured data y δ i ∈ Y satisfying with δ i > 0 (noise level). We use the notation δ := (δ 0 , . . . , δ N −1 ). The finite set of data above is obtained by indirect measurements of the parameter x, this process being described by the model where F i : D i ⊂ X → Y , and D i are the corresponding domains of definition.
The l-LMK method consists in incorporating the Kaczmarz strategy into the Levenberg-Marquardt method. This procedure is analog to the one introduced in [8], [4], [9], and [3] regarding the Landweber-Kaczmarz (LK) iteration, the Steepest-Descent-Kaczmarz (SDK) iteration, the Expectation-Maximization-Kaczmarz (EMK) iteration, and the Iteratively Regularized Gauss-Newton-Kaczmarz (IRGNK) respectively. As usual in Kaczmarz type algorithms, a group of N subsequent steps (starting at some multiple k of N ) is called a cycle.
The l-LMK iteration should be terminated when, for the first time, all x δ k are equal within a cycle. That is, we stop the iteration at Notice that k δ * is the smallest multiple of N such that or equivalently (see Proposition 4.1 below) such that For exact data (δ = 0) we have ω k = 1 for each k and the l-LMK iteration reduces to the Levenberg-Marquardt-Kaczmarz (LMK) method. For noisy data however, the l-LMK method is fundamentally different from the LMK method: The bang-bang relaxation parameter ω k effects that the iterates defined in (4), (5) become stationary if all components of the residual vector F i (x δ k ) − y δ i fall below a pre-specified threshold. This characteristic renders (4), (5) a regularization method in the sense of [6] (see Section 4).
The article is outlined as follows. In Section 2 we formulate basic assumptions and derive some auxiliary estimates required for the analysis. In Section 3 we prove a convergence result for the LMK method. In Section 4 we prove a semiconvergence result for the l-LMK method. In Section 5 a numerical experiment for an inverse doping problem is presented. Section 6 is devoted to final remarks and conclusions.

Assumptions and basic results
We begin this section by introducing some assumptions, that are needed for the convergence analysis presented in the next sections. These assumptions derive from the classical assumptions used in the analysis of iterative regularization methods [6,12,20].
(A1) The operators F i and their linearizations F i -see (A2) -are continuous and the corresponding domains of definition D i have nonempty interior, i.e., there exists is the ball of radius ρ around x 0 . Moreover, we assume the existence of C > 0 such that (notice that x δ 0 = x 0 is used as starting point of the l-LMK iteration). (A2) We assume that the local tangential cone condition [6,12] holds for some η < 1. This is a uniform assumption on the nonlinearity of the operators F i . Note that F i (x) need not necessarily be the Fréchet derivative of F i at x, but it should be a bounded linear operator that continuously depends on x, see (A1).
We are now in position to choose the positive constants α and τ in (5), (6). For the rest of this article we shall assume for some 0 < q < 1.
In the sequel we verify some basic facts that are helpful for the convergence analysis derived in the next two sections. The first result concerns some useful identities.
Lemma 2.1. Let x δ k , h k and α be defined by (4), (5) and (11) respectively. Moreover, assume that (A1) -(A3) hold true. a) For all k ∈ N we have Proof. The proof of a) and b) is straightforward and will be omitted. To prove c) notice that Remark 2.2. According to [10] the Levenberg-Marquardt-iteration should be implemented with variable α k , which for the LMK would mean that α k is chosen in such a way that for some 0 < q < 1. From (12) and monotonicity of the mapping α → B δ k , (see, e.g., [6]), it follows that α as chosen in (11) is larger than the α k 's defined in [10] (see [7,Theorem 3.3

.1]).
It is worth noticing that Lemma 2.3 as well as Proposition 2.4 remain valid with α k chosen as in (13).

Proposition 2.4 (Monotonicity). Under the assumptions of Lemma 2.3, for all
and satisfy (14). Moreover, Proof. From (A3) it follows that (16) is satisfied with equality for k = 0. If ω 0 = 1, it follows from Lemma 2.3 that (14) holds for k = 0. Then we conclude from (6) and (14) that Due to (11) the last term on the right hand side is non positive. Thus, (16) holds for k = 0. In particular we have x δ 1 ∈ B ρ/2 (x * ). The proof follows now using an inductive argument.
In the next two sections we provide a complete convergence analysis for the l-LMK iteration, showing that it is a convergent regularization method in the sense of [6].

Convergence for exact data
Unless otherwise stated, we assume in the sequel that (A1) -(A3) hold true and that x δ k , h k , α, τ and q are defined by (4), (5) and (11). Our main goal in this section is to prove convergence in the case δ i = 0, i = 0, . . . , N − 1. As already observed in Section 1 the l-LMK reduces in this case to the LMK iteration (i.e. ω k = 1 in (6)). For exact data y = (y 0 , . . . , y N −1 ), the iterates in (4) are denoted by x k , in contrast to x δ k in the noisy data case.
This assertion is a direct consequence of [11, Proposition 2.1]. For a detailed proof we refer the reader to [12].
In the sequel we derive some estimates that are helpful for the proof of the convergence result. From Proposition 2.4 it follows that (14) holds for all k ∈ N. Since the data is exact, (14) can be rewritten as Now inserting (12) into (17) and summing over all k, leads to On the other hand, neglecting the last term on the right hand side of (17) and summing over all k, leads to Finally, neglecting the first term on the right hand side of (17) and summing over all k, leads to Theorem 3.2 (Convergence for exact data). For exact data, the iteration x k converges to a solution of (2), as k → ∞. Moreover, if the kernel condition [5] N is satisfied, where F is defined as in (3), Proof. We define e k := x † − x k . From Proposition 2.4 it follows that e k is monotone nonincreasing. Therefore, e k converges to some ≥ 0. In the following we show that e k is in fact a Cauchy sequence.
Thus, e k is a Cauchy sequence and x k = x † − e k converges to some element x + ∈ X. Since the residuals F [k] (x k ) − y [k] converge to zero, x + is a solution of (2).
⊥ . An inductive argument shows that all iterates x k are elements of x 0 +N (F (x † )) ⊥ . Therefore, x * ∈ x 0 + N (F (x † )) ⊥ . By Remark 3.1, x † is the only solution of (2) in B ρ/2 (x 0 ) ∩ (x 0 + N (F (x † )) ⊥ ), and so the second assertion follows. Remark 3.3. In order to consider the variable choice of α according to (13) let us consider for the moment a condition which is slightly stronger than the tangential cone condition, namely the range invariance condition (A2') There exist linear bounded operators R i (x, x) satisfying If (A2) is substituted by (A2') in Theorem 3.2, the estimates can be improved to This suggests that if, instead of (20) the index n 0 is chosen from the more natural requirement then the analysis might extend to the variable parameter choice (13). Note however, that due to the factor (instead of an in this context desirable i 0 -independent factor ω n 0 N +i 1 α n 0 N +i 1 ) in (24), this is unfortunately not the case.
Our main goal in this section is to prove that x δ k δ * converges to a solution of (2) as δ → 0, where k δ * is defined in (7). The first step is to verify that, for noisy data, the stopping index k δ * is well defined.
Proof. Assume by contradiction that for every l ∈ N, there exists a i(l) ∈ {0, . . . , N − 1} such that F i(l) (x lN +i(l) ) − y δ i(l) ≥ τ δ i(l) . From Proposition 2.4 it follows that (14) holds for k = 1, . . . , lN . Summing over k, leads to Using the fact that either ω k = 0 or , we obtain From equations (12), (26) and the fact that ω l N +i(l ) = 1 for all l ∈ N, we obtain Due to (11), the right hand side of (27) tends to +∞ as l → ∞, which gives a contradiction. Consequently, the minimum in (7) takes a finite value.

Description of the model problem
In this section we introduce a model which plays a key rule in inverse doping problems related to measurements of the current flow, namely the linearized stationary bipolar case close to equilibrium. This model is obtained from the drift diffusion equations by linearizing the Voltage-Current (VC) map at U ≡ 0 [14,2], where the function U = U (x) denotes the applied potential to the semiconductor device. This simplification is motivated by the fact that, due to hysteresis effects for large applied voltage, the VC-map can only be defined as a single-valued function in a neighborhood of U = 0. Additionally, we assume that the electron mobility µ n (x) = µ n > 0 and hole mobility µ p (x) = µ p > 0 are constant and that no recombination-generation rate is present [16,15].
Under these assumptions the Gateaux derivative of the VC-map Σ C at the point U = 0 in the direction h ∈ H 3/2 (∂Ω D ) is given by the expression where the concentrations of electrons and holes (û,v) (written in terms of the Slotboom vari- ∇û · ν = ∇v · ν = 0 on ∂Ω N (31d) and the potential V 0 is the solution of the thermal equilibrium problem Here Ω ⊂ R d is a domain representing the semiconductor device; the boundary ∂Ω of Ω is divided into two nonempty disjoint parts: ∂Ω = ∂Ω N ∪∂Ω D . The Dirichlet part of the boundary ∂Ω D models the Ohmic contacts, where the potential V as well as the concentrationsû and v are prescribed; the Neumann part ∂Ω N of the boundary corresponds to insulating surfaces, thus a zero current flow and a zero electric field in the normal direction are prescribed; the Dirichlet part of the boundary splits into ∂Ω D = Γ 0 ∪ Γ 1 , where the disjoint boundary parts Γ i , i = 0, 1, correspond to distinct contacts (differences in U (x) between different segments of ∂Ω D correspond to the applied bias between these two contacts). The function C(x) is the doping profile and models a preconcentration of ions in the crystal, so C(x) = C + (x) − C − (x) holds, where C + and C − are concentrations of negative and positive ions respectively. In those subregions of Ω in which the preconcentration of negative ions predominate (P-regions), we have C(x) < 0. Analogously, we define the N-regions, where C(x) > 0 holds. The boundaries between the P-regions and N-regions (where C changes sign) are called pn-junctions. Moreover, V bi is a given logarithmic function [2].

Inverse doping problem
The inverse problem we are concerned with consists in determining the doping profile function C in (32) from measurements of the linearized VC-map Σ C (0) in (30). Notice that we can split the inverse problem in two parts: The first step is to define the function γ(x) := e V 0 (x) , x ∈ Ω, and solve the parameter identification problem div (µ n γ∇û) = 0 in for γ, from measurements of [Σ C (0)](U ) = Γ 1 (µ n γû ν − µ p γ −1v ν ) ds. The second step consists in the determination of the doping profile in Since the evaluation of C from γ can be explicitly performed in a stable way, we shall focus on the problem of identifying the function parameter γ in (33).
Summarizing, the inverse doping profile problem in the linearized stationary bipolar model (close to equilibrium) for pointwise measurements of the current density reduces to the identification of the parameter γ in (33) from measurements of the DN map In the formulation of the inverse problem we shall take into account some restrictions imposed by the practical experiments, namely i) The voltage profile U ∈ H 3/2 (∂Ω D ) must satisfy U | Γ 1 = 0 (in practice, U is chosen to be piecewise constant on the contact Γ 1 and to vanish on Γ 0 ); ii) The identification of γ has to be performed from a finite number N ∈ N of measurements, i.e. from the data (U i , Λ γ (U i )) Therefore, we can write this particular inverse doping profile problem in the abstract formulation of system (2), namely where U i are fixed voltage profiles chosen as above; X := L 2 (Ω) ⊃ D(F i ) := {γ ∈ L ∞ (Ω); 0 < γ m ≤ γ(x) ≤ γ M , a.e. in Ω}; Y := R.
To the best of our knowledge, assumptions (A1) -(A3) are not satisfied for the Dirichlet-to-Neumann operator Λ γ . Therefore, although the operators F i are continuous [2], the analytical convergence results of the previous sections do not apply for system (34).
The fixed inputs U i , are chosen to be piecewise constant functions supported in Γ 0 where the points x i are uniformly spaced in [0, 1]. The parameter γ to be identified is shown in Figure 1 (a) (notice that γ(x) ∈ [0, 10] a.e. in Ω). In Figure 1 (b) a typical voltage source U i (applied at Γ 0 ) and the corresponding solutionû of (33) are shown. In these two pictures, as well as in the forthcoming ones, Γ 1 is the lower left edge and Γ 0 is the top right edge (the origin corresponds to the upper right corner).
(a) (b) Figure 1: In picture (a) the parameter γ to be identified is shown. In picture (b) a typical voltage source U i (boundary condition) and the corresponding solutionû of (33) are shown.
For comparison purposes we implemented both the l-LMK and the l-LK iteration. The initial condition for both methods is presented in Figure 2 (c). The linear system in the l-LMK is solved inexactly by three CG steps, so the numerical effort for one step of the l-LMK is three times the one for one step of the l-LK. In the computations it turned out that the performance of the l-LMK is not very sensitive to the value of α. The "exact" data y i , i = 0, . . . , 8, were obtained by solving the direct problems (33) using a finite element type method and adaptive mesh refinement (approx 8000 elements). Artificially generated (random) noise of 5% was introduced to y i in order to generate the noisy data y δ i for the inverse problem. In order to avoid inverse crimes, a coarser grid (with approx 2000 elements) was used in the finite element method to implement the l-LMK and l-LK iterations.
For both iterative methods the same stopping rule (7) was used. We assumed exact knowledge of the noise level and chose τ = 2. In Figure 2 (d) we plot, for each one of the iterations, the number of non-loped inner steps in each cycle. For the l-LMK iteration (solid red line) the stopping criterion is achieved after 24 cycles, while the l-LK iteration (dashed blue line) is stopped after 205 cycles. In this picture one also observes that the computational effort to perform the l-LMK cycles decreases much faster than in the l-LK iteration.
The quality of the final result obtained with the l-LMK method can be seen at Figure 2 (a), where the iteration error for the approximation obtained after 24 cycles is depicted. In Figure 2 (b) we present the iteration error for the l-LK iteration after 205 cycles, when the stopping criterion is reached.
Since the same noisy data and the same stopping rule were used for both iterations, the quality of the final results in Figures 2 (a) and (b) is similar. However, the l-LMK iteration needed a much smaller number of cycles to reach the stopping criterion than the l-LK iteration (starting from the same initial guess). Moreover, the number of actually performed inner steps per cycle is much smaller for the l-LMK iteration. All these observations lead us to conclude that the l-LMK iteration is numerically much more efficient than the l-LK iteration.

Conclusions
In this article we propose a new iterative method for inverse problems of the form (2), namely the l-LMK iteration. In the case of exact data this method reduces to the LMK iteration.
In the l-LMK iteration we omit an update of the LMK iteration (within one cycle) if the corresponding i-th residual is below some threshold. Consequently, the l-LMK method is not stopped until all residuals are below the specified threshold. We provide a complete convergence analysis for the l-LMK iteration, proving that it is a convergent regularization method in the sense of [6]. Moreover, we provide a numerical experiment for a nonlinear inverse doping problem and observe that the l-LMK iteration generates results that are comparable with other Kaczmarz type iterations. The specific example considered in Section 5.2 indicates that the l-LMK iteration is numerically more efficient than the l-LK iteration.