On convergence and convergence rates for Ivanov and Morozov regularization and application to some parameter identification problems in elliptic PDEs

In this paper we provide a convergence analysis of some variational methods alternative to the classical Tikhonov regularization, namely Ivanov regularization (also called method of quasi solutions) with some versions of the discrepancy principle for choosing the regularization parameter, and Morozov regularization (also called method of the residuals). After motivating nonequivalence with Tikhonov regularization by means of an example, we prove well-definedness of the Ivanov and the Morozov method, convergence in the sense of regularization, as well as convergence rates under variational source conditions. Finally, we apply these results to some linear and nonlinear parameter identification problems in elliptic boundary value problems.


Introduction
Consider inverse problems formulated as operator equations F : D(F )(⊆ X) → Y , where (X, T X ), (Y, T Y ) are topological spaces. Such problems are typically ill-posed in the sense that F is not continuously invertible. We will assume that a solution x † ∈ D(F ) to (1) exists. Since the measured data y δ that we actually have is typically contaminated with noise, whose level δ in the estimate S(y, y δ ) ≤ δ (2) we assume to know, and due the above mentioned ill-posedness, the problem has to be regularized. For this purpose, we will use regularization and data misfit functionals R : X → R + 0 , S : Y × Y → R + 0 and consider the following two variational regularization methods.
As already pointed out, e.g., in [16,18], the ideal choice (6) yields convergence and convergence rates even without knowledge of the noise level and of the possibly higher regularity (in the sense of a source condition) of x † . If R(x † ) is not known, then the discrepancy principle type choices (7), (8) are reasonable alternatives, as will be shown here.
We will here show convergence and convergence rates without exploiting any equvalence to Tikhonov regularization (for specially chosen regularization parameter) as this may fail due to nonconvexity, cf., e.g., [12,Section 3.5] and the following counterexample.
This example can be lifted to an ill-posed function space setting X = L ∞ (Ω), with some nonnegative normalized kernel function Φ : If Φ is the Green's function of some differential operator D (equipped with boundary conditions on ∂Ω) then the operator equation F (x) = y is equivalent to a possibly nonlinear inverse source problem for a PDE, namely to Dy = f (x). The first order necessary optimality conditions for Ivanov and Tikhonov regularization can, analogously to Proposition 2.2 in [2], and using the fact that the indicator function δ B L ∞ x 0 (0) is the Fenchel conjugate of x → 1 x 0 x L 1 , be derived as and Again we use constant data y δ (t) ≡ y+δ so that by the normalization of Φ the expression for p above can be rewritten as Indeed, the only feasible element x satisfying the optimality conditions for Ivanov regularization is the constant function with value x 0 . This can be seen as follows. First of all, for any Consider now the case p(s) > 0, hence, by the optimality condition (11), x(s) = x 0 sign(p(s)) = x 0 , a contradiction.
Ivanov regularization has been put forward and analyzed by Ivanov and coauthors [3,10,11,12] on weakly compact sets in reflexive Banach spaces for linear inverse problems. In [18], convergence of Tikhonov, Ivanov and Morozov regularization for nonlinear problems was established in Hilbert spaces. More recently, a comparison of these three methods in a general setting has been provided [14] and also rates for Ivanov regularization have been established in Hilbert scales [16]. Our results on welldefinedness and convergence of Morozov regularization are largely (actually in a more general framework) already covered by [7], which also contains a particular convergence rates case. Nevertheless we decided to provide a joint convergence analysis with Ivanov regularization, especially in the general framework of convergence and convergence rates of Theorems 2.5, 2.8 below, which extend the results from [7] also for Morozov regularization. Parameter identification in PDEs is a class of problems, where such alternative variational formulations in Banach spaces can be particularly fruitful, e.g., when exploiting knowledge about pointwise bounds of coefficients or sources for regularization purposes. We here consider some model problems of parameter identification in the elliptic PDE ∇ · (a ∇u) + c u = b , namely the three possible settings of identifying one of the spatially varying parameters a, b, or c, from additional observations of the state u, while the other two parameters are assumed to be known. For these model problems, we will establish applicability of the abstract results on Ivanov and Morozov regularization from the first part of this paper, in appropriate function space settings.
The remainder of this paper is organized as follows. In Section 2 we provide a convergence and convergence rates analysis for (3) and for (4) with (6), (7), or (8). These abstract findings are illustrated by means of the mentioned parameter identification examples in Section 3, and we make some concluding remarks in Section 4.

Convergence analysis
Our aim is to establish well-definedness and convergence of the methods (3) and (4) with the choices (6), (7), or (8). For this purpose, we specify some assumptions that are actually closely related to conditions imposed in previous papers on nonlinear inverse problems.
(ii) R, S(·, y δ ) are lower semicontinuous with respect to T X and T Y , respectively.
(iii) For any C > 0, the sublevel set M R C = {x ∈ D(F ) : R(x) ≤ C} is T X -compact. (iv) For any C > 0 and any y δ ∈ Y , the sublevel set M S C = y ∈ F (D(F )) : S(y, y δ ) ≤ C is T Y -compact.
Remark 1 Conditions (i)-(v) guarantee existence of an R-minimizing solution, cf. [9,Theorem 3.4], [17,Theorem 1.9]. Therefore in the following we will, without loss of generality, assume that x † is an R-minimizing solution.
Examples of regularization and data misfit functionals satisfying conditions (ii), (iii), (iv), (v) are (powers of ) norms on Banach spaces with the weak topology if the space is reflexive or with the weak-* topology if the space is the dual of a separable Banach space. With such choices of (X, T X ), (Y, T Y ), condition (i) holds for any bounded linear operator F ∈ L(X, Y ). For some examples of nonlinear forward operators F satisfying (i), we refer to Section 3.
The first part of condition (vi) is, e.g., satisfied if R(x) = x − x 0 p and S(F (x 0 ), y δ ) > δ, since then by setting ρ 0 = 0 we have x δ ρ 0 = x 0 . In this norm setting R(x) = x − x 0 p , mappings ϕ n according to (vi)a can easily be found, e.g., as ϕ n (x) : The uniqueness condition (vi)b is, e.g., satisfied if both functionals R and S(F (·), y δ ) are convex on D(F ) and at least one of them is strictly convex. In particular if R is strictly convex, a sufficiently small radius ρ might even compensate for possible nonconvexity of F (analogously to sufficiently large α in Tikhonov regularization).
In view of the fact that the range of F is typically non-closed in an ill-posed setting, the following closedness result already indicates some regularizing property of the Ivanov method.

Proposition 2.2 Let conditions (i), (iii) of Assumption 2.1 hold. Then for any
Proof. For any sequence (y n ) n∈N with y n T Y −→ y there exists a sequence of preimages (x n ) n∈N ⊆ D(F ) such that R(x n ) ≤ ρ and y n = F (x n ). Thus by Assumption 2.1 (iii), there exists a T X convergent subsequence x n k T X −→ x with R(x) ≤ ρ, and by Assumption 2.1 (i) we get x ∈ D(F ) and F (x) = y. ♦ We begin our analysis with first of all showing well-definedness of minimizers. Theorem 2.3 Let y δ ∈ Y , τ ≥ 1, δ > 0 be fixed and let (2), as well as, for two functionals R : Then x δ M o and, for any ρ ≥ ρ II * , x δ ρ are well-defined. In particular, x δ ρ with ρ = ρ I * according to (6) or with ρ = ρ II * according to (7) are well defined and the relations and hold. Moreover, the monotonicity relation holds for all ρ 1 , ρ 2 ≥ ρ II * and any two minimizers x δ ρ i of (4) with ρ = ρ i , i ∈ {1, 2}.
The key elements of the proof are (as usual in the context of variational regularization) the direct method of calculus of variations and minimality arguments. step 1. To see existence of a minimizer of (3), note that exists such that R(x n ) → I. The latter and x n ∈ X ad M o implies boundedness of the sequences (R(x n )) n∈N , (S(F (x n ), y δ )) n∈N and thus, by Assumption 2.1 (iii), (iv), existence of a subsequence and of elements Similarly, for showing existence of a minimizer of (4) with ρ = ρ I * = R(x † ) we use the fact that obviously x † ∈ X ad (ρ) (cf. (5)) and therefore by (2) and nonnegativity of S, the infimum I = inf x∈X ad (ρ) S(F (x), y δ ) is contained in [0, δ] and thus finite. Hence, the functional values S(F (x n ), y δ ) of the minimizing sequence (x n ) n∈N are bounded (by δ) and boundedness of the functional values R(x n ) follows directly from x n ∈ X ad (ρ). The rest of the proof, using closedness of F and lower semicontinuity of the functionals R, S(·, x δ ), goes analogously to above. step 3. well-definedness of ρ II * and validity of estimates (13), (14), (15): To prove that ρ according to (7) is well-defined, we show that R ad = ρ ≥ 0 : a minimizer x δ ρ of (4) exists and S(F (x δ ρ ), y δ ) ≤ τ δ is a right unbounded interval containing its left end point (which then is the searched for minimizer). First of all, R ad contains ρ I * = ρ † = R(x † ), as we have shown welldefinedness of x δ ρ † above, and since by minimality of Morover for any ρ ∈ R ad the whole interval [ρ, ∞) has to be contained in R ad , as can be easily seen by replacing x † with x δ ρ in step 2. of the proof, and using the fact that X ad (ρ) ⊇ X ad (ρ) for ρ ≥ ρ so that (By the same argument, also monotonicity (15) follows.) Thus R ad is a union of right unbounded intervals and therefore itself a right unbounded interval. It contains its left endpoint, since actually for any sequence ρ n converging (without loss of generality in a monotonically decrasing manner) to some ρ ∈ R ad , the limit ρ will be contained in R ad : Namely for all n ∈ N, we have that and by the lower semicontinuity Assumption 2.1 (ii) satisfies Thus X ad (ρ) = ∅ and I = inf x∈X ad (ρ) S(F (x), y δ ) is contained in [0, τ δ] and thus finite. So existence of a minimizer x δ ρ can be shown as in step 1., and since additionally (13) follows by minimality of ρ II * in R ad and the fact that ρ I * is contained in R ad , as we have shown in step 2. of the proof. step 4. well-definedness of ρ III * and validity of estimates (16), (17): = ρ 0 and are done. It remains to consider the case S(F (x δ ρ 0 ), y δ ) > τ δ, which by (15) and contraposition implies ρ 0 < ρ II * . We will apply the Intermediate Value Theorem to the mapping ψ : [ρ 0 , ρ II * ] → R, ψ(ρ) = S(F (x δ ρ ), y δ ) whose values at the endpoints satisfy ψ(ρ 0 ) > τ δ, ψ(ρ II * ) ≤ τ δ, so the value τ δ will be assumed on this interval provided ψ is continuous. (Note that ψ is well defined even if the optimal argument x δ ρ is nonunique since the (globally) optimal function value is unique. However, in oder to prove continuity of the value mapping ψ, we will need the uniqueness assumption 2.1 ((vi)b).) For arbitrary ρ ∈ [ρ 0 , ρ II * ] and any sequence (ρ n ) n∈N converging to ρ, the sequences R(x δ ρn ) ≤ ρ n and S(F (x δ ρn ), y δ ) ≤ S(F (x δ ρ 0 ), y δ ) (by (15)) are bounded, thus by T X , T Y compactness of sublevel sets and closedness of F there exists a subsequence x δ , whose limit by lower semicontinuity of R, S(·, y δ ) satisfies Now consider an arbitrary element x ∈ X ad (ρ). Using ϕ n according to assumption 2.1 ((vi)a) (where d n = ρ n − ρ) we can render x admissible for the minimization problem with radius ρ n , i.e., ϕ n (x) ∈ X ad (ρ n ) and thus obtain from minimality of x δ ρn that S(F (ϕ n (x)), y δ ) ≥ S(F (x δ ρn ), y δ ) for all n ∈ N. Combining this with the right hand side limit in (18) and using the fact that lim inf k→∞ S(F (ϕ n k (x)), y δ ) = S(F (x), y δ ) , we end up with S(F (x), y δ ) ≥ S(F (x), y δ ). Since x ∈ X ad was arbitary, the assumed uniqueness of minimizers yields x = x δ ρ . Therefore, analogoulsy to (18) we get for any subsequence (ρ nm ) m∈N of (ρ n ) n∈N existence of a subsequence (ρ nm l ) l∈N such that where we have used ϕ nm l (x δ ρ ) ∈ X ad (ρ) in the last inequality. By a subsequencesubsequence argument this yields S(F (x δ ρn ), y δ ) → S(F (x δ ρ ), y δ ) as n → ∞. The estimate , y δ ) and monotonicity (15) by contraposition yields (16). ♦ Remark 2 Boundedness of the R functional values (14) or (17), together with the compactness Assumption 2.1 (iii) also gives subsequential type stability with respect to perturbations of the data. In case of uniqueness (Assumption 2.1 (vi)b), by a subsequence-subsequence argument this yields T X -stability.
Convergence and convergence rates can be obtained from the two general results Theorems 2.5, 2.8 below, that are quite staightforward to see.
For this purpose we need some additional assumption on S, that is obviously satisfied if S is defined by some power of a norm. (i) For any two sequences (y n ) n∈N , (ỹ n ) n∈N we have the implication S(y n ,ỹ n ) → 0 and S(ỹ n , y) → 0 as n → ∞ ⇒ S(y n , y) → 0 as n → ∞ .
Then we have T X -subsequential convergence to a solution of (1) in the sense that for any zero sequence (δ n ) n∈N the sequencex δn has a T X convergent subsequence whose limit solves (1). If the solution x † to (1) is unique, thenx δ T X −→ x † as δ → 0. Moreover, if (19) holds with C = R(x † ), where x † is an R-minimizing solution of (1), then the regularization terms converge Thus, in case R is a norm on a space X satisfying the Kadets-Klee property, and T X is the weak topology on that space, altogether we even have (subsequential) norm convergence.
Proof. Let (δ n ) n∈N be an arbitrary sequence converging to zero. Then by (19), (20) the sequences (R(x δn )) n∈N , S(F (x δn , y δn )) n∈N are bounded, hence Assumption 2.1 (i)-(iv) yields existence of a subsequence (x δn k ) k∈N that T X -converges to some x with In case of uniqueness, convergence of the whole sequence follows by a subsequencesubsequence argument.
To show (21), we note that (19), which we here have assumed to hold with C = R(x † ), implies lim sup δ→0 R(x δ ) ≤ R(x † ). We now assume existence of a subsequence δ n → 0 such that lim sup n→∞ R(x δn ) < R(x † ). By T X lower semicontinuity of R this implies R(x) < R(x † ) for the T X accumulation point x whose existence we have shown above. But since x † is an R minimizing solution, this contradicts the fact (also proven above) that x solves (1). ♦ Corollary 2.6 Let y ∈ F (D(F )) and let (y δ ) δ>0 be a family of noisy data satisfying (2) such that (with y 0 := y) for all δ ≥ 0, and for two functionals R : X → R + 0 , S : Y × Y → R + 0 , Assumptions 2.1 (i)-(v) and 2.4 hold with x † an R-minimizing solution of (1).
Then we have T X -subsequential convergence as δ → 0 to a solution of (1) for x δ

M o
and for x δ ρ with ρ according to (6) or (7). If additionally Assumption 2.1 (vi) holds, then the same holds true for x δ ρ with ρ according to (8).
To obtain convergence rates in the Bregman distance with respect to R for some ξ in the subdifferential ∂R(x), (which is nonempty, e.g., if R is convex) we make use of a variational source condition cf. [9,Equation (14)], for some index function ϕ : R + → R + , (i.e., ϕ monotonically increasing and lim t→0 ϕ(t) = 0), and ξ † ∈ ∂R(x † ). Moreover, Assumption 2.4 (i) has to be specified as the following generalized triangle inequality.
As can be easily seen, a similar result can be obtained for more general error functionals E : X × X → R + 0 under a more general variational smoothness assumption (cf., e.g., [1,4,6,8]) or the slightly weaker condition (since we can restrict attention to elements satisfying R(x) ≤ R(x † ) and can absorb the constant 1 β into the function ϕ) for some index function ϕ : R + → R + , since this by (24), (25) yields E(x δ , x † ) ≤ ϕ(C S (τ + 1)δ) A possible advantage of (28) is that the subdifferential of R does not get involved and an appropriate choice of the functional E might also enable to state reasonable results in the context of convex but not strictly convex R, such as the L 1 or the L ∞ norm. Corollary 2.9 Let y ∈ F (D(F )) and let, for two functionals R : X → R + 0 , S : Y ×Y → R + 0 , Assumptions 2.1 (i)-(v) and 2.7 as well as ∂R(x † ) = ∅ be satisfied, where x † ∈ D(F ) is a solution to (1) satisfying the variational source condition (23). Moreover, let (y δ ) δ>0 be a family of noisy data such that (2) holds.
Then the convergence rate (26) holds for x δ M o and for x δ ρ with ρ according to (6) or (7).
If additionally Assumption 2.1 (vi) holds then the same holds true for x δ ρ with ρ according to (8).

Identification of a source term
We start with a linear inverse problem, namely identification of the source term b in the elliptic boundary value problem from measurements of u in a smooth bounded domain Ω, where g ∈ H 1/2 (∂Ω) is also given. Without loss of generality (upon subtraction of a harmonic extension of the boundary data g from b) we can assume g = 0. The forward operator where −∆ is the Laplace operator with homogeneous Dirichlet boundary conditions, is well-defined and bounded, by elliptic regularity even as an operator from L q (Ω) into W 2,p (Ω), provided Moreover, F is linear, hence weakly closed. Now we wish to explore possible choices of distance measures E satisfying (28) with under appropriate assumptions on the exact solution b † . Indeed we can estimate where the first inequality holds by definition of the BV norm. Thus (28) is satisfied with or with Together with Remark 1 this implies the following.

Identification of a potential
Consider identification of the spatially varying potential c in the elliptic boundary value problem from measurements of u in Ω.
Here Ω ⊆ R d is a smooth bounded domain and f ∈ H 1 (Ω) * , g ∈ H 1/2 (∂Ω) are given. We choose X = L ∞ (Ω) = L 1 (Ω) * with T X the weak* topology and as well as Y = L p (Ω) with p ∈ [1, ∞] arbitrary and T Y the weak (in case p = ∞ weak*) topology, so that Assumption 2.1 (ii), (iii), (iv), and Assumptions 2.4, 2.7 are satisfied. We now verify Assumption 2.1 (i). Since D(F ) is weak* closed, we have, for any sequence (c n ) n∈N ⊆ D(F ) the implication c n * c ⇒ c ∈ D(F ). It remains to show that under the additional assumption F (c n ) Denoting u n = F (c n ) and with an extension g ∈ H 1 (Ω) of the boundary data g to Ω, the weak form of (32) for c = c n , u = u n u n − g ∈ H 1 0 (Ω) and where we have used Cauchy-Schwarz and Young's inequality as well as the norm ∇ · L 2 (Ω) on H 1 0 (Ω). Together with boundedness of c n in L ∞ (Ω) (following from weak* convergence and the uniform boundedness principle) we thus have uniform boundedness of u n in H 1 (Ω) and thus, using compactness of the embedding H 1 (Ω) → L 2 (Ω), existence of a subsequence (u n k ) k∈N and an element u ∈ H 1 (Ω) such that u − g ∈ H 1 0 (Ω), u n k u in H 1 (Ω), u n k → u in L 2 (Ω). Therewith, we get, by u n T Y −→ u, that u = u and, using (33) u − g ∈ H 1 0 (Ω) and for any k ∈ N, where all terms on the right hand side go to zero as k → ∞: The first one by u n k u in H 1 (Ω), the second one by boundedness of c n k in L ∞ and u n k → u in L 2 (Ω), and the last one by c n * c in L ∞ (Ω) and uϕ ∈ L 1 (Ω). Thus, taking the limit Again we consider and intend to find a distance measure E satisfying (28) under appropriate assumptions on the exact solution c † . For this purpose we assume that the state u † corresponding to the exact solution c † satisfies where boundedness away from zero can, e.g., be achieved by some maximum principle for the elliptic PDE (32) together with an assumption on nonnegativity of f and positivity of g.
We get, similarly to subsection 3.1, and, by elliptic regularity and likewise for u † .

Identification of a diffusion coefficient
Now we consider identification of the spatially varying diffusivity a in the elliptic boundary value problem from measurements of u in Ω. Again, Ω ⊆ R d is a smooth domain and f ∈ H 1 (Ω) * , g ∈ H 1/2 (∂Ω) are given. Similarly to above, the domain of the forward operator will be for some positive constants γ < γ, but since, as we will see below, we need a space (X, T X ) that satisifes in order to obtain weak sequential closedness of F , the choice X = L ∞ (Ω) will not be feasible this time. However, X = BV (Ω) fulfills the requirement (40). As a data space, again we use Y = L p (Ω). Indeed, for (d = 1 and p ∈ [1, ∞]) or (d = 2 and p ∈ [1, ∞)) or d ≥ 3 and p ∈ [1, 2d continuity of the embeddings BV (Ω) → L ∞ (Ω), H 1 (Ω) → L p (Ω) guaranteess welldefinedness of F : X → Y , a → u. Even for larger p, one can achieve well-definedness and a uniform W 1,p (Ω) bound on F (a) as long as a is sufficiently close to a constant.
Proof. We set a = γ+γ 2 and first of all prove W 1,p regularity of solutions to (37) with constant diffusion coefficient a. To this end, we consider a smooth approximation v n +g n of u, satisfying (37) with f, g replaced by f n ∈ C ∞ , g n ∈ C ∞ , f n → f in (W 1,p * (Ω)) * , g n → g in W 1−1/p,p (∂Ω), and g n the smooth extension of g n to the interior satisfying g n W 1,p (Ω) ≤ C tr g n W 1−1/p,p (∂Ω) according to the Trace Theorem. We use the Helmholtz decomposition cf. [5,Section III.] ∇v n |∇v n | p−2 = ∇ϕ n + w n where ϕ n ∈ L 1 loc (Ω) ∩ W 1,p * (Ω) , ∇ϕ n L p * (Ω) ≤ C ∇v n |∇v n | p−2 L p * (Ω) = C ∇v n p−1 L p (Ω) and w n ∈ L p (Ω) d , ∇ · w n = 0 in Ω, ν · w n = 0 on ∂Ω , and use the PDE to get the energy estimate a ∇v n p L p (Ω) = a Ω ∇v n · ∇v n |∇v n | p−2 dx = a Ω ∇v n · ∇ϕ n + w n dx = Ω a∇v n · ∇ϕ n dx = Ω (f n + a∆g n )ϕ n dx ≤ C P F f n + a∆g n (W 1,p * (Ω)) * ∇ϕ n L p * (Ω) where we have used the Poincaré-Friedrichs inequality on W 1,p 0 (Ω). Thus u n W 1,p (Ω) = v n + g n W 1,p (Ω) is uniformly bounded, hence there exists a weakly (weakly * in case of p = ∞) convergent subsequence, whose limit can be easily checked to coincide with u. Thus in particular (−∆ a ) −1 : (W 1,p * (Ω)) * → W 1,p 0 (Ω), mapping f to a solution of (37) with a = a, g = 0, is bounded. Moreover, this also implies existence of an extension g ∈ W 1,p (Ω) of the boundary data g satisfying We now verify weak sequential closedness of F . Proof. We consider an arbitrary sequence (a n ) n∈N ⊆ D(F ) with a n T X −→ a, u n = F (a n ) T Y −→ u, and, from (40), immediately have a ∈ D(F ). So it remains to show that F (a) = u, which we do by using the weak form of (37) for a = a n , u = u n u n − g ∈ H 1 0 (Ω) and ∀ϕ ∈ H 1 0 (Ω) Ω a n ∇u n ∇ϕ dx = where again g ∈ H 1 (Ω) is an extension of the boundary data g to Ω. Testing with ϕ = u n − g implies √ a n ∇(u n − g) 2 L 2 (Ω) = Ω (f (u n − g) − a n ∇g∇(u n − g)) dx (Ω) + √ a n ∇g L 2 (Ω) √ a n ∇(u n − g) L 2 (Ω) which by pointwise boundedness of a n from above and below implies uniform boundedness of u n in H 1 0,a (Ω), which we define as the closure of C ∞ 0 (Ω) with respect to the norm induced by the inner product (u, v) = Ω a∇u∇v dx. Thus, by compactness of the embedding H 1 0,a (Ω) → L 2 (Ω), we get existence of a subsequence (u n k ) k∈N and an element u ∈ H 1 (Ω) such that u − g ∈ H 1 0 (Ω), u n k − g u − g in H 1 0,a (Ω), u n k → u in L 2 (Ω), and, due to (40) as well as (a n ) n∈N ⊆ D(F ), also a n k → a in L 1 (Ω) and a n k * a in L ∞ (Ω). The latter two limits imply norm convergence of a n k to a in L 2 (Ω), since a n k − a L 2 (Ω) = Ω (a n k − a)a n k dx + Ω (a n k − a)a dx ≤ a n k − a L 1 (Ω) γ + a n k − a, a L ∞ ,L 1 → 0 as k → ∞ . (a∇u∇ϕ − f ϕ) dx = Ω a∇(u − u n k )∇ϕ dx + Ω (a − a n k )∇u n k ∇ϕ dx for any k ∈ N. Here the first term on the right hand side goes to zero as k → ∞ by weak H 1 0,a (Ω) convergence of u n k to u, and the second one by boundedness of ∇u n k | L 2 (Ω) and (43). Due to density of C ∞ 0 in H 1 0 (Ω), this yields F (a) = u. ♦ Concerning the variational smoothness assumption (28) with S(y 1 , y 2 ) = y 1 − y 2 Y = y 1 − y 2 L p (Ω) , R(c) = c X , we get the following auxiliary result Lemma 3.5 Let u † = F (a † ) and the normed spaces U, V, Z be such that U * = V or V * = U and ∇u † · ∇ϕ : ϕ ∈ H 1 0 (Ω) is dense in Z ∃C > 0 ∀ϕ ∈ H 1 0 (Ω) , a ∈ M a † : ∇(a∇ϕ) V ≤ C ∇u † · ∇ϕ Z .
for M a † = a ∈ D(F ) : a X ≤ a † X . Then For fixed u † , verification of condition (46) requires higher order regularity results for a solution ϕ to the transport equation ∇u † · ∇ϕ = h in terms of higher order norms of h.
On the other hand, in case of U = W s,p (Ω) one can estimate F (ã) − F (a † ) U by means of an interpolation inequality. In doing so, one has to take into account that boundedness ofã in BV in general does not admit a regularity result better than F (ã) ∈ W 1,p (Ω).