Convergence analysis of (statistical) inverse problems under conditional stability estimates

Conditional stability estimates require additional regularization for obtaining stable approximate solutions if the validity area of such estimates is not completely known. In this context, we consider ill-posed nonlinear inverse problems in Hilbert scales satisfying conditional stability estimates characterized by general concave index functions. For that case, we exploit Tikhonov regularization and provide convergence and convergence rates of regularized solutions for both deterministic and stochastic noise. We further discuss a priori and a posteriori parameter choice rules and illustrate the validity of our assumptions in different model and real world situations.


Introduction
In this paper, we investigate the operator equation which acts as model of an inverse problem with a (possibly nonlinear) forward operator Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
with domain D(F) mapping between the infinite dimensional separable real Hilbert spaces X with norm · and Y with norm · Y . Let denote by f † ∈ D(F) the uniquely determined solution of (1) for the exact right-hand side g = F(f † ) ∈ Y . Moreover let, as is typical for inverse problems, (1) be locally ill-posed at f † , which means that for closed balls B r (f † ) : [20, definition 3]). Consequently, in order to find stable approximate solutions to equation (1) based on observed noisy data g obs of g, some kind of stabilization is required. Our focus here is on variational regularization in a Hilbert scale under conditional stability estimates.
For considering the Hilbert scale we introduce a densely defined (unbounded and closed) linear self-adjoint operator L : D(L) ⊂ X → X, which is strictly positive such that we have for some m > 0 Lx m x for all x ∈ D(L).
The operator L satisfying (2) generates a Hilbert scale {X ν } ν∈R with X 0 := X , X ν = D(L ν ), and with corresponding norms x ν := L ν x X . It is well-known that for a triple of indices −a < t s the interpolation inequality holds for all f ∈ X s . In the following, we will consider a mixed data model, which allows to treat both deterministic and stochastic error contributions. Therefore recall the notion of a Hilbert space process Z on Y , which is a bounded linear mapping Z : Y → L 2 (Ω, A, P) with a probability space (Ω, A, P). Note that, by definition, P [Z ∈ Y] = 0, and it is common to write Z, g := Z (g) for g ∈ Y. A Hilbert space process Z is called centered, if E [ Z, g ] = 0 for all g ∈ Y, and it is called white, if Cov [ Z, g 1 , Z, g 2 ] = g 1 , g 2 for all g 1 , g 2 ∈ Y. With this notion in mind, we consider the data model g obs = g † + σZ + δξ (4) with a centered white noise Z on Y , some (deterministic) element ξ ∈ Y with ξ Y 1, and parameters σ, δ > 0. Model (4) covers both deterministic and stochastic error contributions, parameterized by δ and σ respectively, see [1] for examples. Note that if σ = 0, then g obs ∈ Y, the measurements g obs at hand are purely deterministic and satisfy the classical bound with the noise level δ > 0. In this case we concretize the situation by assigning g obs = g δ . If σ > 0, then P g obs ∈ Y = P [Z ∈ Y] = 0 and hence (4) has to be understood in a weak sense, this is for each g ∈ Y we observe g obs , g = g † , g + δ g, ξ + σ Z, g , where, by definition, Z, g is a random variable with distribution N 0, g 2 Y , and for two elements g 1 , g 2 ∈ Y the dependency structure is encoded in Cov [ Z, g 1 , Z, g 2 ] = E [ Z, g 1 Z, g 2 ] = g 1 , g 2 .
Initially, we pose two assumptions which are valid throughout the paper. The first assumption refers to properties of F, D(F) and f † . Moreover, it defines occurring indices a and s, u in the Hilbert scale under consideration. (1) for given right-hand side g. (d) There are further indices a, s ∈ R such that a 0, 0 s < u 2s + a, and −a < s.

(a) The domain D(F) of F is a convex and closed subset of
In the following we will need closed balls and their intersections with the domain of defini- respectively. Now we are in position to introduce the second assumption in form of a conditional stability estimate.

Assumption 2.
There are a concave index function 3 holds for all f ∈ Q, where the multiplier R may depend on a, ϕ and Q.
There are two main sources for verifying conditional stability estimates of the form (7): (A) Local structural conditions for the nonlinearity of F, (B) Global inequalities of the forward operator F.
In general, the local nonlinearity conditions in (A) require Gâteaux or Fréchet derivatives Under the stated assumptions we search for approximate solutions f α to f † , which are regularized solutions as minimizerŝ of the Tikhonov functional with s-norm square penalty f 2 s and a data fidelity term S ·; g obs . If σ = 0 in (4), i.e. if we have deterministic data g obs = g δ ∈ Y , we will consider the most common choice S g; g obs = 1 If σ > 0 in (4), then one has g obs / ∈ Y with probability 1 as discussed above, and hence g − g obs Y = +∞ a.s. for any g ∈ Y. However, the functional T g; g † = 1 2 g − g † 2 Y can still be interpreted as an ideal data fidelity term, which is unavailable (as g † is unknown). In view of (6) it seems natural to use S g; g obs := 1 2 g 2 Y − g, g obs as data fidelity term in that case, which ensures well-definedness and formally differs from T ·; g † only by the additive constant 1 2 g obs 2 Y (which is however +∞ in the stochastic case). Hence, for stochastic noise we consider Note that the penalty f → f 2 s is a non-negative, convex and sequentially lower semi-continuous functional. By definition of the Hilbert scale, for all s 0, this functional is stabilizing in the sense that all its sublevel sets are weakly sequently compact in X . Under assumption 1, existence and stability of approximate solutions f α in the sense of [32, section 4.1.1] are then evident, since assumptions 3.11 and 3.22 in [32] are satisfied (in case of stochastic noise, this is a.s. the case). Moreover, note that we always have for the minimizer of the Tikhonov functional f α ∈ X s , which means that there is a radius ρ > 0 such that f α and f † both belong to D s ρ (0). In order to obtain convergence of the regularized solutions to f † , the interplay of the noise magnitude and the choice of the regularization parameter α > 0 must be appropriate.
To prove even convergence rates in variational regularization, smoothness conditions have to be imposed on f † . It will be shown that the conditional stability estimate (7) from assumption 2 allows us to verify error estimates and convergence rates for the constructed approximate solutions and that the property f † ∈ X u ∩ D(F) is sufficient to serve as such a smoothness condition if the index u matches the set Q from (7). In this context, however, we should emphasize that the stability estimate (7) is not powerful enough to yield alone stable approximate solutions to (1) since Q is in general not or not completely known. Therefore, the additional use of Tikhonov-type regularization is needed in order to force the approximate solutions into the set Q of admissible elements for (7) for sufficiently small noise.
In the context of smoothness conditions we also mention commonalities between conditional stability estimates (7) and variational source conditions, which have become a major tool to derive convergence rates during the last decade. In case of the Hilbert scale regularization (9) and adapted to (7), variational source conditions attain the form valid for some set of admissible elements M. Variational source conditions of the form (11) with ϕ (t) = √ t have been introduced in [15] and appeared recently for example in [4,9,10,13,17,23,31,32]. Similar to conditional stability estimates, variational source conditions express in an implicit way both nonlinearity conditions and solution smoothness of the underlying nonlinear inverse problem.
There is a certain connection between conditional stability estimates (7) and variational source conditions, which depends very much on the set M. Since the difference f 2 s − f † 2 s may attain positive and negative values for varying f ∈ M , there is no immediate connection. However, if M is such that the roles of f and f † in (11) can be interchanged, then each variational source condition immediately implies a conditional stability estimate as examined in [23]. If (11) is validated based on spectral source conditions and nonlinearity estimates, this will in general not be the case. For another approach to variational source conditions with general convex penalty functionals in the Tikhonov regularization of linear problems we refer to [16,26]. More recently, there have been approaches (see e.g. [23][24][25]35]) to verify variational source conditions directly for specific problem instances without relying on nonlinearity assumptions or spectral source conditions or on both. In this case, it can happen that the set M allows to interchange f and f † in (11), and hence also a conditional stability estimate follows, see e.g. example A.4 in the appendix.
The remainder of the paper is organized as follows: The focus of section 2 is on convergence and convergence rate assertions for deterministic inverse problems. As main result of section 2, in theorem 1 and its corollary 1 convergence rates for general concave index functions ϕ in the conditional stability estimate (7) are formulated and proven. This section closes a gap in the theory by extending the results recently published in [7] from the Hölder case to the case of general concave index functions. Section 3 is the statistical counterpart to section 2 with theorem 2 and corollary 3 as main result concerning convergence rates. In the appendix we finally discuss a series of motivating examples.

Deterministic inverse problems
In this section we consider a deterministic noise model, this is (4) with σ = 0. Recall that this implies g obs ∈ Y, g obs − g Y δ and we write g obs = g δ and f α = f δ α . Based on assumption 1 the following proposition on convergence is an immediate consequence of [32, theorem 4.3 and corollary 4.6]. In this context, we also take into account the usual properties of Hilbert scales, moreover the Kadec-Klee property of Hilbert spaces and the fact that f † is assumed to be the unique solution to (1) and sufficiently smooth. Proposition 1. Let α = α(δ) (a priori choice) or α = α(δ, g δ ) (a posteriori choice) be choices of the regularization parameter α > 0 satisfying the limit conditions then we have under assumption 1 and for δ n → 0 as n → ∞, α n = α(δ n ) or α n = α(δ n , g δn ), and Based on conditional stability estimates required by assumption 2, however, we can even prove convergence rates for the regularized solutions. We remark that the set Q of admissible elements with associated radii and Hilbert scale indices and the index function ϕ in this assumption need not be known. On the other hand, as long as the choice of the regularization parameter α > 0 obeys the condition (12), we have by formula (15) from proposition 1 that for fixed ν ∈ [0, s] and arbitrarily small radii ρ > 0 there is some In the following we will employ some convex analysis. The Fenchel conjugate of a func- For an index function h (defined on [0, ∞)) the Fenchel conjugate can be defined accordingly by extending h to all of R by Note that h * is always convex as a supremum over affine linear functions, and that for convex The Fenchel-Young inequality states that for all a, b ∈ R with equality if and only if a ∈ ∂h * (b), which for convex h is in turn equivalent to b ∈ ∂h (a). For more details on convex analysis we refer to [30]. Now we are ready to formulate our first main theorem, which yields an error decomposition.

Theorem 1.
Let the assumptions 1 and 2 hold and let the regularization parameter α > 0 be chosen a priori or a posteriori such that for sufficiently small noise levels 0 < δ δ the regularized solutions f δ α belong to the set Q of admissible elements of the conditional stability estimate (7). Then we have for such δ with the function depending on the concave index function ϕ and on the indices a, s, u the error estimate Proof. By assumption we have f δ α , f † ∈ Q for all 0 < δ δ . Hence using (3) and (7) we can compute Next we apply Young's inequality (this is just (16) a+u . This yields It follows from the minimizing property of f δ α in (9), that where we used (5). Due to the triangle inequality and (a + b) 2 2a 2 + 2b 2 it holds Some rearranging yields Combining (19) and (20) gives The claim follows by from dividing by α. □ Remark 1. Note that the assumption in theorem 1 that for sufficiently small noise levels 0 < δ δ the regularized solutions f δ α belong to the set Q of admissible elements of the conditional stability estimate (7) is satisfied if the choice of the regularization parameter satisfies the condition (12) and if the set Q is the intersection of a finite number of closed intersected balls D ν ρ (f † ) with 0 ν s. Before we conclude with convergence rates under a priori and a posteriori parameter choice rules, let us collect some facts about the approximation error in (17): (a) As ψ u,s,a (0) = 0 we obtain ϕ app (α) 0 for all α > 0.

Corollary 1.
Let the assumptions of theorem 1 hold true, suppose that ψ u,s,a is concave, and let α = α * be chosen such that Then we obtain the convergence rate Proof. Due to remark 2(c) we can simplify the error estimate (17) to with C = 2 max C, C (8C) u−s a+s and C as in theorem 1. Note that the infimum over α > 0 of the right-hand side of (27) can be computed as By concavity of ψ u,s,a , the last expression equals ψ u,s,a δ 2 . Furthermore choosing α = α * such that the infimum is attained at α * corresponds to equality in the Fenchel-Young inequality which is attained if and only if − 1 α * ∈ ∂ (−ψ u,s,a ) δ 2 . It remains to show that α * as in (25) satisfies (12). By the equality condition in the Fenchel-Young inequality (16) it holds As the left-hand side is 0, this implies immediately δ 2 α * ψ u,s,a δ 2 → 0 as δ → 0. For any convex function on [0, ∞), the subdifferential can be represented as an interval with borders given by left-and right-hand sided derivatives. Thus the concavity of ψ u,s,a implies As the supremum tends to ∞ as δ tends to 0 (see [36, remark 3.31]), this also proves α → 0 as δ → 0. □

Remark 3.
(a) The additional assumption that ψ u,s,a itself is also concave in corollary 1 seems rather mild. In case of a Hölder-type function ϕ, this follows immediately from concavity of ϕ itself, see example 1 below. Similarly, if ϕ is of logarithmic type as in example A.4, then concavity of ψ u,s,a is also evident.
(b) We will give another possible expression for an a priori parameter choice rule avoiding convex analysis in corollary 3.
Let us now turn to an a posteriori parameter choice rule. Given a set of candidate parameters α 1 = δ 2 , α j = α 1 r 2j−2 with some r > 1 for j = 2, ..., m where m is the first value such that α m 1, we define i.e. α Lep := α jLep is chosen according to the Lepskiĭ-type balancing principle. This gives the following result: Let the assumptions of theorem 1 hold true and choose α = α Lep according to (28). Suppose further that f δ α ∈ Q for all α in the previously described candidate set and any sufficiently small δ with the set of admissible elements Q for (7). Then we obtain the a posteriori convergence rate Proof. Note that (17) together with (23) yields an error decomposition of the form with some constant C > 0. For our set of parameter candidates this gives yields with Ψ ( j) = (4δ)/ √ α j = 4r 1−j and Φ ( j) = C ϕ app (α j ). By construction, Ψ is non-increasing, Φ is non-decreasing, and Φ (1) Ψ (1) = 4 if δ is sufficiently small. Furthermore Ψ (i) rΨ (i + 1), and hence it follows from [27, corollary 1] that By some elementary convex analysis we conclude as in [36, lemma 3.42], exploiting (23), that the minimum on the right-hand side can be replaced by the infimum over all α, provided that δ is sufficiently small (see also [37]). □

Example 1 (Hölder type conditional stability).
Let us consider the Hölder special case ϕ (t) = t γ of the conditional stability estimate (7) with exponents 0 < γ 1, which has recently been studied in a slightly modified form in [7]. Here we obtain

because it can be seen via differentiation that the supremum is attained for
This term coincides with the corresponding error term in [7, lemma 3.3]. Hence, the convergence rate from (26) attains in this example the form which again coincides with the rate results of theorems 2.1 and 2.2 in [7]. Note that in case of a linear forward operator, these rates are known to be order optimal as also discussed in [7]. The a priori choice (25) for the regularization parameter leads for the Hölder type conditional stability to Remark 4. The case of Hölder type conditional stability considered in example 1 allows us to discuss briefly the borderline situation u = s. Evidently, then the a priori parameter choice (32) attains the form α * = α * (δ) ∼ δ 2 , which is in a general Hilbert space setting well-known from [6] as an appropriate choice for conditional stability estimates of a form like in example A.3 below. However, in our setting f † ∈ X s with this parameter choice formula (31) cannot serve as a convergence rate result, because the exponent of δ is not positive. Moreover proposition 1 does not apply, since δ 2 /α * → 0 as δ → 0 fails. Hence, one cannot even show at all convergence f δ α * − f † s → 0 as δ → 0 and if the set Q in (7) restricts the applicability of the conditional stability estimate to balls around f † , then u = s is in contrast to u > s does not ensure that f δ α * ∈ Q. Asking for reasons why [6] recommends α * = α * (δ) ∼ δ 2 nevertheless also for the borderline situation u = s of conditional stability, we see that Cheng and Yamamoto in [6] use for finding approximate solutions the minimization problem instead of (9), which needs to know the set Q. Then one can show at least a convergence rate result in the X -norm of the form For γ = 1 such rate result (33) takes place also under somewhat stronger conditions for 'oversmoothing' penalties in the case u < s with f † s = ∞. In this context, we refer to [19], where for the a priori parameter choice (32), here with δ 2 /α * → ∞ as δ → 0, (33) is proven, see also [18] for the same convergence rate result by using the discrepancy principle.

Statistical inverse problems
Now we will discuss how to generalize the previous results to the stochastic data model (4) with σ > 0. To analyze (10) we have to proceed differently and post additional assumptions: Assumption 3. Let us assume that there is a Gelfand triple (V, Y, V ) such that the embedding ι : V → Y is Hilbert-Schmidt. Furthermore we suppose that F satisfies the interpolation inequality for all f ∈ D s ρ f † with some constant C θ (ρ), ρ > 0 and θ ∈ (0, 1). This assumption requires some comments. First note that ι being Hilbert-Schmidt implies i.e. it holds Z V < ∞ a.s. (34) is satisfied.

Proof. The interpolation inequality (3) for the Hilbert scale
The most common example for white noise ξ is as follows. Let Y = L 2 (Ω) for some Lipschitz domain Ω ⊂ R d , and let V = H s (Ω) with some s > d 2 . Then ι : H s (Ω) → L 2 (Ω) is Hilbert-Schmidt and one has the interpolation inequality (see (3)) It follows similar to the deterministic case that the functional (10) admits a unique minimizer for fixed data g obs . If Z is considered as an element of Y * , then continuous dependency of f α on Z can also be shown following the deterministic results. Convergence and convergence rates are slightly more involved, as we will see below. For the sake of presentation we restrict ourselves to a convergence rates result: Theorem 2. Let the assumptions 1-3 be satisfied, let the data g obs be given as in (4), and suppose (34) holds true. If there are σ 0 , δ 0 > 0 and α is chosen such that f α ∈ D s ρ f † for all 0 < σ σ 0 , 0 < δ δ 0 (with ρ as in assumption 3), then we have (surely) the error estimate for some constant C > 0.
Proof. Denote again S g; g obs := 1 2 g 2 Y − g, g obs and T g; Due to the minimizing property of f α in (10) we have which combined with (19) implies that By definition of S ·; g obs and T ·; g † we obtain For the last term on the right-hand side we use ab 2a 2 + 1 8 b 2 , which yields the estimate Concerning the second term on the right-hand side, using (34) and applying (18) appropriately twice we obtain with some constants C, C , C > 0 as f α s is bounded. Altogether this yields with some generic constant C > 0. Now we can proceed as in the deterministic case. □

Corollary 3. Let the assumptions of theorem 2 be satisfied and recall the notation
and choose α such that Then we obtain the a.s. convergence rate Proof. According to theorem 2 we have a.s. for some sufficiently large C > 0, where we also exploited Z V < ∞ a.s. and remark 2(c). Via δ = Σ Σ −1 (δ) = Σ −1 (δ) ϕ app (Σ −1 (δ)) and analogously a.s., where means up to a multiplicative constant which can change from line to line, but is independent of α, σ and δ. □

Remark 6.
(a) It is immediately be clear that the convergence rate in corollary 3 can also be obtained under an a posteriori choice of α as in corollary 2. (b) In the case ϕ (t) = t γ with some 0 < γ 1 as discussed in example 1, we compute and hence it can be seen immediately that the a priori choices in (25) and (36) and also the obtained rates in theorem 1 and theorem 2 with σ = 0 coincide.
Proof. Similar to the proof of theorem 2 we obtain from the minimizing property As Z V is a.s. bounded, this implies by C (ρ) = o (ρ) the claim. □ Acknowledgments BH is supported by German Research Foundation (DFG) via grant HO 1454/12-1, and FW has also been supported by the DFG through CRC 755, subproject A07. We also thank two anonymous referees for several constructive comments which helped us to improve the presentation of the paper.

Appendix
In this appendix we discuss different approaches to derive conditional stability estimates as in assumption 2. and where ϕ is a concave index function and K ,K as well as r are positive constants. The first condition (A.1) often occurs in regularization literature for Hilbert scale models (see e.g. [28,29,33,34]) , sometimes also in the stronger version where a > 0 denotes the degree of ill-posedness locally at f † (see [8, section 10.4]). In the form with a general concave index function ϕ, the second condition (A.2) was introduced and exploited in [2]. In the special case of monomials ϕ(t) = t κ , however, with exponents 0 < κ 1 and associated with Hölder rates this condition plays some role in the context of the degree (κ, ζ) of nonlinearity of F at f † introduced in [21], where the inequality for exponents 0 κ 1, 0 ζ 2 and a constant K > 0 has been considered. For strong structural conditions of nonlinearity of interest in this example and in particular for condition (A.2), exponents κ > 0 are required in (A.3). Evidently, by the triangle inequality with a constant K > 0 depending on K and r. Vice versa, we also derive by the triangle inequality a condition (A.2) with ϕ(t) = t κ (0 < κ 1) from (A.4).
Weaker structural conditions of nonlinearity, which will be discussed in example A.2 below, are characterized by the fact that F at f † does not allow for exponents κ > 0 in (A.3), but exponents κ = 0 and 1 < ζ 2 are typical in case of a Hölder continuity of the derivative F ( f ) in a neighborhood of f † . Now we come back to the pair (A.1) and (A.2) of conditions and derive for this situation the corresponding structure of the set Q in assumption 2. Combining both inequalities we immediately find a conditional stability estimate (7) of the form with some constant R > 0 and for the subset Q = D r (f † ) of the closed intersected ball D θ ρ (0), where θ = 0 and ρ = f † + r .
Let us close this example with some special application using X = Y = L 2 (0, T) and D(F) = X . We consider here the family of forward operators with constants c 0 , c 1 > 0. Such operators (A.6) occur in various types of parameter identification problems, e.g. for finding time-dependent growth rate functions in ordinary differ ential equation models and for identifying time-dependent conductivity functions in heat equation models (see for more details [14]). The corresponding operator equation (1) is locally ill-posed everywhere on X . Moreover, the operator F is continuously Fréchet differentiable everywhere on X and its Fréchet derivative attains the form Furthermore, we have for some constant K > 0 and for all f ∈ X which indicates without an upper bound for the radius r > 0 of D r (f † ) a degree (1, 1) of nonlinearity (see (A.3)). Applying the triangle inequality to (A.7) yields the estimate mapping in X = L 2 (0, T) in the context of the related Hilbert scale {X τ } τ ∈R generated by the for the multiplier function in F (f † ), we also find a constant 0 < c down < ∞ such that we have the estimate which is valid for all f ∈ X and can be rewritten in form of an inequality (A.1). Consequently, we find for all r > 0 a conditional stability estimate of type (7) with a = 1 and ϕ(t) = t as with the simple integration operator J from (A.9) such that L = (J * J) −1/2 defines the Hilbert scale. Hence, we have Unfortunately, for the solution f † from (A.17) with f † (1) = 0 we have f † ∈ X ν only for ν < 1/2 and hence f † / ∈ X 1 (see e.g. [12, lemma 8]). Then proposition A.1 cannot be applied immediately in case of this solution f † , but remark A.1 is applicable and for a = 1,K = 1 2 ,Ǩ = 2, η = 1, τ < 2 and arbitrarily large r > 0 we get for the autoconvolution operator F from (A.16) and f † from (A.17) the conditional stability estimate

A.2. Variant (B): based on global inequalities of the forward operator F
where θ is a positive number and the multiplier R may depend on the radius ρ > 0. For f = f † , the estimate (A. 19) is a special case of (7) with a = 0. The corresponding set Q collects elements f ∈ D(F) with the property f θ ρ. Hence, all the consequences of (7) 19) can be verified in large numbers for parameter identification problems in differential equations by powerful tools of PDE theory like Carleman estimates. With respect to concrete applications we refer to [5,6] and further literature mentioned therein. For a glimpse of such examples, we briefly recall here in the following a parameter identification problem, which was comprehensively outlined in [22] (see also [7, section 5.2]).
Example A.4 (Global conditional estimates, relation to variational source conditions). As a final example, which is also related to variational source conditions, we consider the problem of reconstructing the refractive index n = 1 − f † from far field data u ∞ in the acoustic scattering problem where the so-called Sommerfeld radiation condition (A.20c) is assumed to hold uniformly for all directions x = x/r ∈ S 2 = x ∈ R 3 | |x| = 1 .
In practical applications, either one or several incident directions d ∈ S 2 can be measured. Here we consider all directions d as available and define F f † := u ∞ . This forward operator can be seen as a mapping from L ∞ R 3 to L 2 S 2 × S 2 , and its natural domain of definition is as for all f ∈ D (F) the problem (A.20a)-(A.20c) admits a unique solution.
For this problem, it has been shown in [23] (see theorem 2.4 and corollary 2.5 ibidem) that a variational source condition and a conditional stability estimate hold true. More precisely, if 3/2 < m < s such that s = 2m + 3/2 and f † ∈ D (F) ∩ H s R 3 with the Fourierbased Sobolev space H s R 3 , then the variational source condition (11) with −a = m and the function ϕ (t) = A ln 3 + t −1 −2µθ , µ = min 1, s − m m + 3/2 holds true for any 0 < θ < 1. This also implies the conditional stability estimate (7) with −a = m and ϕ as in the above formula. Note that the case −a = m corresponds to a < 0 and is hence not covered by our analysis.