Non-linear Tikhonov Regularization in Banach Spaces for Inverse Scattering from Anisotropic Penetrable Media

We consider Tikhonov and sparsity-promoting regularization in Banach spaces for inverse scattering from penetrable anisotropic media. To this end, we equip an admissible set of material parameters with the L p -topology and use Meyers’ gradient estimate for solutions of elliptic equations to analyze the dependence of scattered ﬁelds and their Fr´echet derivatives on the material parameter. This allows to show convergence of a non-linear Tikhonov regularization against a minimum-norm solution to the inverse problem, but also to set up sparsity-promoting versions of that regularization method. For both approaches, the discrepancy is deﬁned via a q -Schatten norm or an L q -norm with 1 < q < ∞ . Numerical reconstruction examples indicate the reconstruction quality of the method, as well as the qualitative dependence of the reconstructions on q .


Introduction
We consider direct and inverse scattering of time harmonic waves from a penetrable and anisotropic inhomogeneous medium with density described by a matrix-valued material To this end, we set up weak solution theory for the scattering problem in Lebesgue spaces L t with t ≥ 2 to be able to treat contrast functions in L p with p < ∞ in some admissible set.Our analytic results allow to prove convergence of a sparsity-promoting version of Tikhonov regularization in Banach spaces for a specifically designed penalty term towards, roughly speaking, a minimum-norm solution.Numerical examples of contrast reconstructions in two dimensions show feasibility of the proposed algorithm.
For incident waves u i (x, θ) = exp(ikθ we seek solutions u(•, θ) to (1) such that the scattered field u s (•, θ) = u(•, θ) − u i (•, θ) additionally satisfies Sommerfeld's radiation condition, lim r→∞ r (d−1)/2 ∂u s ∂r (rx, θ) − iku s (rx, θ) = 0 uniformly in all directions x ∈ S d−1 .(2) By construction, the scattered field in particular solves the Helmholtz equation ∆u+k 2 u = 0 outside some ball B R = {|x| < R} containing D; such solutions are called radiating in the sequel.It is well-known that radiating solutions to the Helmholtz equation have the asymptotic behavior u s (rx, θ) = γ d e ikr r (d−1)/2 u ∞ (x, θ) + O(r −1 ) as r → ∞, where γ 2 = exp(iπ/4)/ √ 8πk, γ 3 = 1/(4π), and u ∞ : S d−1 × S d−1 → C is the so-called far field pattern of the scattered field.This function is analytic in both variables and defines the far field operator F = F Q : L 2 (S d−1 ) → L 2 (S d−1 ), The inverse scattering problem we are interested in is to stably approximate the contrast function Q exa from noisy measurements of the far field pattern u ∞ , that is, from a noisy version F ε meas such that F exa − F ε meas ≤ ε for some noise-level ε.To this end, we show that different variants of Tikhonov regularization can be employed for this task and in particular suggest a sparsity-promoting variant of that technique.The latter variant hence provides a solution Q † such that F Q † = F exa in the limit as the noise level ε tends to zero, such that moreover Q † minimizes the sparsity promoting penalty term defining the Tikhonov functional.
The convergence analysis of minimization-based regularization methods requires Banach spaces with some smoothness properties that rule out L ∞ as a suitable space for contrasts.Following [11], we use Meyers' gradient estimates for weak solutions to elliptic equations to obtain that gradients of weak solutions to (1) actually belong to L t -spaces with t > 2. This in turn allows to prove various analytic properties for the solution to (1) as Lipschitz continuity or directional Gâteaux differentiability that only require the contrast Q to be measured in some L p -norm with 2 < p < ∞.While [11] applies sparsity-promoting regularization methods based on Tikhonov regularization to determine a conductivity distribution in electrical impedance tomography, the only paper that investigates corresponding techniques in inverse scattering is [13], where the simpler Helmholtz equation ∆u + k 2 n 2 u = 0 is tackled.
A specificity of our approach compared to [11,13] is that we do not only incorporate penalty terms that are linked to Hilbert spaces but also measure the discrepancy, that is, the difference between the measured far field data and the computed approximation, in a Banach-space: We consider either the full range of Schatten classes S q on the space of linear operators on L 2 (S d−1 ) for 1 < q < ∞, or defined a norm on the space of measurement operators by considering the norm of their integral kernels in L q (S d−1 × S d−1 ).For q = 2, both notions coincide.The choice of q significantly influences both reconstruction time and quality, as we demonstrate numerically.
On the very technical level, the ellipticity of the conductivity equation tackled in [11] generally makes uniform estimates for solutions to the governing differential equation with different conductivities arguably easier than for the indefinite Helmholtz equation treated in this paper.(This problem did not occur in [13] due to the much easier solution theory in L t for the setting in that reference.) To be able to directly rely on a specific bound in Meyers' seminal work [16], we set up weak solution theory for (1) in the ball B 2R and assume that the support of the contrast is strictly included in B 2R .This assumption is not crucial but avoids technicalities.We finally remark that our estimates use a generic constant C that might change its value from one occurrence to the other.This paper is organized as follows: In Section 2 we recall well-known weak solution theory for the anisotropic Helmholtz equation, that will be used in Section 3 and Section 4 to establish Lipschitz continuity and differentiability of the solution operator mapping the contrast to the scattered field.Section 5 extends these result to the forward operator mapping the contrast to its far field operator.In Section 6 we present two variants of Tikhonov regularization.Finally, Section 7 presents several numerical examples computed using a sparsity-promoting shrinked Landweber iteration or a primal-dual algorithm; the required adjoint of the derivative of the forward operator is computed in Appendix A.

The scattering problem
In this section, we recall weak solution theory in Sobolev spaces W 1,t with t ≥ 2 for the anisotropic Helmholtz equation ( 1) subject to the radiation condition (2) for the scattered field.The latter equation is understood in the distributional sense.After recalling conditions for solvability of that problem in H 1 , we provide L t -theory using Meyers' gradient estimates.As it leads to somewhat shorter expressions, we actually tackle the scattering problem for the corresponding scattered fields, which are required to be locally in H 1 and to satisfy the differential equation div(( weakly in the sense of L 2 (R d ), subject to the radiation condition (2).Before recalling the standard L 2 -based solution approach via Riesz-Fredholm theory, we introduce a set of contrasts that depends on a fixed parameter λ ∈ (0, 1), Remark 2. We consider contrasts Q on B 2R supported in B R since this straightforwardly allows to directly rely on a specific Meyers' estimate from [16], avoiding technicalities.(B 2R could be replaced by any bounded domain that strictly contains B R .) Seeking to solve for the scattered field instead of the total one, we rewrite equation (3) for all test functions ψ ∈ C ∞ 0 (R d ) in the weak sense by multiplying that equation with ψ, integrating over B 2R , and integrating by parts the divergence term.Thus, and density of smooth functions in H ). Thus, the boundary integral in ( 5) is well-defined as a duality pairing between H ±1/2 (∂B 2R ).We denote by Λ 2R : ) the exterior Dirichlet-to-Neumann operator, see [17], which maps Dirichlet boundary values φ on ∂B 2R to the normal derivative ∂v/∂ν of the unique radiating solution v to the exterior Dirichlet scattering boundary problem.More precisely, v ∈ H 1 loc (R d \B 2R ) is the unique radiating solution to ∆v+k 2 v = 0 in R d \B 2R , and can be written down explicitly in series form using Hankel functions.As u s is a radiating solution to the Helmholtz equation, Λ 2R (γ(u s )) equals ∂u s /∂ν, such that ( 5) becomes for all ψ ∈ H 1 (B 2R ), with right-hand side F (ψ) = − B R Q∇u i • ∇ ψ dx.(We omit the trace operator γ from now on if a restriction to the boundary is obvious.)To prove existence of solution of (6), we follow [9], see also [3], and rely on an additional Dirichlet-to-Neumann map Λ ∆,2R that maps Dirichlet data on ∂B 2R to Neumann data of the solution to an exterior Dirichlet boundary problem for the Laplace equation.By [3, p. 131 allow to reformulate the variational form (6) as Both Λ 2R and Λ ∆,2R are bounded from H 1/2 (∂B 2R ) into H −1/2 (∂B 2R ), such that s and s 1 are bounded sesquilinear forms.The coercivity of Λ ∆,2R implies that s is coercive, , see [3, p. 131] and the compact embedding of H 1 (B 2R ) in L 2 (B 2R ) imply that s 1 is a compact sesquilinear form.By the representation theorem of Riesz there exists a bounded operator S : for all ψ ∈ H 1 (B 2R ), the variational formulation (7) can be equivalently rewritten as S v − S 1 v = r in H 1 (B 2R ).Multiplying with the inverse S −1 yields v − K v = S −1 r with a compact operator K := S −1 S 1 .Thus, Riesz-Fredholm theory implies that uniqueness of solution to the latter equation implies existence of solution for all right-hand sides.Lemma 3. If the only solution to the homogeneous problem corresponding to (6) is the trivial solution, then that variational problem possesses a unique solution for all bounded anti-linear functionals F ∈ H 1 (B 2R ) * and there is Uniqueness of solution for the variational problem ( 6) is strongly linked to the unique continuation property for solutions to that equation.In [9], Hähner shows uniqueness of solution for contrasts Q that are C 1 -smooth and supported in domains of class C 2 ; this result can be generalized to contrasts that are piecewise differentiable on a decomposition of B 2R into finitely many Lipschitz domains.Corollary 4. If there is a decomposition of B 2R = n j=1 Ω j of B 2R into finitely many Lipschitz domains Ω j such that Q ∈ Q belongs to C 1 (Ω j ) for j = 1, . . ., n, then the variational formulation (6) possesses a unique solution v for all bounded anti-linear functionals Remark 5.It is well-known that solutions v to (6) can be uniquely extended to radiating solutions in H 1 loc (R d ) of the Helmholtz equation in H 1 loc (R d ), see [17]: For the spherical Hankel function h (1) and the spherical harmonics Y m this extension is given by ṽ Such extended functions are called radiating extensions of solutions to (6) in B 2R in the sequel, and, by abuse of notation, simply denoted by v as well.
To be able to handle derivatives of scattered fields in L p -spaces, we give a version of Meyers' well-known gradient estimate from [16].Theorem 6.For the bounded Lipschitz domain i.e. v solves the variational formulation (6) for all ψ ∈ H 1 0 (B 2R ).Then there exists a constant T λ ∈ (2, ∞) depending on λ and d such that for all t ∈ (2, T λ ) the gradient ∇v belongs to L t (B 2R ) d and satisfies where C = C(λ, d, t, R).As λ → 0 (or λ → 1) the constant T λ tends to 2 (or ∞).
Proof.In [16,Theorem 2] the original statement is shown more generally for f The number r * is defined by 1 r * = 1 r − 1 d if r < d or as any number in (1, ∞) else.Since in our case h = k 2 u for u ∈ H 1 (B 2R ), the choice r = 2 is natural.If d = 2, we can hence choose an arbitrary r * ∈ (1, ∞) such that r * ≥ t > 2. In three dimensions, the analogous condition for r * is fulfilled: The necessary condition t > 2 enforces t > 2d/(d + 2), which allows to use estimate (49) of [16,Theorem 2], and gives the stated result for v = u, h = k 2 u, and r = 2.
Before considering analytic properties of the contrast-to-solution map in the next section, we reformulate the anisotropic Helmholtz equation under investigation for a source which is the radiating fundamental solution to the Helmholtz equation.More precisely, 3 The solution operator To investigate the solution operator mapping the contrast Q and the incident field u i to the weak solution of the scattering problem ( 6), we define a sesquilinear form for For Q ∈ Q, we assume that the forward problem ( 6) is solvable for all right-hand sides and denote by L(Q, •) : Choosing f = u i as a solution to the Helmholtz equation in R d , v Q = L(Q, u i ) is hence the weak solution to the variational formulation (6), i.e., and the radiating extension of To state a perturbation result for L(Q, •), note that boundedness of the solution operator L(Q, •) implies by Riesz' representation theorem the existence of a boundedly invertible operator Lemma 7. Assume that for Q ∈ Q the forward problem (6) is uniquely solvable and let Q be a perturbation of Q, small enough such that Then for all Thus, the solution operator L(Q + Q , •) exists for all Q that satisfy (14) and is uniformly bounded by 13) and note that exists as a bounded operator on H 1 (B 2R ) and its operator norm is bounded by A −1 the claim follows from the equivalence of ( 15) and the equation (

Remark 8. (i) It is well-known that the technique of the proof of Lemma 7 allows to
show solvability for all contrasts of the form (ii) Combining the last Lemma 7 with Corollary 4, one can show that the forward problem ( 6) is solvable in the union of open L ∞ -balls around, roughly speaking, all piecewise continuously differentiable contrasts Q with radius A −1 Recall now that the constant T λ > 2 has been defined in Theorem 6 and that u i denotes a generic solution to the Helmholtz equation 14), there holds that with a constant C that only depends on Q but not on Q or on u i .
Proof.Due to Lemma 7, the assumptions on Q and Q imply that both solution operators ) and denote the radiating extensions (see ( 9)) of these functions to R d again by v Q+Q and the corresponding total fields by u Q+Q =

Boundedness of the solution operator
Choosing p and t such that 1/p + 1/t = 1/2, the generalized Hölder inequality yields The choice of p and t implies p = 2t/(t − 2) and since by assumption p > 2T λ /(T λ − 2) we have 2T λ /(T λ − 2) < 2t/(t − 2).The strict monotonicity of s → 2s/(s − 2) on (2, ∞) consequently implies that t < T λ , which allows to conclude by Meyers' estimate (10) that Due to the assumption on Q and Lemma 7, the right-hand side of the latter estimate can be bounded by where C Q is the constant from (8).Together, the last four estimates yield the claim.
Remark.The case p = ∞ is not covered by Meyers' estimate, but could be treated directly by standard L 2 -theory from Lemma 3 and Lemma 7.This also holds for all further statements.
Theorem 9 requires assumption (14) for Q merely to bound the norm of u Q+Q independently of Q .If one knows a-priori that the norm of solutions to the scattering problem is uniformly bounded, then assumption (14) can obviously be dropped.

Derivative of the solution operator
We now have a glance at the differentiability of the solution operator and therefore fix the incident field u i in this entire section.We further fix Q ∈ Q such that the solution operator L(Q, •) is bounded on H 1 (B 2R ) and denote the derivative of L with respect to We show in Theorem 12 below that v can be interpreted as a Gâteaux derivative of L(Q, u i ) in direction Q (see also Remark 13).
has the following continuity properties: is the total field whose radiating extension to R d satisfies the anisotropic Helmholtz equation div((Id d +Q)∇u Q ) + k 2 u = 0 weakly in R d (see (9)).Choosing p and t such that 1/p + 1/t = 1/2, the generalized Hölder inequality further implies that and, as in the proof of Theorem 9, Meyers' estimate (10) yields Next, we exploit Lemma 3 to estimate ) , and we conclude that (ii) For t ∈ (2, T λ ) and p > 2T λ /(T λ − 2), Meyers' estimate (10) yields as in the proof of part (i) that where, as above, the radiating extension of the total field 9)).The first term in ( 20) is bounded by ) due to (i), such that it remains to bound the second term: For arbitrary ε > 0 such that t = t + ε ∈ (t, T λ ), we set p = t t/(t − t) and compute that Since p(t − t)/t t = 1 and as ) due to the proof of (i), Theorem 12.For p > 2T λ /(T λ − 2), the solution operator L is differentiable in the sense that for every where C > 0 is independent of Q and u i .Thus, if {Q n } n∈N satisfies (14) for all n ∈ N as well as we first consider the variational formulations defining all three terms, Thus, for all ψ ∈ H 1 (B 2R ) there holds Lemma 7 and Hölder's inequality for p and t such that 1/t + 1/p = 1/2 imply that ) , which implies the claimed estimate.
Lemma 14.Under the assumptions of Theorem 12, there exists t ∈ (2, T λ ) and C > 0 independent of Q and u i such that Proof.As in Theorem 9, we exploit that the difference of u Q+Q = u i + L(Q + Q , u i ) and Consequently, Meyers estimate (10) implies that The first term of the right hand side is bounded by Theorem 9, whereas the second one can be estimated as in the proof of Lemma 11(ii), Theorem 15.Under the assumptions of Theorem 12, the map Q → L (Q, u i ) is locally Lipschitz continuous: There is C > 0 independent of Q and u i such that for all P ∈ L p (B R ) d×d there holds satisfy by (19) the variational formulations Thus, Lemma 7 and the generalized Hölder inequality with Lebesgue indices p and t such that 1/p + 1/t = 1/2 imply that and Lemma 11 shows that ) .Combining these bounds with the above estimate for w shows the claim, as .
In analogy to the potential representation (11), the radiating extension of the derivative because v solves, by definition, the variational formulation (19).

The forward operator
In this section, we define the forward operator corresponding to the inverse scattering problem we are ultimately interested in.This operator maps a contrast function to the corresponding far field operator.Before defining the forward operator, we first assume from now on that the solution operator L(Q, u i ) is well-defined and bounded on H 1 (B 2R ) for all Q ∈ Q. Due to Lemma 7, this can always be guaranteed by choosing the parameter λ ∈ (0, 1) defining Q small enough.
Assumption 16.The solution operator L(Q, u i ) exists for all Q ∈ Q as a bounded operator on H 1 (B 2R ).
Further, we recall from (11) that the radiating extension of For a direction x ∈ S d−1 , the far field pattern of v ∞ (x) hence equals As the latter integral expression is an analytic function in x, the far field v ∞ is analytic as well.Let us now introduce, for brevity, the integral operator (See [3] for the mapping properties of V .)The total field v + u i restricted to B 2R satisfies Thus, we abbreviate the (bounded If we further introduce the integral operator then there holds that ), the following lemma shows that the composition on the right is well-defined and bounded.This is basically due to the smoothing property of Z, which is a trace class operator, see [8].
Proof.(i) The kernel of the integral operator Z is smooth in x and y, such that one easily shows the claimed bound by partial integration and the Hölder inequality.
(ii) Due to the bounds shown in part (i), Z is bounded from L t (B 2R ) d into any Hilbert space H m (S d−1 ).Choosing m large enough then implies that the embedding of H m (S d−1 ) in L 2 (S d−1 ) is a trace class operator, see [7].As those operators form an ideal, the Z is also trace class operator that maps L t (B 2R ) into L 2 (S 2 ).
We are now ready to rigorously introduce the forward operator that, by definition, maps contrasts to far field operators.To this end, we consider incident fields in form of Herglotz wave functions, that are well-known entire solutions to the Helmholtz equation in R d .It is moreover well-known that g → v g | B 2R is a bounded operation from L 2 (S d−1 ) into H 1 (B 2R ), see [3].Using v g as an incident field then defines a far field operator analytic in both variables, F Q is compact and even belongs to the set S 1 of trace class operators on L 2 (S d−1 ), since its singular values s j (F Q ) are summable, i.e., F Q S 1 = j∈N |s j (F Q )| < ∞.The embedding p ⊂ q for 1 ≤ p < q ≤ ∞ of the sequence spaces p further implies that trace class operators belong to the qth Schatten class S q for all q ∈ [1, ∞), a Banach space of all compact operators on L 2 (S d−1 ) with q-summable singular values s j (F ), equipped with the norm defined by This allows to define the contrast-to-far field mapping, as an operator from Q into the qth Schatten class S q .
(i) Due to Lemma 17 with t = 2 and the continuity properties of the solution operator L, the composition (ii) As trace class operators form an ideal in the space of all bounded operators, and as F(Q)g = Z(L(Q, v g )) with a trace class operator Z, the forward operator is a trace class operator as well, and hence belongs to all spaces S q for q ≥ 1.
(iii) An alternative to the S q -norms are L q -norms for integral operators on the sphere: Since F(Q)g = S d−1 u ∞ (•, θ)g(θ) dS(θ) is represented by the far field pattern u ∞ (•, θ) of the scattered fields u s = L(Q, v g ), the L q -Norm of u ∞ defines an operator norm for F(Q) by F(Q) q := u ∞ L q (S d−1 ×S d−1 ) , 1 < q < ∞.The contrast-to-far field map Q → F(Q) as defined in (26) is then continuous from L q (S d−1 ) into L q (S d−1 ) with q = q/(q − 1), because g → v g | D is continuous from L q (S d−1 ) into C 1 (D) for all q ∈ (1, ∞).For q = 2, it is well-known that • S 2 = • 2 .The advantage of the L q -norms with respect to the implementation of inversion algorithms is that the computation of adjoints or subdifferentials is straightforward for these spaces.Since the subsequent theoretic results do not depend on the choice of the discrepancy norm, we continue to work with the Schatten norms • Sq , noting that all results holds as well for the • q -norms.
The link between the solution operator L and the non-linear forward operator F enables us to show various properties of F via those of L. To this end, note first that the far field of the radiating extension of L(Q, v g ) depends boundedly and linearly on L(Q, v g ), such that the derivative ), S q ) with respect to Q ∈ Q of F equals, by the product rule in Banach spaces, see [23], This allows to transfer the results of Theorems 9, 12, and 15 from L to F. (14), and q ≥ 1.
Proof.The basic ingredient of the proof is the smoothing property of the far field map Z defined in (24), which is a trace class operator from L 2 (B 2R ) d into L 2 (S d−1 ).Choosing the incident field u i as a Herglotz wave function v g for some g ∈ L 2 (S d−1 ), where inequality ( * ) follows from Lemma 17(ii) and the fact that the composition of the trace class operator Z with a bounded and linear operator is of trace class as well.Now we use again the technique from the proof of Theorem 9, see (17) and (18), to obtain the bound together with the estimate = C for the total wave field due to Lemma 7, with a constant such that by plugging the last estimates together we conclude that F(Q+Q The bounds in (ii) and (iii) are shown analogously, using Theorems 12 and 15 instead of Theorem 9.
As for Theorem 9, the last corollary's assumption that ( 14) holds for Q can be replaced by uniformly bounded solution operators, see Corollary 10.
Proof.Instead of Theorem 9, use Corollary 10 to obtain the bound (29) in the last proof.

Non-linear Tikhonov and sparsity regularization
The inverse problem we consider is to stably approximate a contrast Q exa from perturbed measurements of its far field operator F(Q exa ).More precisely, for noisy measurements F ε meas with noise level ε > 0 such that F(Q exa ) − F ε meas Sq ≤ ε, we seek to approximate Q by non-linear Tikhonov regularization.Thus, for some penalty term R we consider the Tikhonov functional over some appropriate admissible set of contrasts included in Q.As the functional J α,ε requires F(Q) to be well-defined, we suppose for the rest of the paper that the variational formulation (6) of the forward problem is uniquely solvable for all Q ∈ Q and all incident fields u i that solve the Helmholtz equation in R d .
Assumption 21.The variational formulation (6) is uniquely solvable for all Q ∈ Q and all incident fields u i that solve the Helmholtz equation in R d , and the norm of the solution Due to Lemma 7, the first part of the latter assumption can always be guaranteed by choosing the parameter λ ∈ (0, 1) that defines the set Q in (4) close enough to one, as ( 6) is uniquely solvable if Q is the identity matrix.The second part can be guaranteed by merely considering contrasts in Q ∩ X for some space Before presenting the Tikhonov regularization framework in detail, we first show Lipschitz continuity of the discrepancy Proof.For all contrasts Q and Q + Q in Q the reverse triangle inequality for norms implies that By Assumption 21, Corollary 20 bounds the last right-hand side uniformly in Q and Fixing p > 2T λ /(T λ −2), let us now set p * = dp/(p+d), such that 1 < p * < d.Sobolev's embedding theorem then implies that W 1,p * (B 2R ) d×d embeds compactly into L p (B 2R ) d×d ; moreover, the Sobolev inequality ) d×d are in general discontinuous (an embedding into Hölder spaces would require p * > d).We consider in the sequel the space W 1,p * 0 (B R ) d×d of functions that vanish on ∂B R and extend those by zero to all of R d , such that the intersection of W 1,p * 0 (B R ) d×d with Q is well-defined.(By abuse of notation, we do not denote this extension explicitly.) Non-linear Tikhonov regularization is classically based on the assumption that the penalty term R is coercive in the space of interest L p (B R ) d×d , such that weak convergence results can be obtained for a minimizing sequence.If R is even coercive in a space compactly embedded in L p (B R ) d×d , then one directly obtains strong convergence of the minimizing sequence.
Theorem 23 (Tikhonov regularization).If we choose the penalty term R of the Tikhonov functional J α,ε as n /α n → 0, then every sequence of minimizers of J αn,εn contains a subsequence that weakly converges to solution norm amongst all solution to the latter equation.
As the proof of Theorem 23 is well-known, see, e.g., [20] or the proof of Theorem 25 below, we directly present a sparsity-promoting alternative based on a wavelet basis of W 1,p * , see [21].Assume that for j = 0, G = {(Fa, . . ., Fa)}, and m ∈ Z d ,

We finally define wavelet coefficients of functions
Examples for suitable wavelets include the well-known Daubechies wavelets, see [4,5]; however, the following result holds as well for differently constructed Meyer wavelets, see Chapter 3.1.5 in [21], in particular Theorem 3.12.
Theorem 24 (See [21, Theorem 3.5]).For 1 ≤ p * < ∞ and the above-defined n-wavelets Ψ j,G m such that n ∈ N satisfies n > max(1, 2d/p * + d/2 − 1) there holds that the set of functions {ψ j,G m } is an unconditional basis in W 1,p * (R d ).Further, there are constants A, B > 0 such that for all Q ∈ W 1,p * (R d ) d×d there holds In the following, we use the representation of the W 1,p * -norm in (32) for contrasts Q ∈ W 1,p * 0 (B R ) d×d that are extended by zero to all of R d and, to this end, abbreviate the series in (32) by j,G,m .For all numbers 1 ≤ r ≤ p * and all sequences (a j ) j∈N in p * (N) there holds that bounds the rth power of the W 1,p * -norm of Q ∈ W 1,p * 0 (B R ) d×d from above.Recall now that F(Q exa ) and F ε meas ∈ S q model exact and noisy measurements, respectively, with noise level F(Q exa ) − F ε meas Sq ≤ ε.Given the above setting, the following result follows straightforwardly from standard non-linear regularization theory [19,20].
Theorem 25 (Sparsity regularization).For 1 ≤ r ≤ p * = dp/(p + d), the functional J α,ε with R = R r from (33) possesses a minimizer in Q ∩ W 1,p * (B R ) d×d .If ε n → 0 as n → ∞ and if one chooses α n = α n (ε n ) such that 0 < α n → 0 and 0 < ε 2 n /α n → 0, then every sequence of minimizers of J αn,εn contains a subsequence that weakly converges to an R r -minimizing solution Proof.We repeat the proof for the existence of a minimizer of J α,ε .For an arbitrary minimizing sequence is uniformly bounded as well.As W 1,p * (B R ) d×d is a reflexive Banach space, the sequence {Q (n) } n∈N contains a weakly convergent subsequence that converges in L p (B R ) d×d due to the compact embedding of W 1,p * (B R ) d×d in L p (B R ) d×d , say, to Q ∈ W 1,p * (B R ) d×d .Since Q is a convex set, the limit Q belongs to Q and Lipschitz continuity of the discrepancy term Lower semi-continuity of the penalty term R r with respect to W 1,p * (B R ) shows that Q is a minimizer of J α,ε .Consistency of the minimizers for vanishing noise level under the given choice of the regularization parameter α can be shown as in, e.g., [20].
We omit here to show well-known source conditions that imply convergence rates of the minimizers, as these are classic and can be easily transferred from either abstract results in, e.g., [20], or from [11], to our setting.All required analytic properties of the forward operator F can be derived from Corollary 19.
As it is well-known that a solution Q to the inverse problem F(Q) = F meas is only unique up to a change of variables, Theorem 25 shows that all we can hope for is to determine an R r -minimizing solution.Even if we restrict ourselves to an (scalar) isotropic contrast of the form Q = q sc Id d , it is unclear to us whether the far field operator F q sc corresponding to q sc uniquely determines the isotropic contrast q sc .

Numerical examples
After elaborating a theoretic framework that guarantees convergence of the Tikhonov iterates against a minimum-norm solution, we present a couple of numerical experiments for contrasts that are sparse in a wavelet basis, in the sense that few wavelet coefficients of the isotropic contrast function are non-zero.To this end, we minimize the Tikhonov functional in (30) for the sparsity-promoting penalty (33) in a simplified setting where merely a scalar material parameter q sc with wavelet coefficients W q sc = {(q sc ) j,G m } defined as in (31).Starting with an initial guess q sc 0 (that will always be chosen as zero), we consider a Tikhonov functional for the linearization of the forward operator at the current iterate q sc and seek a minimizer h of where R r is defined in (33).We tackle the latter minimization problem numerically by either a shrinked Landweber iteration as proposed by Daubechies, De Frise and de Mol and indicated computation times are measured on an eight-core INTEL R Core TM i7-3770 CPU@3.40GHz with 32 GB RAM.When adding artificial noise to the synthetic data, we scale a matrix containing independent and normally distributed random numbers with mean zero and standard deviation one such that the sum of the synthetic data and the random matrix has a prescribed relative error, equal to ε = 0.01, 0.05, or 0.1).Note that instead of discretizing the adjoint of the derivative of the forward operator, we rely on the adjoint of the discretization of the integral operator, to obtain exact adjunction up to the precision of the iterative solver.Figure 1 shows plots of the two contrasts q sc (1,2) we consider for inversion (for the complex-valued q sc (1) we plot real and imaginary part).In the following first set of examples we used the shrinked Landweber iteration sketched above for artificial noise levels ε equal to 0.01, 0.05, and 0.1 and regularization parameter α = ε.The wave number equals k = 140, such that the wave length is about 0.045.The iteration is stopped by the discrepancy principle if the (relative) discrepancy is less than 1.5 ε. Figure 2 shows that the shape of the cross in Figure 1(a,b) is well reconstructed and that the magnitude of the reconstruction is roughly matched, at least for small noise level.However, the small variations of the contrast inside the cross-shape are not well resolved but tend either to thicken or to thin the width of the cross.For ε = 0.01, the relative discrepancy does not reach the prescribed value of 0.015 in 500 iterations, which might be due to the numerical noise level of the synthetic data.We hence plot the 500th iterate; after the 400th iteration, the first two digits of the reconstruction do no longer change, such that this is, arguably, justified.Reconstruction times notably become tremendous for so many iteration steps.
Figure 3 shows the corresponding results in the same reconstruction setting for the realvalued double L-shape from Figure 1(c).The inversion scheme converges somewhat faster; again, for ε = 0.01 the reconstructions do not reach a relative discrepancy of 1.5 ε until the sequence of reconstructions becomes stationary at about the 200th iterate.Notably, the imaginary part of the reconstruction remains small during the iteration without imposing it to vanish by the algorithm.Again, the reconstruction times are rather high, which is a well-known disadvantage of soft-shrinking techniques.Generally speaking, the inversion problem is to our impression somewhat harder to tackle numerically by the shrinked Landweber iteration compared to the Helmholtz equation ∆u + k 2 (1 + q)u = 0 considered in [13].
To cope with the two most obvious disadvantages of the shrinked Landweber iteration, we finally consider the primal-dual algorithm by Chambolle and Pock.This allows first to consider different norms for the discrepancyterm of the Tikhonov functional (we choose discretized L p -norms for functions on S 1 × S 1 -norms as explained in Remark 18) and second yields smaller computation times.This algorithm computes the minimizer of the Tikhonov functional in (34) by explicitly considering the resolvents of the subdifferentials of the convex functionals F → F − F ε meas + F(q sc ) q Sq and h → αR r (q sc + h).More precisely, let us consider general proper, convex, and lower semicontinuous functionals E : S q → [0, ∞] and P : Q → [0, ∞], as long as the resolvents (I +σ∂E * ) −1 and (I +η∂P) −1 of the subgradients of the Fenchel conjugate E * of E and of P are explicitly computable for η, σ > 0. The primal-dual algorithm then computes the minimizer of via these resolvents.As already mentioned, E = • q q /q for 1 < q < ∞, see Remark 18.We further define P(•) as sum of R r from (33) and a convex functional 1 b that ensures that the reconstructed contrast respects a-priori known pointwise bounds: 1 b (q sc ) = 0 if −1 ≤ Re q sc (x) ≤ 3 and 0 ≤ Im q sc (x) ≤ 3 in [−0.4,0.4] 2 ; otherwise, 1 b (q sc ) = ∞.For both functionals, the subgradients can be computed using basic rules of convex analysis, see, e.g., [18], and both resolvents can be computed explicitly.
For the remaining example, we invert q sc (2) for scattering data for k = 100, such that 2π/k ≈ 0.063.The artificial noise level ε is set to 0.01, the regularization parameter α in (36) equals 0.01 and the remaining parameters η and σ to (5/4 F (q sc ) ) 1/2 .We stop the primal-dual algorithm applied to the linearized functional in (36) when the relative residuum of the linear equation is less than 0.05 (which typically yields less than 10 steps and is finished in less than a minute).The largest part of the computation time of the primal-dual algorithm is due to the computation of the entire (factorized) matrix representing the derivative of the forward operator at the current iterate q sc .(This typically takes less than 2 minutes for the examples below.)The numerical computation of the (manorm of F (q sc ) takes about one minute and executing the algorithm for one linearized problem typically less than three minutes.Figure 4 shows the effect of changing the parameter q ∈ (1, ∞) of the discrepancy term • q q /q by plotting reconstructions for q = 1.6, 2, and 3. (We simply plot the iterate with the smallest error.)Let us first note that for all reconstructions, the computation times are much smaller than for the shrinked Landweber iteration.Generally, choosing q larger/smaller results in smaller/larger iteration numbers to reach to optimal reconstruction in the entire range in between q = 1 and q = 5.On the other hand, the reconstruction quality is best for q = 2, where the accuracy roughly matches that of the shrinked Landweber iteration.(Arguably, this might be due to the Gaussian distribution of the additive noise.)Choosing q larger or smaller than 2 yields increasingly worse reconstructions; in particular, the contrasts do not reach the true values anymore.Thus, the choice of the discrepancy norm has obviously a significant influence on the inversion result.

A The adjoint of the forward operator's linearization
The adjoint operator of the linearization F is a crucial ingredient for most gradient-based schemes tackling the inverse scattering problem to stably solve the non-linear equation F(Q) = F meas for some given F meas ∈ S q .This is our main motivation to give an explicit and computable representation of this adjoint.We fix Q ∈ Q, consider F (Q) : L p (B R ) d×d → S q (a) (b) (c) Figure 4: Real part of reconstructions of q sc (2) by primal-dual algorithm for different discrepancy norms • q q /q (see Remark 18) and fixed artificial noise level ε = 0.01, plotted on [−0.4,0.4] 2 .(a) q = 2, 5 iter., 12 min.,rel.error=0.658(b) q = 3, 2 iter., 4 min., rel.error=0.738(c) q = 1.6, 41 iter., 82 min., rel.error=0.763.and aim to determine F (Q) * : S q → L p (B R ) d×d such that ) d×d for all Q ∈ L p (B 2R ) and K ∈ S q . (37) Here, p and q are the conjugate Lebesgue indices to p and q, respectively, such that 1/p + 1/p = 1 and 1/q + 1/q = 1, and (• , for an arbitrary orthonormal basis (g j ) j∈N of L 2 (S d−1 ).Consequently, (37) becomes Thus, we consider at first a single L 2 -scalar product for fixed Q ∈ Q and g ∈ L 2 (S d−1 ) and seek for A : L 2 (S d−1 ) → L p (B 2R ) d×d such that Recall from (22) that L (Q, v g )[Q ] = v ∈ H 1 (B 2R ), a function whose radiating extension satisfies Since the derivative F , see (27), involves the far field of L , we note that

Remark 1 .
If the material parameter A = Id d +Q is piecewise differentiable, then any weak solution u and its co-normal derivative ∂u/∂ν A := ν A∇u are continuous over interfaces Γ where A jumps: [u] Γ = 0 and [ν A∇u] Γ = 0, where ν denotes a unit normal to Γ and [v] Γ denotes the jump of the function v across Γ.
) Thus, λ determines the class of possible material parameters A = Id d +Q such that λ ≤ Re z A(x)z and |A(x)| ≤ λ −1 for all z ∈ C d with |z| = 1 and almost every x ∈ R d .(We always implicitly extend Q ∈ Q by zero from B 2R to all of R d .)We further endow Q with the L p (B 2R ) d×d -norm for 1 ≤ p ≤ ∞.Note first that for p < ∞, the set Q then has no interior points for the L p -topology, since for any Q ∈ Q and any ε > 0, the open is a compactly supported (mother) wavelet with scaling function ψ Fa , and that the corresponding one-dimensional wavelets are associated to a multi-resolution analysis.Define d-dimensional n-wavelets as usual by setting ψ m (x) = d r=1 ψ Fa (x r − m r ) for m ∈ Z d and x = (x 1 , . . ., x d ) ∈ R d .For {Fa, Mo} d * = {G ∈ {Fa, Mo} d : at least one component of G equals Mo} we further set

A(
•) L 2 (B R ) d×d is the usual scalar product in L 2 (B R ) d×d , (A, B) L 2 (B R ) d×d = ij B ij dx,extended to the anti-linear dual product between L p (B R ) d×d and L p (B R ) d×d .Further, (•, •) S 2 is the scalar product in the Hilbert space of Hilbert-Schmidt operators,(F, K) S 2 = j∈N s j (F)s j (K) = ∞ j=1 Fg j , K g j ) L 2 (S d−1 ) [15, into H 1/2 (∂B 2R ), see[15, Lemma 3.35], such that u s | ∂B 2R belongs to H 1/2 (∂B 2R ).Further, (Id d +Q)∇u s belongs to H(div, B 2R ) since u s solves (3) in L 2 (R d ), such that the trace theorem in H(div, B 2R ) shows that ∂u s /∂ν = ν •∇u s = ν •(Id d +Q)∇u s belongs to H −1/2 (∂B 2R 1(B 2R ) implies that the latter equation holds for all ψ ∈ H 1 (B 2R ).As the trace operator γ(u) = u| ∂B 2R has an unique continuation to a linear operator from H 1 (B