Preconditioning inverse problems for hyperbolic equations with applications to photoacoustic tomography

This paper is concerned with the preconditioning of wave equation-constrained linear inverse problems from boundary observation data, such as, for instance, photoacoustic tomography, which is considered as an application. The main result of this paper is a concept for regularization parameter robust preconditioning. That is, we propose a preconditioner for the PDE-constrained optimization problem such that the condition number of the preconditioned optimality system is uniformly bounded with respect to the regularization parameter. Using an augmented Lagrangian formulation for the discretized optimality system, we employ a discretization strategy which renders the discretized preconditioned system robust with respect to both the regularization parameter and the mesh size. Numerical experiments complement our analysis.


Introduction
In this paper, we reformulate inverse problems for linear hyperbolic equations from indirect boundary measurement data in the framework of optimal control. The optimality system of the resulting hyperbolic PDE-constrained optimization problem is a saddle point problem in an infinite-dimensional Hilbert space setting. We prove the well-posedness of the optimality system and provide a formulation that is regularization parameter robust. In particular, we propose a preconditioner for the continuous system which renders the condition number of the preconditioned optimality system uniformly bounded with respect to the regularization parameter. Using a conforming finite element discretization, the discrete preconditioned system is robust with respect to the regularization and discretization parameters. Analogous concepts have been developed for control problems based on elliptic partial differential equations. Saddle point formulations for inverse problems of the wave equation have been considered in [3,4] before. The main difference between these papers and ours is that our approach includes an additional regularization term which allows us to avoid an 'observability hypothesis' as stated in [3,4]. A further difference is that for our choice of finite element discretization spaces, the discrete inf-sup condition of the associated saddle point problem is inherited from the continuous formulation (see section 4.2) and does not need to be assumed.
As a prime (numerical) test example, we consider the inverse problem of photoacoustic tomography (PAT) [15,16]. For this inverse problem, we provide a proof of concept of regularization parameter robust preconditioning. The concept is flexible and can be applied to generalized problems of PAT, such as models taking into account attenuation or variable sound speed models. Moreover, at the current state of research, we do not make an attempt to be competitive with existing highly developed and efficient algorithms in PAT (such as fast Fourier methods based on backprojection algorithms-see for instance [6]), because our implementations are highly memory and run-time demanding since they require space-time solutions of the wave equation. For the sake of completeness, we review the basic mathematical model of PAT and the variant, which is considered here: for photoacoustic imaging, a specimen is illuminated by a short laser pulse. The absorbed electromagnetic energy creates instantaneous heating of the probe which in turn induces an acoustic pressure wave caused by rapid thermal expansion. The goal of PAT is to reconstruct the electromagnetic absorption density, which is assumed to be proportional to the induced acoustic pressure (denoted by u in the following). Mathematically, the propagating pressure wave (denoted by y ) is typically modeled with the acoustic wave equation in R 2 , R 3 : Let d = 2, 3, then the pressure wave solves (1.1) In PAT, the acoustic pressure y is measured over time on a curve Γ, which does not intersect the support of the specimen and lies entirely on one side of Γ-with this term, we have in mind, for instance, that the measurement devices surround the specimen or that the measurements lie on part of a half plane which bounds the specimen. The mathematical problem of PAT consists of reconstructing u from the knowledge of y on Γ over time. The inverse problem of PAT is therefore an inverse initial source problem for a hyperbolic partial differential equation.
In the forward modeling of PAT, we slightly deviate from the standard model (1.1), and consider the wave equation in a bounded domain Ω ⊂ R d over a finite time interval (0, T) and enforce homogeneous Dirichlet boundary conditions on ∂Ω. The deviation from the standard model in PAT is done in order to work on a bounded computational domain. If ∂Ω is relatively far away from the specimen of interest, we might expect only slight deviations from the standard PAT model. Therefore, the PAT problem considered in this paper consists of estimating the absorption density u : Ω → R from boundary measurements z d of y We assume that Γ is a closed curve completely contained in the open domain Ω (see figure 1). With respect to optimal control, variants of the standard PAT model have also been considered, for instance, in [5]. For the sake of simplicity, we shall illustrate our ideas here only for the constant sound speed coefficient case; the basic concept also applies to the variable sound speed case.
The organization of the paper is as follows. In section 2, we introduce some notation and recall well-known results regarding the wave equation which are needed in order to introduce our optimal control problem in a proper Hilbert space setting in section 3. We prove the wellposedness of the corresponding optimality system in a regularization parameter robust manner in theorem 3.6, which is the key element for the design of our robust preconditioner. In section 4, we propose and analyze a robust operator preconditioner for the continuous optimality system (see theorem 4.1). We then show the well-posedness of the discretized augmented optimality system for our particular choice of finite element discretization spaces. Given such a stable discretization, the preconditioner for the discrete optimality system is derived from the continuous one in theorem 4.4. Finally, in section 5, we conclude our work with numerical experiments on the robustness of our preconditioner and the proof of concept of applicability for PAT.

Basic notation
Throughout this paper, we will use the following notation:

Notation 1.1 (Sets). Let
denotes an open bounded domain with piecewise Lipschitz boundary ∂Ω. Let ∅ = Ω s be an open subset with Lipschitz boundary ∂Ω s , which is compactly contained in Ω. Moreover, let Γ ⊂ ∂Ω s be a measurable subset (see figure 1).

Weak solutions of the hyperbolic wave equation
In the following, we recall the concept of a weak solution of the constant coefficient wave equation with homogeneous Dirichlet boundary conditions on ∂Ω × [0, T].

Definition 2.1 (Weak solution [9]). Let
then the wave operator is defined as For a given • For every (f , y 0 , y 1 ) ∈ D there exists a unique weak solution y ∈ W of (2.4).
• The linear mapping (f , y 0 , y 1 ) → (y, y ) is bounded from D into L 2 (0, T; H 1 0 (Ω))× L 2 (0, T; L 2 (Ω)) and C([0, T]; H 1 0 (Ω)) × C([0, T]; L 2 (Ω)), respectively. Now, similar to [3,4], we collect the set of possible solutions of the wave equation with inhomogeneities from the set D: Definition 2.3. Let W be the set defined in (2.1). The set of possible solutions of the wave operator is defined as the set It is important for our further considerations, and somehow surprising, that set Y can be associated with a Hilbert space topology:

Theorem 2.4 (Hilbert space Y). The set Y is a Hilbert space when associated with the inner product
Proof.
(i) First, we note that from theorem 2.2, it follows that the mapping ( f , y 0 , , y k (0)) k is a Cauchy sequence in D, and therefore possesses a limit in D, which we denote by (f , y 0 , y 1 ). According to the first item of this proof, there exists y ∈ Y which solves W[y] = f , y (0) = y 0 , y (0) = y 1 . It then follows that Thus the Cauchy sequence is converging and thus the space Y is complete. □

The minimization functional
We consider now the problem of PAT on a bounded domain Ω, as discussed in section 1. For the sake of simplicity, we assume that the sound speed is constant one in Ω. That is, the considered problem of photoacoustics (in operator notation) reads as follows: Since the data z d can be noisy, it is common to reformulate problem 3.1 as a constrained optimization problem consisting of calculating, for some fixed regularization parameter α > 0, a minimizer of the cost functional In the following, we analyze the functional T α,z d . The first result consists of proving that the trace of y is well-defined such that the residual term y − z d Proof. From theorem 2.2, it follows that Y, as defined in (2.5), can be represented as and moreover the norms and · Y are equivalent. This, together with the trace theorem, shows that which gives the assertion. □ In order to solve the constrained minimization problem, we will reformulate it as a saddle point problem.

The saddle point problem
where the bilinear forms a α,ρ : Y × Y → R and b : Y × Λ → R and the linear form l : Y → R are defined by for all y, p ∈ Y and λ ∈ Λ.
The first order optimality conditions of the Lagrangian L ρ , can be expressed as a saddle point problem, consisting of finding (y, λ) ∈ Y × Λ satisfying To prove the existence and uniqueness of a solution of equation (3.9), we need to guarantee four Brezzi conditions (see, for instance, [2]) in an appropriate Hilbert space setting on Y (as defined in (2.5)). In particular, we investigate Y together with a parameter-dependent family of functionals: For α > 0, ρ 0 we introduce The next lemma guarantees that Y associated with the bilinear forms (·, ·) Yα,ρ , induced by y Yα,ρ , is indeed a Hilbert space.
Proof. According to theorem (2.4), the mapping is an equivalent norm to · Y . The assertion then follows from the inequality (3.4). □ Moreover, to verify the Brezzi conditions we also need the following lemma:  for all y ∈ Y whenever ρ > 0.
• To prove the 1st Brezzi condition, we estimate with the Cauchy-Schwarz inequality as follows: let y , p ∈ Y , then y Yα,ρ p Yα,ρ .

9). As a consequence, the function y is the unique minimizer of the constrained optimization problem (3.2)-(3.3).
So far we have seen that for every α > 0, ρ 0 there exists a unique pair (y, λ) ∈ Y × Λ solving the saddle point problem (3.9). It is due to the regularization term in the minimization functional (3.2) that the coercivity of a α,ρ on the kernel of b holds. In the work of Cîndea and Münch [3,4], the authors consider the minimization of the data discrepancy only, without an additional regularization term, and thus, in order to establish coercivity, they needed to assume an additional observability inequality which guarantees that the measurements yield a norm on the kernel of their state equation. We can avoid such an assumption.
Thus for every α > 0 and ρ 0 there exists a continuous operator A α,ρ : Y → Y such that A α,ρ y, p Y ×Y = a α,ρ (y, p) for all y, p ∈ Y.
For B : Y → Λ , its adjoint B : Λ → Y is given by Using this operator notation, the saddle point problem (3.9) is equivalent to The following corollary is an immediate consequence of theorem 3.6. 16) is a self-adjoint isomorphism. Furthermore, for the Hilbert space X := Y × Λ endowed with the inner product (x, w) Xα,ρ := (y, p) Yα,ρ + (λ, µ) Λ for all x = (y, λ), w = (p, µ) ∈ X, (3.17) it follows that where the constants c = c(ρ) and c = c(ρ) are positive and independent of α. Here, as usual, we identify the dual

Robust preconditioning
Since we are dealing with a space-time domain, a discretization of equation (3.16) will lead to a very large linear system of equations. For an efficient solution, we will use a preconditioned minimum residual (MINRES) method. The regularization parameter α > 0 enters in equation (3.16) via the bilinear form a α,ρ . Thus the solution of the (discretized) system depends on α. It is our goal to obtain α-independent convergence of an appropriately designed preconditioned MINRES method for the discretized system. Such a preconditioner for the discretized system is derived from an operator preconditioner for the continuous system (3.16).

Operator preconditioning
Since the operator A α,ρ defined in equation (3.16) is not self-mapping, a MINRES method cannot be applied to the system (3.16). This is remedied by complementing it with an isomorphic operator such that the composition is self-mapping. This complementation is referred to as preconditioning. For an in-depth discussion on this topic, we refer the reader to [7,11]. Then the operator P −1 α,ρ A α,ρ : X → X is a self-adjoint isomorphism with respect to the topology induced by the inner product (·, ·) Xα,ρ . Furthermore, the condition number of the preconditioned systems satisfies and is bounded uniformly in α > 0.
Proof. The operator P α,ρ : X → X is the inverse of the Riesz representation operator J X : X → X (see, for instance, [12]) when X is associated with the topology induced by · Xα,ρ . Then from corollary 3.8 it follows that the composition P −1 α,ρ A α,ρ : X → X is a linear isomorphism on X which is self-adjoint with respect to the inner product on X, which shows the first claim. For the second part, note that The assertion then follows from the definition of the condition number and (3.18). □ In the following, we use P α,ρ as a preconditioner for equation (3.16) and we consider the equivalent system 3) The preconditioned system will then be solved by a MINRES method.

Stable discretization
The well-posedness of the discretized version of equation (3.16) is characterized by the discrete Brezzi conditions. For a pair of conforming discretization spaces Y h ⊂ Y and Λ h ⊂ Λ, the bilinear forms a α,ρ and b, restricted to the subspaces Y h × Y h , Y h × Λ h respectively, satisfy the 1st and 3rd Brezzi condition from theorem 3.6 automatically for every α > 0, ρ 0. The 2nd Brezzi condition is in general only satisfied for ρ > 0. We will, therefore, only consider the discretization of the optimality system for the augmented Lagrangian L ρ (ρ > 0)-see (3.7). It remains to verify the 4th Brezzi condition.

Lemma 4.2. Let Y h ⊂ Y be a conforming discretization space and set
Then for every α, ρ > 0, Λ h ⊂ Λ and the bilinear form b satisfies the 4th Brezzi condition, where C obs = C obs (Ω, Ω s , T) is the constant from lemma 3.2.
The proof of the second assertion follows exactly the same lines as the last part of the proof of theorem 3.6 with (Y, Λ, y, λ,ŷ) replaced by ( (4.5) and l h = l| Y h , the discretized saddle point problem, which is considered in the following, is given by Since the discrete Brezzi conditions are satisfied with the same constants as for the continuous formulation, corollary 3.8 carries over to the discretized formulation in equation (4.6).

17), it follows that
where the constants c = c(ρ) and c = c(ρ) are positive and independent of α and do not depend on the choice of Y h .
Theorem 4.4. Let α, ρ > 0 and assume that Y h ⊂ Y is a conforming discretization space and let Λ h ⊂ Λ be given by (4.4).
Then the operator P −1 α,ρ,h A α,ρ,h : X h → X h is a symmetric isomorphism with respect to the topology induced by the inner product (·, ·) Xα,ρ .
Furthermore, the condition number of the preconditioned systems satisfies and is bounded uniformly in α > 0. Moreover, it does not depend on the choice of Y h .
This shows that our preconditioner P α,ρ,h is robust with respect to α and the discretization.
Proof. The theorem follows from corollary 4.3 and is proven along the lines of the proof of theorem 4.1 restricted to the subspace In the following, we use P −1 α,ρ,h as a preconditioner, and, as a consequence, the minimum residual method can be applied to the preconditioned discrete system of equations in (4.6): Remark 4.5. Note that despite the robustness with respect to α and the discretization, the condition number of the preconditioned discrete system will depend on ρ > 0 since ρ appears in the (discrete) Brezzi constants (see theorem 3.6 and lemma 4.2) and therefore in the constants from (4.7). As a result of this, the solution of the system (4.9) will depend on ρ, as our numerical results will illustrate later on.

Numerical experiments and results
Our test example is PAT, as described in problem 3.1 (see equation (1.2)) on the rectangular domain Ω = (0, 1) d over the time interval (0, T) with T = 1 4 . As observation domain Γ we use the boundary of (see figure 1) The longest distance of a point p in Ω s to Γ, that is max{d( p, Γ) : p ∈ Ω s }, is 1 4 . Thus, for the information of a point to propagate to Γ, it takes 1 4 time-units (because the sound speed equals 1). Therefore, a measurement time of T = 1 4 units guarantees uniqueness [13, theorem 2]. For the numerical solution, we consider 2nd order B-spline spaces with equidistant knot spans and maximum continuity on the interval (a, b), S 2, (a, b), where is the number of uniform refinements performed. This space has a mesh size of h = (b − a)/2 and smoothness C 1 (a,b). Moreover, the second derivative of a 2nd order spline is piecewise constant and thus the spline is an element in H 2 (a,b). Tensor product B-spline spaces are the tensor products of univariate B-spline spaces. Defining , we see that because, as already stated above, every spline is two times weakly differentiable, for all y h ∈ Y h (W[y h ], y h (0), y h (0)) ∈ L 2 (0, T; L 2 (Ω)) × H 1 0 (Ω) × L 2 (Ω). Thus Y h ⊂ Y according to the definition of Y in (2.5) and thus Y h is conforming. For more complex domains, isogeometric analysis (see [8,14]) can be used to obtain smooth conforming discretization subspaces, and multi-patch domains can be dealt with using the methods described in [1] and the references within.

Robustness of the preconditioner
In the first series of the numerical experiments for PAT, we study the robustness of our preconditioner P α,ρ,h as introduced in theorem 4.4.
Here we use the observation surface Γ = ∂Ω s (see (5.1)). Condition numbers for different α and different numbers = t = x of uniform refinements are given in tables 1 and 2 for d = 2 and d = 3, respectively. Tables 3 and 4 contain iteration numbers for solving the preconditioned system using MINRES with zero right-hand side and a random initial starting vector. The stopping criteria is the reduction of the residual error by 10 −8 . As predicted from theorem 4.4, the condition numbers and iteration numbers, respectively, are independent of the mesh size h, as well as the regularization parameter α.
The preconditioner is not robust with respect to ρ. This is reflected by the iteration numbers in table 5. The number of iterations needed to reach the threshold increases significantly as ρ decreases. Here, we present the numerical results for PAT with the preconditioning method described above using simulated data. The ground truth, the smiley, is represent in figure 2. It is constructed to be a second order spline with eight refinements, that is an element of S 2,8 ((0, 1) 2 ). We used simulated measurement data z d , which was obtained by numerically solving the wave equation with the ground truth with the Galerkin method in space and a finite difference method in time, with time step h t = T/2 10 , where T = 1 4 .  To solve the inverse problem of PAT, we discretize the augmented optimality system on the space Y h := S 2,6 (0, T) ⊗ S 2,6 ((0, 1) 2 ) ∩ H 1 0 ((0, 1) 2 ) and Λ h according to (4.4). It is obvious that since we make only six refinements, the ground truth cannot be recovered accurately.
The observation surface is again Γ = ∂Ω s and the resulting preconditioned linear system of equations is solved using the MINRES method. This is done by using sparse direct solvers for the subsystems with the matrices P Y α,ρ,h = A α,ρ,h and P Λ h . This is currently the bottle neck of the numerical procedure as the direct inversion of such matrices requires a lot of memory, which limits us, for instance, to a maximum of = 6 uniform refinements. Figure 3 shows the reconstruction for α = 1.0 and ρ = 1.0. In figure 4, we reduced α to 10 −7 while ρ stays the same. Both reconstructions do not resemble the ground truth. To obtain  the results shown in figures 5-8, we reduce ρ to 10 −2 , 10 −5 , 10 −6 and 10 −7 , respectively. We see from these figures that the value ρ significantly effects the recovered image. Smaller values of ρ provide a better recovery; however, too small values of ρ yield instabilities which can be observed in figure 8.

Discussion regarding the augmented parameter ρ
In section 4.2, we presented a stable discretization scheme based on an augmented Lagrangian stabilization and a particular choice of Λ h . The discrete preconditioner P α,ρ,h from theorem 4.4 is not robust with respect to ρ (see remark 4.5). The iteration number for the preconditioned discretized augmented system P −1 α,ρ,h A α,ρ,h increases as ρ goes to zero; however, a small ρ is needed to recover the desired image, as shown in the figures.