An adaptive smoothness regularization algorithm for optical tomography

In diffuse optical tomography (DOT), the object with unknown optical properties is illuminated with near infrared light and the absorption and diffusion coefficient distributions of a body are estimated from the scattering and transmission data. The problem is notoriously ill-posed and complementary information concerning the optical properties needs to be used to counter-effect the ill-posedness. In this article, we propose an adaptive inhomogenous anisotropic smoothness regularization scheme that corresponds to the prior information that the unknown object has a blocky structure. The algorithm updates alternatingly the current estimate and the smoothness penalty functional, and it is demonstrated with simulated data that the algorithm is capable of locating well blocky inclusions. The dynamical range of the reconstruction is improved, compared to traditional smoothness regularization schemes, and the crosstalk between the diffusion and absorption images is clearly less. The algorithm is tested also with a three-dimensional phantom data. © 2008 Optical Society of America OCIS codes: (170.3010) Image reconstruction techniques; (170.6960) Tomography; (100.3010) Image reconstruction techniques; (100.3190) Inverse problems; (100.6890) Threedimensional image processing; (100.6950) Tomographic image processing References and links 1. S. R. Arridge, “Optical Tomography in Medical Imaging,” Inverse Probl. 15, R41–R93 (1999). 2. J. Hebden, A. Gibson, R. Yusof, N. Everdell, E. Hillman, D. Delpy, S. Arridge, T. Austin, J. Meek, and J. Wyatt, “Three-dimensional optical tomography of the premature infant brain,” Phys. Med. Biol. 47, 4155–4166 (2002). 3. J. Hebden and T. Austin, “Optical tomography of the neonatal brain,” European Radiology 17, 2926–2933 (2007). 4. J. Culver, T. Durduran, D. Furya, C. Cheung, J. Greenberg, and A. Yodh, “Diffuse optical tomography of cerebral blood flow, oxygenation, and metabolism in rat during focal ischemia,” J. Cerebral Blood Flow & Metabolism 23, 911–924 (2003). 5. J. Kaipio and E. Somersalo, Statistical and Computational Inverse Problems (Springer Verlag, 2004). 6. J. Kaipio, V. Kolehmainen, M. Vauhkonen, and E. Somersalo, “Inverse problems with structural prior information,” Inverse Probl. 15, 713–729 (1999). 7. B. Pogue, T. McBride, J. Prewitt, U. sterberg, and K. Paulsen, “Spatially Variant Regularization Improves Diffuse Optical Tomography,” Appl. Opt. 38, 2950–2961 (1999). 8. A. Douiri, M. Schweiger, J. Riley, and S. Arridge, “Anisotropic diffusion regularization methods for diffuse optical tomography using edge prior information,” Meas. Sci. Technol. 18, 87–95 (2007). 9. B. Brooksby, S. Jiang, C. Kogel, M. Duyley, H. Dehgani, J. Weaver, S. Poplack, B. Pogue, and K. Paulsen, “Magnetic resonance guided near infrared tomography of the breast,” Rev. Sci. Instrum. 75, 5262–5270 (2004). 10. P. Yalavarthy, B. Pogue, H. Dehgani, C. Carpenter, S. Jiang, and K. Paulsen, “Structural information within regularization matrices improves near infrared diffuse optical tomography,” Opt. Express 15, 8043–8058 (2007). 11. D. Calvetti, F. Sgallari, and E. Somersalo, “Image inpainting with structural bootstrap priors,” Image and Vision Comput. 24, 782–793 (2006). (C) 2008 OSA 24 November 2008 / Vol. 16, No. 24 / OPTICS EXPRESS 19957 12. D. Calvetti and E. Somersalo, “Microlocal sequential regularization in imaging,” Inverse Problems and Imaging 1, 1–11 (2007). 13. D. Calvetti and E. Somersalo, “Gaussian hypermodels and recovery of blocky objects,” Inverse Probl. 23, 733– 754 (2007). 14. D. Calvetti and E. Somersalo, “Hypermodels in the Bayesian imaging framework,” Inverse Probl. 24, 034013 (20pp) (2008). 15. D. Calvetti, J. P. Kaipio, and E. Somersalo, “Aristotelian prior boundary conditions,” Int. J. Math. Comp. Sci. 1, 63–81 (2006). 16. M. Schweiger, S. Arridge, M. Hiraoka, and D. Delpy, “The finite element method for the propagation of light in scattering media: Boundary and source conditions,” Med. Phys. 22, 1779 – 1792 (1995). 17. D. Calvetti and E. Somersalo, Introduction to Bayesian Scientific Computing – Ten Lectures on Subjective Computing (Springer Verlag, 2007). 18. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D 60, 259–268 (1992). 19. C. R. Vogel and M. E. Oman, “Fast, robust total variation-based reconstruction of noisy, blurred images,” IEEE Trans. Image Process. 7, 813–824 (1998). 20. P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell. 12, 629–639 (1990). 21. D. Calvetti and E. Somersalo, “Recovery of shapes: hypermodels and Bayesian learning,” Proc. of the Applied Inverse Problems 2007: Theoretical and Computational Aspects. J. of Physics Conference Series (to appear) . 22. “Automatically Tuned Linear Algebra Software (ATLAS),” http://math-atlas.sourceforge.net/ (19th June 2008). 23. Y. Saad, Iterative Methods for Sparse Linear Systems (Society for Industrial Mathematics, 2003). 24. I. Nissilä, J. Hebden, D. Jennions, J. Heino, M. Schweiger, K. Kotilahti, T. Noponen, A. Gibson, S. Järvenpää, L. Lipiäinen, and T. Katila, “Comparison between a time-domain and a frequency-domain system for optical tomography.” J. Biomed Opt. 11, 064015 (2006). 25. T. Tarvainen, M. Vauhkonen, V. Kolehmainen, and J. P. Kaipio, “Finite element model for the coupled radiative transfer equation and diffusion approximation,” Int. J. Numer. Meth. Eng. 63, 383–405 (2006). 26. J. Heino, E. Somersalo, and J. Kaipio, “Statistical compensation of geometric mismodeling in optical tomography,” Opt. Express 13, 296–308 (2005). 27. S. R. Arridge, J. P. Kaipio, V. Kolehmainen, M. Schweiger, E. Somersalo, T. Tarvainen, and M. Vauhkonen, “Approximation errors and model reduction with an application in optical diffusion tomography,” Inverse Probl. 22, 175–195 (2006). 28. J. P. Kaipio and E. Somersalo, “Statistical inverse problems: discretization, model reduction and inverse crimes,” J. Comp. Appl. Math. 22, 493–504 (2006). 29. C. Paige and M. Saunders, “LSQR: An Algorithm for Sparse Linear Equations And Sparse Least Squares,” ACM Trans. Math. Software 8, 43–71 (1982).


An adaptive smoothness regularization algorithm for optical tomography 1. Introduction
The diffuse optical tomography (DOT) is an emerging medical imaging modality in which the optical properties of the tissue are estimated from the measured scattering and transmission of near infrared light [1].One of the potential uses of this modality is to estimate the local oxygen saturation level in the brain tissue of neonates [2,3], conveying indirect information of, e.g., the cerebral metabolic rate of the oxygen [4].
The optical tomography problem is very challenging from the point of view of instrumentation and data collection, as the transmission signal has a low signal to noise ratio and is corrupt by errors due to instabilities in the device.Moreover, the computational challenges are remarkable since the estimation problem is a non-linear severely ill-posed inverse problem.
The ill-posedness of the inverse problem can be compensated for by using regularization techniques, among which Tikhonov regularization [5] is the most widely popular.An alternative approach is to consider the inverse problem as one of statistical inference and to compensate the shortcomings of the data by complementary information, or prior information about the unknowns.Regularized solutions of inverse problems often have an interpretation in the Bayesian statistical framework.Conversely, while the statistical inverse problem is much richer than just one estimate, the regularization methods can be used as a guideline for designing reasonable prior densities.We make use of this interplay in this article.
A standard Tikhonov regularized solution based on a homogenous smoothness penalty pro-duces typically a blurred diffuse image, thus failing to identify the boundaries of the internal structures of the target and possible inclusions.If prior information about the internal structures is available, e.g., from anatomical imaging modalities such as X-ray tomography or MRI, it is possible to construct structural penalties that allow abrupt changes across the structure boundaries [6,7,8,9,5,10].Often, however, such complementary information is not available, and a structure based on generic anatomical information alone may lead the algorithm astray.In the present article, we propose an iterative algorithm that updates dynamically the regularization functional based on the features in the current estimate.More precisely, starting with a homogenous smoothness penalty, the proposed algorithm uses the current estimate of an optical parameter as a pilot image, interpreted as a smooth approximation of the underlying structure, and iteratively improves the approximation, eventually finding a solution that is locally smooth but may have fast variations across the boundaries of the inclusions.The idea of updating iteratively the smoothness penalty has previously been studied with structured grids and in the framework of linear inverse problems [11,12,13,14].In the present work, we extend these ideas to unstructured grids which makes them suitable for applications where the forward problem calls for finite element modelling.Moreover, the non-linearity of the forward map increases the complexity of the problem.Finally, the construction of the anisotropic smoothness penalty with undetermined boundary conditions is new, extending the idea of Aristotelian boundary conditions presented in [15].The algorithm presented here is not limited to optical tomography but is also suitable for other linear or non-linear inverse problems where unstructured grid is used to represent distributed parameters in a discrete form.

Forward model and inverse problem
The incoherent light propagation in scattering medium can be described with the radiative transfer equation.Being an integro-differential equation, its numerical approximation leads easily to computational problems of prohibitive size.When the medium is strongly scattering, the radiative transfer equation is often replaced with the diffusion approximation.
Assuming that the light source is time harmonic with modulation frequency ω > 0, the photon density φ = φ (x) in the diffusion approximation satisfies the elliptic partial differential equation where , is a bounded domain representing the target, and the diffusion coefficient is defined in terms of the absorption coefficient μ a (x) and reduced scattering coefficient The appropriate boundary condition is a Robin condition, where q is the boundary source and the impedance coefficient ζ takes into account the refractive mismatch of the body and the outside air on the boundary.We use the semi-empirical model where R is the reflection coefficient of the boundary and n is the refraction index of the medium, and n air = 1.0, see [16] for further details of the model.The modulation frequency of the source is ω = 2π f , f = 100MHz and c is speed of light in the medium, assumed to be constant.
The inverse problem of optical tomography is to estimate the optical properties of the body from the scattering and transmission of light measured on the surface of the body.In terms of the diffusion approximation model of light propagation, the problem is tantamount to estimating the absorption and the diffusion coefficient pair μ a (x), κ(x) | x ∈ Ω from the noisy observations of the outcoming photon densities corresponding to all applicable boundary sources.Given the boundary source q, the outcoming photon density on the boundary, or exitance, is defined as Hence, the inverse problem is to determine the pair μ a , κ from all available pairs q, Γ q , where the measured exitance Γ q is corrupted by noise.
In the present formulation, the exitance is a complex quantity, which can be expressed in terms of the logarithm of the amplitude and the phase shift, In practice, we may apply a finite number of linearly independent boundary source patterns, q , 1 ≤ ≤ L, and the data consists of the corresponding photon densities, denoted by Γ , recorded only at few boundary points r j , 1 ≤ j ≤ m, where m is the number of the detectors at the boundary.Using the amplitude-phase representation and ignoring the measurement noise, the data set is thus ( Assume that a triangular or tetrahedral finite element meshing of the domain , is given, with vertices x i , 1 ≤ i ≤ N, and assume that the unknown optical distributed parameters μ a and κ are discretized and represented by the collocation values (μ a (x i ), κ(x i )) at the nodal points.We denote by p ∈ R 2N the vector containing the N collocation values of both unknown parameters.Using finite elements, we may discretize the forward diffusion approximation model to obtain a discrete forward model.Let us denote by y ∈ R 2m the vector containing the boundary data (5) corresponding to the amplitude and phase of the exitance Γ at the detector locations r j , 1 ≤ j ≤ m.By stacking all the observation vectors into a single vector y, the finite element model leads to a discrete forward model

Statistical inference
In the Bayesian statistical framework, all the variables are treated as random variables, the randomness being an expression of the lack of information about the values that they take on [5,17].The inverse problem in this framework is to infer on the probability density of the unknown vector of primary interest based on the realized values of the random variable representing the observable.
To set up the statistical model, consider the forward model (6).Assuming that the observation is corrupted by additive noise that is independent of the optical properties of the body, we write y = f (p) + e, e ∼ π noise (e), the notation above indicating that the probability density of the noise is π noise .This model allows us to write the likelihood density of the measurement, where the symbol "∝" stands for "equal up to a multiplicative constant".Hence, the data conditional on the true signal is distributed around the noiseless value f (p) as the noise is distributed around the origin.
Assume that complementary information concerning the unknown p is available, and this information is expressed in the form of a probability density π prior (p).This density, the prior probability density, expresses our belief about values that we expect, where π prior (p) is large, and on the other hand, it excludes values of p that we believe to be unlikely or impossible, where π prior (p) is small or vanishes.The joint probability density of the random variables p and y can be written as Reversing the roles of the observable and unknown in the equation for the joint probability density, we may define the probability density of p conditioned on the observation.The result is Bayes' equation for the posterior probability density, Above, π(y) is the marginal density of y, obtained by integrating out the variable p from the joint probability density, and it is ignored here as it contributes only to normalizing of the posterior density.In the Bayesian framework, the posterior density is the solution of the inverse problem, since it is an expression of the information that is available about p, taking the prior belief, the noisy measurement and the model between the unknown and the data into account.
A particular instance of the posterior density is when the prior density and the likelihood are Gaussian.If where the noise and prior covariance matrices C n and C p are positive definite, then the posterior density assumes the form By writing the symmetric (e.g., Cholesky) decompositions of the inverses of the covariance matrices, C −1 n = S T S, C −1 p = L T L, the density can be expressed in the form This equation is particularly useful for establishing the connection between regularization and statistics.We observe that the maximizer of the posterior density, known as the Maximum A Posteriori (MAP) estimate of p, denoted by p MAP , satisfies which is the Tikhonov regularized solution of the inverse problem with the penalty functional and the regularization parameter equal to unity.Hence, an appropriate choice of a Gaussian prior density leads to the Tikhonov regularized solution as the MAP estimate, and conversely, a judiciously chosen penalty functional helps to design a Gaussian prior density.We use this well known interplay in both direction in the discussion to ensue.

Construction of the prior
In the Bayesian framework, the prior density comprises the prior information, or prior belief, concerning the unknown field, in the present problem the absorption and scattering coefficient.
The challenge in constructing the prior density is how to translate qualitative information concerning the unknown into a quantitative form that is expressible as a probability distribution.
The selection of a good prior model is by no means unique, and the quality can often be assessed only by its capacity of producing meaningful a posteriori estimates that respect the qualitative a priori constraints.
In the present context, we assume that the a priori information concerning the optical properties of the object is summarized in the following qualitative description: the object consists of well defined simple subdomains with slowly varying optical properties, while across the subdomain boundaries, sharp variations may occur.Such a description is often said to define a blocky object, and techniques for retrieving the parameter fields in such cases have been developed [18,19,20,13,12,14].In the present work, the goal is to extend the approach proposed in [12] to a strongly non-linear problem with a non-structured discretization.An extra computational challenge comes from the complexity of the forward model, requiring both effective numerical algorithms and use of memory.
The construction of the prior density consists of two steps.To allow the optical parameters to easily jump across the subdomain boundaries, we use inhomogenous direction sensitive smoothness priors with dynamic adaptation [11,12].To avoid boundary artifacts, we adjust the a priori variances in the smoothness prior model to be nearly constant throughout the body [15].

Structural interior prior
The first step towards constructing a Gaussian direction sensitive second order smoothness prior is the selection of a suitable penalty functional.Let u : Ω → R denote a twice differentiable function to be estimated from indirect observations.In the present application, u is either the diffusion coefficient or the absorption coefficient.Furthermore, let λ : Ω → R a given nonnegative weight function such that λ ∂ Ω = 1.We can then define an isotropic second order penalty functional with weight λ ≥ 0 in the interior of Ω of the form The role of the penalty functional in the estimation problem is to favor solutions yielding small values.To understand better for which functions u the penalty takes on small values note that, when λ is fixed, the minimizer of u → J (u; λ ) is a steady state diffusion field with the diffusion coefficient λ .In other words, the function λ controls which parts of the image u mix together via diffusion, thus conveying structural prior information to our model.Strategies for the choice of λ will be discussed later in this section.Assume that a finite element triangular or tetrahedral mesh over Ω with vertices x i is given.Let Ω i denote a Voronoi cell containing an interior vertex x i , i.e., where the second identity follows from the divergence theorem, with n denoting the exterior normal vector of the boundary ∂ Ω i .To approximate the integral in the rightmost term, we denote by x i an interior vertex and by 1 ≤ i ≤ N int , and x 1 i , x 2 i ,...,x p i its neighboring nodes.Furthermore, we let y j i = (x j i + x i )/2 be the midpoint of the edge joining neighboring vertices and ∂ Ω j i the facet of the polyhedron ∂ Ω i separating the vertices x i and Collecting in the vector p the values of the function u at the collocation points, p i = u(x i ), the above approximation leads to a matrix representation of the second order differential in the interior points, where the matrix L int , whose elements are computed using (9) depends on the weight function λ .The discrete counterpart of the interior penalty functional ( 8) is therefore We partition the vector p into the vectors p int ∈ R N int and p bdry ∈ R N bdry containing the interior node and the boundary node values, respectively.Assuming, for the time being, that the boundary values are known, we may define the following probability density of the interior nodes conditional on the boundary nodes, To obtain a proper prior for the vector p, taking into account that the boundary values are also unknown, it suffices to construct a prior density for the boundary node values.
2.1.5.Variance adjustment at the boundary Consider now the boundary collocation points The smoothness penalty functional derived above involves the values of u at the boundary points, but in order to obtain a proper prior density, additional information concerning the regularity over the boundary is required.Our approach follows the reasoning in [15].Here, for simplicity, we restrict the explanation to the case when the surface ∂ Ω is smooth.
Assuming that along the boundary, the function u is second order smooth, we define a second order boundary penalty, where Δ τ denotes the surface Laplacian.The mesh defined over Ω restricted to the boundary induces a meshing of the boundary.We discretize the surface Laplacian analogously with the interior Laplacian, that is, where C i is the boundary of Ω i ∩ ∂ Ω and ν is its normal vector tangent to ∂ Ω.If Ω ⊂ R 2 , the calculation of the rightmost integral reduces to evaluations at two points.Approximating the normal derivatives with finite differences at neighboring collocation points, as in the case of interior points, we arrive at a discrete approximation of the boundary penalty functional, where L bdry is the finite difference matrix which approximates the integral (12).We then define the Gaussian marginal smoothness density corresponding to this penalty function, where α > 0 is a scaling constant, whose role will be discussed below.
If we partition the matrix L int according to the partitioning of p into interior and boundary values, ×N bdry , we can express the prior density in the form where L is the matrix To choose the value of the parameter α, we require that when λ = 1, the variance of components p i in the interior and at the boundary are of the same order of magnitude, recalling that the variance of a component p i can be calculated by the equation where e i is the ith unit coordinate vector.The details can be found in Appendix A.

Structural prior
In this section we discuss the choice of the function λ in the definition of the smoothness penalty functional and an adaptive updating scheme for it.From Eq. ( 9) it follows that λ (y j i ) can be viewed as a coupling constant between the neighboring collocation values u(x i ) and u(x j i ).The coupling should be small if we believe that the function value is likely to change significantly between these two collocation points, while it should be large if the collocation values are believed to be close to each other.
If we have available a smooth approximation u of the unknown function u to be estimated, which we will refer to as pilot function, we can define λ to be where τ > 0, k ∈ N are shape parameters of the function λ , and D v j i denotes the directional derivative in the direction v j i , This selection of λ makes the smoothness prior direction sensitive.The rational for this choice is that when the pilot function has a strong gradient in the given direction v j i , the penalty for the unknown function u for changing rapidly while passing form x i to x j i should be small, to allow a rapid variation.The matrix obtained by modifying L using the pilot image u, denoted by L u can be formed knowing only the values of the pilot image at the interpolation points y j i .

Adaptation
We now focus our attention to the construction of the structural priors for the optical tomography problem that we are considering.Assume that κ and μ a are the pilot functions of the diffusion and absorption coefficients, respectively.Since these coefficients are estimated simultaneously from the data, we define the matrix where γ 1 , γ 2 > 0 are scaling constants.Similarly as in [11,12], we propose an adaptive algorithm for updating iteratively the penalty matrix ( 16) based on the current information about the optical parameters.
(C) 2008 OSA 2. Using the current pilot images κ and μ a , form the matrix W according to (16).

Calculate
4. If j = n iter , stop, otherwise (a) Update the pilot images κ and μ a using the current estimate of p, normalized by the maximum values, Compute the values at the interpolation points y j i , according to the equations ) Increase j by one and continue from 2.
The selection of the value of Tikhonov regularization parameter δ > 0 in (17) will be discussed in the next section where we address various issues concerning the practical implementation of the algorithm and the details of the numerical experiments.The solution of the minimization subproblem ( 17) is computed with a damped Gauss-Newton algorithm.
We point out that the updating the penalty functional is tantamount to updating the prior density based on the data, which may look dubious in the classical Bayesian paradigm.Although not rigorously analyzed in the present form, the updating of the prior may be justified by in the context of hierarchical models: see [13,14,21].

Results
In this sections we illustrate the performance of the proposed algorithm, when applied to both synthetic and real phantom data.The algorithm has been implemented in C, so as to take advantage of the shared memory parallelism and to take advantage of automatically tuned BLAS libraries [22] for linear algebraic operations.

Simulated data
We start by testing our algorithm with simulated data in two dimensions.The target is a circle of diameter 7 cm, with three circular inclusions, each of diameter 1 cm.Two of the inclusions have a scattering coefficient different from the constant background and two of them have absorbtion coefficients different from the constant background, therefore one of the inclusions has both scattering and absorption coefficients different from the background.
Two different simulations are carried out.In the first one, the optical parameters of the inclusions are constants, hence we refer to it as the "blocky inclusion model".In the second one, which we will refer to as the "smooth inclusion model", the optical parameter functions within the inclusions are Gaussian humps.The optical properties of the background and the inclusions are listed in the Table 1, and the targets are plotted in Fig. 2. When the smooth inclusion model is used, the tabulated values refer to the peak values of the Gaussian humps.The refraction index in the body for this simulation is n = 1.4 assumed known, and 17 optical fibers are uniformly spaced along the circumference of the target.The data are simulated by sequentially letting one of the fibers act as a source and the remaining ones as detectors, leading to To avoid the "inverse crime" of using the same computational forward and inverse problem, the simulated data are produced with a FEM model using 3790 triangular elements with quadratic basis functions, while the model used in the solution of the inverse problem uses 3566 triangular elements with quadratic basis functions.
The optical parameters are represented in a third triangular mesh of 2190 elements and are interpolated to the FEM meshes by piecewise linear interpolants.In our simulation we add zero mean Gaussian noise with standard deviation σ log|Γ| = 10 −3 max|log|Γ|| = 0.0228 for the the component added to the logarithm of the amplitude, and of σ ϕ = 10 −3 max|ϕ| = 0.0016 for the component added to the phase shift.Therefore the Cholesky factor of inverse of the noise covariance matrix is the diagonal matrix In the construction of the penalty functional the shape parameters in (15) to be k = 2 and τ = 50.The scaling coefficients γ 1 and γ 2 appearing in the definition (16) of W are set in the first iteration round,when the pilot images are flat and the penalty is a homogenous second order smoothness penalty, and never changed afterwards.The criterion used for the selection of the scaling is that we no pixel values of the simulated inclusions should not deviate from the background by more than two standard deviations.This adjustment can be done once, together with the boundary variance adjustment.In a realistic situation, the standard deviation needs to be estimated based on the prior belief of the dynamical range about the coefficients.
The update of the value of Tikhonov regularization parameter δ should be be adjusted at each iteration sweep, e.g., using the Morozov discrepancy principle, but this strategy would increase the computational burden significantly.In our computed examples, set the value of delta to 10 −3 once and for all.The maximum number of iteration steps was set to n iter = 20, and up to 15 Gauss-Newton iteration steps were allowed for the solution of the minimization problem.Within each Gauss-Newton step, the new search direction is computed by solving the corresponding linear system with the GMRES algorithm [23].
The reconstructions the blocky inclusion model are displayed in Fig. 3.The top left panel a) corresponds to the absorption coefficient μ a and the top right one b) to the diffusion coefficient κ.The image in the left of each panel displays the reconstructions after the first outer iteration before the smoothness penalty adaptation, while the image on the right shows the final iteration (n iter = 20).The corresponding profiles of the reconstructions along the circular curve indicated in Fig. 2, traversed in counterclockwise direction, are shown underneath.
The advantage of using the penalty adaptation in this example is quite dramatic: after the adaptation, the contrast of the inclusions is sharp and the dynamical range improved.Interest-Fig.2. Images of the absorption coefficient μ a (top) and diffusion coefficient κ (bottom) of the two dimensional simulated targets.The left hand side images correspond to the blocky inclusion model and the right hand side images to the smooth inclusion model.The parameter values are as in Table 1.
ingly, the adaptation is also capable of removing almost completely the crosstalk between the absorption and diffusion coefficient images.
Figure 4 is the counterpart of Fig. 3 for the smooth inclusion model.Although the prior belief that the target is blocky is in conflict with the model, the reconstruction is reasonable and in line with what one would expect.Notice that also in this case,the adaptation is able to reduce the crosstalk very effectively.

Reconstructions from phantom data
In this section we test the efficiency of the adaptive regularization approach with measured phantom data.The details about the instrumentation utilized for the data collection can be found in [24].
The two data sets available used in the computed experiments, two homogenous phantoms, with the same background optical properties, one without and one with two inclusions, are encoded in the vectors y h and y nh , respectively.To suppress reconstruction artifacts due to the diffusion approximation model, we consider the difference data translated around the computed response of the homogenous background.In other words, we define where f (p h ) is the 3D FEM solution computed on a grid of 21207 tetrahedral elements with quadratic basis functions, assuming that the homogeneous optical parameters are known.The absorption and scattering coefficient of the inclusions in the non-homogeneous phantom are approximately twice the values for the background.We estimated absorption coefficient at μ a,bg ≈ 0.0097 mm −1 for the scattering coefficient μ s,bg ≈ 1.04 mm −1 and at n = 1.56 for the refraction index.A schematics of the non homogeneous phantom is shown in Fig. 5.
The data was collected by positioning 16 source and 16 detector fibers around the cylindrical surface in two rings (see Fig. 5).Since the diffusion approximation is known to be a poor model for light propagation near a singular source, measurements from source-detector pairs less than 30 mm apart are omitted.Thus the actual data used in this example consist of 192 amplitude and phase measurements.
A preliminary study of the distribution of the data suggests that the noise in the logamplitude/phase representation can be well modelled with a Gaussian distribution, although the variance of the data depends on the mean.Fig. 6 shows the histograms of the log-amplitude and phase of the homogeneous phantom data from the first source-detector pair and normal probability densities with the means and variances equal to the empirical means and variances of the sample.Empirical analysis of the data reveals that the standard deviation of the noise at detectors 30 mm or more away from the source increases with the amplitude of the data, and can be fairly well estimated using a linear regression model.This justifies an additive noise model of the form y = f (p) + e, e ∼ N (0, (diag(y)) 2 ).
While at first this empirical noise model might be surprising, it becomes easier to understand after recalling that both the absolute value of the logarithm of the amplitude and the phase shift increase when the receiver is moved away from the source.Observe that we the constant of proportionality has been omitted from the noise level, with the understanding that it can be accounted for in the Tikhonov regularization parameter.The Cholesky factor of the inverse of the noise covariance C −1 n for this model is The construction of the prior proceeds as in the synthetic data case, with the optical parameters discretized in a mesh of 25924 tetrahedral elements with linear basis functions.The Cholesky factor L of the inverted prior covariance matrix C −1 p and its parameters are as in Section 3.1, except that now τ j = 150.The Tikhonov regularization parameter is held at δ = 10 −6 .
The adaptive algorithm is stopped after 8 outer iterations, with a maximum of 15 inner Gauss-Newton iteration steps allowed.
Figure 7 displays the projection of the reconstructed optical parameter distributions through the plane z = 0, see Fig. 5.The scattering coefficient image is calculated from the reconstructed diffusion and absorption coefficient images.We define the contrast of the inclusion C μa to be the ratio of the means μ a over nodal values of μ a inside and outside the inclusion.The contrast C μ s is defined in a similar manner.Here, the boundary of the inclusions is assumed to be known.The contrasts in the reconstruction with the smoothness prior are C μa = 1.12 and C μ s = 1.11, respectively, and after eight adaptive iterations, they attain the values C μa = 1.57and C μ s = 1.21.The algorithm does not remove completely the crosstalk in the absorption image: in fact, in the location of the inclusion with the same absorption coefficient as the background, the mean value of μ a is 0.0102mm −1 before adaptation and μ a = 0.0106 mm −1 after the adaptation.However, since the adaptation improves considerably the dynamical range of the reconstruction, the relative crosstalk is diminished, and it is visually less prominent.

Discussion
The article describes a new adaptive regularization algorithm for diffuse optical tomography.
In contrast to the traditional smoothness penalties that keeps the cost of variations in the reconstructed optical parameters fixed, the proposed algorithm updates iteratively the smoothness penalty, using the current estimate as a structural pilot image to decrease the penalty where the possibility of strong contrasts is suggested.The iterations can thus be thought of as a learning process, where the current iterate provides prior information for the new update.A similar approach has been previously applied to linear inverse problems with structural meshes, while in the present application, the mesh is unstructured and the forward map is strongly non-linear.The adaptive algorithm is expected to work best with data that correspond to a target with a blocky structure, the optical parameters changing abruptly across the inclusion boundaries.The process of relaxing locally the smoothness penalties is likely to diminish or remove typical overshooting phenomena creating disturbing artifacts in the reconstruction images.
The algorithm has been tested with both synthetic data and with phantom data.The synthetic data corresponds to a two-dimensional target, and both blocky and smoothly varying optical parameter models are considered.The forward model in the synthetic data simulation is the diffusion approximation.In this case the algorithm was able to get rid almost completely of the crosstalk between the absorption and diffusion coefficients, whereas traditional smoothness penalized algorithms are corrupted by artifacts.The reconstructed inclusions exhibit sharp boundaries, even if the data was generated with a smoothly varying model.This result is not surprising as it is in line with the prior belief but it should be taken into account when the user is in doubt about the blocky structure of the target.The absence of boundary artifacts in the reconstruction is the result of implementing the smoothness penalty using a statistical analysis of the prior image variances.
A second set of tests was performed using phantom data corresponds to a cylindrical target with blocky inclusions, and a three-dimensional diffusion approximation model.To suppresses the effects of the modelling errors, difference data was used.Although the optical fibre contacts may change between the two targets and therefore not cancel in the difference data, the computed reconstructions show little boundary effects.With real data crosstalk is not completely removed, but since the adaptive iterative algorithm considerably improves the contrast between the inclusions and the background without increasing the crosstalk artifacts, the latter becomes Fig. 7. Cross sections of the reconstructed absorption (top) and scattering (bottom) coefficient through the plane z=0 (see Fig. 5).The reconstructions on the left are before adaptation, corresponding to a homogenous smoothness penalty, while those on the right are after eight adaptation sweeps.visually less prominent.The remaining crosstalk artifacts indicate that the coupling between the scattering and absorption coefficients is in the higher order terms of the harmonic expansions of the radiative transfer equation and therefore not captured by the lowest order, or diffusion, approximation.Hybrid methods that use the radiative transfer equations in the regions where the diffusion approximation is not expected to be valid have been suggested [25].Altenatively, the modelling error due to the low order approximation can be taken into account in the Bayesian statistical framework, see [5,26,27,28].
The choice of a rather coarse grid for the discretization of the optical parameters, which has a regularization effect on the solution, cannot resolve well sharp edges accurately, thus causing some artifacts near the boundary of the inclusions.The modelling of the noise in the optical tomography measurements is not a trivial issue, in particular when the data is in the amplitude-phase format.Here a Gaussian approximation with a semi-empirical variance model was used, and the scaling of the noise covariance was absorbed by the regularization parameter when phantom data was used.Proper modelling of the noise is particularly important when absolute rather than difference data need to be used.A detailed analysis will be the topic of future work.

A. Variance adjustment
In this appendix, we discuss the details of the construction of the smoothness penalty and the variance adjustment at the boundary.
Consider the block partitioning ( 14) of the matrix L. Since the block L int ∈ R N int ×N int corresponds to a discrete Laplacian with homogenous Dirichlet data, it is invertible, while the matrix L bdry ∈ R N bdry ×N bdry has a one-dimensional null space spanned by the unit vector 1 = 1, 1,...,1] T ∈ R N brdy .Therefore it follows that the null space of L is also one-dimensional, spanned by the unit vector 1 ∈ R N .Consequently, given Lp, it is possible to recover the vector p only up to an undetermined additive constant which can be specified by, e.g., specifying the mean value μ over the boundary values p bdry .We therefore define the folding matrix whose range is an (N bdry − 1)-dimensional subspace in R N bdry perpendicular to the null space of L bdry , and we reparameterize p bdry as With this representation, where Without loss of generality, we may assume here that μ = 0, i.e, r = 0. Consider the Gaussian density where the matrix K α ∈ R N×(N−1) has full column rank.We are interested in the covariance of the components of the vector p with respect to this density.Consider an interior node value p j , expressed in terms of p as where v is the least squares solution of the linear system where K T α ) + is the pseudo-inverse of K T α .Here we used the identity (K T α K α ) −1 = K + α (K + α ) T and (K + α ) T = (K T α ) + .The solution is found by solving the normal equations, or, in view of the equation ( 21), implying that the solution is and therefore, Similarly, we compute the variance of a boundary node value.Denoting by n k ∈ R N bdry a unit vector with a single non-zero element, the corresponding boundary node value can be expressed as and the variance of the value is found to be var(p k ) = u 2 , where u is the least squares solution of the system or, equivalently, leading to the equation

Fig. 1 .
Fig. 1.Solid lines presents the triangular mesh.Dashed line is a voronoi cell Ω i connected to node x i .

Fig. 3 .
Fig. 3. Reconstructions of optical parameters μ a (a) and κ (b) from simulated data from the blocky inclusions model.The reconstructions on the left of each panel are those obtained after the first Gauss-Newton iteration, thus corresponding to a homogenous smoothness penalty without adaptation, while on the right are those obtained after 20 iteration sweeps.Panels (c) and (d) show the respective profiles along the circular contour indicated in Fig. 2, traversed counterclockwise.The solid line is the true target, the dotted line the solution after the first iteration and the dashed line the final reconstruction.

Fig. 4 .
Fig. 4. Reconstructions of optical parameters μ a (a) and κ (b) from simulated data from the smooth inclusions model.The reconstructions on the left of each panel are those obtained after the first Gauss-Newton iteration, thus corresponding to a homogenous smoothness penalty without adaptation, while on the right are those obtained after 20 iteration sweeps.Panels (c) and (d) show the respective profiles along the circular contour indicated in Fig. 2, traversed counterclockwise.The solid line is the true target, the dotted line the solution after the first iteration and the dashed line the final reconstruction.

Fig. 5 .
Fig. 5.The geometry of the phantom with the diameter of 69.25 mm and height of 110 mm.The 16 sources (marked with ×) and 16 detectors (marked with •) are located in two rows on the cylindrical surface.A two dimensional slice of the reconstructions are shown along the plane of intersection at z = 0 (red) .The locations of the inclusions are projected on the bottom.

Fig. 6 .
Fig. 6.Histograms of the logarithm of the amplitude (left) the phase (right) from a repeated measurement with one source/detector pair.The red line shows a Gaussian with the same mean and variance as the sample.The values of the abscissae are arbitrary, as the data are not calibrated.
2008 OSAwhere e j ∈ R N int is jth unit coordinate vector.Denoting by E the expectation with respect to the probability density(22), the variance of the component is var(p j ) = E p 2

Table 1 .
Optical parameters of the simulated object.The numbering of the inclusion is in agreement with Fig.2.17 = 272 measured amplitudes and phases.Due to the wide dynamical range, amplitudes are routinely expressed in the logarithmic scale, corresponding to the model of Section 2.1.