Nonlinear greedy sparsity-constrained algorithm for direct reconstruction of fluorescence molecular lifetime tomography

: In order to improve the spatial resolution of time-domain (TD) fluorescence molecular lifetime tomography (FMLT), an accelerated nonlinear orthogonal matching pursuit (ANOMP) algorithm is proposed. As a kind of nonlinear greedy sparsity-constrained methods, ANOMP can find an approximate solution of L 0 minimization problem. ANOMP consists of two parts, i.e., the outer iterations and the inner iterations. Each outer iteration selects multiple elements to expand the support set of the inverse lifetime based on the gradients of a mismatch error. The inner iterations obtain an intermediate estimate based on the support set estimated in the outer iterations. The stopping criterion for the outer iterations is based on the stability of the maximum reconstructed values and is robust for problems with targets at different edge-to-edge distances (EEDs). Phantom experiments with two fluorophores at different EEDs and in vivo mouse experiments demonstrate that ANOMP can provide high quantification accuracy, even if the EED is relatively small, and high resolution.


Introduction
Fluorescence molecular tomography (FMT) is increasingly utilized for gene therapy, cancer diagnosis and drug discovery [1,2]. According to the illumination source, FMT can be classified into three modes: the continuous wave (CW) mode, the frequency-domain (FD) mode, and the time-domain (TD) mode. Compared to the other modes, the TD mode can acquire the temporal point spread functions and provide information at all frequencies, so it can be used to reconstruct fluorescence molecular lifetime tomography (FMLT). Fluorescence molecular lifetime is sensitive to physiological factors of the environment, such as tissue oxygenation, pH and glucose concentration [3]. Fluorescence molecular lifetime has been adopted to observe the fluorescence response energy transfer [4], to reduce the crosstalk between different fluorescent targets [5] and to improve the resolution of molecular yield tomography [6].
As for the reconstruction of FMLT, there are mainly two categories of methods. One is based on transforms, such as Fourier transform [7] and Laplace transform [8], which utilize only part of the characteristic information and rely on the selection of transformation factors. The other is directly based on the TD strategy and can make the best of all the measurement data. We previously developed an L1 regularization projected steepest descent (L1PSD) algorithm based on the TD strategy [9]. Simulations and phantom experiments demonstrated that the L1PSD algorithm performed well in localization of fluorescent targets and quantification of fluorescence lifetime, and provided high contrast between different fluorescent targets. However, the main drawbacks of L1PSD are that its resolution is poor and its solutions are not sparse enough for small targets. In the reconstructed FMLT images, the true targets are submerged in strong background noises. For the cancerous tumors in the early stages, the targets labeled with the fluorescent probes are typically small, namely sparse [10]. So effective sparsity-constrained methods for FMLT reconstruction are of importance. In order to enhance the spatial resolution of FMLT and obtain sparse reconstruction results for small-target problems, a method that gives higher priority to the values of the true targets (i.e., effective values) is needed. Greedy methods are known to perform well in this respect [11][12][13]. They can find an approximate solution of L 0 minimization problem, and are more suitable for problems with very sparse solution [14].
For NCS, L1 regularization is often adopted to induce sparsity in the solution, based on the assumption that the cost function is convex everywhere. L1 regularization is effective for regression with nonlinear models [28][29][30]. However, L1 regularization methods have some intrinsic drawbacks, which might cause unsatisfactory sparsity of the solution. Dobson et al. applied L1 regularization methods to the generalized linear models (GLM) [37]. In order to guarantee the error bound, namely the distance between the estimated and true statistical parameters, they introduced the notion of Restricted Strong Convexity (RSC). Around the true parameter, RSC requires a lower bound on the curvature of the cost function in a restricted set of directions. With a projected gradient descent method to solve the L1 regularized cost function, Agarwal et al. [31] guaranteed the accuracy by using a slightly different notion of RSC and restricted smoothness (RSM). Around the true optimum solution, RSM bounds the curvature of the cost function in a restricted set of directions. However, curvature bounds depend on the location of the true optimum solution for some cost functions, like loss functions in GLM [36]. Moreover, for the majority of L1 regularization methods, the true parameter is assumed to be bounded. For example, Bunea et al. recognized that the error bound for logistic regression with L1 regularization relies on the true parameter [28]. The selection of the regularization coefficient, based on the results reported in [29], implicitly requires the length of the true parameter to be short enough. So the methods with L1 regularization might not be suitable for all nonlinear models.
Greedy methods give higher priority to the most effective values and can find an approximate solution of L 0 minimization problem. They can perform well for nonlinear problems. Given some properties of the objective functions, such as the restricted diagonal deviation property, orthogonal matching pursuit (OMP) can find the exact support set for nonlinear models [32,33]. Iterative hard thresholding algorithm can accurately recover sparse signals from few nonlinear observations, under conditions similar to linear compressed sensing [34]. Gradient hard thresholding pursuit algorithm performs well in sparse logistic regression and sparse precision matrix estimation [35]. Gradient support pursuit algorithm can guarantee a bounded distance between the true sparse optimum solution and the reconstructed sparse vector when the cost function has a stable restricted hessian matrix or a stable restricted linearization [36]. Therefore, greedy methods can be powerful for sparsity-constrained problems.
In order to improve the resolution of FMLT reconstruction, greedy methods are adopted in this paper. Among different greedy methods, OMP is widespread for its simplicity and high efficiency. Nonlinear OMP is recommended for its precise reconstruction given some properties of the objective functions, such as restricted diagonal deviation property [32,33]. In this paper, an algorithm based on nonlinear OMP is proposed for FMLT reconstruction. The proposed algorithm is named accelerated nonlinear orthogonal matching pursuit (ANOMP) algorithm. It is divided into two parts, i.e., the outer iterations and the inner iterations. Each outer iteration selects multiple elements, instead of only one, to expand the support set of the inverse lifetime I τ , so as to accelerate the convergence. The inner iterations obtain an intermediate estimate of I τ , based on the support set estimated in the outer iterations. The selection of elements in the outer iterations is based on the gradients of mismatch error. For inner iterations, L1PSD is adopted to guarantee the stability and acquire accurate estimation results [9]. The stopping criterion for the outer iterations is based on the stability of the maximum reconstructed inverse lifetime values and is robust for problems with targets at different edge-to-edge distances (EEDs). ANOMP is based on the TD strategy directly, so it can make the best of all the data collected. Phantom experiments under different conditions and in vivo mouse experiments are performed to demonstrate that ANOMP can provide high resolution, high accuracy of localization and quantification, and good contrast at the same time.

Forward model for light propagation
To calculate the distribution of the excitation light and the Green's function of the emission light, the telegraph equation [39] is selected as the forward model, where ( ) D r denotes the diffusion coefficient, c denotes the velocity of light, r is the spatial coordinate, t is the time, ( , ) r t Φ denotes the density of photon and ( , ) S r t is the source term. S r t r t E r t = Φ * [8], where * denotes the temporal convolution operator. The introduction of the inverse lifetime I τ is to prevent the singularity of the zero points in lifetime images [9]. For the background, I τ is assumed to be 0.
The emission signal detected can be formulated as where ( , , ) m d G r r t denotes the Green's function of the emission light, d r denotes the spatial coordinate of the detector and s r denotes the spatial coordinate of the source.

FMLT reconstruction based on ANOMP algorithm
The image domain is discretized and decomposed into G N voxels, so the lengths of the yield map af ημ , lifetime map τ , and the inverse lifetime map I τ are all G N . With the prereconstructed yield map [9], the reconstruction of FMLT can be divided into the outer iterations and the inner iterations. The outer iterations expand the support set of I τ continuously. The elements with the highest gradients of L2 norm of the fitting error are considered to be effective and are selected. The inner iterations, i.e., the estimation steps of each outer iteration, obtain an intermediate estimate of I τ , based on the estimated support set.
To avoid dealing with the singularity of the zero points, the inverse lifetime map I τ , instead of FMLT image τ , is reconstructed first [9]. With the yield map af ημ obtained, the FMLT problem can be developed as where the fluorescent measurement vector m Φ is rearranged in order of excitation sources, detectors and time points, and has a length of S D I × × , where S is the number of excitation sources, D is the number of detectors and I is the number of discrete time points. The ith element of m Φ and the nonlinear forward function F is given by Nonlinear OMP, as the algorithm of the outer iterations, is used to obtain a sparse solution of the FMLT problem (3).
OMP is an iterative greedy algorithm that finds an approximate solution of the L 0 minimization problem. Besides, for the target algorithm, L 1 regularization is adopted to obtain an intermediate estimate in the inner iterations. Similar to that of linear OMP, the optimization function of the nonlinear OMP for inverse lifetime is given by where F ∇ is the local gradient of F. The selection criterion of elements is based on the gradient of mismatch error , which is referred to as effectiveness gradient in this paper.
Because FMLT is a high-dimensional nonlinear problem, multiple elements, instead of only one, are selected in each iteration to accelerate the convergence. So the selection criterion turns to be where z is the magnitude of the effectiveness gradient. Equation (8) extracts the indices of S N components of vector z that have the largest magnitudes. The gradient of F and the corresponding elements are formulated as Then the new indices should be merged with the estimated support set, as where Γ is the support set.
With the updated support set, the inner iterations are conducted to obtain an intermediate estimate of I τ . According to the principle of traditional greedy methods [33], the new estimate should be arg where c is the complement operator. However, the reconstruction of FMLT is a severely ill-posed problem, because of the huge amount of noise, the severe redundancy of measurement data, the strong scattering of light and the high nonlinearity of the forward model (Eq. (3)) under physical conditions. For reconstruction of fluorescence yield tomography, regularization methods such as the truncated singular value algorithm are introduced in greedy algorithms to reduce the illposedness [11]. However, the direct reconstruction of FMLT is a nonlinear problem, so the truncated singular value algorithm is not suitable. In this paper, L1PSD method is employed, which can guarantee the stability of convergence and enhance the anti-noise capability of sparsity-constrained reconstruction [9]. So the optimization function for inner iterations is given by In order to reduce the computational complexity, the components of F in Eq. (3) where c k Γ denotes the ith element of the support set. Mathematically, there is no order for the support set. The elements of the support set are assumed to be arranged in ascending order in this study.
For the inner iterations, the stopping criterion is modified as The stopping criterion takes two kinds of situations into consideration. The first one is that the iterative process reaches a convergence for the inner iterations, for which The second one is that there is an oscillation of convergence, for which 0 R Δ < . The second situation usually occurs at the very beginning of the outer iterations, when the support set is too small. When either of the two situations occurs, the stopping criterion takes effect and the inner iterations are interrupted.
For outer iterations, the elements possessing the highest effectiveness gradients are considered to be the most effective and are selected. The stopping criterion is the key factor that influences the sparsity and accuracy of the reconstruction results. If the true sparsity is known, the stopping criterion can be set as the lower limit of reconstruction sparsity. However, the true sparsity is difficult to obtain in practice. The stopping criterion can also be based on the effectiveness gradient, like the upper limit of the maximum effectiveness gradient, which means that the outer iterations are interrupted if the maximum effectiveness gradient exceeds a preset upper limit. However, the effectiveness gradients of the background are strongly influenced by the EED of targets, so the stopping criterion based on the effectiveness gradients is not robust.
For non-greedy methods, the stopping criteria can be based on the difference of reconstruction results of adjacent iterations [42]. In [8], the maximum reconstructed inverse lifetime values of the targets are confirmed to be good approximations of real values of targets and have great significance for quantification. For ANOMP, the difference of reconstruction results of different outer iterations is described by the difference of the maximum reconstructed inverse lifetime values. Figure 1(a) is the maximum reconstructed inverse lifetime values of different outer iterations of Expt. 1 described below, and Figs. 1(b)-1(e) are the corresponding reconstructed inverse lifetime maps of the 1st, 4th, 8th and 13th outer iterations, respectively. At the very beginning of the outer iterations, when the reconstruction sparsity is much smaller than the real one, the maximum reconstructed values are not close to the real values and are unstable, because the support set has not covered the real distribution of the targets and the reconstruction algorithm of the inner iterations is difficult to recover the real values. So the reconstruction with a too small support set cannot provide accurate and stable results. After the support set covers the real distribution of the targets, the reconstructed inverse lifetime values of the targets will remain close to the real values, so the maximum reconstructed values can keep stable. When the maximum reconstructed values remain stable, the iterative process is thought to reach a convergence and excessive iterations will only affect the resolution and sparsity, so the iterations are interrupted.
Set v Γ to be the set of maximum reconstructed values of the outer iterations that belong to the stable part. The range of v Γ should be below a preset lower limit, i.e., If the range of v Γ violates (15), v Γ will be reset, and only the maximum reconstructed value of the current step will be included, i.e., The number of outer iterations is a compromise between accuracy and sparsity. In Fig.  1(a)  The main steps of the proposed algorithm are summarized as below.

Parameters setting
For the outer iterations, set the number of the selected elements of each outer iteration , the lower limit of stopping criterion ζ , the lower limit of the range of the maximum reconstructed values ω , the initial inverse lifetime 0 0 I τ = , the initial iteration number k = 1, the initial support set 0 Γ Ø = ; for the inner iterations, set the initial inverse lifetime , the maximum ratio between the difference of residual errors of adjacent iterations and the initial residual error ξ .

ANOMP
(1) Compute the local gradient 1 1 (2) Extract the indices of the elements that are the most effective: (3) Update the support set  (4.4) Obtain k I τ corresponding to the reconstructed innI τ ; , go to step 7; To compromise between the computational time and the robustness for convergence, the number of the selected elements of each outer iteration S N is empirically set to be 10. The lower limit of the stopping criterion ζ and the lower limit of the range of the maximum reconstructed values ω are set to be 9 1 0.01 10 s − × and 9 1 0.1 10 s − × for all the experiments, to ensure convergence of the outer iterations. For the inner iterations, the maximum ratio between the difference of residual errors of adjacent iterations and the initial residual error ξ is set to be 0.007. After being normalized by the sum of the measured fluorescent values, the empirical range of the initial regularization parameter 0 λ is between 15 5 10 − × and 14 5 10 − × . If 0 λ is beyond 14 5 10 − × , the residual errors in the inner iterations will oscillate, no matter how many elements are included in the estimated support set. If 0 λ is below 15 5 10 − × , the reconstruction problem will turn into a severely ill-posed one and the reconstruction accuracy will decrease. In this paper, 0 λ is set to be 14 1 10 − × for all the experiments.

Quantitative metrics for FMLT reconstruction
In order to evaluate the performance of ANOMP, several quantitative metrics are selected. The mean absolute error (MAE) measures the closeness between the reconstructed results and the real values. Because a small inverse lifetime value will turn into a very big lifetime value and the theoretical lifetime of the background is infinite, the MAE is suitable for the reconstructed inverse lifetime tomographic images, instead of the final lifetime images. The MAE is defined as where recon I τ and true I τ are the reconstructed inverse lifetime and the real inverse lifetime of the whole object, including the background and the targets.
The contrast-to-noise ratio (CNR) [43] is adopted to indicate if the real targets are well recovered or submerged in the noise, i.e., target detectability. The CNR is also suitable for the inverse lifetime and is defined as where recon τ and true τ are the reconstructed and real lifetime values of a fluorescent target, respectively. recon τ is defined as the valley lifetime value of the reconstructed target. In this paper, the bigger value of the REs of two fluorescent targets is selected to be an evaluation metric, called as the maximum relative error ( max RE ). In order to evaluate the resolution of the reconstructed lifetime images, the separability of two fluorescent targets is considered. If the lifetime values along the line between the centers of the two targets are all smaller than 3 times of the minimum reconstructed lifetime value of the two targets, the targets are considered to be inseparable; or otherwise the targets are separable.
All the evaluation metrics are used in the phantom experiments. In the in vivo animal experiments, the exact locations of targets are unknown, so only max RE and the separability are evaluated.

Phantom experiments
In order to test the performance of ANOMP, three phantom experiments, Expt. 1, Expt. 2 and Expt. 3, with fluorophores at different EEDs (9 mm, 5 mm and 2 mm), were conducted. A fiber-coupled, time-correlated and single-photon-counting (TCSPC) system was built, the schematic of which could be found in our previous work [9]. A femtosecond laser generator (Spectra-Physics, Newport Corporation, Irvine, CA) provided the repetitive ultrashort 775 nm-wavelength pulse (80 MHz, 100 fs pulse-width). The detector consisted of a multi-anode photomultiplier (PMT) (PML-16-C, Becker and Hickl, Berlin, Germany) and the TCSPC module. Different from the system previously developed [9]), a group of filters, including an achromatic doublet (AC254-030-B, Thorlabs, Newton, NJ), two bandpass fluorescence filters with a center wavelength of 840 nm (ff01-840/12-25, Semrock, Rochester, NY; XBPA840, Asahi Spectra, Torrance, CA) and a long-pass edge filter (BLP01-785R-12.5, Semrock, Rochester, NY), was set in front of the detector, to eliminate most of the excitation light . The measurement data were subsampled and truncated, to reduce the computational cost. The final temporal resolution was set to be 75 ps and time gate was set to be 5.175 ns. A set of measurement data consisted of 18 projections, and each projection included 11 probe points and a field of view (FOV) of 220°. So S = 18, D = 11, and I = 69. As for the phantom setup, a cylindrical phantom (30 mm in diameter) filled with 1% intralipid solution (with absorption coefficient of 0.02 cm −1 and reduced scattering coefficient of 10 cm −1 ) was used as the background. Two small cylinders (4 mm in diameter and 10 mm in height) filled with 10 μM indocyanine green/dimethyl sulphoxide (ICG/DMSO) were used as fluorescent targets. The real lifetime and inverse lifetime of the targets are 1.03 10 s − × [44]. The ICG has a peak excitation wavelength at 780 nm in DMSO and a peak emission wavelength at 830 nm, which are close to the excitation wavelength and the central wavelength of the fluorescence filters in the system [45].
In the reconstruction, only the part containing fluorophores was taken into account. The target domain was 30 mm in diameter and 10 mm in height. The geometry model was discretized into 3,145 nodes and 15,407 tetrahedral elements. The Galerkin-FEM [39,40] was adopted to calculate the distribution of the excitation light and the Green's function of the emitted light. Then the target domain was discretized into 4,182 voxels of size  Table 1 lists the quantitative metrics of the reconstruction results of the phantom experiments. The MAEs of ANOMP are all lower than 1/3 of those of L1PSD. The lower MAEs of ANOMP indicate that ANOMP has higher quantification accuracy than L1PSD. The higher CNRs of ANOMP demonstrate that the reconstructed targets can be detected better with ANOMP. For Expt. 1 and Expt. 2, L1PSD can guarantee the quantification accuracy of the reconstructed targets and has low max RE . However, for Expt. 3, when the EED is only 2 mm and the reconstructed background is too strong, L1PSD has a large max RE (27.48%). ANOMP can provide high quantification accuracy of the reconstructed targets with different EEDs ( max RE <10%). As for the separability, the two targets cannot be separated with L1PSD. By contrast, ANOMP can effectively separate the two targets.

In vivo mouse experiments
To further test the performance of ANOMP in practical applications, in vivo mouse experiments were conducted. The experiments were approved by the Ethics Committee of Tsinghua University. An 8-week BALB/c nude mouse was anesthetized by 0.2 ml 1% Nembutal during the experiments. Two small cylinders (4 mm in diameter and 10 mm in height) filled with 10 μM ICG/DMSO were used as the fluorescent targets, and were implanted into the abdomen of the mouse. The abdomen could be approximated with homogeneous model [46]. Then the mouse was put into a cylindrical container (30 mm in diameter) and its arms and legs were stuck to the inside wall of the container with waterproof rubberized fabric. 1% intralipid solution, with optical parameters close to the abdomen of mouse [47], was poured into the container and covered the abdomen of the mouse. The methods and parameters of data collection and reconstruction were the same as those used in the phantom experiments.
The reconstruction results are shown in Fig. 5. of ANOMP and L1PSD are as small as 3.65% and 4.10%, respectively. The reconstructed targets of ANOMP are separable, while those of L1PSD are inseparable.

Discussions
For FMLT reconstruction, the L1PSD algorithm based on the TD strategy has previously been demonstrated to perform well in localization of fluorescent targets and quantification of fluorescence lifetime, and provide good contrast between different fluorescence targets [9]. However, L1PSD has poor resolution and its solution is not sparse enough for small targets. Moreover, when the EED of two targets is small (Expt. 3), the strong background noise could even affect the quantification accuracy of the reconstructed targets ( max RE = 27.48%). In order to improve the resolution of FMLT, ANOMP, a nonlinear greedy sparsity-constrained algorithm, is proposed. Greedy methods can find an approximate solution of L 0 minimization problem. ANOMP consists of two parts, i.e., the outer iterations and the inner iterations. The outer iterations expand the support set continuously, based on the gradients of mismatch error. The inner iterations utilize the L1PSD algorithm to obtain an intermediate estimate of I τ , based on the estimated support set. The stopping criterion for the outer iterations is based on the stability of the maximum reconstructed values and is robust for problems with targets at different EEDs. Only after the support set covers the real distribution of the targets, the reconstructed inverse lifetime values of the targets can be accurate and remain close to the real values; and the maximum reconstructed values can keep stable.
Phantom experiments with two fluorophores at different EEDs were carried out to test the performance of ANOMP. The reconstruction results indicate that ANOMP can provide high quantification accuracy, good detectability of the fluorescent targets, and good separability of two targets. Especially when the EED is only 2 mm, the reconstructed background is too strong and thus affects the quantification accuracy for L1PSD. In comparison, ANOMP still behaves well in quantification of the fluorescent targets. In addition, in vivo mouse experiments were conducted to further evaluate the performance of ANOMP in practical applications. ANOMP still provides high quantification accuracy for the fluorescent targets and thoroughly separates the fluorophores implanted in the abdomen of the mouse. By contrast, the reconstructed background of L1PSD is too strong and makes the reconstructed targets inseparable. The results of the phantom experiments and in vivo mouse experiments demonstrate that, ANOMP is effective for the problems with sparse targets.

Conclusion
In conclusion, an accelerated nonlinear orthogonal matching pursuit algorithm, as a kind of nonlinear greedy sparsity-constrained reconstruction method, is proposed to enhance the spatial resolution and sparsity of the solution for fluorescence molecular lifetime tomography problems with sparse targets. The reconstruction results of the phantom experiments with different edge-to-edge distances and the in vivo mouse experiments demonstrate that ANOMP can provide high quantification accuracy for the fluorescent targets, even if the EED is small, and high resolution. However, the selection of regularization parameter, i.e., λ0, is still empirical. Future work will be focused on finding a formula for the determination of λ0, improving the shapes of the reconstructed targets and applying FMLT to in vivo cases.