Regularized reconstruction based on joint L 1 and total variation for sparse-view cone-beam X-ray luminescence computed tomography

: As an emerging hybrid imaging modality, cone-beam X-ray luminescence computed tomography (CB-XLCT) has been proposed based on the development of X-ray excitable nanoparticles. Owing to the high degree of absorption and scattering of light through tissues, the CB-XLCT inverse problem is inherently ill-conditioned. Appropriate priors or regularizations are needed to facilitate reconstruction and to restrict the search space to a specific solution set. Typically, the goal of CB-XLCT reconstruction is to get the distributions of nanophosphors in the imaging object. Considering that the distributions of nanophosphors inside bodies preferentially accumulate in specific areas of interest, the reconstruction of XLCT images is usually sparse with some locally smoothed high-intensity regions. Therefore, a combination of the L 1 and total variation regularization is designed to improve the imaging quality of CB-XLCT in this study. The L 1 regularization is used for enforcing the sparsity of the reconstructed images and the total variation regularization is used for maintaining the local smoothness of the reconstructed image. The implementation of this method can be divided into two parts. First, the reconstruction image was reconstructed based on the fast iterative shrinkage-thresholding (FISTA) algorithm, then the reconstruction image was minimized by the gradient descent method. Numerical simulations and phantom experiments indicate that compared with the traditional ART, ADAPTIK and FISTA methods, the proposed method demonstrates its advantage in improving spatial resolution and reducing imaging time.

( ) ( ) ( ) S X n = Γ r r r (1) where S(r) is the light emitted, n(r) is the concentration of nanophosphors, Γ is the light yield of the nanophosphors, and X(r) is the intensity of X-rays at position r, which can be given by the Lambert-Beer law [9].
In the visible and NIR spectral window, biological tissues have the characteristics of high scattering and low absorbing. Therefore, the propagation model of the emitted light in biological tissues can be established by the diffusion equation (DE) [26]: where Ω is the image domain, Φ(r) is the photon fluence, μ a (r) is the absorption coefficient. To solve the diffusion Eq. (2), the Robin boundary conditions are usually applied [27,28], as shown below: where ∂Ω is the boundary of Ω , κ is the boundary mismatch parameter and ν represents the outward unit normal vector on the boundary. With the finite element method (FEM), Eq. (2) and Eq. (3) can be discretized into a matrix equation as: Here N is the distribution vector of nanophosphors, a ij and f ij are the elements of matrix A and F, respectively. ψ i (r) and ψ j (r) are the corresponding elements of discretized geometrical meshes of the imaging domain, and X(r) is the intensity of X-rays at position r.
Since the matrix A is positive definite, Eq. (4) can be further recast into where 1 M A F − = Γ , Φ represents the distribution vector of photon fluence. For optical tomography, only intensity values of Φ on the object surface could be measured, then Eq. (7) becomes = meas Φ WN (8) where Φ meas is the vector of photon fluence acquired on the object surface, and W consists of rows of the weight matrix M corresponding to surface measurements.

Inverse problem
The goal of the XLCT reconstruction is to estimate the nanophosphor distribution N from Φ meas . In practical application of XLCT, noise of the XLCT imaging system needs to be considered, and Eq. (8) becomes: where b represents the actual fluorescence signals measured on the surface of the imaging object, ς is the noise of the system, x = N represents the unknown distribution of nanophosphors in the imaging object. The objective function of the reconstructing method based on the joint L 1 and total variation regularization can be expressed as: where 1 L λ and TV λ are the regularization parameters.

Solution of the inverse problem
The solution to Eq. (10) is divided into two parts. First, the L 1 regular solution of reconstructed image is obtained by the FISTA algorithm. Then the gradient descent (GD) strategy is applied to deal with the TV constraint in the reconstruction of XLCT. The flowchart of the proposed method is recapitulated in Fig. 1. Considering that x represents the distribution of nanophosphors, the non-negative constraint is applied before the TV minimization of x. The implementation process of these two parts is described as below, respectively. 2.3.1 FISTA for solving the L 1 regularization problem x , the optimization problem of the L 1 regularization term is changed to: 11) In this study, the FISTA algorithm is used to solve Eq. (11) and a simple flowchart is summarized in Algorithm 1.
Step k: (k>1) Compute L k (Lipschitz constant) based on backtracking line-search strategy

GD for TV constraint in the reconstructed image
The second part of the optimized problem (Eq. (10) can be expressed as the TV minimization problem, which is shown as: After getting the current L 1 regularization solution in the first part, the GD method is used to solve Eq. (12), in which the derivative of the image TV with respect to each pixel is approximately written as: where δ is a small positive number in order to avoid the denominator to zero.

Experimental design
Numerical simulations and phantom experiments were performed to evaluate the performance of the proposed method based on the custom-developed CB-XLCT system in our laboratory. For comparison, three traditional methods, algebra reconstruction technique (ART), adaptive tikhonov regularization (ADAPTIK), and fast iterative shrinkage-thresholding (FISTA) algorithm, were also implemented for solving Eq. (9) with no regularization, L 2 regularization and L 1 regularization, respectively. The relaxation factor was set as 0.1 for ART algorithm. For ADAPTIK and FISTA algorithm, the regularization parameters were empirically set to 10 −4 and 10 −5 , respectively. In this study, to make the calculating process consistent for a fair comparison between the FISTA algorithm and the proposed algorithm, the L 1 regularization parameter was set as 10 −5 in the proposed algorithm. For the TV regularization, the gradient descent (GD) method was used to solve the minimization problem, in which the step length and total descent numbers were set as 0.1 and 10, respectively. For ART, ADAPTIK, FISTA and the proposed L 1 -TV algorithm, the iterative numbers were empirically set as 3000, 20, 3000, 30, respectively.

Numerical simulations setup
Firstly, numerical simulations were carried out with a cylinder phantom. The cylinder phantom ( Fig. 2) was composed of a large cylinder tank (3.0cm in diameter and 2.3cm in height) and two small tubes (4mm in diameter and 4mm in height) filled with Y 2 O 3 :Eu 3+ (50mg/ml), which were placed inside the cylinder as the targets. The tank was filled with a mixture of water and intralipid and the optical properties were set as  [29,30]. To evaluate the reconstruction results of two targets with different distances, numerical simulations were performed with the targets positioned at distances of 3, 2 and 1mm (edge-to-edge distance between the two targets), respectively. For phantom simulations, the imaging model is discretized into 2,695 nodes and 12,285 tetrahedral elements in a 3D region of 3.0 × 3.0 × 2.3 cm 3 . To make the results comparable with phantom experiments, in the numerical simulations, the distance from the X-ray source to the rotation center of the imaging system was set as 26.3cm and the EMCCD camera was positioned perpendicularly to the X-ray source-detector axis, with a distance of 45.0cm between the EMCCD and the rotation center. The voltage and current of the cone beam X-ray source were set as 50kVp and 1mA, respectively. The simulated 24 projections were obtained every 15° during a 360° scan. After optical luminescence was obtained at different angles, white Gaussian noise was added into all projections with a zero-mean and signal-to-noise ratio (SNR) of 30DB to simulate noisy measurements.

Phantom experiments setup
In order to further validate the performance of the proposed method with real luminescence measurements, phantom experiments were performed by using a custom-developed CB-XLCT system, which was shown in Fig. 3. The system mainly included a micro-focus X-ray source (Oxford Instrument, U.K.), a rotation stage, a flat-panel X-ray detector (2923, Dexela, U.K.) for high-resolution CT imaging, and an electron-multiplying charge-coupled device (EMCCD) camera (iXon DU-897, Andor, U.K.) for optical imaging. The maximum voltage of the X-ray source was 80kVp with the maximum power of 80W. The distances from the Xray source to the rotation center of the imaging system and to the flat-panel detector were 26.3cm and 86.3cm, respectively. The EMCCD camera coupled with a Nikon 50-mm f/1.8D lens was positioned at 90° towards the X-ray source-detector axis, with a distance of 45.0cm to the rotation center. The minimum cooling temperature of EMCCD camera was −80°, which could effectively reduce the dark noise. The configuration of the physical phantom used in imaging experiments was shown in Fig. 4, which was based on observations from the simulation studies. A glass cylinder (4.0cm in diameter, 4.0cm in height) containing a mixture of water and intralipid was fixed on the rotation stage. Two small glass tubes (3mm in diameter) filled with Y 2 O 3 : Eu 3+ (50mg/ml) were symmetrically placed in the cylinder to simulate two targets. The edge-to-edge distances (EED) between the two tubes were 5.5mm and 1.8mm. During imaging experiments, the voltage and current of the X-ray source were set as 50kVp and 1mA, respectively. The phantom was rotated from 0° to 360° and the optical images were obtained every 15° by the EMCCD camera. The exposure time of the EMCCD camera was set as 1s, with the EM gain and binning set as 260 and 1 × 1. For the CT imaging, the projections were acquired with an angular increment of 1° on a 360° circular orbit, while the voltage and current of the X-ray source were set as 50kVp and 1mA, and the acquisition time for each projection was 150ms. The Feldkamp-Davis-Kress (FDK) algorithm [31,32] was used for CT reconstruction.

Quantitative evaluation
The quality of reconstructed CB-XLCT images was evaluated quantitatively by several indexes including the location error (LE), dice similarity coefficient (DICE) and contrast-tonoise ratio (CNR) [30].
LE evaluates the localization accuracy of the reconstructed target, which is defined as the Euclidean distance error between the centers of true and reconstructed targets: where L r and L t denote the centers of the reconstructed and true targets, respectively. DICE reflects the similarity of the true and reconstructed targets and can be calculated by: where ROI t and ROI r denote the regions of true and reconstructed targets, respectively, and |⋅| defines the number of voxels in a region. CNR is used for quantitative evaluation of noise and artifacts in reconstructed images, as shown below:

Numerical simulations
Firstly, the XLCT tomographic images of the targets positioned at different distances were reconstructed using 24 projections with different algorithms, as shown in Fig. 5. All the reconstruction results are normalized based on their maximum values. It can be seen for ART and ADAPTIK algorithms, the distribution of two targets can be effectively distinguished when the EED is 3 mm, but the task became difficult when the EED is 2 mm and 1 mm, with relatively large location errors, as shown in Figs. 5(d)-(i). Comparatively, the distribution of two targets can be effectively distinguished with FISTA and the L 1 -TV methods, when the EED is 2 mm. However, when the EED is 1 mm, no algorithms except for the proposed L 1 -TV algorithm can distinguish them, as shown in Figs. 5(m)-(o). To evaluate the performance of the proposed method with fewer projections, the XLCT images of the targets positioned at different distance were reconstructed using 4 projections, as shown in Fig. 6. It can be seen that based on ART and ADAPTIK algorithms, the distribution of two targets can be effectively distinguished when the EED is 3 mm, but it is difficult to be distinguished when the EED is 2 mm and 1 mm using 4 projections and compared with the reconstruction results using 24 projections, the reconstruction results using 4 projections yield worse target shape and larger location error, as shown in Figs. 6(d)-(i). Besides, there is background noise in the reconstruction results because the ill-conditioned of reconstruction is more serious with the decrease of the projection angle. Correspondingly, the distribution of two targets can be effectively distinguished when the EED is 2 mm, but it is difficult to be distinguished when the EED is 1 mm using 4 projections based on the FISTA method, as shown in Figs. 6(j)-(l). Compared with the method of ART, ADAPTIK and FISTA, the distribution of two targets can be effectively distinguished when the EED is 1 mm based on the proposed L 1 -TV algorithm although only 4 projections can be used, as shown in Figs. 6(m)-(o). Table 1 summarizes the quantitative evaluation of the reconstructions using different methods. Here LE1 and LE2 represent for the localization error of the reconstructed target 1 and target 2, respectively. For considering the prior information of the targets sparsity and local smoothness, the reconstruction results based on the proposed L 1 -TV method yield highest reconstruction quality in terms of the target shape and localization accuracy, among the four methods, which further confirm the observation in Fig. 6.  In order to compare the performance of different algorithms, the XLCT tomographic images of the targets positioned at different distance for phantom experiments were reconstructed using 24 projections, as shown in Fig. 7. It can be seen that for ART and ADAPTIK algorithms, the distribution of two targets can be effectively distinguished when the EED is 5.5 mm, but the task became difficult when the EED is 1.8 mm, with relatively large location errors, as shown in Figs. 7 (c)-(k). Compared with the method of ART and ADAPTIK, FISTA method can reduce background noise. However, it is also difficult to distinguish the distribution of the two targets when the EED is 1.8mm, as shown in Figs. 7(l)-(o).

Phantom experiments
Comparatively, when the EED is 1.8 mm, no algorithms except for the proposed L 1 -TV algorithm can distinguish them, as shown in Figs. 7(p)-(s). In order to further validate the performance of the proposed method for the fast imaging, the XLCT tomographic images of the targets positioned at different distance were reconstructed using 4 projection angles, as shown in Fig. 8. It can be seen that based on ART, it is difficult to be distinguished when the EED is 5.5 and 1.8 mm using 4 projection angles, as shown in Figs. 8(c)-(f). Based on the ADAPTIK method, the distribution of two targets can be effectively distinguished when the EED is 5.5 mm, but it is difficult to be distinguished when the EED is 1.8 mm using 4 projection angles. With sparse regularization, the distribution of two targets can be effectively distinguished when the EED is 5.5 mm, but it is difficult to be distinguished when the EED is 1.8 mm using 4 projection angles based on the FISTA method, as shown in Figs. 8(l)-(o). Figures 8(p)-(s) show that with our proposed L 1 -TV algorithm, however, even the targets with a distance of 1.8 mm could be separated successfully, demonstrating its advantage in improving spatial resolution in real imaging experiments. Table 2 summarizes the quantitative evaluation of the reconstructions using different methods with 4 projections. The results indicate that even in phantom experiments, the proposed L 1 -TV algorithm still performs better in target location, shape recovery and image contrast, when compared to the conventional reconstruction methods, which further confirm the observation in Fig. 8.
CB-XLCT imaging using 4 projections requires less data acquisition time than that using 24 projections. In addition, due to reduced measured data, the dimension of the system matrix is reduced greatly, which is 14237 *8810 for 24 projections and 14237*1469 for four projections. Therefore, Compared with XLCT imaging with 24 projections, reconstructions with 4 projections could implement fast imaging with reduced scanning time and memory footprint, as shown in Table 3.

Discussion and conclusions
In this study, a reconstruction approach based on joint L 1 and total variation regularization is proposed for the CB-XLCT inverse problem. The implementation of this method is split into the L 1 penalty and the TV penalty. The L 1 constraint is used to enhance the target sparsity and the TV constraint is used to maintain the local smoothness and preserve the edges of the targets of the reconstructed image. Numerical simulations and phantom experiments results confirm the superiority of the proposed method over the conventional ART, ADAPTIK and the FISTA methods. In numerical simulations, two targets could be effectively distinguished by the proposed method when the EED is 1 mm, while in phantom experiments, two targets could be resolved when the EED is 1.8 mm, demonstrating its advantage in improving spatial resolution. With the consideration of the prior information on target sparsity and local smoothness, the proposed method indicates its potential in improving the XLCT image quality in terms of target shape and localization accuracy, and in reducing imaging time by reconstructions with fewer scanning angles.
For the phantom experiments, the whole scanning time consist of the rotation time and the acquisition time of the EMCCD camera. In our imaging experiments, the speed of the rotation was 6 degrees per second and the exposure time of the EMCCD camera was set as 1s per view. That resulted in more time spent on the rotation process, where the acquisition time of 4 projections was quite limited. The current configuration of the imaging system makes the reduction of imaging time using the proposed method not obvious in phantom experiments. With the increase of rotation speed and the exposure time in in-vivo experiments, the benefit of the proposed method on imaging time reduction would be further recognized.
Although the system setup for numerical simulations and phantom experiments appeared similar, the results were not identical, especially when the target distance was 1.8 mm in phantom experiments. The possible reason may be that in numerical simulations, inaccuracies were mainly caused by simulation noise and numerical errors due to the ill-posedness of the system matrix. In contrast, in phantom experiments, inaccurate modeling of optical properties and measurement noise, geometric error of the imaging system, and other factors, would further deteriorate the reconstruction results.
In this study, the fast iterative shrinkage-thresholding (FISTA) algorithm is used to solve L 1 regular constraint. Besides FISTA algorithm, it can be solved based on the L 1 -Ls algorithm [33] and the separable paraboloidal surrogates algorithm [23,24]. In addition, the traditional TV model is adopted in the proposed L 1 -TV method. To further improve the performance of the proposed method, the edge-preserving total variation (EPTV) [34] and the adaptiveweighted total variation (AWTV) [35] can be used in the future studies. Nevertheless, due to the uncertainty of the CB-XLCT forward model, the reconstruction is very ill-conditioned. It is difficult to obtain high quality reconstructions in the phantom experiments, especially when targets are closer. In addition to studies on reconstruction methods, further investigations on constructing more accurate propagation model of photons, such as by using the radiation transfer equation, may provide additional benefit for CB-XLCT imaging.
In summary, a reconstruction approach based on joint L 1 and total variation regularization was proposed in this study. Compared to the traditional ART, ADAPTIK and FISTA method, the proposed method demonstrated its advantage in improving spatial resolution and reducing imaging time, which can promote the widely use of CB-XLCT in vivo.