Low-dose CT reconstruction via L1 dictionary learning regularization using iteratively reweighted least-squares

Background In order to reduce the radiation dose of CT (computed tomography), compressed sensing theory has been a hot topic since it provides the possibility of a high quality recovery from the sparse sampling data. Recently, the algorithm based on DL (dictionary learning) was developed to deal with the sparse CT reconstruction problem. However, the existing DL algorithm focuses on the minimization problem with the L2-norm regularization term, which leads to reconstruction quality deteriorating while the sampling rate declines further. Therefore, it is essential to improve the DL method to meet the demand of more dose reduction. Methods In this paper, we replaced the L2-norm regularization term with the L1-norm one. It is expected that the proposed L1-DL method could alleviate the over-smoothing effect of the L2-minimization and reserve more image details. The proposed algorithm solves the L1-minimization problem by a weighting strategy, solving the new weighted L2-minimization problem based on IRLS (iteratively reweighted least squares). Results Through the numerical simulation, the proposed algorithm is compared with the existing DL method (adaptive dictionary based statistical iterative reconstruction, ADSIR) and other two typical compressed sensing algorithms. It is revealed that the proposed algorithm is more accurate than the other algorithms especially when further reducing the sampling rate or increasing the noise. Conclusion The proposed L1-DL algorithm can utilize more prior information of image sparsity than ADSIR. By transforming the L2-norm regularization term of ADSIR with the L1-norm one and solving the L1-minimization problem by IRLS strategy, L1-DL could reconstruct the image more exactly.

However, high quality CT images are now based on a noticeable X-ray radiation dose to the patient, which may result in a non-negligible lifetime risk of genetic or cancerous diseases [1]. This fact has become a major concern for clinical applications of CT scans. Therefore, this article focuses on reducing the radiation dose in CT and generating the clinically qualified image.
In order to reduce the radiation dose, one direct way is to lower mAs levels in CT data acquisition protocols. However, this approach will result in insufficient numbers of X-ray photons received by the detectors and hence increase the quantum noise level. This is a great challenge for these advanced methods which have taken the noise models into account. For example, PWLS (penalized weighted least-squares) based methods [2] can only deal with the noise-contaminated sinogram data to some extent. As a consequence, the radiation dose cannot be reduced evidently by this approach if the reconstructed images need to be qualified for clinical diagnosis. Another way to reduce imaging dose is to decrease the number of X-ray projections operated by fewer sampling angles. Yet, this will lead to serious streaking artifacts in the image reconstructed by the analytic-based algorithms like FBP (filtered backprojection algorithm) [3], since the analytic-based algorithms require that the number of projections should follow the Shannon/Nyquist sampling theorem [4].
To solve the under-sampled reconstruction problem, algebraic algorithms transform the problem to a series of linear equations and the reconstructed image is acquired by the iterative method. SART (simultaneous algebraic reconstruction technique) [5] is one of the typical algebraic iterative methods. But the images reconstructed by traditional algebraic algorithms do not satisfy the clinical image quality demand. Researchers are trying to improve the performance of the iterative reconstruction algorithms by introducing prior information of the reconstructed images to the reconstruction process.
Recently, the compressed sensing theory [6,7] has been applied to the CT reconstruction problem. The reconstruction problem of the compressed sensing algorithm can be written as a constrained optimization where (Aµ −ĝ) 2 is the data fidelity term, A is the projection matrix modeling the forward projection, μ is the image vector to be reconstructed, ĝ is the projection vector, R(μ) is the regularization term including the prior information, λ is the regularization parameter adjusting the relative penalties on the regularization term and the data fidelity term. The regularization term is modeled by the prior information. One commonly used regularization term is the TV (total variation) norm, which is the sum of the absolute coefficients of the DGT (discrete gradient transform) of the reconstructed image. TV-based algorithm is usually used to solve the CT reconstruction problem since that most CT images are piecewise constant. So the TV norm is small enough to reflect the image sparsity. Algorithms like ASD-POCS (adaptive steepest descent projection onto convex sets) [8] and GPBB (gradient projection Barzilai Borwein) [9] are typical TVbased reconstruction algorithms. Besides the TV norm, the recent coming DL (dictionary learning) methods can generate the regularization term, which divide the CT image into many overlapped patches and calculate the sparse representations of the patches under the basis of an over-complete dictionary. While the TV-based methods take the image as a whole to measure the image sparsity, the DL methods have advantages over TV since they extract the sparse prior information from each overlapped image patch, which utilizes more sparse information than TV. Reference [10] combines the SIR (statistical iterative reconstruction) model with DL regularization term to deal with the lowdose reconstruction problem.
Although the compressed sensing algorithms behave well in the low-dose CT reconstruction problem, some low-contrast details of the reconstructed image are lost, especially when the sampling rate decreases further. In order to reserve more image information and satisfy the need of further radiation reduction, the TV-based algorithm developed a weighted TV regularization to help preserve the edge of the image [11]. When it comes to the DL-based algorithm, it is common that the regularization term is L 2 -norm error. Usually, algorithms based on L 2 -norm error minimization may lead to over-smoothing of the image, causing loss of detail. One way to alleviate this problem is to develop algorithms to minimize the L 1 -norm error. In this work, we develop a DL reconstruction algorithm with the L 1 -norm regularization term while the L 1 -minimizaiton problem is approximated by iteratively solving the weighted L 2 -minimization problem, known as IRLS (iteratively reweighted least squares) [12,13]. The proposed L 1 -DL (L 1 dictionary learning) algorithm is compared with ADSIR (adaptive dictionary based statistical iterative reconstruction), SART and GPBB to demonstrate the improvement of image quality based on the L 1 -norm regularization term.
The rest of the paper is organized as follows. In ''Methods'' section, firstly, the backgrounds of ADSIR and IRLS are reviewed. Then the L 1 -DL algorithm and its corresponding optimizing methods are described. After that, the workflow of the algorithm is provided. In section Simulation, a series of experiments are performed to demonstrate the proposed algorithm's superiority. Finally, the section Conclusion with corresponding discussions and further analysis is provided.

Review of ADSIR
The previous work developed a DL based approach for low-dose X-ray CT with the statistical reconstruction model [10]. The related algorithm is reviewed in details as follows.

SIR model
Let I and N be integers and R be the real space. By assuming a monochromatic source, measured data follow the Poisson distributionphantoms with full display is the entrance X-ray intensity, y = (y 1 , y 2 , . . . , y I ) T ∈ R I×1 is the exit X-ray intensity, g = (g 1 , g 2 , . . . , g I ) T ∈ R I×1 is the integral of the linear attenuation coefficient with g i = [Aµ] i = N 2 j=1 a ij µ j , A = {a ij } ∈ R I×N 2 is the system matrix, the reconstructed image µ = (µ 1 , µ 2 , . . . , µ N 2 ) T is a linear attenuation coefficient distribution, which transforms the initial image of N × N pixels to a vector µ ∈ R N 2 ×1 , γ i represents the read-out noise.
The objective function of the SIR model is as In the SIR model, the statistical weight reflects the confidence of the projection measurement along each path. The projection data through denser paths would have lower SNR (signal to noise ratios). Compared to the SART which minimizes a least square function, SIR deals with a statistically weighted least square function. However, this development is insufficient for the low-dose CT reconstruction that it is essential to introduce the regularization constraint.

DL model
Let N 0 and K be integers. The DL regularization term is represented as is an operator to extract patches with N 0 × N 0 pixels from the image, the image patches are overlapping. With a sliding distance of one pixel, the total number of the patches is is the training dictionary, whose column d k ∈ R N 2 0 ×1 is called an atom with the same size of a patch. Usually, the dictionary is redundant or over-complete (N 0 2 ≪ K). α s ∈ R K ×1 has few nonzero components as a sparse representation of the patch by the dictionary basis D. ν s is the regularization parameter different from λ.

ADSIR
By introducing the DL regularization term to the SIR model, the objective function of ADSIR is as Since μ, α s and D are all unknown, the algorithm is iterated by the alternating minimization scheme, which divides the primary problem into two recursive steps-update of the dictionary model and update of the image. During each iteration process, keep the image μ unchanged firstly when the dictionary model is updated. And the objective function (5) becomes which is the dictionary learning and sparse representation problem. The objective function (5) can be transformed to where the sparse level L S 0 is set as a fixed number, usually from 5 to 10. Then the L0-norm problem as (5) is transformed to the OMP question, which has no need to determine the value of ν s . The dictionary D is updated by the classic K-SVD (K Singular Value Decomposition) algorithm [14]. Then, the sparse representation α s is updated by using the OMP (orthogonal matching pursuit) [15] algorithm based on recent dictionary. Once the dictionary model has been updated in the current iteration process, the image μ should be updated with α s and D invariable. In other words, the problem transforms to the form as which consists of the data fidelity term . Since the regularization term is already a separable quadratic function. By replacing the data fidelity term with a separable paraboloid surrogate [16], the optimization can be iteratively solved by

IRLS
Consider a sparse signal x with length N (sparse means the signal has few nonzero components, that is ||x|| 0 ≪ N) is encoded by an M × N measurement matrix Φ with M < N, and the encoded signal is y = Φx with length M. Referred to [12,13], the objective function with L p -norm minimization to solve the sparse signal is as IRLS can be used for solving (9) by replacing the L p -norm with a weighted L 2 -norm where the weights are computed from previous iteration result x (t−1) . To make the L 2norm approximate to the L p -norm, the weights are calculated by where a small ɛ > 0 is provided to ensure stability. Then the signal is iterated by In the ADSIR model, the regularization term ∑ S s=1 ||E s μ − Dα s || 2 2 is the sum of the L 2norm of the difference between the image patch and its sparse representation. The L 2norm constraint tends to distribute the energy of ∑ S s=1 ||E s μ − Dα s || 2 2 to each image patch E s μ − Dα s uniformly. However, most CT images are piecewise constant, so that most image patches have small values of ||E s μ − Dα s ||(equal or close to zero). A small part of the image patches have large values of ||E s μ − Dα s || because they contain edge details and image information. In other word, the distribution of ||E s μ − Dα s ||is sparse. So we propose the L 1 -DL method to make ||E s μ − Dα s || converge to the sparse distribution by the L 1 -norm regularization term. The L 1 -DL utilizes more prior information of image sparsity (the sparse distribution of ||E s μ − Dα s ||) than ADSIR. Derived from the ADSIR algorithm, the objective function of the L 1 -DL method is generated by replacing the L 2 -norm of the regularization term with the L 1 -norm, which is To make this optimization problem solvable, the patch-based weighted L 2 -norm similar to IRLS is introduced. Since the DL theory takes the sparse representation of the image patch as the sparse constraint, the weights are calculated by the information of each image patch, which is , the script (t − 1) means the corresponding terms are previous iteration results, a small ɛ > 0 is to ensure stability. The weights is not computed from each component of the vector E s μ − Dα s . We firstly calculate the average absolute value of the components of the vector E s μ − Dα s , and then take the reciprocal value of it as the weight of this patch. This strategy is different from the IRLS algorithm, which is in order to fit the nature of DL method that the image patch is treated as the basic unit.
By introducing the weights, the optimization problem becomes The iteration is also operated by the alternating minimization scheme. When updating the dictionary model, the objective function (15) becomes By the transformation E s µ ′ = √ w s E s µ,α ′ s = √ w s α s and the equation (16) can be transformed to which is as the same form as (6), so that the dictionary model can be updated by K-SVD and OMP. When updating the image, the objective function (15) becomes Similar to (8), the formula to update the image is The convergence of L1-DL is much more difficult to prove, and is considered beyond the scope of this paper. However, our experimental results to be reported below seem suggesting the convergence of our proposed algorithms.
Above all, the workflow of the developed algorithm is exhibited in Algorithm I. In addition, the ordered subsets convex (OSC) algorithm [17] is utilized to accelerate the convergence.

ALGORITHM I. Workflow of the developed algorithm
While the stopping criterion is not satisfied, do  (14); Output the final image.
The exiting condition is as follows: Both conditions must be met at the same time, then the iteration process stops.

Simulation results
To verify the effectiveness of the proposed L 1 -DL algorithm on low-dose CT reconstruction, several simulation experiments are designed. All the simulations are performed in MATLAB on a dual-core PC with 3.10 GHz Intel Core i5-2400. The proposed algorithm is compared with SART, GPBB and ADSIR. The scanning geometry is the fan-beam geometry shown in Fig. 1. The size of the phantom is r = 20 cm, the radius of the scanning circle (or the distance from the radiation source to the central point of the phantom) is R = 40 cm. For each projection view, 512 detector elements were equi-angularly distributed with the field angle being 36.87°. The distance from the radiation source to the detector elements is 75.895 cm. During the simulation, the scanning circle covers 360° around the imaging phantom, and the size of the reconstructed image is 256 × 256 pixels. The phantoms under simulations are respectively the Shepp-Logan phantom, and the human head slice from clinic, which are shown in Fig. 2 with full and part display window. The biomedical images are often observed by a proper window width to find more details. The number of different densities in the Shepp-Logan phantom is 6, and we set the density of water as 0.2. Some details about the proposed algorithm (like the setting of some variables) are explained as follows. The image patches are overlapping with a sliding distance of one pixel, and the patches are with N 0 × N 0 pixels. The dictionary is consisted of K atoms. When updating the dictionary by K-SVD algorithm, the sparsity level is L D 0 . The sparsity level of the sparse representation α s is set as L S 0 . The iterative stopping criterions are ɛ 1 and ɛ 2 . The values of all the corresponding variables are provided in Table 1. In the image processing field, K = 4 × N 0 × N 0 is a conventional choice to ensure the redundancy of the dictionary. The patch size also influences the quality of the image. If the patch size is too small, it could not effectively catch features in an image. On the other hand, a larger patch size may lead to an over-smoothed image and corresponds to a larger number of atoms in a dictionary, which would increase the computational cost. For image processing, N 0 = 8 is a proper value with considering both image quality and computational cost [18]. The sparsity level L D 0 and L S 0 are empirically determined according to the complexity of the image to be reconstructed.
In addition, all the experiments are simulated by photon number counting method. To perform the OSC algorithm, the transmission data can be calculated with y i = b i e −ĝ i = e −ĝ i by setting the entrance X-ray intensity b i as the number of photons. L1-DL and ADSIR set the initial guess of the image as a random matrix while GPBB and SART set all the values of the elements of the initial matrix as 1. OSC is utilized to Fig. 2 The simulation phantoms. The first to third rows refer to the Shepp-Logan phantom (a, b), and the human head slice (c, d), the first column refers to the phantoms with full display windows and the second column refers to the phantoms with proper window widths

Reconstruction of different sparse levels
In this simulation, the forward simulation and inverse reconstruction are all performed in 2D. During the simulation, the scanning circle covers 360° range around the phantom. The scanning step of tomographic angels of the Shepp-Logan phantom is set as 3°(120 views) and 6°(60 views) respectively. The scanning step of tomographic angels of the human head slice slice is set to 2°(180 views) and 4°(90 views) respectively. We choose the SART, GPBB, and ADSIR algorithms to be the comparisons besides our proposed L 1 -DL algorithm. The regularization parameter of GPBB is chosen by tests (from 0.1 to 2, the length of step is 0.1). We choose the best one to perform GPBB. When it comes to the DL methods, a proper selection of the regularization parameter λ is a vital problem. A bigger λ weakens the effect of the data fidelity term, generating a loss of some fine details in the image while a smaller λ weakens the effect of the regularization term as the sparse constraint, resulting in more noise and streak artifacts in the reconstructed image. In this article, the regularization parameter of ADSIR is determined by a similar model as Ref. [19] creates. The model firstly reconstructs the image by setting λ as infinite, then calculates the difference between the forward projection and the scanning data. After that, the proper value of λ can be calculated by a fitting function based on the difference. In the proposed method (L 1 -DL), the weights introduced to the algorithm would influence the regularization parameter. To eliminate this effect, we multiply the weights with a constant which is the average value of (1/N 2 0 ) The weights are calculated by With this modification, we select the regularization parameter of L 1 -DL as the same as the one of ADSIR. The regularization parameters of different phantom simulations are shown in Table 2. The rationality and influence of the regularization parameter selection will be discussed later. The reconstruction results of the simulations by using these four algorithms are shown in Figs. 3, 4 (Shepp-Logan phantom) and Figs. 5, 6 (human head slice). The negative values in reconstructed image during the iterative process are set to be zero. The results indicate the density of the phantoms. The images of human head slice are displayed by transforming the density to the CT attenuation value. The CT value is calculated by where μ water is the linear attenuation coefficient of water.
By taking the original phantom as a gold standard to provide a numeric quantification of the results, the RMSE (root mean square error) of the reconstructed images is introduced to measure the difference between the reconstructed image and the original image in L 2 -norm. This criterion is defined as: where μ ij truth means the gray-value of the original image. The unit of RMSE can be HU by transform the linear attenuation coefficient to CT value according to Eq. (21). It is easy to demonstrate that the smaller the value of RMSE is, the better quality of the image is.
The quantitative results are shown in Table 3. In Figs. 3 and 5, all the reconstructed images of the same phantom are shown in the same proper display window. In Fig. 3, the results of SART are the worst among the four algorithms. Even with 120 sampling views, the result is ruined by the streak artifacts, which deteriorates seriously when the sampling rate decline to half. The reason for the bad results of SART is that this algorithm is a simple iterative algorithm that has no regularization term to utilize the prior information about the reconstructed image. Compared to SART, GPBB and ADSIR perform better with help of the regularization term utilizing the sparse constraint. Figure 3f, n contain some edge structures, which proves the over-smoothing effect of ADSIR. When projection views reduce to 60, the image reconstructed by ADSIR is influenced by some artifacts. L 1 -DL and GPBB perform well in the 120 views situation. The RMSE of these two methods are 1.647HU and 2.658HU respectively, which are both tiny. And Fig. 3e, g reveal that the difference is hard to recognize. When the sampling views reduce, tiny edge structures emerge in Fig. 3o and the RMSE of GPBB increases to 11.04. Figure 4 display the horizontal intensity profiles through the center of reconstructed images compared to the original image. It is shown that the effects of the reduction of the sampling views are arranged as: SART > ADSIR > GPBB > L 1 -DL. In Table 3, the RMSE of L 1 -DL under 60 views situation is much bigger than the RMSEs of ADSIR and SART with 120 views situation, which certifies that the proposed algorithm can adapt to further radiation dose reduction Table 4.
When it comes to the human head slice in Figs. 5 and 6, the results are similar to the Shepp-Logan phantom. But the original image of the human head slice is more complex Fig. 4 The horizontal intensity profiles through the center of the original and reconstructed images of the Shepp-Logan phantom. a L1-DL; b ADSIR; c GPBB; d SART than the Shepp-Logan phantom, and is close to the real reconstruction process of the biomedical images. Although the RMSE of the proposed algorithm is a little bigger than GPBB in 180 views situation, L 1 -DL is better than GPBB when the number of views reduces to 90. In 90 views situation, the image of GPBB, displayed as Fig. 5k, has some artifacts and the spatial resolution is worse than L 1 -DL by checking the enlarged region of the images. In addition, the RMSEs of the proposed algorithm is much better than the ones of ADSIR. Considering the proposed algorithm is modified based on the reconstruction model of ADSIR, the improvement of image quality is obvious.

Robustness to the noise
In the practical applications, the measurements of the radiation projections are usually polluted by noise, which demands that the algorithm should be robust to the noise. To evaluate the tolerance to noise of the proposed algorithm and other three ones, we choose to add some Poisson noise to the projection data for test. The scanning step of tomographic angels of the Shepp-Logan phantom is set to 6°(60 views) and  Table 5. Compared to the results with no noise polluted in Figs. 3, 4, 5 and 6, the image quality of SART degrades fast with the noise level increasing    while GPBB, L 1 -DL and ADSIR are more robust to the noise. In the Shepp-Logan experiments, L 1 -DL is better than GPBB and ADSIR comparing the RMSEs. The difference between the reconstructed image and original image shown in Fig. 7 indicates the image quality is arranged as: L 1 -DL > GPBB > ADSIR > SART. In the head slice simulations, the RMSE of SART is a bit smaller than ADSIR, but the image reconstructed by SART are ruined by noise and artifacts. When it comes to L 1 -DL and GPBB, the RMSE of GPBB is better than L 1 -DL in 180 views situation, and the RMSE of these two methods are same in 90 views situation. The results (Fig. 9i-l) of the 90 views simulation with noise indicate that all the four algorithms lose the structure in the yellow rectangle region. L 1 -DL is still better than ADSIR and has a bit higher spatial resolution than GPBB.

Convergence rate
To explain the convergence rate of L 1 -DL compared to ADSIR, the RMSEs of the images reconstructed by these two algorithms are shown as functions of the number of iterations in Fig. 11. The projection data are simulated on the human head slice with 180 scanning views, which is not polluted by noise. L 1 -DL stops at the 72th iteration while ADSIR stops at the 49th iteration. Since the iterative process of L 1 -DL only has one additional step, which is updating the weight function, it takes almost the same time as ADSIR to perform one iteration. The times consumed by one iteration of ADSIR and L 1 -DL are 89 and 92 s respectively. By checking the first 20 iterations, it can be found that the RMSE decreasing rate of ADSIR is only a bit faster than L 1 -DL at first. So it is claimed that L 1 -DL needs more iterations than ADSIR for convergence because of the better reconstructed result, but the convergence rate of the two algorithms are almost the same. However, it takes several tens of minutes for both ADSIR and L 1 -DL to reach the stopping points, so that some accelerating methods and high computation efficiency are the expectations for the real time imaging.

Regularization parameter investigation
In this part, the rationality and influence of the regularization parameter are discussed in detail. To verify that the selections of this parameter in former text are rational and explore the influence to the results by other parameter values, the images are reconstructed by different regularization parameters. The phantom under simulation is the human head slice in Fig. 2. The projection data are 180 views with no noise, 180 views with noise simulated by 2 million photons. To each projection model, the regularization parameters are chosen as the one calculated by the fitting function (cited as λ), the one multiplied by 0.2 (cited as 0.2λ) and the one multiplied by 5 (cited as 5λ). The results of ADSIR are shown in Figs. 11 and 12. The RMSEs of the results are shown in Table 6. From the data in Table 6, we can see the images reconstructed by a bigger regularization parameter (5λ) will have larger RMSEs, which means the reconstructed images are influenced by the smoothing effect and fewer image details are reserved. When this parameter is set smaller (0.2λ), the RMSEs might be a little smaller than the ones reconstructed by the proper regularization parameter, such as the data as italics in Table 6 shows. However, the images in Figs. 11 and 12 (the second column) are ruined by noise or streak artifacts in different extents. Therefore, it is important to select the proper value of the regularization parameter for the DL-based methods and the simulations in this section prove that the selections in this article are rational.

Discussion
The simulation results indicate that the proposed L 1 -DL algorithm is a useful and robust method for the sparse CT reconstruction. L 1 -DL utilizes more prior information of image sparsity than ADSIR benefited by the L 1 -norm DL regularization term. L 1 -DL is an improved method of ADSIR, and the simulation results demonstrate the image quality improvement of L 1 -DL than ADSIR.   . 12 The reconstruction results of the human head slice by ADSIR with different regularization parameter. a-c The image reconstructed by 180 scanning views data with no noise; d-f the difference between the reconstructed image (a-c) and the original image (OI); g-i the image reconstructed by 180 scanning views data simulated by 2 million photons; j-l the difference between the reconstructed image (g-i) and the original image (OI). The first to third columns refer to the results with λ, 0.2λ and 5λ respectively By comparing the simulation results, another comparison algorithm, GPBB, is a little better than L 1 -DL in some situations when comparing the RMSEs. However, GPBB is not good at reserving the edge details and structures, especially when the sampling rate reduces further. It is claimed that L 1 -DL has a higher spatial resolution than GPBB.
In the simulation to explain the convergence rate of L 1 -DL, L 1 -DL stops at the 72th iteration while ADSIR stops at the 49th iteration and the time of one iteration is about 90 s. So it consumes about 108 and 74 min for the reconstruction of L 1 -DL and ADSIR. When it comes to the reconstruction processes of GPBB and SART with the same projection data, the time of one iteration of these two algorithm is about 1 s. Since GPBB and SART stops when the iteration number reaches 1000, the reconstruction time is about 17 min. In addition, the iteration process of GPBB and SART can be accelerated by GPU, so that the time can be reduced to less than a minute. Accelerating L 1 -DL, ADSIR and other DL-based methods is an important factor for the practical application of the DL-based methods.

Conclusion
In this work, we propose to replace the L 2 -norm regularization term with the L 1 -norm one to improve image quality reconstructed by the DL-based method. The new objective function is optimized by the adaptive weighted L 2 -norm strategy, which is similar to the IRLS algorithm. By involving this modification, the proposed L 1 -DL algorithm behaves better than the existing DL-based method (ADSIR), and other two comparing algorithms. Experimental results show that the proposed algorithm can satisfy the demand of further radiation reduction in CT scanning since it needs fewer scanning data for high-quality recovery. In addition, the proposed algorithm retains the robust characteristic to the projection noise as a DL-based algorithm. Our future work will focus on two aspects. One of them is accelerating the DL-based methods to make the real time imaging with low-dose radiation possible. The other is looking for some possible strategies to utilize more prior information and further improve the image reconstruction result. For example, by utilizing a proper way to distinguish structural information and noise in the image, the DL regularization term can be designed based on the distinguishing results, which is a promising method to preserve more structural information and improve the image quality.