Introduction

Computed tomography (CT) is a technique based on X-ray attenuation to reconstruct tomographic images of an object [1]. As one of the greatest scientific and technological achievements at the end of twentieth century, CT have had a revolutionary impact on clinical screening, diagnosis, image-guided surgery, image-guided radiotherapy, and other aspects [2, 3]. However, the widespread use of CT has aroused people’s concern about causing cancer or genetic abnormalities [4, 5]. Therefore, strategies were proposed to lower X-ray dose in CT scanning [6], such as reducing tube current or peak voltage, decreasing the number of projections, and improving the hardware of CT system. All of the above methods can reduce the X-ray dose absorbed by patients, while this paper mainly studies low-dose CT acquired by reducing the radiation tube current or tube voltage. CT imaging artifacts brought by these methods are not too serious. However, due to the low signal-to-noise ratio of the obtained projection data, images reconstructed by traditional methods are usually severely degraded by noise, which might affect radiologists’ diagnosis [7]. In order to reconstruct high-quality images from low-dose computed tomography (LDCT) projection data, methods including projection domain preprocessing, iterative reconstruction, and post-processing have been proposed.

Operations on sinogram, called projection preprocessing method, aim to reduce noise in raw data. Projection data could be considered a noisy image. Denoising filters are usually designed according to the noise characteristics or structure information of the sinogram. Compared with other methods, projection preprocessing takes the statistical properties of noise as prior information. In recent years, models like penalized likelihood [8] and sparsity-based sinogram denoising [9] have been proposed to suppress noise and avoid edge blurring in sinogram. Liu et al. developed a well-designed composite dictionary with discriminative features in sinogram domain [10]. Xie et al. made full use of the prior knowledge and statistical properties of LDCT sinogram to formulate the preprocessing as a standard maximum a posteriori estimation [11]. Karimi et al. proposed simultaneous sparse representation (SSR) method [12], which is similar to sparse representation-based dictionary learning [13, 14] and non-local means algorithm [15, 16]. Hu et al. developed an improved Wasserstein generative adversarial network (WGAN) framework for sinogram to enhance the reconstructed image quality [17].

Post-processing method is not limited by CT supplier. Some outstanding filters and networks have been presented to tackle LDCT restoration, such as BM3D [18, 19], conveying path-based convolutional encoder-decoder network (CPCE) [20], and joint bilateral filter net (JBFnet) [21]. Post-processing methods can be directly applied to the reconstructed CT images and easily integrated into the current CT workflow. Based on the statistical property of CT images, Choi et al. proposed a deep learning approach to statistical image restoration for LDCT [22]. Shiri et al. applied a residual convolutional neural network to cope with ultralow-dose CT images of COVID-19 patients [23]. Gholizadeh-Ansari et al. used dilated convolutions with different dilation rates to capture more contextual information in fewer layers for LDCT [24].

Iterative reconstruction methods can also be used to enhance image quality of LDCT by taking advantages of the measured data information and some prior knowledge. Gao et al. proposed a prior-based machine learning model for Bayesian reconstruction of ultralow-dose CT images [25]. He et al. developed a noise suppression-guided image filtering reconstruction (NSGIFR) algorithm [26] for LDCT reconstruction. They introduced guided image filtering (GIF) [27] and block-matching and 3D filtering (BM3D) [18, 19] to achieve an outstanding performance on noise suppression and texture preservation. As a commonly used regularization method, total variation (TV) [28, 29] uses L1-norm of the image gradient to decrease noise in LDCT image reconstruction. However, TV-based reconstruction results are weak in retaining image structure, at the same time introducing blocking artifacts, which affects the clinical diagnosis. Therefore, Yu et al. considered L0-norm-based LDCT image reconstruction model, only calculating the number of image gradients whose amplitude is non-zero while ignoring penalty for the large ones [30]. Recently, relative total variation (RTV) [31] was proposed to extract main structures from complicated texture patterns. Unlike L0-norm and L1-norm, RTV is defined in terms of the differences between the partial derivatives of textures and structures in an image. Window inherent variation (WIV) and window total variation (WTV) are, respectively, defined as denominator and numerator of RTV. The main structures in images stand out according to the definition of WIV, which could easily distinguish structures and the textures. For WIV, the partial derivatives of textures and noise in a local window tend to cancel each other out. However, both textures and noise contribute to the value of WTV. As a consequence, the textures and noise can be removed by penalizing RTV. In [32], Gong et al. proposed RTV-regularized projections onto convex set (POCS-RTV) reconstruction model. Based on this model, they used L-curve method to set parameters in POCS-RTV, called adaptive POCS-RTV (POCS-ARTV). This model is able to reduce noise by penalizing RTV term.

In this work, we aim to take advantages of prior knowledge in gradient domain to enhance the quality of LDCT. Even though experimental results in [32] show a satisfactory performance on low-intensity CT reconstruction, we observed that POCS-RTV tends to cause fuzzy edges and details in reconstructed results especially for extremely LDCT. We consider this phenomenon is related to the weighting function of RTV, because the weight is only based on distance between pixels, ignoring the importance of image gray-scale values. We draw lessons from the thought of bilateral filtering [33, 34] to propose the bilateral weighted relative total variation (BRTV). Spatial proximity and pixel value similarity are utilized as the weighting function of WIV. Two close and similar pixels make more contributions to the partial derivative in WIV so that a smaller BRTV could be acquired, which could suppress noise while preserve sharp edges and fine details at the same time. Structures, textures, and noise contribute to the value of WTV. Therefore, weights of WTV are set as the original ones in RTV.

Inspired by [33, 34], we first propose BRTV to simultaneously maintain edges and further reduce noise for image restoration, then propose the BRTV-regularized projections onto convex set (POCS-BRTV) model for LDCT reconstruction. Being different from POCS-RTV, the weights of WIV in POCS-BRTV rely on both the closeness to vicinity in the domain and similarity to vicinity in the range, which could better characterize the distance and difference between two pixels. Especially for LDCT, textures and edges are severely degraded by extremely high level noise. It is easy to confuse textures and noise when reconstruction models based on image sparsity in the gradient domain are used to tackle the LDCT images. POCS-BRTV shows outstanding performance on getting sharper edges as well as distinguishing fine details and noise for LDCT reconstruction. The main contributions of this work are threefold. Firstly, we develop an image restoration model, BRTV, based on proximity and similarity between two pixels in a local rectangular. Secondly, BRTV is utilized as the regularization term for LDCT reconstruction model. Penalizing BRTV term can suppress noise and preserve main structures. Finally, we develop an iterative algorithm to solve POCS-BRTV model.

This paper is structurally organized as follows. In the next section, we briefly introduce LDCT imaging model and RTV. In the third section, the proposed algorithm is presented, after which the experimental results of the simulated data are described. The conclusion of this study is given in the final section.

Methods

Low-Dose CT Imaging Model

In LDCT, relationship between pixels and projection data can be expressed as.

$$g=Af+e.$$
(1)

Column vector \(g\) denotes the N-dimension projection data degraded by noise \(e\); column vector \(f\) is M-dimension object image; \(A\) represents the projection coefficient matrix.

The task of CT imaging is to reconstruct the object image \(f\) from projection data \(g\). In ideal circumstances, CT projection data is noise-free so that the reconstructed images can be obtained according to the inverse of projection matrix if there is enough computer memory available. As a matter of fact, LDCT cannot be directly reconstructed due to the limitation of the computer memory for problem dimension and the degradation by noise. To approximately find \(f\) from \(g\), the least squares method of optimal objective function is usually adopted to solve the problem in Eq. (1), which is represented in the following form.

$${f^*} = \mathop {\arg \min }\limits_{f \geqslant 0} \;\left\| {Af - g} \right\|_2^2 + \lambda R\left( f \right)\;.$$
(2)

The first term in Eq. (2) is a data fidelity term to constrain the reconstructed image and projection data. The second term is a penalty term, usually used for noise reduction and edge preservation. \(\lambda\) is a penalty parameter to balance the two terms just mentioned.

Relative Total Variation

Relative total variation (RTV) [31] can be treated as a special weighting total variation, which enforces the structure part to be sparse in gradient domain. The formulation of RTV can be expressed as.

$$RTV\left( p \right) = \frac{{{D_x}\left( p \right)}}{{{L_x}\left( p \right) + \varepsilon }} + \frac{{{D_y}\left( p \right)}}{{{L_y}\left( p \right) + \varepsilon }},$$
(3)

where \(\varepsilon\) is a positive number to avoid denominators equaling to zero. \({D_x}\left( p \right)\) and \({D_y}\left( p \right)\) are windowed total variations (WTV) along \(x\) and \(y\) directions at the pixel \(p\) of image \(f\), using for reflecting the absolute differences between neighboring pixels within a rectangular region \(R\left( p \right)\) centered at pixel \(p\). Structures, textures, and noise in region \(R\left( p \right)\) contribute to WTV, which has no dependence on the sign of each partial derivative. \({L_x}\left( p \right)\) and \({L_y}\left( p \right)\) denote the windowed inherent variations (WIV) along \(x\) and \(y\) directions at pixel \(p\), and the WIV helps to distinguish structures from image \(f\). WTV and WIV are defined as.

$${D_x}\left( p \right) = \sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot \left| {{{\left( {{\partial_x}\;f} \right)}_q}} \right|}, {D_y}\left( p \right) = \sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot \left| {{{\left( {{\partial_y}\;f} \right)}_q}} \right|} ,$$
(4)
$${L_x}\left( p \right) = \left| {\sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot } {{\left( {{\partial_x}\;f} \right)}_q}} \right|, {L_y}\left( p \right) = \left| {\sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot } {{\left( {{\partial_y}\;f} \right)}_q}} \right|,$$
(5)

where \({\partial_x}\) and \({\partial_y}\) are the partial derivatives in two directions. \({k_{p,q}}\) is a weighting function defined based on the spatial affinity referring to a rectangular region centered at pixel \(p\), setting as

$${k_{p,q}}\; \propto \;\exp \left( { - \frac{{{{\left( {{x_p} - {x_q}} \right)}^2} + {{\left( {{y_p} - {y_q}} \right)}^2}}}{{2{\sigma^2}}}} \right),$$
(6)

where \({x_p}\) and \({y_p}\) are the horizontal and vertical coordinates at pixel \(p\), and \(\sigma\) relates to the scale of window \(R\left( p \right)\).

The Proposed Reconstruction Algorithm

LDCT naturally contains severe noise, and this noise will corrupt the image quality of the complicated texture within the LDCT. RTV tends to blur the distinction between textures and noise so that it is less than satisfactory for LDCT image denoising. In [32], Gong et al. proposed relative total variation-regularized projections onto convex set (POCS-RTV) model. Experiments show a satisfactory performance on low-intensity CT reconstruction, while we observed that this model tends to cause fuzzy edges and textures in reconstructed results especially for extremely LDCT. We consider that this phenomenon is related to the weighting function of RTV, because \({k_{p,q}}\) is only based on distance between pixels \(p\) and \(q\), ignoring the importance of image gray-scale values. We draw lessons from the thought of bilateral filtering [33, 34] to utilize spatial proximity and pixel value similarity as the weighting function of WIV. Therefore, bilateral weighted relative total variation (BRTV) is proposed to simultaneously maintain edges and further reduce noise. Based on the characteristics of BRTV and LDCT, we utilize BRTV as the regularization term of LDCT reconstruction model and propose the POCS-BRTV model.

Bilateral Weighted Relative Total Variation

BRTV considers both closeness to vicinity in the domain and similarity to vicinity in the range of WIV. Considering that WTV would extract edges, textures, and noise of an image at the same time, BRTV takes no account of the modifications on WTV. BRTV is set as

$$BRTV\left( p \right) = \frac{{{D_x}\left( p \right)}}{{{{\tilde L}_x}\left( p \right) + \varepsilon }} + \frac{{{D_y}\left( p \right)}}{{{{\tilde L}_y}\left( p \right) + \varepsilon }},$$
(7)

where \({D_x}\left( p \right)\) and \({D_y}\left( p \right)\) are same as the ones in RTV. \({\tilde L_x}\left( \;p \right)\) and \({\tilde L_y}\left( p \right)\) utilize a new weighting function \({h_{p,q}}\). They are defined as.

$${D_x}\left( p \right) = \sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot \left| {{{\left( {{\partial_x}\;f} \right)}_q}} \right|},\; {D_y}\left( p \right) = \sum\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot \left| {{{\left( {{\partial_y}\;f} \right)}_q}} \right|} ,$$
(8)
$${\tilde L_x}\left( p \right) = \left| {\sum\limits_{q \in R\left( p \right)} {{h_{p,q}} \cdot } {{\left( {{\partial_x}\;f} \right)}_q}} \right|,\;{\tilde L_y}\left( p \right) = \left| {\sum\limits_{q \in R\,\left( p \right)} {{h_{p,q}} \cdot } {{\left( {{\partial_y}\;f} \right)}_q}} \right|,$$
(9)

where \({k_{p,q}}\) is the original weighting function in RTV. It was defined based on the closeness between pixel \(p\) and pixel \(q\) shown in Eq. (6). \({h_{p,q}}\), as a composite similarity measure, is proportional to the product of gray similarity and position proximity between pixel \(p\) and pixel \(q\):

$${h_{p,q}}\; \propto \;\exp \left( { - \frac{{{{\left( {{x_p} - {x_q}} \right)}^2} + {{\left( {{y_p} - {y_q}} \right)}^2}}}{{2{\sigma^2}}}} \right) \cdot \exp \left( { - \frac{{{{\left( {{f_p} - {f_q}} \right)}^2}}}{{2{\sigma^2}}}} \right),$$
(10)

where \({x_p}\) and \({y_p}\) are the horizontal and vertical coordinates at pixel \(p\). \({f_p}\) is the gray-scale value at pixel \(p\). A large \({h_{p,q}}\) represents a greater closeness between pixel \(p\) and pixel \(q\). As a result of that, the partial derivatives make more contributions for WIV to reach a smaller BRTV.

\({L_x}\left( p \right)\), \({\tilde L_x}\left( p \right)\), \({L_y}\left( p \right)\), and \({\tilde L_y}\left( p \right)\) of a noisy image are displayed in Fig. 1 to verify the performance of \({h_{p,q}}\). It can be observed that \({\tilde L_x}\left( p \right)\) and \({\tilde L_y}\left( p \right)\) show clearer structures with less blurry edges than \({L_x}\left( p \right)\) and \({L_y}\left( p \right)\).

Fig. 1
figure 1

The noisy phantom image and corresponding \({L_x}\left( p \right)\), \({\tilde L_x}\left( p \right)\), \({L_y}\left( p \right)\), and \({\tilde L_y}\left( p \right)\) images. The display window is [0, 0. 5]

POCS-BRTV Reconstruction Model

The proposed POCS-BRTV reconstruction model is set as

$$\mathop {\min }\limits_{f \geqslant 0} \left\| {Af - g} \right\|_2^2 + \lambda \cdot \sum\limits_{p \in f} {BRTV\left( p \right)} .$$
(11)

This model is to reconstruct image \(f\) from projection data \(g\). \(A\) represents the projection coefficient matrix. The first term is data fidelity term, which maintains the constancy between the reconstructed image and the measurements. The second term, BRTV, is applied to remove noise and preserve clearer edges during LDCT reconstruction process. \(\lambda\) is a weight to balance data fidelity and regularization term.

As an iterative algorithm is adopted to solve this problem, we linearize \(\left\| {Af - g} \right\|_2^2\) at point \({f^k}\). Problem (11) could be expressed as

$$\mathop {\min }\limits_{f \geqslant 0} \left\| {A{f^k} - g} \right\|_2^2 + \left\langle {{A^T}\left( {A{f^k} - g} \right),f - {f^k}} \right\rangle + \left( {\gamma /2} \right) \cdot \left\| \;{f - {f^k}} \right\|_2^2 + \lambda \cdot \sum\limits_{p \in f} {BRTV\left( p \right)} ,$$
(12)

where \(\gamma\) is a positive parameter, and \({f^k}\) is the intermediate image at the kth iteration during reconstruction process. \(\left\langle { \, \cdot \, } \right\rangle\) represents inner product of the two vectors. Optimization problem (12) can be reformulated as the following concise form:

$$\mathop {\min }\limits_{f \geqslant 0} \left\| {\;f - \left( {{f^k} - \left( {1/\gamma } \right) \cdot {A^T}\left( {A{f^k} - g} \right)} \right)} \right\|_2^2 + \left( {{{2\lambda } \mathord{\left/ {\vphantom {{2\lambda } \gamma }} \right. \kern-\nulldelimiterspace} \gamma }} \right) \cdot \sum\limits_{p \in f} {BRTV\left( p \right)} .$$
(13)

The term \({f^k} - \left( {{1 \mathord{\left/ {\vphantom {1 \gamma }} \right. \kern-\nulldelimiterspace} \gamma }} \right) \cdot {A^T}\left( {A{f^k} - {g^*}} \right)\) can be approximately implemented by simultaneous algebraic reconstruction technique (SART). The combination of SART and nonnegative constraints, called POCS, is to acquire the intermediate image \({f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}\). Let \({\lambda_1}\) be equal to \({{2\lambda } \mathord{\left/ {\vphantom {{2\lambda } \gamma }} \right. \kern-\nulldelimiterspace} \gamma }\). Therefore, problem (11) can be solved by a two-step iterative reconstruction algorithm, including POCS step [35] and BRTV step:

$${f^{k + 1/2}} = POCS\left( {{f^k},\;g} \right),$$
(14)
$${f^{k + 1}} = \mathop {\arg \min }\limits_{f \geqslant 0} \left\| {\;f - {f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}} \right\|_2^2 + {\lambda_1} \cdot \sum\limits_{p \in f} {BRTV\left( p \right)} .$$
(15)

To solve Eq. (15), we first discuss the x-direction measure of penalty term. It can be written as

$$\begin{aligned} \mathop{\sum}\limits_p {\frac{{D_x}(p)}{{\tilde{L}_x}(p) + \varepsilon}} &= \mathop{\sum}\limits_p {\frac{{\mathop{\sum}\limits_{q \in R\left( p \right)} {{k_{p,q}} \cdot \left| {{{\left( {{\partial_x}\;f} \right)}_q}} \right|} }}{{\left| {\mathop{\sum}\limits_{q \in R\left( p \right)} {{h_{p,q}} \cdot {{\left( {{\partial_x}\;f} \right)}_q}} } \right| + \varepsilon }}} = \mathop{\sum}\limits_q {\mathop{\sum}\limits_{p \in R\left( q \right)} {\frac{{{k_{p,q}}}}{{\left| {\mathop{\sum}\limits_{q \in R\left( p \right)} {{h_{p,q}} \cdot {{\left( {{\partial_x}\;f} \right)}_q}} } \right| + \varepsilon }}\left| {{{\left( {{\partial_x}\;f} \right)}_q}} \right|} } \\ & \quad\quad\ \ \ \approx \mathop{\sum}\limits_q {\mathop{\sum}\limits_{p \in R\left( q \right)} {\frac{{{k_{p,q}}}}{{{{\tilde L}_x}\left( p \right) + \varepsilon }}\frac{1}{{\left| {{{\left( {{\partial_x}\;f} \right)}_q}} \right| + {\varepsilon_g}}}\left( {{\partial_x}\;f} \right)_q^2} } = \mathop{\sum}\limits_q {{{\tilde u}_{xq}}{w_{xq}}\left( {{\partial_x}f} \right)_q^2}\; \end{aligned}$$
(16)

where \({\varepsilon_g}\) is introduced for numerical stability. \({\tilde u_{xq}}\) and \({w_{xq}}\) are set as.

$${\tilde u_{xq}} = \sum\limits_{p \in R\left( q \right)} {\frac{{{k_{p,q}}}}{{{{\tilde L}_x}\left( p \right) + \varepsilon }}}, \;{w_{xq}} = \frac{1}{{\left| {\left( {{\partial_x}f} \right){}_q} \right| + {\varepsilon_g}}}.$$
(17)

A similar process can be used to the y-direction measure in BRTV. Then, Eq. (15) can be written in a matrix form:

$${\left( {{v_f} - {v_{{f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}} \right)^T}\left( {{v_f} - {v_{{f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}} \right) + {\lambda_1}\left( {v_f^TC_x^T{{\tilde U}_x}{W_x}{C_x}{v_f} + v_f^TC_y^T{{\tilde U}_y}{W_y}{C_y}{v_f}} \right)$$
(18)

where \({v_f}\) and \({v_{{f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}}\) represent the vector of \(f\) and \({f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}\), respectively. \({C_x}\) and \({C_y}\) are the matrices of forward difference operator. \({\tilde U_x}\), \({\tilde U_y}\), \({W_x}\), and \({W_y}\) are the diagonal matrices consisting of \({\tilde u_{xi}}\), \({\tilde u_{yi}}\), \({w_{xi}}\), and \({w_{yi}}\), respectively. The optimization problem of Eq. (18) can be solved using the following equation:

$$\left[ {I + {\lambda_1}\left( {C_x^T\tilde U_{_x}^tW_{_x}^t{C_x} + C_y^T\tilde U_{_y}^tW_{_y}^t{C_y}} \right)} \right] \cdot v_{_f}^{t + 1} = {v_{{f^{k + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}}}},$$
(19)

where \(I\) is an identity matrix. \(t\) represents the number of iterations of BRTV.

The complete workflow of the proposed POCS-BRTV model is summarized as Table 1.

Table 1 LDCT reconstruction by the proposed POCS-BRTV model

Experiments and Analysis

In this section, experiments on digital photons are used for demonstrating performance of the proposed model. All experiments ran on a 3.20 GHz Intel(R) Core(TM) i7-8700 CPU with windows 10 64-bit system environments, and all the reconstruction algorithms were implemented by Matlab 2016b combining with Visual Studio 2015. All reconstructed images use root mean square error (RMSE), peak signal-to-noise ratio (PSNR), and structure similarity (SSIM) for quantitative analyses. The size of each phantom is \(256 \times 256\) for all of the experiments in this paper. Projections from these phantoms are acquired via fan-beam CT, and the detector units are arranged in line. Poisson noise is added to projection data in order to simulate LDCT. The level of Poisson noise is negatively correlated with photon number \({I_0}\). Specifically, a large \({I_0}\) corresponds to a higher SNR. We set \({I_0} = 1.0 \times {10^4}\) and \(1.0 \times {10^5}\) for all digital photon experiments. The geometry scanning parameters in this work are set as Table 2.

Table 2 Geometry scanning parameters of the simulated experiments

In all of the experiments, the qualities of reconstructed images are highly affected by parameters, which are selected according to the evaluation indexes and visual effect. In this work, we fixed the number of max iterative reconstruction \({N_{iterMax}} = 1000\). A stopping criteria referring to errors between two successive steps is set for an earlier termination. The relaxation parameters \({\alpha_{SART}}\) are empirically fixed as 0.15 and 0.25 corresponding to photon number \({I_0}\) at 1.0 × 104 and 1.0 × 105, respectively. The value of the regularization parameter \({\lambda_1}\) can control the smoothness of results. A larger one represents a greater smoothness. Making BRTV iteration number \({N_{BRTV}}\) larger could preserve the edge sharpness well. Because BRTV is implemented in each iteration during the reconstruction process, \({N_{BRTV}}\) set as 2 or 3 is enough. \(\sigma\) is a spatial parameter to control the window size for computing the windowed variations. A large \(\sigma\) refers to a bigger window size, as a result of which the images would be strongly smoothed. A bit larger \(\varepsilon\) helps preserve smoothly varying structures. In other words, a smaller one helps to preserve sharper edges and more fine textures. Besides, we found that the value of \(\varepsilon\) depends on image complexities, simple images corresponding to a small value of \(\varepsilon\). It is worth noting that all of the parameters of POCS-RTV are set the same as those of POCS-BRTV for a fair comparison.

Shepp-Logan Phantom Experiment

The first simulation experiment uses a Shepp-Logan phantom to simulate LDCT case. In this experiment, the photon number is set as \({I_0} = 1.0 \times {10^4}\) and \(1.0 \times {10^5}\). Parameters for POCS-BRTV model in different corresponding cases are set as follows: (1) \({\lambda_1} = 0.0007\), \({N_{BRTV}} = 2\), \(\sigma = 0.6\), \(\varepsilon = 0.000001\); (2) \({\lambda_1} = 0.00045\), \({N_{BRTV}} = 2\), \(\sigma = 0.5\), \(\varepsilon = 0.000001\).

In Fig. 2, CT images are reconstructed using LDCT projection data via SART, POCS-TV, POCS-RTV, and POCS-BRTV. On the far left one is the noise-free phantom, in which a ROI is marked by a red rectangle and zoomed-in at the corner. It can be observed that images reconstructed by SART are severely degraded by noise. POCS-TV reconstruction can effectively reduce noise, while the results from POCS-TV are blurred by staircase effect (indicated by red arrows). Adjusting parameters could mitigate this problem, while at the same time details might be blurred during the reconstruction process. From the reconstructed images of POCS-RTV, edges indicated by red arrows are blurry, while these artifacts disappear in the results of POCS-BRTV. Clear edges can be seen from the results of POCS-BRTV. Our model represents better noise suppression and edge protection than other three algorithms.

Fig. 2
figure 2

The tomographic results of Shepp-Logan phantom and a zoomed-in region reconstructed by different algorithms with different noise levels. The first column is the reference image, the rest of columns are results reconstructed via SART, POCS-TV, POCS-RTV, and POCS-BRTV. Images from top to bottom are the results reconstructed with different photon numbers: 1.0 × 104 and 1.0 × 10.5. The display window is [0, 0. 5]

Table 3, listed RMSE, SSIM, and PSNR corresponding to Fig. 2, is to quantitatively analyze different reconstruction algorithms under different noise levels. The bold ones are the optimal indexes. POCS-BRTV outdistances the other three algorithms according to the indexes.

Table 3 Global RMSEs, SSIMs, and PSNRs of reconstructed Shepp-Logan phantom with different photon numbers

FORBILD Head Phantom Experiment

The second experiment is performed on the projection data of the FORBILD head phantom. Parameters for our model corresponding to \({I_0} = 1.0 \times {10^4}\) and \(1.0 \times {10^5}\) are set as follows: (1) \({\lambda_1} = 0.01\), \({N_{BRTV}} = 3\), \(\sigma = 1\), \(\varepsilon = 0.0002\); (2) \({\lambda_1} = 0.004\), \({N_{BRTV}} = 3\), \(\sigma = 0.9\), \(\varepsilon = 0.00001\).

The restored images from LDCT data are presented in Fig. 3, and the reference image is on the far left. Two ROIs are marked with red rectangles. The rest of columns are images processed by SART, POCS-TV, POCS-RTV, and POCS-BRTV, respectively. The zoomed-in ROIs corresponding to Fig. 3 are displayed in Fig. 4.

Fig. 3
figure 3

The tomographic results of FORBILD head phantom reconstructed by different algorithms with different noise levels. The first column is the reference image, the rest of columns are results reconstructed via SART, POCS-TV, POCS-RTV, and POCS-BRTV. Images from top to bottom are the results reconstructed with different photon numbers: 1.0 × 104 and 1.0 × 10.5. The display window is [0.5, 1.5]

Fig. 4
figure 4

ROIs of the FORBILD head phantom and the reconstructed results in Fig. 3. The images in (a) are extracted from the reference FORBILD head phantom image. Images in (b) and (c) are extracted from results reconstructed with different photon numbers: 1.0 × 104 and 1.0 × 105, respectively. From top to bottom, images are ROI 1 and ROI 2. From left to right in (b) and (c), images are, respectively, reconstructed by SART, POCS-TV, POCS-RTV, and POCS-BRTV. The display window is [0.5, 1.5]

From the zoomed-in ROIs, it can be observed that all of the results from SART are severely disrupted by noise so that three minor dots in images cannot be identified even to the naked eye. Images reconstructed via POCS-TV tend to cause staircase effects, which are not smooth enough. POCS-RTV leads to over-smoothing results. Almost all of the three minor dots are smoothed by POCS-RTV. These problems are improved according to the results of POCS-BRTV. However, there is something that needs to be improved. When the noise in projection data intensifies, even though the POCS-BRTV model can significantly suppress noise, two extremely minor dots cannot be preserved in ROI1. Table 4 presents the quantitative evaluation indexes corresponding to the results in Fig. 3. The bold ones are optimal indexes. The results show that POCS-BRTV model presents the best performance at the two presented cases.

Table 4 Global RMSEs, SSIMs, and PSNRs of reconstructed FORBILD head phantom with different photon numbers

In Fig. 5, the horizontal profiles (177th row) of the reconstructed images from the incident X-ray intensity at \({I_0} = 1.0 \times {10^4}\) are provided to verify the advantages of POCS-BRTV, that it generates the nearest image, especially at edges, to the reference one. From the RIOs in Fig. 5, we can see that POCS-BRTV is able to preserve the sharp edges, while POCS-RTV tends to cause jitter on edges.

Fig. 5
figure 5

Horizontal profiles (177th row) of the reconstructed images shown in Fig. 3 by different algorithms with the incident X-ray intensity of 1 × 104

Discussion and Conclusion

In this paper, we proposed a LDCT reconstruction model based on the prior knowledge in gradient domain. This work extended the thinking of Gong et al. that they utilized the differences between partial derivatives generated by noise and structures, and developed POCS-RTV for low-intensity CT reconstruction [32]. POCS-RTV is defined based on WIV and WTV in the gradient domain. Structures and noise can be adaptively distinguished by WIV. However, the results from POCS-RTV tend to cause blur on edges especially under high-level noise. It might result from the weights of WIV that only depend on the distance between two pixels in a local rectangular. In fact, both nearby spatial location and similar gray values make contributions to characterize the closeness between two pixels. Therefore, we further integrated the similarity measurement with the weighting function of WIV, proposing POCS-BRTV to reconstruct LDCT. POCS-BRTV considers both position closeness and gray similarity to vicinity in the range of WIV, which helps preserve more fine details and sharp edges compared with POCS-RTV.

In summary, we propose a bilateral weighted relative total variation (POCS-BRTV) model for LDCT reconstruction in this work, integrating the thinking of bilateral filtering with POCS-RTV. Experimental results obtained at different noise levels show that the proposed POCS-BRTV model leads to significant improvements on LDCT images and especially achieves better results on preservation of edges and details during the reconstruction procedure.