Statistical iterative reconstruction using adaptive fractional order regularization

: In order to reduce the radiation dose of the X-ray computed tomography (CT), low-dose CT has drawn much attention in both clinical and industrial fields. A fractional order model based on statistical iterative reconstruction framework was proposed in this study. To further enhance the performance of the proposed model, an adaptive order selection strategy, determining the fractional order pixel-by-pixel, was given. Experiments, including numerical and clinical cases, illustrated better results than several existing methods, especially, in structure and texture preservation.


Introduction
X-ray computed tomography has been widely used for clinical and industrial purposes. However, the radiation risk accompanying with the increasing number of scans also draws a lot of attention [1]. The famous as low as reasonably achievable (ALARA) principle is encouraged to evade excessive radiation dose in the clinical field. Since the imaging quality is approximately proportional to the radiation dose, simply reducing the radiation dose will seriously degrade the signal to noise ratio (SNR) and the clinical values of the images.
The major methods to reduce the radiation dose can be divided into two categories: the first one is to lower the x-ray tube current or shorten the exposure time (mAs) and the second is to reduce the photon number penetrating human body. However, the former will introduce quantum noise into projection data and the later will cause sparse view, limited angle, and interior CT. In this study, we only focused on the first category of methods and the proposed model is natural to extend to other topics.
Mathematically, the procedure of CT image reconstruction can be formulated as a linear inverse problem. Both categories of the methods reducing the radiation dose will make the linear system ill-conditioned and the traditional filtered backprojection (FBP) will be powerless. In the past decades, much endeavors have been made to deal with this problem. The most classical methods include the algebraic reconstruction technique (ART) [2] and expectation maximization (EM) [3]. However, when the sampling data is highly underdetermined, both ART and EM can only achieve better results than FBP and it is still very difficult to meet the standard for clinical requirements.
Except ART and EM, other methods for the contaminated projection data can be classified into two groups in terms of objects to process. The first one is to directly process the projection data. Li et al. proposed to filter the projection data with a penalty function and solve the corresponding nonlinear system with iterated conditional mode [4]. La Rivière used the separable paraboloidal surrogates to solve a new penalized likelihood function which can obtain lower computational cost and better visual effect [5]. Wang et al. modeled the first and second moments of the noise of the projection data with a weighted least squares method and developed a penalized weighted least squares (PWLS) using the space relationship between signals as data fidelity [6]. To overcome the quasi-speckle noise and mosaic effect introduced by PWLS, Tang et al. proposed to decompose the projection data into different scales and cope with them with anisotropic diffusion [7]. Based on the spatial relations and similarity between pixels, Manduca et al. proposed a bilateral filter to process the projection data [8]. Although filter-based methods are computationally efficient and can only suppress the noise to a certain degree, it is difficult to keep the structure details very well and the artifacts introduced by the filters in the projection data will degrade the reconstructed images with unpredictable side effects.
Different from the filter-based methods, the optimization object of the second group is the image pixel. Erdogan and Fessler analyzed the statistical property of the projection data, and formulated a minimization framework including a statistical data fidelity and a regularization terms [9]. Thibault et al. applied this framework into multi-slice spiral CT [10]. Xu et al. used the classical total variation (TV) as regularization term and introduced it into interior tomography and spectral CT [11,12]. Employing the fact that the attenuation coefficient for same material is identical, as prior information, Rashed and Kudo applied the SIR framework into sparse view problem [13]. This kind of methods has been proved to be efficient to suppress the noise and streak artifacts. The crucial part is what prior information is formulated as the regularization term and this issue is related to a popular topicscompressive sensing (CS).
Recently, compressive sensing has drawn a lot of attention in many fields. Different from the classical Nyquist-Shannon sampling theorem, CS proved that if the undersampled signal can be represented sparsely with a sparsifying transform, it can be accurately reconstructed after the corresponding inverse transform [14,15]. The most common sparsifying transform is the discrete gradient transform and TV is the sum of the transform coefficients. Inspired by the CS theory, several methods based on TV were proposed. Sidky et al. coupled TV into projection onto convex sets (POCS) [16], derived TV-POCS for sparse and limited view problems, and extended it to circular cone-beam scan with an adaptive step-size strategy [17]. This adaptive step size strategy was improved by Ritschl et al. and the minimal value of the raw data cost function could be achieved [18]. Although TV is widely used, it has inherent flaws. TV assumes the signal is piecewise smooth, which will make the reconstructed result blocky. Meanwhile, since TV is based on the gradient of single pixel, so it is very difficult to identify the texture and edges. To overcome this problem, with similar idea, Tian et al. and Liu et al. respectively introduced a penalty weight into TV and gave an edge preserving TV (EPTV) [19] and an adaptive-weighted total variation model [20]. Several group proposed different energy model with high order regularization terms [21][22][23]. These energy models can efficiently eliminate the blocky effects, but new artifacts such as speckle-like noise will be introduced. Zhang et al. combined TV with a high order norm to suppress the blocky effects [24].
In the recent decades, a new mathematical tool, fractional calculus, has been applied in image processing. Different from the integer order derivative, the fractional-order derivative based TV has different frequency response characteristics corresponding to different frequency bands and can be seen as a natural inner interpolation between traditional TV and high order TV. Based on this property, fractional order derivative based models have shown their ability to mitigate the blocky effect and preserve more structure and texture information [25][26][27][28][29][30][31][32][33], without introducing other drawbacks such as those using the higher-order methods. More details about application of fractional calculus in signal processing can be referred in [25,34,35]. Pu et al. derived six numerical masks to compute the fractional order derivatives and applied them in image enhancement [25]. Bai and Feng proposed a fractional order anisotropic diffusion model and observed that when the order is 1.8 or 2.2, the best results could be obtained [26]. Zhang and Wei derived fractional bounded variation (FBV) space to restore noisy images with rich texture information [27,28]. According to a texture detection and preservation strategy, Chan et al. presented a spatially adaptive fractional order diffusion model [29]. Zhang's group first introduced fractional calculus into medical imaging. They proposed two energy minimization functions for CT metal artifact reduction [30,31], and gave a fractional model for image reconstruction [32]. With these efforts, fractional calculus demonstrated powerful potential in image processing, especially, texture and structure preservation. With proper order selection, it also can mitigate the side effect caused by integer order methods, like TV and high order PDE [21][22][23]36].
In this study, we proposed an adaptive fractional order selection strategy for fractional order TV and involved it into statistical iterative reconstruction framework. The numerical algorithm was given with two derivative masks. The theoretical details are described in Section 2, and the numerical and clinical experiments were provided in Section 3. Finally, the discussion and conclusion were given in Section 4.

Statistical iterative reconstruction
In traditional CT system, while the x-ray beam scans the target object, part of the photons are absorbed. The remaining part will pass through the target object and be measured by the detectors. The attenuation distribution μ of the object can be calculated according to the Beer-Lambert law: where I is the number of photons detected by a single detector bin along a certain path l and 0 I is the corresponding number of photons in a blank scan. The measured data can be described in an approximate discrete form with the Poisson distribution: 1 Poisson exp , 1, 2,..., , 1, 2,..., , where i Y is the raw measurement along the ith x-ray path with the corresponding blank scan factor i b and read-out noise i r ,

{ }
ij A a = is the system matrix, I and J is the total number of projections and image pixels respectively. For simplicity, we consider the monochromatic source and assume that the statistical distributions along different x-ray paths are independent. Neglecting the constant terms, the Poisson log-likelihood function which is given in [9] can be written as Under the situation of low dose radiation, the reconstruction problem will become illconditioned and how to achieve a stable and reliable solution is still a challenging problem. Maximizing the a posteriori (MAP) is a powerful tool to deal with this problem. Then we have the following MAP function: With the monotonicity of logarithmic function and ignoring the constant term, the solution of Eq. (4) can be obtained by minimizing the following objective function: With ( )=-log ( ) R P μ μ and the second-order Taylor expansion for -( ) L μ , we can have is the statistical weight for the ith x-ray path and ˆi l is an estimate for linear line integral for the ith x-ray path which is defined as ˆl og( / ( ))

The proposed method
In Eq. (6), ( ) R μ can be seen as the prior information for the image. Originated from different image prior information, several methods have been proposed. TV is the most popular one which makes the assumption that the images are piecewise constant [36]. The method based on q-generalized Gaussian Markov field (q-GGMRF) can be seen as a generalized version for TV regularization [37]. All of these methods are local which means the numerical calculation is neighborhood-based. It is well known that this kind of methods cannot efficiently identify the complex structures and noises. The fractional order regularization has been proved powerful to deal with some fractal-like structures, such as blood vessels and superficial sulci and gyri of brain.
Here, we propose an adaptive fractional order TV regularization term and replace ( )

The numerical scheme
There are two key parts for the numerical scheme of the proposed adaptive fractional order TV based SIR (AFTV-SIR) model. The first part is the iterative solution for our model. The second is the discretization of fractional order gradient image α μ ∇ . For simplicity, we use the separable paraboloid surrogate method [9] to obtain the iterative solution for Eq. (7) as the following: will be given next.
Fractional calculus has been studied for hundreds of years. As its most important ingredient, fractional order derivatives can be viewed as a generalization of the integer order ones. The main definitions can be classified into two categories: the frequency domain ones and the time domain ones. Due to the requirement of symmetry by discrete Fourier transform, the image must be extended which will yield a high computational cost. Additionally, spatial adaptivity of the fractional order is one of the contributions of this study, so we only consider the time domain definitions. Thus, similar to [27][28][29][30][31][32][33], Grüwald-Letnikov fractional-order derivative was utilized to deduce the discretization version of Eq. (8).
Given a two-dimensional image denoting the generalized binomial coefficient, and K referring to the number of pixels involved in computing the fractional order partial derivatives.
For simplicity, the computations in Eq. (10) can be considered as convolutions of the image with two shift-invariant kernels discretizing the fractional order partial derivatives. The construction for the convolution kernels can refer our previous works [32]. Figure 1 shows the concrete forms of the kernels.

Selection of α
In the proposed model, the selection of the fractional order α in image processing field is an inevitable problem. Manually choosing α is a common approach in most fractional models [26,27,[30][31][32][33]. But due to the varied distribution of image features, it is difficult to reach the optimums of a globally uniform α with this non-intelligent strategy for every targets. It is well known that TV ( 1 α = ) will cause staircase effect and erase some important textures.

Overall workflow
The overall flowchart for out model is presented in Table 1.

Experimental results
In this section, numerical phantom and patient data studies were performed to validate the proposed method. To further evaluate the performance of the proposed AFTV-SIR, filtered back projection (FBP), TV-SIR [11] and EPTV [19] were compared. For FBP, Ram-Lak filter was used. The parameters in TV-SIR and EPTV were set according to the suggestions in the original references [11,19]. In this study, ite N was set to 1000. λ was set to 0.3 in numerical experiment and 0.2 in patient experiment respectively. The iterative stop threshold τ was set to 0.01. The size of kernels for fractional derivatives was set to 7. The size of the normalized symmetric Gaussian kernel G S was tested in a reasonable range from 3 to 17 and G S was manually set to 9 for best reconstruction results. According to the recommendation in [38,39], the size of the local window W S should be set to =4 1=37 W G S S + . It is reasonable that the optimal choice of G S is target-dependent. How to adaptively choose G S according to the image contents is out of the scope of this study and will be our future work. All the experiments were tested on MATLAB 2012b with a PC (Intel i7 4770k CPU and 8 GB RAM).
Four metrics were used in this paper for quantitative analyses. The first metric is the root mean square error (RMSE), which is defined as The correlation coefficient (CC) is defined as where μ and * μ are the mean values of μ and * μ .
The structural similarity index (SSIM) was used to measure the structure similarity, which has been proved to be coherent to visual perception: where μμ σ * is the covariance of μ and * μ , and 1 c and 2 c are constants that we set according to [41].

Numerical phantom study
In the numerical phantom study section, the FORBILD abdomen phantom [42] which has 256 × 256 pixels, was used to test the proposed method. Siddon's ray-driven algorithm [43] was used to simulate fan-beam geometry. The source-to-rotation center distance is 40 cm and the detector-to-rotation center is 40 cm. The image array is 20 cm × 20 cm. The detector which is 41.3 cm in length is modeled as a straight-line array of 512 detector bins. To demonstrate the performance of our method, the abdomen phantom was uniformly sampled with 300 views over 360°. After that, Poisson noise was added to each detector with photon number 5

10
i b = × . The reconstruction results are shown in Fig. 2. It can be observed that the result from FBP contained much noise which blurred the structure information heavily, especially in the low contrast regions. TV-SIR, EPTV and AFTV-SIR suppressed most noise. However, in the zoomed low contrast and complex structure regions, which are labelled by green dotted rectangles, TV-SIR suffered from obvious blurring effect. EPTV and AFTV-SIR both obtained satisfactory resolution in the high contrast regions, but AFTV-SIR had better visual effect in the low contrast regions.
To quantitatively evaluate the performance of the proposed method, we compared the performance of the four algorithms on the reconstruction of ROIs with detailed structures, which were marked by red dotted rectangles in Fig. 2(a). The quantitative results are given in Fig. 3. It can be seen that the AFTV-SIR obtained the best results: it has the lowest RMSE and highest PSNR/CC/SSIM for all of the ROIs. To further demonstrate the difference among the four reconstruction algorithms, two horizontal profiles of the reconstructed images were plotted across the line marked as target profile 1 and 2 in Fig. 2(a), which are respectively located in the low contrast and complex structure regions. The results further showed the advantage of three iterative algorithms over the FBP on noise suppression, as well as the advantage of the AFTV-SIR over the EPTV and the TV-SIR on edge/contrast preservation at the matched noise level (see Figs. 4 and 5).

Patient data study
In this study, we used CT images collected on a GE Discovery CT 750 HD Scanner. We specified a representative slice from the image volume to evaluate our methodology. The image is of 256 × 256 pixels. The tube voltage was set to be 120 kVp, and the mAs level was 100 mAs. We regarded this acquisition as the normal-dose scan, and for the consideration of radiation dose, we simulated the corresponding low-dose projection data by adding noise to the normal-dose projection data using the same method in the previous subsection. The photon number was set as 5

10
. The projection geometry was the same as that of the numerical phantom study part. To demonstrate the capability of the proposed method for low dose reconstruction, 1160 views were collected, and they were uniformly distributed over a full scan range. The reconstruction results were shown in Fig. 6, where (a) and (b) are the FBP reconstructed images from the acquired normal-dose and simulated low-dose projection data, respectively. Figure 6(c), 6(d), 6(e) are the reconstructed images from the simulated low-dose projection by the TV-SIR, the EPTV, and the AFTV-SIR algorithms. It can be observed that TV-SIR, EPTV and AFTV-SIR eliminated most of the noise. However, blocky effect can be seen in Fig. 6(c) and some tiny structures were blurred. In the zoomed parts labeled by green dotted rectangles, AFTV-SIR preserved more details than TV-SIR and EPTV and efficiently suppressed the blocky effect, indicated by the red arrows.
To quantitatively evaluate the performance of the proposed method we compared the performance of the four algorithms on the reconstruction of ROIs with detailed structures, which were marked by red dotted rectangles in Fig. 6(a). The quantitative results (Fig. 7) have the similar trend as the phantom study. The AFTV-SIR had the lowest RMSE and the highest PSNR/CC/SSIM for all of the ROIs.  Also two vertical profiles were chosen from two different regions, which represented high contrast structure (bone) and low contrast structure (vessel) respectively. The results plotted in Figs. 8 and 9 obviously showed that iteration-based methods could achieve better results on noise suppression, but TV and EPTV could also smooth structures in the low frequency   regions which is labelled by red arrows. AFTV-SIR obtained better matching performance on both structure and contrast preservation.

Discussions and conclusion
SIR framework has been proved to be a powerful tool to deal with quantum noise caused by lowering the x-ray tube current or shortening the exposure time. The selection of the regularization term is flexible and will prominently impact the performance of noise suppression. TV is used as the most common regularization term, which has shown its ability to eliminate the noise. However, due to the congenital defect of mathematical assumption, TV suffers from blocky effect, which means small details, including edges and structures are smoothed. And these details in medical images might indicate some primary nidi, which would have important clinical value. In this study, a generalized version of TV called as fractional order TV was coupled into the SIR framework to overcome the blocky effect. In addition, an adaptive strategy for order selection was given to further improve the performance of this method. In both numerical and clinical cases, our method achieved better results than TV-SIR and its variant EPTV on detail and contrast preservation. The essence of the proposed method is the fractional order TV term. It was shown that, by adjusting the fractional order, the fractional order based TV model could efficiently eliminate the blocky effect and preserve more details. We measured the structure characteristics of each pixel with a statistical variable defined as local variance and introduced it into the computation of the fractional order. When the region belonged to cartoon, the order was small. Once the regions have more details, the order rose.
The selection of the parameter λ is a vital factor to impact the performance of the regularization based method. Due to the case dependence of λ , we only chose λ manually to balance the tradeoff between regularization and fidelity terms to obtain the best imaging quality. There has been some methods to choose λ according to the noise property in image domain [38,39]. However, the noise in CT images only obey some certain distribution in projection domain, not in image domain. In future, we would study a parameter selection method to ensure an approximate optimal solution for the reconstruction.
Another issue we must mention here is the computational cost. Iterative reconstruction methods suffer from heavy computational burden for the loops of re-projection and backprojection operations between image and projection domains. The proposed AFTV-SIR has a same computational complexity as TV-SIR. The main expense is blamed on the convolution operation with the proposed kernels of fractional order derivative. From a mathematical point of view, decreasing the size of the mask would be a possible way to reduce the computational cost, but the accuracy of fractional order derivative would be sacrificed. In practice, the extra computational expense can be alleviated by several methods. One possible software approach is ordered subset method, which can be recognized as a parallelization of computational operations. Another popular way is to code based on the graphics processing unit (GPU). Many studies have shown the possibility of GPU to dramatically accelerate the iterative procedure [44,45]. With these powerful tools, iterationbased reconstruction methods will be very close to clinical applications.
In this study, our effort focused on the noise suppression for low-dose CT. The proposed regularization item had power to be involved into other energy minimization function based imaging topics. In the future, we can extend the proposed method to other topics in medical imaging, including sparse view reconstruction, interior reconstruction, limited view reconstruction, metal artifact reduction, etc.