Non-linear regularized phase retrieval for unidirectional X-ray differential phase contrast radiography

: Phase retrieval from unidirectional radiographic differential phase contrast images requires integration of noisy data. A method is presented, which aims to suppress stripe artifacts arising from direct image integration. It is purely algorithmic and therefore, compared to alternative approaches, neither additional alignment nor an increased scan time is required. We report on the theory of this method and present results using numerical as well as experimental data. The method shows signiﬁcant improvements on the phase retrieval accuracy and enhances contrast in the phase image. Due to its general applicability, the proposed method provides a valuable tool for various 2D imaging applications using differential data.


Introduction
X-ray radiography is a standard method of imaging in a variety of applications including medical diagnostics, biological in-situ studies, and materials science. The interaction of X-rays with matter can be described by the complex index of refraction n = 1 − δ + iβ , where δ characterizes the phase shift and β the attenuation properties of the material. In the diagnostic energy range of X-rays (10 − 100 keV), δ is typically three orders of magnitude larger than β . The consequently high interaction cross section makes the phase shift measurement a favorable imaging modality. It has been demonstrated that phase images can provide higher contrast than absorption-based images and contain complementary information about the sample [1,2].
In the past, a variety of phase contrast techniques have been reported. They can be divided into propagation based methods [3][4][5], interferometric methods [6,7] and crystal analyzer based methods [8][9][10][11][12]. Among these methods, phase contrast is provided either by the acquisition of the projected phase profile (φ ), its first spatial derivative (∇φ ) or its second derivative (∇ 2 φ ). Many of the reported methods require highly coherent X-rays, only available at synchrotron sources. A recently developed technique based on grating interferometry has established itself as a suitable technique for phase contrast imaging on synchrotron sources [13,14] as well as on conventional X-ray tubes [15,16]. The setup consists of an X-ray source, a grating interferometer of two gratings and an X-ray detector. This technique allows the simultaneous measurement of an absorption, a differential phase and a darkfield signal [17]. Due to the acquisition of the differential phase (∇φ ), the method is also referred to as differential phase contrast (DPC) imaging. Grating interferometry has mainly been demonstrated in unidirectional mode, where the gratings have a periodic line pattern and the phase gradient can only be measured in the perpendicular direction to these lines. In bidirectional grating interferometry, "checkerboard" and "mesh" like grating patterns are used, and the DPC acquisition can be extended to both directions [18].
Compared to conventional absorption imaging, the retrieval of the phase from the raw data is in general not trivial. Paganin et al. derived a general framework for the phase retrieval in coherent imaging systems [19]. Depending on the measured representation of the phase signal (φ , ∇ϕ, ∇ 2 ϕ), further post processing (integration filter) is necessary to retrieve the quantitative phase signal φ . Phase retrieval from the second derivative has mainly been discussed in association with propagation based methods [20,21]. A huge variety of methods exist for differential phase techniques, where the phase is available in the first derivative. A bidirectional phase gradient acquisition has been performed on a scanning transmission X-ray microscope (STXM) using an annular quadrant detector [22], and a complex filter [23] was used for the quantitative phase retrieval. Also using an STXM, Hornberger et al. [24,25] quantitatively retrieved the phase by deconvolving the contrast transfer function. Further, Thibault et al. [26] developed a method called scanning diffraction X-ray microscopy (SDXM), which is a combination of STXM and coherent diffractive imaging (CDI) and allows the reconstruction of the complex transfer function of an object by using an iterative algorithm.
For unidirectional grating interferometry, the phase is typically retrieved by a onedimensional, direct integration. An integration in the image domain is equivalent to dividing by spatial frequency in the Fourier domain, which acts as low-pass filter. Therefore, along the direction of integration, the image undergoes a smoothing operation. However, in the perpendicular direction, high frequencies are unfiltered and amplify the noise. This leads to severe stripe artifacts in the image along the direction of integration. The resulting poor quantitative phase recovery is characterized by a low contrast-to-noise ratio (CNR) in the phase image, often impeding the subsequent analysis of the radiograph.
Kottler et al. also discussed this issue and proposed a bidirectional scanning approach [27] with unidirectional gratings. Two scans are performed, one with horizontally-, the other with vertically-oriented gratings. This yields a two-dimensional phase gradient of the sample and the phase can be retrieved with a complex filter. The method has shown to be well-suited for reducing stripe artifacts. However, it requires further alignment of the gratings or the sample and likewise increases scan time and dose by a factor of two. Stripe removal by acquiring the bidirectional phase gradient has further been demonstrated with a two-dimensional grating interferometer [18]. Similarly, this approach requires a two-dimensional stepping scheme and thus quadratically increases scan-time and dose compared to the unidirectional case. For diffraction enhanced imaging (DEI) and multiple image radiography (MIR), a linear least squares approach (a) for the reduction of stripe artifacts has been reported [28].
Here, an iterative method for solving the problem of stripe artifacts from direct integration in unidirectional DPC images is proposed. The method is based on non-linear constrained optimization and is purely algorithmic. It works on the raw, noisy DPC measurements and thus does not need longer exposure times. Apart from an initial grating alignment, no further adjustments of the setup hardware or the sample are required. The stripes from the integration can be suppressed and the visibility of image details is enhanced. The performance of the method is evaluated by calculating the root mean squared error (RMSE) of the images compared to a simulated ground-truth image for phantom data and the image CNR using numerical and experimental datasets. The significant increase in CNR and the improved quantitative phase recovery makes this method a valuable tool in differential phase contrast radiography. Figure 1 shows a typical grating interferometer setup for parallel beam geometry. The first grating of the interferometer is a phase grating, acting as a beam splitter by introducing a periodic phase shift of zero or π to the wavefront. This generates an interference pattern downstream, with maximum intensity differences at fractional Talbot distances, given by d m = mg 2 1 /8λ , where m is an odd integer, g 1 the period of the phase grating and λ the wavelength [29].

Experimental setup and phase retrieval
A sample positioned in front of the phase grating introduces a phase shift in the wavefront, generating a phase projection profile φ (x, y), which is given by the line integral In this equation, λ is the wavelength and δ (x, y, z) is the refractive index decrement, associated to the real part of the complex index of refraction n = 1 − δ + iβ . The partial derivative of the phase profile with respect to x is proportional to the beam refraction angle α [29]: Beam refraction causes a shift of the interference pattern in x-direction (see Fig. 1). This fringe shift, measured in radians with respect to the fringe period g 2 , corresponds to the DPC signal and is given by Here, d is the propagation distance downstream of the phase grating and Eq. (2) was used to obtain the right part of Eq. (3). The fringe period g 2 is typically a few microns and therefore, the interference pattern cannot generally be resolved sufficiently by the X-ray detector. For the determination of ϕ, an absorption grating, having the same period as the fringes (g 2 ) and acting as an analyzer mask, is positioned at a fractional Talbot order distance. The acquisition protocol consists in a so called "phase stepping scan", moving the absorption grating in equidistant steps in x direction and acquiring an image at each position [29]. This is done in a reference (without sample) and an object phase stepping scan ( Fig. 1(b)), respectively, to account for an inhomogeneous detector illumination. From the phase stepping curves (PSC), available in each pixel, the absorption (a), the differential phase (ϕ), and a scattering signal (v) can be calculated by a Fourier analysis: where a i is the magnitude and ϕ i is the phase of the i-th Fourier component of the PSC. Signals from reference and object scan are labeled by "obj" and "ref", respectively. The scattering signal v corresponds to the reduced fringe visibility V obj /V ref .
In this paper, we focus on the recovery of the phase signal from the DPC measurement ϕ. According to Eq. (3), the retrieval of the phase image φ (x, y) requires a one-dimensional integration in x-direction, given by Due to noise in the DPC image, this integration accumulates the errors and therefore the noise variance. The typical result are strong stripe artifacts, reducing image contrast and impeding an exact phase retrieval. Figure 2 shows a noisy DPC image generated from the modified Shepp-Logan phantom [30] and the phase retrieval obtained from an integration in the horizontal direction. Prior to the integration step, the mean value was removed in every line of the DPC image to enforce a cumulative sum of zero. However, this simple correction approach could not remove strong horizontal stripe artifacts which still appear across the image.

Measurement and noise model
The method described in the next section is an algorithmic approach for the suppression of the aforementioned stripe artifacts arising from direct integration. It is based on constrained optimization and thus requires a measurement and a noise model, respectively. The DPC measurement model is a standard linear regression model, which can be written in operator notation, D x is a forward operator which models the relation between φ and ϕ in absence of noise and w is a random image modelling the noise in measured data. The noise standard deviation derived by Engel et al.
[31] (Eq. 25) is given by where V is the visibility and N is the number of photons in the measurement. Equation (6) is the discretization of Eq. (3), whereas the constant of proportionality ( g 2 λ d ) has been omitted for the sake of simplicity. D x models the first derivative and can be implemented as a finite differences transform operator, given by where i and j are now discrete coordinates (pixel coordinates) and N i is the image size in xdirection.

Regularized image integration method
A direct inversion of Eq. (6) by integrating the noisy measurement ϕ causes the typical horizontal stripe artifacts. The phase retrieval can be improved by solving a constrained optimization problem. While maintaining consistency with the measured DPC data, the solution is retrieved by minimizing a cost function. The constrained optimization problem for the above model can be written as Essentially, this optimization problem seeks for the minimal p norm of D y f for any f being consistent with the measurement data ϕ in a least squares sense. W is a weighting operator, multiplying each pixel with its inverse noise standard deviation (1/σ ) to account for the noise model. This approach is also known under the name penalized weighted least squares (PWLS) [32]. D x and D y are finite differences transforms in the x-and y-direction, respectively, f is the solution of the optimization problem and ε is a boundary for the noise. The p norm of an image is defined as For instance, the most often used 2 norm corresponds to the Euclidean norm. In this case, problem (9) is linear and an explicit inversion for f exists. Here, we focus on the use of the 1 norm, which is the sum of the magnitude of all pixel values. The 1 norm minimization of the finite differences transform is well-known in denoising applications and often referred to as total variation denoising (two-dimensional finite differences) [33]. The 1 norm minimization typically shows strong edge preserving characteristics and causes less blurring than the 2 norm. The better performance comes at the expense of a non-linear optimization problem, having no explicit inversion formula.
For the solution of (9) with p = 1, the problem is first reformulated to an unconstrained form: The data consistency constraint and the optimization function are now expressed in a single objective function F( f ). A Lagrange multiplier λ (regularization parameter) is used for weighting the two terms. The parameter ε from Eq. (9) is now implicitly chosen by adjusting λ . This parameter is strongly data-dependent and in general difficult to determine. Here, it will be determined by evaluating well-known image quality metrics. Equation (11) can further be generalized to hold multiple regularization terms. The problem formulation is then given by In principle, this allows for any number of regularization terms to be added, which is particularly useful if more a-priori knowledge about the sample is available. Minimum norm constraints with any norm number p and for any linear transform T f can be added. An example for regularized integration using two regularization terms will be given in the next section.
For the numerical solution of the unconstrained optimization problem, a non-linear Conjugate Gradient (NLCG) algorithm [34] has been used. NLCG is characterized by a fast convergence for the inversion of large-scale linear systems. Moreover, this algorithm can handle both, the linear (p = 2) and the non-linear (p = 1) case and thus provides a convenient way for a comparison. The implementation was done in Matlab (MathWorks, Version R2010a) and run on a 64-bit Linux machine (Intel Xeon 2.83GHz processor).

Numerical data
Numerical phantom data have been used for the theoretical assessment of the method. The data were generated according to Eq. (6), where φ was the known (ground-truth) image, i.e. the Shepp-Logan phantom, representing the line integral of δ (real part of the refractive index) through an object. Noise has been generated by using the previously discussed model for DPC measurements (Eq. (7)). The calculation of a noise standard deviation map required the simulation of a transmission and a visibility image. For the transmission image, we assumed a constant ratio δ /β = 10 (β is the imaginary part of the refractive index), yielding the transmission image directly from the known phase image. For the visibility image, a high scattering signal at strong edges of the object was assumed and small angle scattering was neglected. Therefore, an edge enhanced image was generated by applying a discrete two-dimensional Laplace filter to the phantom image, from which the visibility map was determined. Figure 3 shows the simulated transmission and visibility image and the combined noise standard deviation map. Using the noise standard deviation map, noisy DPC data, as shown in Fig. 2(b), was generated as input for the regularized image integration. In the following, regularized and direct integration are evaluated in terms of well known image metrics, including root mean squared error (RMSE) compared to the ground-truth image, contrast-to-noise ratio (CNR) and the noise pattern standard deviation. The ground-truth image is defined as the radiographic phase image of the sample in absence of errors (e.g., noise, material imperfections or measurement errors). RMSE and CNR are computed as a function of the regularization parameter λ . First, this allows a direct comparison of regularized and conventional direct integration and second, it provides support for the choice of an optimal λ , although this is of course dependent on the chosen metric. In order to compare the the nonlinear 1 and the linear 2 norm minimization, the simulations are performed for p ∈ {1, 2}. For all simulations, the same convergence criterion in the NLCG algorithm is applied. Figure 4(a) shows the phase image obtained by direct integration and Fig. 4(b) displays the optimal result of the regularized integration in terms of the RMSE. The RSME of two images f 1 and f 2 is defined as where N i and N j are the horizontal and vertical image sizes, respectively. Figure 4(e) displays the RMSE as a function of the regularization parameter λ and with p ∈ {1, 2}, for regularized integration (RI), and the RMSE for direct integration (DI). For all values of λ within the investigated range, a smaller RMSE could be measured for regularized integration. The minimum was achieved with the 1 norm minimization at λ RMSE = 5 · 10 −3 , yielding the optimal solution (displayed in Fig. 4(b)). Regularized image integration changes the noise pattern of the images. Apparently, direct integration introduces a horizontally oriented stripe pattern, as shown in Fig. 4(c). After regularization, the pattern is stripe-free and more homogeneous (Fig. 4(d)), although a few new artifacts, mainly at strong horizontal edges of the sample (top and bottom of phantom skull), have been introduced. Artifacts at horizontal edges appear due to the low DPC signal at these locations, which is a result of the unidirectional sensitivity (horizontal direction) of the grating interferometer. The standard deviation of the noise patterns, indicated by σ in Fig. 4(c) and (d), is smaller for the regularized integration, confirming the improved image quality. The RMSE metric requires the knowledge of a ground-truth image for the computation, which is in practice unknown. Therefore, the method has further been evaluated using the image CNR, which represents a well-known metric for the quantification of the image contrast and is independent of ground-truth data. CNR is defined as where S and σ are the mean value and the standard deviation, respectively, within a regionof-interest (ROI) of the object (obj) and the background (bg). Figure 4(f) shows the CNR as a function of λ , calculated by using the object and background regions marked with the red boxes in Fig. 4(a). The 1 norm minimization again performed better than the 2 norm and the optimal regularization parameter is λ CNR = 4 · 10 −2 , which is almost an order of magnitude larger than the optimal value in terms of RMSE. This is mainly because high regularization parameters significantly suppress noise, resulting in a higher CNR. On the other hand, a high λ can smooth the image and reduce edged sharpness. For this reason, optimizing the CNR is not always the best approach and may lead to an overestimation of λ .
The previous results showed that regularized image integration outperformed direct integra-tion of differential phase contrast data in terms of RMSE, CNR and noise variance. The vertical and horizontal line profiles through the image of Fig. 4(b), as shown in Fig. 4(g) and (h), further illustrate how well the vertical variations could be suppressed and that the regularized image is in significantly higher accordance with the ground-truth image. Moreover, the non-linear 1 norm minimization performed clearly better than the 2 norm for both metrics. For this reason, in the examples of the following sections, only the 1 norm minimization will be used for the regularized phase retrieval.

Experimental data
The method has further been tested with experimental DPC data. In order to investigate the quantitative robustness of the stripe removal method also for experimental data, a phantom made of materials with known indices of refraction was fabricated and imaged. The phantom has been acquired with the DPC setup of the TOMCAT beamline at the Swiss Light Source [35]. The X-ray source is a 2.9T superbend bending magnet. A double crystal multilayer monochromator delivers monochromatic radiation. The photon energy was set to 25 keV and the grating interferometer was designed for an inter-grating distance of the 3 rd fractional Talbot order. The source-to-phase grating distance was 25 m, the inter-grating distance was d = 120 mm and the grating periods were g 1 = 3.98 μm and g 2 = 2.0 μm. The materials of the absorption and phase grating were gold and silicon, respectively [36]. The total exposure time was 1.6 seconds, using 8 phase steps. The phantom consists of three concentric rods, which are surrounded by a circular wall. The wall is made of polymethylmetharcylate (PMMA) and the materials for the rods are polypropylen (PP), polystyrene (PS) and polyoxymethylene (POM), respectively. The space between the rods and the wall was filled with de-ionized water. The complex refractive index n = 1 − δ + iβ at 25 keV of each material was known (available at http://sergey.gmca.aps.anl.gov), allowing for a calculation of the ground-truth image (calculated analytically by assuming a known sample profile) . Figure 5(a)-(d) shows the DPC, integrated, regularized and ground-truth image, respectively, of a projection of the phantom and Fig. 5(e) shows the normalized horizontal profiles after summing up the image rows in vertical direction. In this example, an additional image constraint was applied to demonstrate the possibilities of the generalized problem formulation introduced in the previous section. The additional constraint is based on a-priori knowledge about the image boundaries, assuming a background signal of zero. This assumption holds because the DPC image is flat-field corrected, essentially meaning that in a background pixel, the phase signals of the object and the reference scan should be the same. The optimization problem, consisting now of two regularization terms, is defined as: In the second regularization term, the matrix M operates as a binary mask for the pixels in the image f at the left and right edges. For the norm parameter, p = 2 was chosen, because this constrains the image to have low energy at the edge pixels. While the first regularization parameter λ 1 was varied, the second regularization parameter was kept constant at λ 2 = 10 −2 .
Since the second regularization term only constrains parts of the image, the phase retrieval is not very sensitive to λ 2 , making an exact determination of an optimum for this parameter unnecessary. There is a high certainty that a background signal of zero is a true assumption, therefore the value should be kept high. On the other hand, in order to avoid a too strong influence of this term to the result, λ 2 should at the same time be kept small. The choice of λ 2 = 10 −2 is purely empirical and does not represent a unique optimum.  Fig. 5(e), the quantitative phase recovery could be improved over the whole investigated range of the regularization parameter, although a complete agreement to the calculated ground-truth was not achieved (Fig. 5(f)). However, it is important to notice that the result of Fig. 5(f) must be taken with care due to the following important facts: In this example, RI was compared with DI after a vertical summation of the images, not by looking at line profiles (as it was done in the simulations). This summation also smoothes out the stripe artifacts in DI and therefore, the DI line in Fig. 5(f) would be expected to highly match the ground-truth line, which is not the case. This inconsistency might result from several random effects, as for example inhomogeneities in the material, inexact numbers for the material densities and the refractive indeces or inconsistencies of the DPC measurements resulting from grating drift or grating imperfections from the fabrication process. Due to these facts, it is impossible to state that RI was indeed quantitatively more accurate than DI, simply because the ground-truth is actually unknown. On the other hand, due to the removal of the stripe artifacts, it is very likely that a line profile from RI is quantitatively more accurate than a line profile (no vertical summation) of DI. Figure 6(a) shows a DPC image of a mouse (scanned within a sample holder). The scan was taken on a grating interferometer setup using an X-ray tube source, operating at 40 kV, 25 mA. The pixel size of the detector was 50 × 50 μm 2 . The anode material of the tube is tungsten and the spotsize is approximately 0.5 mm. Due to the low spatial coherence of the beam, a source grating was used [15]. 16 phase steps were acquired and the total exposure time was 120 seconds. The full body of the mouse was scanned by stitching multiple scans together. The total size of this image is 2360 × 1200 pixels. The standard problem formulation of Eq. (11) with p = 1 was used to retrieve the regularized phase image. According to a CNR analysis, the optimal regularization parameter was λ CNR = 4 · 10 −3 . The integrated and regularized images are shown in Fig. 6(b) and 6(c), respectively. Significant improvements compared to direct integration can be observed in the images. Features that are hidden in the directly-integrated image become visible in the regularized integration (see ribs in zoomed area). Due to the large image size, the algorithm took 450 iterations with a total run time of approximately 11 minutes.

Discussion
Non-linear, 1 regularized image integration can significantly improve the quality of phase images obtained from noisy, unidirectional DPC measurements without increasing the radiation dose. Longer exposure times, additional phase steps, or bidirectional measurements can be avoided. The method outperformed direct integration in the given examples in terms of RMSE compared to ground-truth data, noise pattern and CNR. The improved RMSE promises more accurate results for a quantitative data analysis, while a higher CNR will facilitate data post processing as for instance image segmentation.
The application of a regularization technique to a problem, which is in principle well-posed, might seem a bit unusual at a first glance. However, since the integration of differential data is highly sensitive to noise and generates strong image artifacts, regularization provides a convenient way to specify data fidelity. This is done using the regularization parameter, λ , being the key parameter of the method. It determines the weighting between the data consistency and the optimization function and can be interpreted as a tuning parameter for the resulting image quality. If λ is chosen too high, overregularization occurs, generating new image artifacts. A typical overregularization effect when using the 1 norm minimization of the finite differences transform is the generation of piecewise constant areas in the image, looking similar to median filtering. If λ is below its optimum, the regularization term is too weak and the stripe artifacts will not vanish. Naturally, λ is data dependent and the optimum is not always trivial to determine. The evaluation of image metrics (CNR, RMSE) for the optimization of λ is one possible approach amongst others. Another well known method is the L-curve analysis, where the data constraint term and the minimization terms are evaluated after each iteration [37]. Different methods usually yield different optima, and therefore, an objective optimum does usually not exist. CNR and RMSE turned out to be convenient metrics for the image evaluation and the results showed, that the regularized images have been superior to the directly integrated images over a wide range of λ . Fortunately, this relaxes the requirement to exactly find an optimum value for λ . Eventually, finding the optimum solution will remain a task dependent issue.
Another important parameter is the norm parameter p. It has been demonstrated, that for the minimization of the finite differences transform, the 1 norm performed better than the 2 norm in terms of the evaluated metrics. The 1 norm has also been subject to many investigations in the field of tomographic image reconstruction, where an image is assumed to have a sparse representation in a linear transform domain. The minimization of the 1 norm in such a domain enforces sparsity and may improve image reconstruction. This concept is also consistent to our problem, since the finite difference transform in the y-direction is expected to yield a sparse image representation. The benefit of minimizing the 1 norm instead of the 2 norm comes at the expense of a non-linear problem formulation. For the finite difference transform, the 1 norm has been shown to perform clearly better, however for other transform domains, the 2 norm may be a better choice.
The possibility of introducing theoretically any number of regularization terms improves the flexibility of the method to handle many kinds of samples. In particular sample specific constraints, which should be added if a-priori knowledge about the sample is available, are expected to further improve the phase retrieval. For instance, if the sample is known to consist mainly of piecewise constant parts, a total variation minimization would be appropriate. Another example, which is less sample specific but also often applicable, would be an "isolated specimen" constraint, meaning that there is a zero background signal all around the sample. Essentially, this would be an generalization of the proposed zero-background constraint, which assumed the signal to be zero only at the left and right edges. In any case, the potential improvement always comes at the expense of an additional free regularization parameter and an increased computational complexity. Additional regularization terms are certainly an important subject for future investigations.
The iterative NLCG algorithm has proven to be a suitable method for solving the non-linear optimization problem, even though the processing time can rapidly increase for very large image sizes. Code optimization, which has not yet been performed, should lead to a significant decrease in computational time.
It is important to point out that non-linear, regularized image integration provides a general signal recovery method for unidirectional differential data. In this work, it has only been applied to data obtained from grating interferometry, although other techniques as for example crystal analyzer based methods would likewise benefit from our approach.
We strongly believe that DPC radiography is a valuable imaging method for various applications in radiology. In order to effectively exploit the additional information obtained from this technique, the proposed method provides an important tool for obtaining the quantitative phase projection. This is crucial for improving the analysis of structures in the examined materials (e.g., in medical images).