A first investigation of accuracy, precision and sensitivity of phase-based x-ray dark-field imaging

In the last two decades, x-ray phase contrast imaging (XPCI) has attracted attention as a potentially significant improvement over widespread and established x-ray imaging. The key is its capability to access a new physical quantity (the ‘phase shift’), which can be complementary to x-ray absorption. One additional advantage of XPCI is its sensitivity to micro structural details through the refraction induced dark-field (DF). While DF is extensively mentioned and used for several applications, predicting the capability of an XPCI system to retrieve DF quantitatively is not straightforward. In this article, we evaluate the impact of different design options and algorithms on DF retrieval for the edge-illumination (EI) XPCI technique. Monte Carlo simulations, supported by experimental data, are used to measure the accuracy, precision and sensitivity of DF retrieval performed with several EI systems based on conventional x-ray sources. The introduced tools are easy to implement, and general enough to assess the DF performance of systems based on alternative (i.e. non-EI) XPCI approaches.


Introduction
X-ray phase contrast imaging (XPCI) is an emerging technique which exploits the x-rays phase changes occurring inside an object. XPCI was first demonstrated using the x-ray interferometer proposed by Bonse and Hart in 1965 [1]. Considering the complex refractive index n = 1 − δ + iβ, the phase effects are described by the real part δ while conventional attenuation is described by β. Being two distinct quantities, β and δ provide different information on the sample. The ratio / δ β is greater than 1000 for low atomic number (Z < 10) elements at x-ray energies between 20 and 150 keV, and greater than 100 for higher Z materials such as copper and iron at energies above 40 keV. This translates into a stronger detected signal if the imaging system is made sensitive to phase shifts, which are invisible to conventional x-ray imaging equipment. Moreover, β decreases more rapidly than δ with increasing x-ray energy, which suggests that high contrast could be still achieved at high energy, where the x-ray dose is potentially lower. Increased investigations of XPCI coincided with the advent of '3rd generation' synchrotron radiation (SR) facilities [2,3]. SR is a partially (both spatially and temporally) coherent, intense x-ray source with characteristics that can be considered ideal for XPCI. SR research enabled the development of S Online supplementary data available from stacks.iop.org/JPhysD/49/485501/mmedia (Some figures may appear in colour only in the online journal) Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. a series of new XPCI techniques, of which what follows is a brief partial list. Analyser based imaging (ABI) uses a perfect crystal between the sample and the detector to 'analyse' the deviated x-rays [4]. Edge-illumination (EI) exploits a small beam aligned with the edge of the detector pixel, again providing sensitivity to small angular deviations of x-rays [5]. Grating interferometry (GI) is based on the Talbot effect induced by a phase grating, which is distorted by the sample and analysed through an additional absorption grating [6,7]. Phase-propagation imaging (PPI) relies on the Fresnel fringes generated through appropriate beam propagation between the sample and the detector [8][9][10], which allows retrieving the phase information under certain conditions [11]. Those techniques have different advantages and disadvantages and, although PPI is often considered the simplest choice when SR is available [12,13], the situation is different when laboratory sources are used. This is quite often the case since SR facilities are expensive to build and maintain-indeed there are only approximately 50 of them in the world. While SR is often essential to develop new technical solutions, a widespread use of XPCI will have to rely on conventional, more affordable x-ray sources. The XPCI techniques which have shown the most promising results in this sense are GI [14] and EI [15]. Both these techniques have been successfully applied in different fields. The most widely explored application so far is probably mammography, where the phase signal, exploited alongside the conventional absorption one, yields important diagnostic information [16], with a potential to keep the x-ray dose at sufficiently low levels [17]. Another attractive capability of most XPCI methods is that they enable access to the refraction induced dark-field (DF) signal. This was first introduced using ABI [18][19][20], then GI [21] and EI [22]. This quantity is often called ultra-small x-ray scattering (USAXS) or simply scattering, and it indicates what is not strictly attenuation or spatially resolved refraction. DF is effectively x-ray refraction occurring at length scales that are too small to be resolved by the imaging system and it can be seen as a refraction induced scattering [18]. It should be noted that, accordingly to Berk and Hardman-Rhyne [23] our DF signal can be consider to belong to the refraction regime (since ν 1). Examples of materials providing high DF contrast are: synthetic microspheres [24], paper [20], solutions containing microbubbles [25], and lungs [26]. This additional information can be retrieved together with absorption and refraction, and it can be applied to a variety of fields including, but not limited to: vascular x-ray imaging combined with new contrast agents [27], mammography to support micro calcification classification [28], lung disease investigation [29] and defects in composite structures [30].
GI and EI are excellent candidates for a widespread application of XPCI in medicine and beyond since they provide high quality images in three complementary contrast channels (absorption, refraction and DF) using conventional sources. This paper focuses on EI, however many of the discussed tools could be easily translated to GI or other XPCI methods. EI does not require high spatial coherence and high stability, since the masks apertures are typically 10-30 µm in size and 80-100 µm in pitch. EI is also flexible as it allows modifying several parameters to optimize the method for a specific application (e.g. aperture sizes and shapes, relative position between masks, setup length and magnification, etc). While this flexibility is beneficial in terms of e.g. trading off sensitivity and dose/exposure time, it also means that it can be non-trivial to select the optimal EI setup configuration for a given application. While the optimization process has been investigated before for the differential phase contrast signal measuring the system sensitivity [31,32], this has not been done before for DF. Also in view of the increasing interest in DF, this work sets out to investigate EI's capability to retrieve correct DF values (accuracy) with small errors (precision), and to determine the smallest amount of detectable DF signal (sensitivity). In particular, the effect of increasing the aperture size in EI and its impact on accuracy, precision and sensitivity on DF retrieval is investigated using experimental data and simulations, by means of two different retrieval methods. The metric introduced in the following can be used in real world studies ranging from material science to biomedical applications [30,33], where quantitative DF retrieval is essential.

Materials and methods
The presented work is subdivided in two parts. First, two EI setups (with different aperture sizes) are compared by evaluating measurements performed on the same DF sample. This basic dataset is then extended through a series of simulations, which enable much wider flexibility in terms of exploring the effect of aperture size on accuracy, precision and sensitivity of the retrieved DF values over a wide range.

Experimental data
The experimental data were acquired using a rotating anode Mo target x-ray tube (Rigaku MM007) at 40 kV/25 mA with a source size of 70 µm. The detector was a Hamamatsu C9732DK flat panel with a pixel size of 50 × 50 µm 2 placed at 2 m from the source. Two sets of masks were realized by electroplating 80 µm thick gold strips onto a 500 µm thick graphite substrate. Aperture periods were 79 µm for the presample mask (M1) and 98 µm for the detector mask (M2), which enable projecting M1 onto M2 with distances from the source of 1.6 m and 1.96 m, respectively. Note that in this configuration every second pixel is covered to reduce the effect of pixel cross talk [34]. The aperture sizes for the first set of masks (Setup 1) were 10 µm and 17 µm for M1 and M2, respectively. For the second setup (Setup 2) the apertures were 23 µm and 29 µm. As in previous work by other groups [20,24], the sample used to measure the DF consists of a PMMA rod (3.2 mm in diameter) placed above a step object consisting of an increasing number of paper sheets (five steps of 1, 2, 4, 8, 16 sheets) as shown in figure 1.
Frames were acquired over 11 equally spaced points on the illumination curve (IC-the curve representing the variation in intensity obtained when M1 is moved along the x with respect to M2-see figure 1 for the reference frame), over a range of 60 µm for Setup 1 and 79 µm for Setup2. The exposure conditions were of 5 s per frame with frame accumulation of 5, and 4 s per frame with frame accumulation of 10 for Setup 1 and Setup 2, respectively.

Simulations
The simulations were run through the code previously validated for Edge illumination [35], developed using the McXtrace simulation package [36] (www.mcxtrace.org/). This code was used to simulate the two experimental setups described above, with the additional capability to vary the aperture size. Twelve setups were simulated with M1 apertures ranging from 8 µm to 30 µm with a step of 2 µm, coupled with matching M2 apertures (i.e. apertures in M2 = apertures in M1 times magnification). The sample consisted of a series of paper sheets as in the experiment, but was extended to a much wider range in terms of number of layers. The scattering signal was generated as a random angular deviation with a predetermined Gaussian distribution, as done in previous work [24,37], but including also paper absorption which affects the result by reducing the statistics. The simulated values for the scattering distributions and the absorption values were taken from the experimental results. The effects of beam hardening was neglected for two reasons: (1) the goal of this preliminary study was to investigate DF retrieval capabilities, regardless of the source of the DF signal and its energy dependence, and (2) a precise measurement of DF as a function of energy ideally requires an intense monochromatic source (such as SR), which was considered to lie beyond the scope of this work. We decided instead to experimentally measure the DF in a small range of values (i.e. of paper layers) where beam hardening is negligible, and use a linear extrapolation to extend it to larger number of layers. Finally, the simulated number of points on the IC (i.e. the number of displacement positions of M1) was 11, like in the experimental case. For all setups, we simulated the same number of photons emitted from the source, hence setups with smaller apertures will have fewer photons reaching the detector, and therefore a poorer detected statistics. This corresponds to keeping the exposure time constant. This choice was driven by hypothetically targeting an application in which reducing the exposure time is more important than x-ray dose optimization, such as security or industrial inspection.
As mentioned above, the use of a simulation provides the opportunity to significantly extend the ranges of both apertures and number of paper sheets over which the DF sensitivity can be investigated. In total we explored 12 setups with a number of simulated paper sheets ranging from 10 to 400. The highest number of sheets resulted in an attenuation of about 84%.

Data processing
The retrieval of absorption, refraction and DF from the experimental data was carried out using 11 points on the IC in order to increase the reliability of the curve fitting procedure. The method is inspired by the one proposed by Oltulu et al in ABI [38]. In analogy to that case, a Gaussian curve fit was performed on a pixel-by-pixel basis over the intensity registered at 11 illumination curve points. Attenuation, refraction and DF can be extracted as the integral, centre and full width at half maximum (FWHM) of the fitted curve, respectively.
The DF values were measured as the mean of a 20 × 20 pixel region of interest selected on the various paper thicknesses. Mathematically, the fitting function for the four parameters c i can be expressed as: where x and y indicate the pixel coordinates, c 0 describes the total intensity, c 1 the peak position i.e. the centre of the curve, c 2 the width, and c 3 the offset (radiation passing through the gold substrate) of the curve; f is the measured curve and Δx is the position of M1. Note that c 0 and c 3 have the same units as f, while c 1 and c 2 have the same units as Δx (length). The fit was applied to all pixels in the image. A series of images without samples (flat) were used to correct the local variations of the masks.
The raw scattering values c 2 were then converted into angular units (radians) and FWHM to remove dependence from the setup and measuring procedure. The obtained DF (σ) values were further corrected for the sample position (d S-M2 ), magnification (M) and M1-M2 distance (d M1-M2 ) -which were slightly different in the two setups-yielding 'absolute' DF values σ c . In formulas:  While equation (1) is effective for low DF samples, it does not take into account the fact that, in presence of high scattering, the separated beamlets shaped by M1 will tend to overlap, and the signal from neighbours pixels will mix (cross-talk). So far, this has been considered the limit for DF retrieval in EI, since an analysis based on independent pixels is no longer valid [22]. To overcome this problem, the simulated data were processed with a modified formula, which includes the cross-talk effect through the introduction of two identical but shifted Gaussians coming from the first neighbours: where p is the pitch of the apertures. This simultaneously considers the effect of two adjacent pixels without requiring the introduction of more parameters to the fit; note however that this approach can only be adopted if we assume a homogenous scattering material, and we are interested in the overall 'area' contrast and not in the single pixel values.
The simulated data were processed using equation (4) since the cross-talk can be significant at large aperture sizes. In the following we will refer to the results obtained with either equation (1) (for experimental data, where cross-talk was negligible) or equation (4) (for simulated data) as 'Method 1'. The second method ('Method 2') used on simulated data is the threeimage algorithm proposed by Endrizzi et al [39]. It is based on a Gaussian approximation and uses an iterative procedure to correct for the offset value. Method 2 has the advantage that it requires fewer images; indeed, in a practical implementation, its use would reduce the overall acquisition time to 3/11. Moreover, it processes the three images globally and not on a pixel-by-pixel basis, which makes it significantly faster.
In terms of metrics to characterize the setups performance, we define accuracy as the difference between the retrieved DF value and the 'real' one set in the simulation; sensitivity as the standard error of the mean over a background area (20 × 20 pixels), which is considered indicative of the minimum amount of detectable DF as area contrast; precision as the standard error of the mean over a 20 × 20 pixels region of interest for various DF values. Figure 2 shows the linearity of the experimental C 2 σ signal versus the number of paper layers, using both Setup 1 and Setup 2. The mean value of the fit slope is: 45.9 ± 0.2 (µrad 2 /(paper layer)). Figures 3 and 4 shows 12 graphs of simulated data generated   for different aperture sizes for the C 2 σ value, which is shown to be linear versus paper thickness for both retrieval Methods 1 and 2, respectively. The black lines indicate the expected DF values i.e. the ones given as input to the simulation. In figure 5, the mean accuracy at different apertures sizes, calculated as:

Results
is shown; σ iS are the simulated DF values for different paper thicknesses and n = 10 is the total number of simulated thickness values. Sensitivity graphs are shown in figure 6, while the precision is summarized in figure 7.

Discussion
The result of the experimental evaluation summarized in figure 2 confirms that, at least to first approximation, the DF values are independent from the parameters of the EI setup. While narrower apertures provide better refraction sensitivity [40], they also lower the detected statistics, which reduces the reliability of the DF signal retrieved under high scattering and   absorption conditions (for both for Method 1 and Method 2), as visible for the highest values of paper thickness for apertures of 8-10 µm in figures 3 and 4, S1 and S2. The retrieved DF values become progressively more accurate at larger aperture sizes, which enables effectively covering the whole range of paper layers investigated (figures 3 and 4, S1 and S2). As shown in figure 5(a), Method 1 reaches an optimal accuracy for aperture sizes between 14 µm and 22 µm, with its accuracy getting worse for both smaller and larger apertures. Instead, Method 2 ( figure 5(b)) seems to underperform only for apertures below 14 µm, and provides results comparable to and possibly better than Method 1 at large aperture sizes (⩾25 µm). Therefore, Method 2 seems to benefit more from the statistical improvement due to larger apertures. Sensitivity remains reasonably flat for Method 1 (figure 6(a)) until about 22 µm, on a value of about 2 µrad 2 (which correspond to 0.05 layers of paper detectable on a 20 × 20 pixel area contrast). The value then grows for larger aperture sizes. As seen from figure 6(b), Method 2 shows reduced sensitivity values, at best around 3.5 µrad 2 when the aperture size is below 14 µm; this is mainly due to the overall lower statistic. However, this value also increases more slowly for larger aperture sizes, indicating room for improvement by e.g. increasing the statistics of every single frame. The precision graphs for both methods (figures 7(a) and (b)) show noisy results for the 'extreme' apertures values of 8, 10 and 30 µm (the latter for Method 1 only). Excluding those aperture sizes, both methods show a good precision ranging from 0.5% to 2%. Despite the lower statistics, Method 2 shows comparable performance to Method 1 in the range of thicknesses corre sponding to 150-300 layer of paper (figure 7).

Conclusions
We have investigated the capabilities of different EI setups to retrieve precise, accurate and sensitive DF values. The aperture size is a crucial parameter for EI, since it influences the refraction sensitivity, effectively allowing to trade-off refraction sensitivity and exposure time. Narrower apertures results in sharper ICs, which translates into higher refraction sensitivity [40], but at the price of reduced statistic for a fixed exposure time. An analytical model capable of predicting the effects of aperture size on DF retrieval is non-trivial, since many other aspects must be simultaneously taken into account. Important contributions come from the acquisition protocols, involving for example the number of points acquired on the IC, their relative positions, exposure time, etc. Another crucial variable is the used retrieval method. There are several methods to retrieve DF, each one of which propagates noise in a different way. In this paper we have simplified the problem by using Monte Carlo simulations, and restricting the study to two retrieval methods only. The more rigorous Method 1 is based on an analytical fit over 11 points, while Method 2 uses only three points on the IC. Three separate metrics have been introduced to characterize the performance of the various setup/ retrieval combinations. Accuracy expresses the capability to retrieve a value as close as possible to the exact one. Precision is the measurement uncertainty on this value, and sensitivity indicates the minimum amount of detectable DF. Simultaneous consideration of those three quantities allows tailored design of an EI system around a specific application, for example by enabling prioritization of detecting low DF values rather than providing linearity and quantitativeness over large DF ranges.
In this work, we consider at fixed acquisition time for all aperture sizes, which made it clear that very narrow apertures affect the retrieval of large DF values because of the reduced statistics. With Method 1, the highest accuracy is obtained for apertures between 14 µm and 22 µm, while for Method 2 the best range seems to lie between 14 µm and 30 µm. While for Method 1 the sensitivity is approximately constant from 8 µm to 22 µm, with Method 2 the sensitivity starts to increase at values larger than 12 µm, indicating a possible optimal tradeoff with accuracy around 14 µm. The precision of the two methods is comparable, and typically between 0.5% and 2%. This is a strong point for Method 2, the precision of which does not seem to be affected by the significantly lower number of points/images used for the retrieval. Although Method 1 shows better absolute results (especially for accuracy and sensitivity), a hypothetical industrial application of DF may prefer a faster scanning procedure (3 instead of 11 images) despite a sensitivity reduction by almost a factor of 2; a lot in this sense will also depend on the analysed sample. These results clearly show the importance of considering the whole DF measurement process in the setup optimization, starting from the mask design and all the way to the image retrieval, since the accuracy, precision and sensitivity can vary significantly depending on their combination. At the same time, it also shows the advantage of EI in terms of flexibility and adaptability to different imaging tasks, for example by using different pre-sample and detector masks in combination with different retrieval algorithms.