A semi-empirical model for scatter field reduction in digital mammography

E Marimón; P A Marsden; H Nait-Charif; O Díaz

doi:10.1088/1361-6560/abd231

1. Introduction

Digital x-ray mammography is the main screening method used for early breast cancer detection. Although mammography techniques have benefitted from technological advances in the last decades, there is still scope for image quality improvements and dose reductions. A mammogram requires good contrast, good resolution, low dose and large dynamic range (NHS 2016). The breast is composed of soft tissue, fat, blood vessels and it may have calcifications or tumours. Some of these tissues have very similar composition, therefore an x-ray scan must be sensitive to these small differences in order to provide enough image contrast for an accurate diagnosis.

An x-ray image is composed by a combination of primary and scatter information. The primary component is formed by particles that arrive to the imager with a certain amount of attenuation but no deviation. The scatter component, on the other hand, is made of particles that interact with the matter and deviate from their initial path, so they arrive to the imager in a range of angles. To maximise visualisation and improve image quality the scatter component needs to be reduced or, ideally, completely removed. Scattered radiation is one of the biggest challenges in digital mammography today (Wang et al 2015), as it affects the overall image quality by degrading the contrast and signal to noise ratio, affecting the diagnosis of low contrast lesions (Boone and Cooper 2000, Ahn et al 2006, Ducote and Molloi 2010). Most digital mammography systems have adopted anti-scatter grids as a physical solution to reduce scattered radiation. This method, although well established and effective, is not able to block all the scattered photons and introduces some loss of information by absorbing part of the primary radiation, which leads to an increase in the delivered dose by up to a factor of 3 (Krol et al 1996). Grids also add complexity and cost to the overall manufacturing process and they are incompatible with most of the current x-ray breast imaging techniques, such as the majority of digital breast tomosynthesis (DBT) applications (Krol et al 1996, Binst et al 2015, Wang et al 2015).

Post-acquisition software-based scatter reduction techniques have emerged in the last years for planar mammography and DBT applications, as a response to the anti-scatter grid limitations. Most of the available methods work with simplifications of direct Monte Carlo (MC) simulations, as this approach can be very time consuming and computational expensive and needs to be case-specific (Díaz et al 2012). Alternatives to full direct MC simulations found in the literature comprise pre-computed libraries of scatter-to-primary ratio maps calculated from direct MC simulations (Díaz et al 2010, Feng and Sechopoulos 2011, Feng et al 2014), GPU-based fast MC simulations with tissue-composition ratio estimation techniques (Kim et al 2015), scatter normalisation from direct MC simulations using homogeneous phantoms (Diaz et al 2019) or convolution-based scatter methods that make use of point-spread functions calculated with simplified MC simulations, obtained by approximating the x-ray beam to a narrow pencil beam (Boone and Cooper 2000, Sechopoulos et al 2007, Ducote and Molloi 2010, Díaz et al 2012, Marimón et al 2017).

In this study, the convolution-based scatter field estimation method has been chosen, as it is one of the most widespread methods and it is faster and less dependent on the system geometry than direct MC simulations (Sechopoulos et al 2007, Díaz et al 2012, Binst et al 2015). Such a method is based on the idea that the scatter in the system is spatially diffuse and it can therefore be approximated to a two-dimensional low-pass convolution filter of the primary image (Ducote and Molloi 2010). These 2D low pass filters, often referred as scatter kernels or scatter point spread function (SPSF), are pre-calculated with simplified MC simulations, where the incident beam is approximated to a narrow pencil beam or a delta function.

The distribution of the SPSFs can be taken as rotationally symmetric (Ahn et al 2006) or the kernel can be adapted to local variations in object thickness and attenuation (Wang et al 2015), i.e. asymmetric SPSFs. It has been seen that spatially invariant kernels suffer from scatter overestimation, introducing up to 50% of discrepancies around the breast edge area when compared with pure MC simulations (Díaz et al 2012). Asymmetric kernels can treat the boundary between the object and the background much better than symmetric kernels. However, this approach introduces a higher computational overhead and high dependencies with the simulated geometry, risking the introduction of image artefacts and scatter under or over estimation if the kernels are not properly computed.

This paper presents an alternative approach to conventional symmetric methodologies, introducing a semi-empirical method that is less system dependent and, therefore, without the intrinsic complications of the asymmetric kernels, while adapting to the areas of the breast that are often neglected in simpler methodologies. Between three and five symmetric kernels are combined to achieve an accurate treatment of the scatter produced in the breast and of the background contribution to the breast edge area, accounting for changes in the materials and, therefore, in the absorption coefficients. The result is a robust, case-independent and precise method with the potential to be implemented as a real time technique by using parallel/GPU techniques.

2. Methodology

The theory behind the convolution-based scatter estimation method defines the scatter signal produced in the system as the result of the convolution of the SPSF kernels with the primary image (Ducote and Molloi 2010). The input image is the raw mammogram, either acquired or simulated, formed by a combination of primary and scatter signals:

$\begin{eqnarray}&&I(x,y)=P(x,y)+S(x,y),\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&S(x,y)=P(x,y)\ast h,\end{eqnarray} \tag{ 2 }$

where P(x, y) is the primary image, I(x, y) is the input image containing both primary and scatter signal, S(x, y) is the scatter image and h is the low-pass filter kernel. The convolution operator is denoted as ∗.

As the primary image is unknown, the scatter signal can be calculated by approximating the primary to the input image in the convolution, i.e. defined as the linear combination of the primary and scattered deposited energies, as suggested first by Love and Kruger (1987)

$\begin{eqnarray}&&\hat{S}(x,y)=I(x,y)\ast h,\end{eqnarray} \tag{ 3 }$

where the ^∧ notation indicates those variables that have been approximated (Love and Kruger 1987).

An accurate definition of the filter is key to accurately predict the scatter image and, eventually, calculate the primary image, which is our target image. In this study, the kernels, or SPSF, are obtained with the aid of MC simulations, using a narrow pencil beam normally incident to a uniform and spatially symmetric geometry. To account for changes in the geometry, several thickness and area dependent kernels are calculated and used to convolve different regions of the input image.

The MC simulations needed in this study were performed with the aid of the Geant4 toolkit (version 10.01.p02) (Allison et al 2006, Vidal and Hoff 2008). The data analysis and visualisations were performed with MATLAB, R2013b, and ImageJ, 1.47V.

2.1. Primary image recovery

The predicted primary image is the result of subtracting the scatter image from the acquired uncorrected image, i.e. input image. This work focuses in presenting a novel methodology, introduced in the following sections, to estimate the scatter image. The proposed pipeline is shown in the schematic of figure 1 and described below. To estimate the scatter image, the input image will be segmented and convolved with a choice of kernels that will account for the changes and non-homogeneities of the processed image.

**Figure 1.** Schematic diagram showing the workflow of the scatter removal method chosen in this study. The images are depicted as dark rectangles, operations as dashed circles/ovals, processes as white rectangles and sub-processes as dashed rectangles.The acquired image is segmented and appropriate kernels are selected for the resulting areas. The kernels are used to calculate the scatter image which are used for the primary image recovery.
Download figure:
Standard image High-resolution image

The input images used for testing the proposed scatter reduction method are both simulated and acquired with a mammography x-ray system. To generate the simulated images, I(x, y), half cone-beam MC simulations are performed using a realistic mammography geometry and realistic breast phantoms, of different thicknesses and shapes. In addition to I(x, y), the primary P(x, y) and scatter S(x, y) images are also stored to analyse the accuracy of the methodology.

A schematic diagram of the two MC simulation geometries that have been used in this study, i.e. cone beam and pencil beam is shown in figure 2. More details of the geometry are given in section 3.

2.2. Semi-empirical method for scatter estimation

The chosen scatter estimation process focuses on the idea that the scatter contribution from different areas of the image can be treated separately and are additive (Boone et al 2000). The proposed method consists of estimating the scatter behaviour of the two distinctive areas of the image, background and breast, using a semi-empirical modification of the two symmetric kernel approach influenced by the theory behind the asymmetric kernels (Ahn et al 2006, Díaz et al 2012, Wang et al 2015). The scatter contribution from each region is estimated individually, accounting for absorption coefficient variations in the boundaries between both areas and, when necessary, considering changes in tissue thickness (Boone et al 2000, Seibert and Boone 2006, Sechopoulos et al 2007). All of the scattering occurring outside of the breast region in the model, e.g. the compression and support paddles, is referred in this study as background area.

For each of the defined areas, the scatter kernels are estimated independently via MC simulations. These kernels are used to convolve the clinical image, i.e. I(x, y), previously segmented using a manual intensity thresholding approach.

The use of one homogeneous thickness-dependent scatter kernel to convolve the whole image has been reported to introduce large uncertainties in the breast edge area (Díaz et al 2012, Wang et al 2015). This is partially due to the thinning of the breast and, principally, to the under-estimation of the background scatter contribution, which is mainly introduced by the compression paddle and breast support paddle. The background contribution, frequently neglected, is properly accounted for in the proposed method as described below.

2.2.1. Scatter kernels for background area

The literature shows that the contribution of background's scatter to the projected breast area can be estimated using pencil beam geometries that simulate the system without the presence of the breast (Sechopoulos et al 2007, Díaz et al 2014), as shown in figure 3(a). However, the convolution of these kernels directly applied to the entire image results in an overestimation of the scatter in the breast area, as the higher x-ray absorption coefficient of the breast is not considered. In contrast, an opposite effect can be achieved using a kernel from a pencil beam hitting the background but surrounded by the object material, as observed in figure 3(b). In this case, the scatter signal is absorbed at the rate defined by the object's absorption coefficient. This approach under-estimates the scatter if the photon is scattered far away from the breast edge, but it is accurate when scattered close to the edge of the object.

**Figure 3.** Proposed geometries to obtain the background scatter kernels. (a) represents the geometry for the full background geometry (100%, B_100B). (b) is the geometry used for the 0% background kernels, B_0B.
Download figure:
Standard image High-resolution image

The method proposed in this work presents a solution to more accurately estimate the scattered radiation generated in the background and arriving at the breast boundaries. This is done by reducing the asymmetric kernel solution to a two-background symmetric kernel problem: the threshold background area, i.e. non-breast region, is convolved with the two kernels defined above, i.e. a pencil beam hitting the background geometry and surrounded by either background (B_100B) or breast material (B_0B). The intensity values of the two resulting scatter images, see equation (3), are then interpolated to obtain the final background contribution to the breast. A semi-empirical equation is used to define this interpolation as described below in equation (9).

In mammography examinations the breast is aligned with the chest-wall (CC projections). Starting from an image displayed with the chest-wall side in the vertical direction, the intensity interpolation can be described row by row. Thus for a given row, n, the intensity profile curves of the predicted scatter images obtained with the pre-calculated kernels B_100B and B_0B can be named, respectively, f_n(x) and g_n(x), where x is the column variable from the edge of the breast (x = 0) to the chest-wall (x = N, if N is the breast width at the chosen row), see figure 4.

**Figure 4.** Schematic showing the key parameters used for the deduction of the semi-empirical equation *t_n* (x). On the right, a simulated mammogram of a breast phantom aligned with the chest wall. On the left, plot of the intensity profile of the selected row, n, from the chest wall (x = 0) to the breast edge (x = N).
Download figure:
Standard image High-resolution image

The definition of the equation is based on the following conditions:

Condition 1: the predicted scatter values obtained after the convolutions will be in between the over and under-estimation limits. The intensity profile of a given row n, t_n(x), is assumed to be a linear combination of f_n(x) and g_n(x):
$\begin{eqnarray}&&{t}_{n}(x)={\alpha }_{n}(x){f}_{n}(x)+{\beta }_{n}(x){g}_{n}(x),\end{eqnarray} \tag{ 4 }$
where α_n(x) and β_n(x) are weighting factors that need to be determined.
Condition 2: the background contribution to the scatter at the edge of the breast can be predicted by the pure background kernel, i.e. the kernel that is farthest away from the breast, B_100B:
$\begin{eqnarray}&&{t}_{n}(0)={f}_{n}(0):{\alpha }_{n}(0)=1,{\beta }_{n}(0)=0.\end{eqnarray} \tag{ 5 }$
Condition 3: the background contribution to the scatter inside the breast, towards the chest-wall, tends to be the one predicted by the background-breast absorption kernel, i.e. the kernel that is placed next to the object, B_0B:
$\begin{eqnarray}&&{t}_{n}(N)={g}_{n}(N):{\alpha }_{n}(N)=0,{\beta }_{n}(N)=1.\end{eqnarray} \tag{ 6 }$

Parameters α_n(x) and β_n(x) decrease and increase respectively with the distance to the breast edge (x). The intensity profile across the MC simulated breast, from the chest wall to the breast edge was studied and analysed for a number of simulated phantoms with different thicknesses. The profile curves of the scatter component, i.e. 'ground truth', were analysed and it was seen that it is possible to write β_n(x) as being directly proportional to x, while assuming α_n(x) to be inversely proportional to the distance led to scatter under-estimation. Therefore, to account for the experimental data and for conditions 2 and 3 the parameters can be described as:

$\begin{eqnarray}&&{\alpha }_{n}(x)=\displaystyle \frac{{K}_{\alpha 1}}{({K}_{\alpha 2}+x)},\end{eqnarray} \tag{ 7 }$

where, constant K_α1 has to be equal or very close to constant K_α2 to ensure that α_n(0) = 1.

$\begin{eqnarray}&&{\beta }_{n}(x)={K}_{\beta }x,\end{eqnarray} \tag{ 8 }$

where, 1 > K_β > 0. In particular, ${K}_{\beta }=\tfrac{1}{N}$ , to ensure that β_n(N) = 1.

The constants K_α1 and K_α2 need to be adjusted to represent the entire image while being consistent across the different mammography geometries and breast shapes, covering the entire breast thickness range. As described above, the intensity profiles of the chosen set of scatter images were used as the benchmark. A range of values for constants K_α1 and K_α2 were analysed against the profiles, the most reproducible values found that made the weighting factors independent of n were K_α1 = 80 and K_α2 = K_α1 − 1. Therefore, the equation adopted in this study to estimate the interpolated scatter signal is:

$\begin{eqnarray}&&{t}_{n}(x)=\displaystyle \frac{80}{(79+x)}{f}_{n}(x)+\displaystyle \frac{x}{N}{g}_{n}(x).\end{eqnarray} \tag{ 9 }$

2.2.2. Scatter kernels for breast area

To estimate the scatter contribution inside the breast it is important to pay special attention to the area near the breast-edge. Around this area, the breast reduces its thickness and the photons arrive at the breast surface with a higher incident angle. Large uncertainties can be introduced if the scatter estimation model is not carefully chosen, a good model should compensate for the reduction in the tissue absorption and in the scatter production, and also for the elongated photon path inside the breast.

It has been found experimentally that the effects that these variables produce, i.e. reduction in thickness and higher incident angle, balance each other if the compressed breast is thin (T ≤ 50 mm). In this case, only one kernel of constant thickness is needed to estimate the scatter produced in the breast. As shown in the results in section 4.1, the maximum scatter signal uncertainty introduced for a 50 mm compressed breast, when compared against a full MC simulation, is ≈5% . This value is reduced further as the thickness decreases.

As the breast phantom becomes thicker (T > 50 mm), the error introduced by a constant thickness kernel increases significantly, resulting in relative difference values of 20% for 70 mm breast thickness. In consequence, additional kernels and corrections need to be introduced to compensate for the loss of accuracy.

The method followed to correct for the thickness difference around the edge of the breast is based in the same principle used for the background estimation. For applying the correction, the breast needs to be separated into an inner and outer area, i.e. constant and non-constant thickness regions. This is achieved by manually thresholding the image where the thickness starts to decrease, differentiated in the image by an abrupt change in the intensity values. The scatter in each area can be split into the scatter produced in the area itself plus the one arriving from the neighbouring area.

As the outer breast area does not have a constant thickness, an additional approximation is needed to empirically calculate an equivalent thickness that represents the whole edge area. A 20% thickness reduction was found to give the best results, while being reproducible across different phantom thicknesses.

The scatter signal produced by the breast edge can be estimated from the combination of the scatter level of the inner region to the scatter arriving on the edge of the breast, obtained using a kernel with the beam hitting the maximum thickness but surrounded by the smaller thickness that the beam will encounter.

3. Geometries and experimental set-up

The performance and robustness of the scatter reduction method proposed in this paper was tested using both MC simulated images and real images acquired with an Hologic Selenia amorphous selenium (a-Se) mammography system hosted at Barts NHS trust (London). For the former case, the same clinical a-Se mammography system parameters were used for the simulations. Additionally, realistically shaped breast phantoms were simulated to calculate the images recorded in the receptor (I), as well as the primary (P) and scatter (S) images independently. For the latter case, clinical images of a 010A CIRS phantom (CIRS) were acquired with and without the use of anti-scatter grids. The performance of the proposed scatter reduction method was evaluated by comparing the post-processed grid-less images against the grid images.

The subsections below describe the geometries of both sets of experiments and the validations performed to ensure that the Geant4 simulations are in line with previously published data.

3.1. Hologic lorad selenia mammography system

The mammography system used both in the simulations and in the clinical exercise made use of an hologic lorad selenia with an a-Se flat panel detector (Hologic, Inc; Bedford, Massachusetts, USA). The system parameters are defined in table 1.

Table 1. Geometry parameters of the mammography system used in this study.

	Hologic lorad selenia
	Simulation parameters	Thick. (mm)	Material
	x-ray tube: anode		Tungsten
	x-ray tube: filter	0.06	Rh or Ag
	SID(^a)	660	Rh or Ag (70 μm)
Geometry (FOV = 340 × 340 mm)	Compression paddle	2.54	Polycarbonate
	Support paddle	2.54	Carbon fibre
	Air gap	17.46	Air
	Detector cover	Private info.	Carbon fibre
	Detector air gap	Private info.	Air
	Detector sensor	200 μm	a-Se

^aSID = Source to imager distance.

3.2. MC geometry

For the MC geometry, two simulation geometries were used: cone and pencil beam. A schematic of these two geometries can be seen in figure 2. Synthetic breast phantoms of realistic shape and dimension were used in the cone beam MC simulations. Such phantoms were inspired in the study published by Rodríguez-Ruiz et al based in mathematical breast modelling (Rodríguez-Ruiz et al 2017) where, given a thickness and a glandularity, the shape and dimension of the compressed breast is characterised and modelled, see table 2. This shape information was used to simulate the compressed breast in Geant4, as shown in phantom figure 5(a).

**Figure 5.** Illustration of the two phantoms used in this study. Figure (a) shows the Geant4 simulated breast phantom based in Rodríguez-Ruiz methodology (Rodríguez-Ruiz *et al* 2017). Figure (b) shows the 010A CIRS phantom with the testers (highlighted in yellow) used in the clinical analysis.
Download figure:
Standard image High-resolution image

Table 2. Dimensions of the simulated breast phantoms. All the phantoms were composed of 30% glandular—70% adipose tissue. The body was not included in the simulations.

T(mm)	t₄(mm)	r₁(mm)	r₄(mm)	r₆(mm)
35	18.3	64.7	71.7	65.3
50	20.0	80.0	97.0	86.0
60	25.3	94.7	112.0	104.3
70	25.0	100.0	133.0	122.0

Four different phantom thicknesses have been tested to validate and study the robustness of the proposed method. The values chosen were 35, 50, 60 and 70 mm and cover the most representative range of thicknesses around the mean value, estimated to be 52 mm (Boone et al 2000). The composition of the breast, defined as described by Hammerstein et al (1979), was chosen to be 30% glandular and 70% adipose tissue with a 2 mm thick layer of skin. Previous studies indicate that variations on the composition have little impact on the scatter to primary ratio (Boone et al 2000, Díaz et al 2012), so the glandular-adipose ratio was kept constant across all thicknesses evaluated. Moreover, the 30–70 ratio was chosen to minimise the uncertainty introduced with this approximation, as this value is a more common breast composition than the most commonly used 50-50, as described by Yaffe, Boone et al in 'The myth of the 50-50 breast' (Nelson et al 2008, Yaffe et al 2009). The breast model was aligned with the chest wall plane in the cone beam simulation case, while a cylindrical-shaped phantom centred with the beam was used in the pencil beam methodology (see figure 2). For simplification, the compression paddles were approximated to rigid and non-tilted paddles (Díaz et al 2017). Refer to table 1 for dimensions and compositions.

To gain statistics, for the pencil beam MC method, a total of 20 simulations of 10⁹ particles were run, to ensure a standard error of the mean lower than 1%. In the case cone beam MC method, 80 simulations of 10⁹ particles were run for the input images. In addition, the pixel size of the images was increased from 70 μm to 560 μm, resulting in a 501 × 501 pixels (280 × 280 mm) kernel that was used to convolve a 650 × 650 pixels (364 × 364 mm) image. The computational time of the MC simulations was 5 and 10 h, respectively (48-core simulation server). However, once the kernels are calculated and stored in a look up table, the execution time will be reduced to a simple image convolution analysis.

Two sets of validations have been performed to ensure that the Geant4 simulations were accurate. The values given in the report of the American Association of Physicists in Medicine (AAPM) task group 195, case 3-Mammography and breast tomosynthesis (Sechopoulos et al 2015) and the data published by Diaz et al (2014) and Sechopoulos et al (2007) were used as a benchmark to ensure that both the geometry and the kernels (SPSF) were simulated correctly.

The results obtained show excellent agreement with the previously published data. The validation against the AAPM report gives an average discrepancy lower than 1% while the validation of the SPSFs shape leads to discrepancies lower than 3% when compared against Sechopoulos et al (2015) and 4% against Díaz et al (2014). More details of the validation can be found in a previous publication of these authors (Marimón et al 2016, 2017).

3.3. Clinical images with the 010A CIRS phantom

The chosen 010A CIRS anthropomorphic breast phantom simulates the shape of a compressed 50 mm thick breast composed by a 5 mm thick skin, i.e. adipose tissue-equivalent layer, that surrounds a 40 mm thick block made of 30%/70% glandular/adipose equivalent tissue. The phantom contains a set of test objects that simulate calcifications, fibrous ducts, tumour masses, see table 3 and figure 5(b) for more information.

Table 3. Information of the composition and size of the CIRS testers used in this study: step wedge, circular details and calcifications testers.

Tester	Detail no.	Composition	Size (mm)
Step wedge	1–5	Glandular (%):	N/A
		0, 30, 50, 70, 100
Circular	1–7	55% glandular	1: 4.76, 2: 3.16, 3: 2.38, 4: 1.98,
details		45% adipose	5: 1.59, 6: 1.19, 7: 0.90
			1: 0.130, 2: 0.165, 3: 0.196,
Calcifications	1–12	N/A	4: 0.230, 5: 0.275, 6: 0.400,
			7: 0.230, 8: 0.196, 9: 0.165,
			10: 0.230, 11: 0.196, 12: 0.165

The hologic lorad selenia system at Barts NHS trust was used to acquire images of the CIRS phantom with and without the use of the anti-scatter grid at two different doses, selected by the system's Automatic Exposure Control (AEC). The higher dose (referred in the text as Dose 1) corresponded to the value chosen by the AEC when the grid was in place, while the lower dose (referred in the text as Dose 2) was the system's choice when the grid was removed. Dose 2 was 45% lower on average.

The geometry of the mammography system was simulated in Geant4 with and without the presence of the phantom to obtain both the phantom and the background scatter kernels, respectively. The phantom was approximated to a homogeneous object surrounded by the skin, the testers were not taken into account to allow a fair evaluation of the model when comparing the results against the grid image ('ground truth'). The pixel size of the detector was kept at its real size, i.e. 70 μm.

4. Results and discussion

4.1. Performance comparison with MC simulated images

The four breast thicknesses under examination comprise the analysis of two thin and two thick phantoms, described in section 3.2. A total of three kernels, i.e. one breast phantom kernel and two background kernels, were used to calculate the scatter of the thinner phantoms, while five were needed for the thicker cases, as explained in section 2.2.

The primary images predicted in this study (P_Predicted) have been compared with the benchmark images (P_Benchmark), i.e. full MC simulated primary images. The results shown in this section present the relative difference from this comparison, calculated following equation (10)

$\begin{eqnarray}&&\mathrm{Rel}.\mathrm{Diff}.(x,y)=100\times \displaystyle \frac{{P}_{\mathrm{Benchmark}}(x,y)-{P}_{\mathrm{Predicted}}(x,y)}{{P}_{\mathrm{Benchmark}}(x,y)}.\end{eqnarray} \tag{ 10 }$

The results obtained are shown in figures 6 and 7 and table 4. Figure 6 shows the relative difference, equation (10), for each of the breast phantom studied. Only the pixels belonging to the projected breast phantom area were analysed, since the background does not have any clinical relevance. In the plots, the positive values (green/red) correspond to the pixels where the scatter signal is overestimated, whereas negative values (blues) represent under-estimation of the method, with respect to the 'ground truth' (full cone beam simulations).

Table 4. For each breast thickness, the table shows the fractional area of the phantom which satisfies the criterion of absolute relative difference values (between the MC 'ground truth' and scatter corrected primary images) lower or equal to the specified error thresholds (2.5%, 5%, 7.5% and 10%). In addition, the last two columns show the mean relative difference and the standard deviation.

Thickness(nm)		Fractional area of breast (%) with max. relative difference (abs.):			Mean Rel. Diff (%)	STD (%)
	≤2.5%	≤5.0%	≤7.5%	≤10.0%
35	96	100	100	100	−0.3	1.2
50	77	99	100	100	0.3	2.0
60	53	86	98	100	1.3	3.1
70	56	86	98	100	−0.4	3.3

In order to facilitate the visualisation, figure 7 illustrates the distribution of the relative differences, i.e. histograms.

Table 4 depicts the area percentage of the images with absolute relative differences equal or less than various error thresholds, i.e. 2.5%, 5%, 7.5% and 10%. The mean relative difference and the standard deviation are also included, the negative values shown in the mean relative difference column indicate scatter under-estimation.

The results show a worst-case scenario of 10% relative difference between the predicted primary image and the full MC ('ground truth') for the thicker phantoms. However, the photon fluctuations of the full MC images have not been taken into consideration. Even with the use of 80 × 10⁹ particles and the increases pixel size, the fluctuations introduce from 1.5% to 5% uncertainty for breast thicknesses of 35–70 mm, respectively. When accounting for this fluctuation, the values shown in table 4 change to maximum relative differences of 5% for 35 and 50 mm breast thicknesses and 7.5% for 60 and 70 mm thicknesses. The results obtained are very promising and further improvements could be obtained by introducing extra kernels in the correction and a more complex segmentation method.

The relative differences in the plots show over and under-estimation fluctuations around the ideal case (0 value). The image non-homogeneities seen in figure 6 are a consequence of the empirical contribution of the model defined by equation (9) used in the curve fitting approach, where the scatter across the image is simplified to a function dependent on the distance. Similarly, the non-homogeneity of the relative difference distribution seen in figure 7 for the two thickest phantoms is a consequence of the additional kernels used to account for the phantom thickness.

The results obtained, with a worst-case scenario of 7.5% of uncertainty, indicate a good performance predicting the scatter radiation. To challenge the model further, its performance has been also tested with non-simulated images. Phantom images were acquired with a clinical mammography system with and without the presence of the system's anti-scatter grid to evaluate the ability of the model against the benchmark solution.

4.2. Performance comparison with 010A CIRS phantom

The evaluation of the uniform step wedge and the circular details testers was done by calculating the contrast to noise ratio (CNR), see equation (11). The circular details tester was analysed with the use of the variance ratio (equation (12)), as the CNR analysis introduced big fluctuations that lead to unreliable results

$\begin{eqnarray}&&\mathrm{CNR}=\displaystyle \frac{{\bar{x}}_{D}-{\bar{x}}_{{\rm{Bckg}}}}{{\sigma }_{{\rm{Bckg}}}},\end{eqnarray} \tag{ 11 }$

where, ${\bar{x}}_{D}$ = mean pixel value of the detail ROI, ${\bar{x}}_{{\rm{Bckg}}}$ = mean pixel value of the background ROI and σ_Bckg = standard deviation of the background ROI

$\begin{eqnarray}&&\mathrm{Variance}\ \mathrm{Ratio}=\displaystyle \frac{{\sigma }_{D}^{2}}{{\sigma }_{{\rm{Bckg}}}^{2}},\end{eqnarray} \tag{ 12 }$

where, ${\sigma }_{D}^{2}=\mathrm{variance}$ of the detail ROI and ${\sigma }_{{{\rm{Bckg}}}^{2}}=\mathrm{variance}$ of the background ROI.

To gain statistics, the measurements were repeated between three to six times, depending on the tester, using different areas of the testers. The standard deviation of these measurements was used to estimate the error and a Student's T-test was performed to study the statistical significance of the grid versus processed grid-less comparison. The null hypothesis considered was the no differentiation of the measurements and a 5% significance level was chosen.

4.2.1. Uniform step wedge tester

The results obtained are shown in table 5 and figure 8.

Table 5. The table gives the values of the CNR for the different steps of the step wedge tester, for the grid and processed grid-less images acquired at two different doses. The standard deviation of the three measurements is stated inside the brackets σ.

	CNR—Step wedge tester
	Dose 1		Dose 2
Step no.	Grid (σ)	Proc. (σ)	Grid (σ)	Proc. (σ)
1	3.66 (0.02)	3.01 (0.09)	2.69 (0.01)	2.19 (0.01)
2	0.06 (0.01)	−0.24 (0.08)	0.08 (0.01)	−0.17 (0.01)
3	−2.32 (0.01)	−2.49 (0.08)	−1.70 (0.01)	−1.72 (0.01)
4	−4.71 (0.02)	−5.10 (0.10)	−3.45 (0.01)	−3.68 (0.01)
5	−8.56 (0.03)	−8.86 (0.08)	−6.27 (0.03)	−6.50 (0.04)

Step number 1 has higher percentage of adipose tissue than the background and thus is an area with more scattering and absorption and higher signal when compared to the background reference area (positive CNR values). For this step, the grid performs better as the model does not reproduce the total amount of scattering, leading to smaller CNR values. When the proportion of glandular tissue increases, step numbers 3–5, the process is inverted (negative CNR values). For these cases the proposed model performs better than the grid. Finally, step number 2 has the same percentage of adipose tissue than the phantom background. In this case the expected CNR is zero, so the grid gives a more accurate reading. This is likely due to the difference between the predicted and real scattering produced in the neighbouring testers.

The results obtained are the average of three measurements. They are reproducible at both energies and the comparison is statistically significant with p-values below the chosen 5%.

4.2.2. Circular details tester

The results obtained are shown in table 6 and figure 9.

Table 6. The table gives the values of the CNR for the different details of the circular detail tester, for the grid and processed grid-less images acquired at two different doses. The standard deviation of the three measurements is stated inside the brackets σ.

	CNR—circular detail tester
	Dose 1		Dose 2
Detail no.	Grid (σ)	Proc. (σ)	Grid (σ)	Proc. (σ)
1	−2.64 (0.09)	−2.69 (0.03)	−1.96 (0.04)	−1.90 (0.04)
2	−1.62 (0.04)	−1.41 (0.01)	−1.16 (0.03)	−1.16 (0.03)
3	−0.77 (0.01)	−0.62 (0.01)	−0.60 (0.01)	−0.69 (0.01)
4	−0.47 (0.03)	−0.59 (0.06)	−0.36 (0.02)	−0.72 (0.02)
5	−0.29 (0.03)	−0.77 (0.02)	−0.30 (0.03)	−1.11 (0.01)
6	−0.10 (0.05)	−0.64 (0.03)	0.02 (0.04)	−1.30 (0.06)
7	0.43 (0.01)	−0.40 (0.04)	0.15 (0.03)	−1.25 (0.02)

The CNR values of the largest masses, numbers 1–3, oscillate between the grid and the processed grid-less images and two of the comparisons are not statistically significant. For the smaller and more difficult to detect masses, the processed grid-less image performs better for both doses. However, the results obtained with the two last masses (numbers 6 and 7) are unrealistically high in the processed image. These objects are placed closer to the edge of the phantom where the non-uniformity increases, reducing the precision of the CNR measurements.

The results obtained are the average of three measurements. They are reproducible at both energies and the comparison, except in two cases, is statistically significant with p-values below the chosen 5%.

4.2.3. Microcalcifications tester

The results obtained are shown in table 7 and figure 10.

Table 7. The table gives the values of the variance ratio for the different details of the calcifications tester, for the grid and processed grid-less images acquired at two different doses. The measurements have been repeated six times, for each of the calcifications contained in the details; the standard deviation is stated inside the brackets σ.

	Variance ratio—microcalcifications tester
	Dose 1		Dose 2
Detail no.	Grid (σ)	Proc. (σ)	Grid (σ)	Proc. (σ)
1	Not visible
2	2.38 (0.38)	2.34 (0.42)	1.89 (0.32)	1.78 (0.33)
3	3.40 (0.36)	3.47 (0.25)	2.05 (0.35)	1.91 (0.34)
4	5.82 (2.19)	5.53 (2.25)	3.59 (1.50)	3.57 (1.16)
5	10.51 (1.98)	10.25 (1.93)	6.04 (1.32)	5.67 (1.13)
6	18.77 (5.03)	18.23 (4.89)	10.99 (2.63)	9.94 (2.59)
7	5.66 (1.43)	5.73 (1.33)	3.61 (0.89)	3.32 (0.80)
8	3.26 (0.47)	3.28 (0.55)	2.32 (0.50)	1.97 (0.29)
9	1.84 (0.54)	1.83 (0.52)	1.54 (0.41)	1.32 (0.29)
10	4.10 (1.57)	4.25 (1.28)	2.88 (0.67)	2.69 (0.73)
11	3.02 (0.64)	3.14 (0.86)	2.28 (0.32)	2.15 (0.55)
12	2.08 (0.52)	2.00 (0.56)	1.66 (0.39)	1.36 (0.30)

At the higher dose, i.e. Dose 1, the higher variance ratio fluctuates between both images independently of the size of the microparticles. The p-values indicate that the differences are not statistically significant. The performance at lower doses lean in favour of the grid images, although both results are within the measurement error. The higher background noise of the processed images affect the variance ratio evaluation as the detection of the microcalcifications is noise limited.

The results obtained are the average of six measurements, one per each dot of the same thickness in the tester.

5. Conclusions

Scatter estimation is one of the main ongoing research areas in x-ray mammography. This study focuses on convolution-based methodologies to simulate the scatter and predict a primary, scatter-free image, enabling the acquisition of high-quality images without the use of anti-scatter grids.

The focus of this study is to find a simple and working approach that does not trade off the removal of the anti-scatter grid for the quality of the final image. Taking into consideration that the scatter signal is additive, the background and breast phantom contributions were analysed separately, accounting for the change in absorption coefficients between material boundaries. With this approach, it was found that it is possible to estimate the background scatter with only two separate kernels, while the breast needs 1–3 kernels depending on the thickness of the breast/object. This method benefits from the simplicity and speed of the symmetric kernel approach, as the kernels can be pre-computed in a database instead of having to be simulated on a case by case basis, but without compromising the accuracy of the scatter estimation at the breast edge.

Four breast thickness and shapes were tested against full MC simulations (benchmark values or 'ground truth'). The results show very good matches, especially for thin breast phantoms. The maximum uncertainty found was for the 60 and 70 mm thick breast phantoms and this did not exceed 7.5%.

In addition to the MC simulations, the method was also tested with non-simulated images acquired with an a-Se clinical mammography system. Images of the 010A CIRS phantom were acquired with and without anti-scatter grid. The performance of the method was therefore evaluated against the current benchmark by analysing the quality of the processed grid-less and grid images making use of the testers embedded in the phantom. In the analysis it was seen that the method performs similarly to the anti-scatter grid, the scores oscillate between the two solutions studied.

The results obtained with the MC simulations and the clinical study are both very positive. This preliminary study shows that it would be possible to use the presented anti-scatter method as an alternative to anti-scatter grids without compromising the quality of the final image.

Future work should focus on testing further the robustness of the method in a clinical environment making use of additional mammography phantoms, performing a one to one comparison between the processed grid-less images and the grid images. One of the limitations of the current study is that it has been tested only for CC projections, the method should also be extended to include the pectoral muscle in the mediolateral oblique projection, tested against less common breast shapes, such as mammograms of breasts that have already undergone surgery and extended to include DBT systems. Further work should also include research in the optimisation of the model to speed up the analysis by improving further the convolution step and creating kernel reference databases.

A semi-empirical model for scatter field reduction in digital mammography

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction