Alpha particle microdosimetry calculations using a shallow neural network

Peter Wagstaff; Pablo Mínguez Gabiña; Ricardo Mínguez; John C Roeske

doi:10.1088/1361-6560/ac499c

1. Introduction

Targeted radionuclide therapy (TRT) is a growing topic of clinical and research interest with the continuing applications of several radionuclides for the treatment of solid tumors and metastatic disease (Lheureux et al 2017, Tafreshi et al 2019, Sgouros et al 2020, Ahenkorah et al 2021). The chosen radionuclide can be labelled or chemically bonded with a carrier biomolecule to create an effective radiopharmaceutical capable of targeting and decaying at the desired site. In this way, TRT represents a future direction of personalized medicine, optimizing radiation therapy to be targeted and localized to individual tumor cells while minimizing the risk of irradiating surrounding healthy cells. The decay processes of these radionuclides result in different types of emissions such as alpha particles, beta particles, Auger electrons, or a combination of types. Of these radiation types, alpha particles have intrinsic properties that could provide advantages when successfully implemented as targeted alpha therapy (TAT) (Roeske et al 2008, Sgouros et al 2010, Mínguez Gabiña et al 2020). In comparison to beta particle emitters, commonly considered alpha particle emitters have high decay energies (3–9 MeV) and short ranges in tissue (18–90 μm) that are comparable to a few cell diameters (Roeske et al 2008, Sgouros et al 2010). The average linear energy transfer (LET) for these energies is approximately 100–167 keV μm⁻¹. When paired with an appropriate carrier, the clinical advantages of alpha particle emitters are that they can deposit significant amounts of energy in cells that they are in direct contact with, while limiting the dose to surrounding normal tissues (Roeske et al 2008, Sgouros et al 2010).

Although TAT has shown recent clinical success, particularly for the treatment of metastatic prostate cancer, the underlying behavior and dosimetry for alpha particles is not fully defined for all settings (Kratochwil et al 2016, Ahenkorah et al 2021, Sathekge et al 2021). For the continuing study and clinical application of alpha particle emitters, it is important to understand how radiation is distributed in targeted areas and how that distribution relates to the overall treatment effect. Dosimetry in the classical sense describes a non-stochastic measure of the energy deposited per unit mass to the body, organ, cavity, or other specified target. While this general dosimetric approach has been applied to increasingly smaller targets, problems arise as the microscopic level is approached. The administered radionuclides are unlikely to follow an ideal uniform distribution both at the macroscopic and microscopic levels (Roeske et al 2008, Sgouros et al 2010). At the cellular level, the energy deposited by an alpha particle in the cell nucleus depends on the path taken which can vary due to cell geometry and alpha particle range (Roeske and Stinchcomb 1997, Stinchcomb and Roeske 1992). Some target cell nuclei may receive multiple alpha particle hits while others may be missed altogether. This leads to a frequency distribution in the energy deposited within the subcellular target. Therefore, the stochastic nature of alpha particle emission and the resulting energy deposition events in these subcellular targets must be considered in the framework of microdosimetry (Sgouros et al 2010). The microdosimetric analog of absorbed dose is specific energy (z) defined as:

$\begin{eqnarray}&&z=\,\displaystyle \frac{\varepsilon }{m},\end{eqnarray} \tag{ 1 }$

where ε is the energy deposited in the target structure with mass m. Note that z is a stochastic quantity and similar to absorbed dose (D), has units of Gy.

For realistically large numbers of alpha particles incident on target cells, microdosimetry calculations can quickly become complex requiring the use of Monte Carlo (MC) simulations. MC codes can be flexible allowing custom geometries and source-target configurations (Guerra et al 2021, Schuemann et al 2019). However, the computation times depend on the number of particles or paths simulated, and the statistical accuracy of MC often relies on a very large number of simulations (Roeske et al 2008, Sgouros et al 2010). Thus, for those without the required hardware or knowledge, MC methods may not be readily accessible.

Previous studies demonstrated that full alpha-particle microdosimetric spectra may not be required for all applications. Rather, the single-event specific energy moments may suffice for analysis of cell survival and prediction of tumor control (Mínguez Gabiña et al 2020, Roeske and Stinchcomb 1996, 2000, 2006). In particular, we consider the first moment of the single event spectrum given by:

$\begin{eqnarray}&&\left\langle {z}_{1}\right\rangle =\displaystyle {\int }_{z=0}^{\infty }{z}_{1}\,{f}_{1}\left({z}_{1}\right)d{z}_{1}\end{eqnarray} \tag{ 2 }$

and the second moment defined as:

$\begin{eqnarray}&&\left\langle {z}_{1}^{2}\right\rangle =\displaystyle {\int }_{z=0}^{\infty }{z}_{1}^{2}\,{f}_{1}\left({z}_{1}\right)\,d{z}_{1},\end{eqnarray} \tag{ 3 }$

where z₁ is the specific energy deposited by a single alpha particle and f₁(z₁) is the single-hit specific energy spectrum. These spectral moments are the main quantities of interest for this study which can also be related to absorbed dose and the standard deviation of that dose (Roeske and Stinchcomb 1999). Moreover, these quantities are essentially the frequency mean (z_F) and proportional to the dose mean (z_D) specific energy per event described by Kellerer (1970) and Roesch (1985) and have been related to cellular damage from alpha particle emitters (Bertolet et al 2021, Roeske and Stinchcomb 1999).

Stinchcomb and Roeske (1999) originally produced tables of 〈z₁〉 and 〈z₁ ²〉 for single cells in suspension. However, as new geometries, such as cells in geometric clusters are considered, these tables have become outdated and limited in application. With the increased interest in machine learning applications, the goal of this study is to create a neural network to provide accurate 〈z₁〉 and 〈z₁ ²〉 approximations in a concise, practical manner. The neural network should generalize well to common geometries and energies for flexible use when considering different scenarios and radionuclides. This current study, as a first foray into applying neural networks to microdosimetry, aims to provide insight into design and data analysis as well as inspire future studies to advance these topics beyond calculations for simplified geometries.

2. Methods

The neural network design method is an iterative process. The initial data collection and preprocessing provide insight into the choice of the network type and architecture. A training algorithm is chosen and implemented. Then the network performance is evaluated, and errors can be investigated. At this point, if the network is satisfactory, it can be exported for use. Otherwise, the process is repeated from the start with the investigated data errors in mind. Figure 1 represents a simple flow chart of the overall network design methodology.

**Figure 1.** Flow chart for the neural network design process.
Download figure:
Standard image High-resolution image

2.1. Data collection and preprocessing

Training and applying a neural network are data driven processes. Like many other artificial intelligence and machine learning techniques, the outcome and viability of the network depends on the availability of a comprehensive training data set. This study aggregates data from two sources to build and train a robust network.

The initial data set was previously published by Stinchcomb and Roeske (1999). The purpose of these tables was to provide an accessible source of microdosimetric data for use in determining the relevant quantities of average dose, standard deviation of the specific energy spectrum, and probability of a cell receiving a given number of alpha particle hits (Stinchcomb and Roeske 1999). Specifically, the first 4 tables from that study were chosen to be included in this analysis. These tables provide 〈z₁〉 and 〈z₁ ²〉 for different source-target configurations, alpha particle energies, cell nucleus sizes, and cell sizes. The table numbers, 1–4, were included as an integer encoded variable to represent the different source-target configurations. For each configuration, the target is always a central, spherical cell nucleus. The concentric alpha particle activity source is denoted as 1—within the nucleus, 2—on the cell surface, 3—in the cytoplasm, and 4—the cubic volume surrounding the cell. These source-target pairs are illustrated in figures 2(a)–(d). The alpha-particle energies included are 3.97, 5.30, 5.867, 6.05, 6.73, 7.45, 8.37, and 8.78 MeV while the nuclear and cellular radii range from 2 to 10 μm and 3 to 15 μm, respectively. The alpha particle energies are associated with decays from terbium-149 (3.97 MeV), polonium-210 (5.3 MeV), astatine-211 (5.867 MeV), bismuth-212 (6.05 MeV), polonium-211 (6.73 and 7.45 MeV), polonium-212 (8.78 MeV), and polonium-213 (8.37 MeV) (Eychenne et al 2021). The nuclear and cellular radii values were chosen to be consistent with previous studies (Goddu Howell and Rao 1994, Stinchcomb and Roeske 1999). The two main advantages of this dataset are the ranges of variables which cover the values most likely to be encountered and the dense grid-like structure which provides a strong training set for the neural network.

**Figure 2.** Schematic illustrations of the 8 source-target configuration geometries included in this study. The textured areas represent alpha particle radiation sources. The top row windows, (a)–(d), are single cells in suspension, and the bottom row windows, (e)–(h), are a two-dimensional representation of three-dimensional close-packed cells in tissues.
Download figure:
Standard image High-resolution image

The limiting factor in the previously described data set is that the selected geometries only consider a single cell in suspension. Future TAT studies must consider more complex source-target geometries such as clusters of cells with a realistic alpha particle emitter uptake. To improve the functionality of the network, additional microdosimetric data for more realistic geometries (see figures 2(e)–(h) for illustration) was provided from a modified form of a MC simulation code (see appendix) used in a previously published study (Mínguez Gabiña et al 2020). In pursuit of another complete training set and to match the network's existing input/output format, 〈z₁〉 and 〈z₁ ²〉 data were calculated for a list of source-target configurations, energies, cell nucleus sizes, and cell radii.

As before, the source-target configurations were integer encoded 5–8 (figures 2(e)–(h)). The geometries of these configurations simulate layers of tissue. In each case, there is a central, spherical target nucleus. However, this central cell is surrounded by a packed grid of cells forming a plane, and multiple planes are stacked to form layers like tissue. The alpha particle activity sources are described by 5—uniform everywhere, 6—uniform everywhere except the cell nuclei, 7—in a spherical shell between the cell membrane and 1.25 times the cell nucleus radius, and 8—only in the cytoplasm. The data set uses ten alpha particle energies from 3.97 to 8.78 MeV. Many of the energies match the previous data, but the few differences are the additions of 6.4 and 7.1 MeV and the small substitutions of 5.8 for 5.867 and 8.4 for 8.37 MeV data. Cell nucleus radii range from 2 to 10 μm and cell radii range from 2.5 to 20 μm with the addition of cell-to-nucleus ratios of 1.25, 1.5, 1.75, and 2. These data will expand and improve the capabilities of the network by including the realistic tissue configurations, more energies, and additional combinations of cell and nucleus sizes. With these two datasets combined, the final training data set size consists of 1932 observations of 6 variables.

Before initializing the neural network, various data preprocessing were considered. In particular, extra care was taken to find any data entry mistakes in the original publication tables. Many classic statistical models and function approximators have assumptions that must be met such as the dependent variable be normally distributed. While satisfying this condition may not be necessary for a machine learning, neural network approach, the target data 〈z₁〉 and 〈z₁ ²〉 are highly skewed. Thus, the common choice of a natural logarithmic transformation was implemented (Wong et al 2013). Using this transformation, the target data becomes approximately normal which improves neural network performance. Additionally, the natural logarithm and corresponding exponential function create a useful positive bound which is important for obtaining physically meaningful output values. Additional pre/postprocessing commonly included in the network are dedicated to remove any duplicate observations and to rescale the minima and maxima of the variables to [−1,1] to work in tandem with the appropriate transfer functions. Lastly, percent standardization of the output scales is included to account for the difference in units and ranges of 〈z₁〉 and 〈z₁ ²〉 (Gy and Gy², respectively). This standardization helps the loss function compare measured errors for both outputs equally.

2.2. Network design

The neural network used in this study was created with the MATLAB^® Deep Learning Toolbox™ (The Mathworks Inc., Natick, MA, USA). Using an iterative design process, the network's parameters or architecture were optimized to improve the performance. Some main choices for network design include the training algorithm, size and number of hidden layers, and transfer functions which are described below.

Training of the network was performed using the Levenberg−Marquardt backpropagation algorithm. This algorithm was chosen as it is the most efficient and accurate for our data size and complexity (Hagan Mohammad 1994). Briefly, this algorithm uses a nonlinear least squares technique to determine the network's weights. This approach is efficient compared to conjugate gradient methods, particularly for smaller networks (Hagan Mohammad 1994). The training algorithm also makes use of an early stopping technique to prevent overfitting. One restriction from this choice of training algorithm is that the loss function must be a squared error formula. The mean squared error (MSE) is the associated error function for the analysis of network performance, with values close to zero indicating good performance. MSE is also suitable for a network predicting multiple outputs of different scales (Borchani et al 2015).

To choose the optimal number of nodes in the hidden layer, a grid search method was used. This process involved looping through a complete network training process while increasing the number of nodes at each iteration. The performance metrics are stored, and a minimum can be found. In this study, a network consisting of two hidden layers with 13 and 20 nodes was determined to be optimal. Increasing the number of nodes beyond this did not yield performance improvements and could contribute to overfitting. For function approximation problems, such as this one, it is common to choose a tangential sigmoidal transfer function in the hidden layer and a linear output transfer function (Grassi and Vecchio 2010). This is the structure which allows a network to fit nonlinear data. With the network architecture finalized, the total number of tunable weights and biases was 387. An abbreviated illustration of the network architecture is shown in figure 3.

**Figure 3.** Schematic of the neural network architecture from input to output. The two hidden layers are abbreviated for minimal space.
Download figure:
Standard image High-resolution image

The standard data partition of 70% training, 20% validation, and 10% testing is recommended when providing all available data to the training process (Yao et al 2000). However, the testing data for this study is a separate, independent set described in the following paragraph. To make more data available for training, the data partition was set to 80% training, and 20% validation. The advantage of using the 80% partition is that it corresponds to 1546 training points versus 1352 training points if the 70% partition were used, thus increasing the overall number of data used for training and hence improving network performance (Xu and Goodacre 2018).

The test data set consisted of energies, cell and nuclear radii that were not part of the training data set except for the source-target configuration codes which remained the same. The first part of the test set focused on the single cell in suspension geometries (1–4). One hundred input values were created using random number generators bound between the training input ranges (energy, cell and nuclear radii), and a separate MC simulation (see appendix) was run to obtain 〈z₁〉 and 〈z₁ ²〉 targets (Roeske and Hoggarth 2007). The second part of the test data set was designed to test the tissue-like geometries (5–8). The energies considered were 4.6, 5.6, 5.9, 6.2, 6.6, 6.9, 7.3, 7.9 and 8.6 MeV. The cell nucleus sizes ranged from 2.5 to 9.5 μm, and the cell sizes ranged from 3.125 to 19 μm. The combination of energies, sizes, and geometries for this part of the test set totaled to 792 data. The same MC code (Mínguez et al 2018, Mínguez Gabiña et al 2020) used for the training data was again used to calculate 〈z₁〉 and 〈z₁ ²〉 target values for this test set. The total test data set contained 892 data points, nearly half the size of the training/validation data. A summary of the training, validation and test data sets is presented in table 1.

Table 1. Summary of training, validation and testing data .

Source configuration	Training/validation data	Testing data
Single cells in suspension Activity localized in (1) Cell nucleus; (2) Cell surface; (3) Cytoplasm; (4) Cubic volume surrounding the cell	Energies: 3.97, 5.30, 5.867, 6.05, 6.73, 7.45, 8.37, 8.78 MeV; Nuclear radii: 2–10 mm; Cell radii: 3–15 mm; Data source: tables 1–4 in Stinchcomb and Roeske (1999)	Energies: 3.97–8.78 MeV (randomly generated); Nuclear radii: 2–10 mm (randomly generated); Cell radii: 3–15 mm (randomly generated); Data source: Monte Carlo code described in Roeske and Hoggarth (2007)
Cellular clusters activity localized in: (5) Uniform everywhere; (6) Uniform everywhere except cell nuclei; (7) In a spherical shell between cell membrane and 1.25× cell nucleus radius; (8) Cell cytoplasm	Energies: 3.97, 5.30, 5.8, 6.05, 6.4, 6.73, 7.1, 7.45, 8.4, 8.78 MeV; Nuclear radii: 2–10 mm; Cell radii: 2.5–20 mm (taking on values of 1.25×, 1.5×, 1.75×, 2× nuclear radii); Data source: Monte Carlo code described in Mínguez et al (2018), Mínguez Gabiña et al (2020)	Energies: 4.6, 5.6, 5.9, 6.2, 6.6, 6.9, 7.3, 7.9, 8.6 MeV; Nuclear radii: 2.5–9.5 mm; Cell radii: 3.125–19 mm (taking on values of 1.25×, 1.5×, 1.75×, 2× nuclear radii); Data source: Monte Carlo code described in Mínguez et al (2018), Mínguez Gabiña et al (2020)

With the complete network design, training/validation data set, and test data set, a final network can be initialized, trained, saved, and exported for use. The final network took ∼15 s to train on a standard laptop with an Intel^® Core™ i5-5200U CPU @ 2.20GHz processor. The exported network is a scripted function which simulates the network. The code runs nearly instantly at this point. A single input of source-target configuration, energy, cell nucleus size, and cell size can be used, or a larger set of inputs can be used similar to the separate test data set in this study.

Table 2. Descriptive statistics for the percent error distribution calculated from the network outputs and the true targets.

	Training/validation		Testing
	〈z₁〉	〈z₁ ²〉	〈z₁〉	〈z₁ ²〉
Mean % Error	−0.01%	−0.02%	0.03%	−0.01%
% Error SD	0.38%	0.71%	0.73%	1.4%
95% Interval	[−0.77%, 0.75%]	[−1.4%, 1.4%]	[−1.4%, 1.5%]	[−2.8%, 2.8%]

SD = standard deviation

Table 3. Average error (+/− standard deviation) versus source-target configuration.

Source/target Configuration	1	2	3	4	5	6	7	8
〈z₁〉	−0.10% +/−2.05%	−0.56% +/−1.88%	0.49% +/−1.49%	0.45% +/−1.57%	0.15% +/−0.17%	−0.04% +/−0.37%	−0.02% +/−0.64%	−0.09% +/−0.48%
〈z₁ ²〉	−0.09% +/−4.00%	−0.85% +/−3.64%	0.92% +/−2.76%	2.30% +/−2.59%	−0.02% +/−0.44%	−0.03% +/−0.65%	−0.02% +/−1.13%	−0.11% +/−0.89%

1 = Cells in suspension with source in nucleus; 2 = Cells in suspension with source in cytoplasm; 3 = Cells in suspension with source on cell surface; 4 = Cells in suspension with source outside of cell; 5 = Packed cell grid with source uniform everywhere; 6 = Packed cell grid with source uniform everywhere except the cell nuclei; 7 = Packed cell grid with source in a spherical shell between the cell membrane and 1.25 times the cell nucleus radius; 8 = Packed cell grid with source only in the cytoplasm.

Table 4. Example output from the neural network.

Source/Target configuration	Energy (MeV)	r_n (μm)	r_c (μm)	〈z₁〉 Actual (Gy)	〈z₁〉 Predicted (Gy)	%Diff	〈z₁ ²〉 Actual (Gy²)	〈z₁ ²〉 Predicted (Gy²)	%Diff
1	7.07	6.8	9.3	4.36E-02	4.43E-02	1.60%	2.72E-03	2.81E-03	3.21%
2	5.43	8.1	12.5	7.48E-02	7.31E-02	−2.28%	6.41E-03	6.14E-03	−4.25%
3	4.47	6.9	10.5	1.12E-01	1.09E-01	−2.60%	1.48E-02	1.41E-02	−4.43%
4	8.03	9.9	12.5	5.05E-02	5.09E-02	0.92%	3.25E-03	3.36E-03	3.22%
5	4.60	3.5	4.4	5.07E-01	5.08E-01	0.27%	3.41E-01	3.43E-01	0.56%
6	6.90	9.5	11.9	5.70E-02	5.75E-02	0.96%	4.15E-03	4.22E-03	1.76%
7	7.30	2.5	3.8	8.68E-01	8.63E-01	−0.67%	9.96E-01	9.78E-01	−1.77%
8	8.60	8.5	17.0	6.41E-02	6.39E-02	−0.39%	5.36E-03	5.31E-03	−1.00%

1 = Cells in suspension with source in nucleus; 2 = Cells in suspension with source in cytoplasm; 3 = Cells in suspension with source on cell surface; 4 = Cells in suspension with source outside of cell; 5 = Packed cell grid with source uniform everywhere; 6 = Packed cell grid with source uniform everywhere except the cell nuclei; 7 = Packed cell grid with source in a spherical shell between the cell membrane and 1.25 times the cell nucleus radius; 8 = Packed cell grid with source only in the cytoplasm; r_n = cell nucleus radius; r_c = cell radius.

2.3. Impact of neural network estimates on cell survival

Previously, Roeske and Stinchcomb (1999) demonstrated that cell survival can be related to microdosimetric moments using:

$\begin{eqnarray}&&S\left(D\right)={e}^{\left[-\lt n\gt \left\{1-{T}_{1}\left({z}_{1}\right)\right\}\right]},\end{eqnarray} \tag{ 4 }$

where S(D) is the surviving fraction for cells irradiated to average dose D, and 〈n〉 is the average number of hits. T₁(z₁) is approximated by:

$\begin{eqnarray}&&{T}_{1}\approx \exp \left(-\displaystyle \frac{\left\langle {z}_{1}\right\rangle }{{z}_{o}}-\,\displaystyle \frac{\left\langle {z}_{1}^{2}\right\rangle \,-\,{\left\langle {z}_{1}\right\rangle }^{2}}{2{z}_{o}^{2}}\right),\end{eqnarray} \tag{ 5 }$

where z_o is the specific energy deposited in a cell to reduce survival to 1/e (Charlton and Turner 1996, Roeske and Stinchcomb 1996). The impact of 〈z₁〉 and 〈z₁ ²〉 values on cell survival using values obtained from the neural network versus actual values was compared for all test data points.

3. Results

Based on the previous methods, a single neural network, capable of providing 〈z₁〉 and 〈z₁ ²〉 values for both single cells in suspension, as well as cells in a close-packed cluster geometry was created. The final MSE performance for training and validation were 5.7 × 10⁻⁷ and 8.1 × 10⁻⁷ respectively. However, this was calculated using the logarithmic transformed outputs and targets. To accurately compare networks by the calculation of performance or errors, 〈z₁〉 and 〈z₁ ²〉 must be re-transformed via the exponential function, and any effects of this transformation should be noted. The MSE for the exponential transformed outputs from the training and validation sets were 3.67 × 10⁻⁷ and 3.88 × 10⁻⁷ respectively with the overall MSE for the combined training/validation data equal to 3.71 × 10⁻⁷. While it is important to consider the performance for different data partitions when comparing models or networks, the testing performance is often the most informative measurement for demonstrating goodness of the network. The MSE for our test data was 2.80 × 10⁻⁷.

Figure 4 presents regression analysis plots for the training/validation and test data sets separated for our two outputs, 〈z₁〉 and 〈z₁ ²〉. The x-axes represent the known values while the y-axes represent the network's output or predicted values. Therefore, any points that lie along the identity reference line (shown in red in figure 4) indicate a good agreement. These plots are presented in a logarithmic scale to reveal the more frequent smaller values. Note that the data have been transformed to the physical quantities of interest in Gy and Gy². Additionally, a regression analysis provided correlation coefficients or R values which were close to unity and convey similar good results for each subset. The target to output correlations are R> 0.999 for both the training/validation data and the separate test data.

Table 2 summarizes the error distribution in percent for both outputs in each data set. There is a slight difference in the error distribution between the two outputs. In particular, 〈z₁ ²〉 had a larger range of target values as well as a larger spread in the error distribution. Overall, both output error distributions were centered around zero and approximately normal. Minimum and maximum training errors were about ± 2% for 〈z₁〉 and ± 3% for 〈z₁ ²〉. Testing error minima and maxima were about ± 3.5% for 〈z₁〉 and ± 7% for 〈z₁ ²〉. The standard deviations were larger for 〈z₁ ²〉 in both data sets, and the standard deviations for the test set were larger than those for the training set. Since the error distribution is approximately normal, a 95% interval was calculated. The intervals, particularly the testing intervals, represent the range of errors that would be expected for a new calculation, and are summarized in the table.

A further analysis of the error as a function of the source-target configuration is shown in table 3 which highlights the average error and standard deviation of this error. In general, the average errors are close to 0 with the exception of the 〈z₁ ²〉 error which has an average value of 2.3% for source-target configuration 4. In terms of the standard deviations, the largest occur for source-target configurations 1–4, with the 〈z₁ ²〉 standard deviations being larger than the 〈z₁〉 standard deviations. These differences may be due to the different data sources used for these source-target configurations. In particular, the network was trained using values that were obtained from an analytic approach (Stinchcomb and Roeske 1999), while the test data were obtained from a MC approach (Roeske and Hoggarth 2007). A comparison between these approaches showed the average difference in 〈z₁〉 using both methods was −0.2% +/−1.84%, while the differences in 〈z₁ ²〉 was 0.04% +/−3.41% (Roeske and Hoggarth 2007), which is comparable to the differences that are observed here. The standard deviations associated with source-target configurations 5–8 are generally < 1%. The greater consistency in values for these source-target configurations occur because the same MC code (Mínguez et al 2018, Mínguez Gabiña et al 2020) was used to generate both the training and testing data.

Examples of the neural network output for a small portion of the test data are shown in table 4. As described previously, the neural network uses as input the source-target configuration (1–8), alpha particle energy, and the radii of the nucleus and cell. The output consists of the predicted values of 〈z₁〉 and 〈z₁ ²〉. For comparison, the actual values are shown along with the percent differences. It is important to note that the network was not trained using these particular energies or cell dimensions, and hence these data provide a flavor for the predictive capability of the network.

The advantage of microdosimetric moments, 〈z₁〉 and 〈z₁ ²〉, is that they can be used to predict cell survival using equation (4). Since cell survival will depend on the average number of hits to the cell nucleus, 〈n〉, we can normalize these values by taking the natural logarithm of equation (4) and dividing –ln(S) by 〈n〉. Table 5 shows a comparison of these values using the actual and predicted values of 〈z₁〉 and 〈z₁ ²〉 from table 4 for z_o = 0.7 Gy, which is a typical value for alpha particle emitters (Charlton and Turner 1996). For these particular examples, the differences in actual versus predicted values is <+/−2.5%. Table 6 shows an analysis across all 892 test data points for a range of z_o values. In general, the differences in cell survival using actual versus predicted values of 〈z₁〉 and 〈z₁ ²〉 is within +/−3.5%, with average errors being close to 0, and standard deviations in the percent difference <0.7%. Moreover, these differences are consistent across a range of z_o values.

Table 5. Comparison of calculated cell survival using parameters predicted by the neural network.

Source/Target configuration	Energy (MeV)	r_n (μm)	r_c (μm)	−ln(S)/〈n〉 Actual	−ln(S)/〈n〉 Predicted	% Diff
1	7.07	6.8	9.3	6.10E-02	6.22E-02	1.57%
2	5.43	8.1	12.5	1.02E-01	9.99E-02	−2.16%
3	4.47	6.9	10.5	1.50E-01	1.47E-01	−2.37%
4	8.03	9.9	12.5	7.02E-02	7.09E-02	0.95%
5	4.60	3.5	4.4	5.55E-01	5.56E-01	0.20%
6	6.90	9.5	11.9	7.90E-02	7.97E-02	0.92%
7	7.30	2.5	3.8	7.74E-01	7.70E-01	−0.47%
8	8.60	8.5	17.0	8.87E-02	8.84E-02	−0.39%

1 = Cells in suspension with source in nucleus; 2 = Cells in suspension with source in cytoplasm; 3 = Cells in suspension with source on cell surface; 4 = Cells in suspension with source outside of cell; 5 = Packed cell grid with source uniform everywhere; 6 = Packed cell grid with source uniform everywhere except the cell nuclei; 7 = Packed cell grid with source in a spherical shell between the cell membrane and 1.25 times the cell nucleus radius; 8 = Packed cell grid with source only in the cytoplasm; r_n = cell nucleus radius; r_c = cell radius.

Table 6. Overall comparison of predicted cell survival (−ln(S)/〈n〉) for test data used with the neural network for various values of z_o.

	z_o = 0.3 Gy	z_o = 0.7 Gy	z_o = 1.1 Gy	z_o = 1.5 Gy
Overall % Difference	0.01% +/−0.54% Range: −3.32% to 2.60%	−0.01% +/−0.63% Range: −3.38% to 3.04%	−0.01% +/−0.66% Range: −3.39% to 3.18%	−0.02% +/−0.68% Range: −3.40% to 3.25%

4. Discussion

In this work, we produced a shallow neural network capable of estimating 〈z₁〉 and 〈z₁ ²〉 for both single-cells in suspension and cells in a close-packed geometric configuration. The network outputs showed good agreement with published data and those produced using MC which validates and merits the use of the network. The network could be simply used as a replacement for or alternative to the MC code originally used to provide these data, but it could more likely be used as an investigative tool for considering new alpha particle emitting radionuclides and other cell sizes. Like many machine learning techniques, the network can continue to be updated as new data are obtained, or the network design or architecture can be adapted and retrained for new purposes as in transfer learning.

However, simulation and data-based models or networks should still be used with the advantages and limitations in mind. The data used in this study are the results of more involved MC simulations, some of which may differ in certain approximations, number of particle tracks simulated, and level of precision. For example, the separate test results presented in this study had a larger spread of errors than the training results. While more errors in testing may be expected since these inputs are new to the network, analysis of the largest testing errors reveals they are from the 100 random input part of the test set (Roeske and Hoggarth 2007). While these training and testing data sets share the same cell geometries (1–4), the MC simulation was known to differ slightly from tabulated values. Some previously published average percent errors between the MC method and tabulated values range from about −2% to +3% for 〈z₁〉 and −4% to +6% for 〈z₁ ²〉 (Roeske and Hoggarth 2007). Therefore, the larger testing errors presented in this study are assumed to include some of these previously quoted uncertainties.

One important consideration is that the network is only trained for the eight different source-target configurations presented in this study. While it can be considered an accomplishment that the network is able to generalize the many different functions for all these configurations, it is also a known limitation of the network. Any calculations beyond the training range may have higher uncertainties and require an updated or retrained network for accurate results. Future studies may want to include more complex geometries such as ellipsoids or 'flat' cells. The network in this study would need to be reinitialized and retrained on data from the new source-target configuration.

A potential limitation of the network is that it is not able to predict 〈z₁〉 and 〈z₁ ²〉 for source-target configurations not explicitly considered here. For example, if we wanted to generate 〈z₁〉 and 〈z₁ ²〉 for a combination activity in the cell nucleus (configuration 1) and activity on the cell nucleus (configuration 2), we could not simply input a value of 1.5 for the source configuration type as the network would not be capable of generating useful data. However, the advantage of using 〈z₁〉 and 〈z₁ ²〉 is that values for different source-target configurations can be obtained by taking a weighted sum of the individual components. The weighting factor is given by the average number of alpha particle hits from a particular component divided by the total number of alpha particle hits (Roeske and Thomas 1997). This approach is generalized such that it applies not only to hybrid source-target configurations, but also to radionuclides in which multiple alpha-particle emissions occur.

There are alternative methods for estimating 〈z₁〉 and 〈z₁ ²〉 which should be discussed. One approach is to simply use MC calculations. These codes have become more readily available, and have user interfaces that allow them to be used without extensive knowledge of coding (Guerra et al 2021, Schuemann et al 2019). However, they still require time to run. A trained neural network can run in under a second and allows the user to easily change input variables and obtain updated output values quickly. Another approach uses an analytical expression that requires updating based on the source-target geometry (Bertolet et al 2020). While this approach has been demonstrated using simple geometries (single cell with activity on the surface), additional validation may be needed for more complex geometries.

The ranges of input variables in this study were meant to span the values most likely encountered in the study of alpha particle radionuclides. This included the energies quoted and the sizes of cells in the given geometries. Regression networks are known for interpolating well, yet caution should be used for input values near the training boundaries or beyond. If future studies were interested in values beyond the trained range, then it would be recommended that some new data be provided to update the training or that the network be reinitialized and retrained with the new data added. This would expand the scope of the network and improve the accuracy of calculations in the new range.

Machine learning and artificial intelligence in the field of medical physics often lends itself to imaging studies with deep convolutional neural networks (Maier et al 2019, Maitra et al 2019, Nensa et al 2019). This study serves as a first look into applying machine learning techniques into the field of alpha-particle microdosimetry. While our shallow network approximates the functions for 〈z₁〉 and 〈z₁ ²〉, it also provides a successful framework for future developments. Provided we have access to a robust dataset or data collection method, we could apply the same technique to more complex, realistic geometries of alpha-particle radiation. Additionally, a similar approach may be used to train and predict the single-event spectra. This would likely involve using a similar network configuration, where the input parameters would be the same, however, the output would consist of multiple nodes representing the spectrum. Training and validation would follow along the lines as discussed here but would require the full single-event spectra. We are currently investigating such an approach and hope to report our findings in the future.

5. Conclusion

A neural network was successfully trained to approximate microdosimetric moments for the study of alpha particle emitters in both single cell and cell cluster geometries. On independent test data, the network predicted 〈z₁〉 and 〈z₁ ²〉 with an accuracy of ±1.4% and ±2.8%, respectively. The network can be shared for continued use or further development and training.

Appendix

The MC simulations (Roeske and Hoggarth 2007, Mínguez et al 2018, Mínguez Gabiña et al 2020) used in this study for calculating 〈z₁〉 and 〈z₁ ²〉 follow a similar approach which is described as follows. In each case, a centrally located cell nucleus is placed at the origin. The nucleus is considered spherical in shape, having a radius ${r}_{{\rm{n}}}$ and is surrounded by a concentric spherical cell with radius ${r}_{{\rm{c}}}.$ Alpha particle emitters are randomly and uniformly placed for the different cases as shown in figure 2. Since the alpha particles have a limited range, only those emissions within a distance equal to the radius of the target nucleus plus the range of the alpha particle are considered. The alpha particles are assumed to travel in a straight line which is valid for particles with energies <10 MeV (Polig 1978). For each alpha particle emitter, a random emission angle is determined. For those emissions that occur within the cell nucleus (source-target configuration 1 and 5), the point at which the alpha particle intersects the nucleus surface is calculated. Using this information, the coordinates of the point of emission and intersection with the surface can be used to calculate the distance the alpha particle travels in the nucleus. For alpha particle sources that are emitted outside of the cell nucleus (source-target configurations 2–8), the maximum angle, ${\theta }_{{\rm{\max }}},$ that an α particle emission subtends with respect to that nucleus is obtained from:

$\begin{eqnarray}&&\cos \left({\theta }_{{\rm{\max }}}\right)=\displaystyle \frac{\sqrt{{{d}_{{\rm{t}}}}^{2}-{{r}_{{\rm{n}}}}^{2}}}{{d}_{{\rm{t}}}},\end{eqnarray} \tag{ A.1 }$

where ${d}_{{\rm{t}}}\,\,$ is the distance of the point of emission to the centre of the target nucleus, and ${r}_{{\rm{n}}}$ is the radius of the target nucleus (see figure 5 for a 2D representation). This maximum angle, ${\theta }_{{\rm{\max }}},$ is calculated in order to see whether the alpha particle is emitted in a direction in which it can hit the nucleus. If the product of the unitary vectors which define the position of the alpha particle emission is < $\cos \left({{\rm{\theta }}}_{{\rm{\max }}}\right),$ then the alpha particle will hit the target nucleus. In this case, the angle with respect to the position vector is determined. Then, the distance to the radius of the sphere perpendicular to the particle direction, and in the plane defined by that direction and the position vector, is determined. Lastly, in the case that the alpha particle reaches the nucleus, the difference between the distances after leaving the sphere and before entering the sphere gives the distance that the alpha particle traverses in the target nucleus. If the range of the alpha particle is not long enough to traverse the whole nucleus, then only the distance until the particle stops inside the target nucleus is considered.

**Figure 5.** A 2D scheme of the maximum angle, ${\theta }_{{\rm{\max }}},$ that an α particle emitted in the cytoplasm (the position is given by the (x, y) coordinates) subtends with respect to the cell nucleus ( ${d}_{{\rm{t}}}\,\,$ is the distance of the position of the emission of the alpha particle emitted to the centre of the target nucleus, and ${r}_{{\rm{n}}}$ is the radius of the nucleus). N. B. A symmetrical ${\theta }_{{\rm{\max }}}$ would be obtained drawing the other tangent line to the cell nucleus from the (x, y) position of the alpha particle emitted.
Download figure:
Standard image High-resolution image

The energy deposited by the alpha particle in the target nucleus is obtained from a curve fit of the range/energy relationship for the alpha particles, assuming a mass density of 1 g cm⁻ ³. Simulations of source-target configurations 1–4 used range-energy data from ICRU Report 49 (ICRU 1993). Simulations of source-target configurations 5–8 used data from National Institute of Standards and Technology (NIST) (available at https://physics.nist.gov/PhysRefData/Star/Text/ASTAR.html). The polynomial fit for this range-energy relationship is given by Mínguez et al (2018), Mínguez Gabiña et al (2020):

$\begin{eqnarray}&&\begin{array}{l}{E}_{res}\left({\rm{MeV}}\right)=1.34\times {10}^{-1}+1.89\times {10}^{-1}\times {R}_{res}-2.15\times {10}^{-3}\times {{R}_{res}}^{2}+1.72\times {10}^{-5}\\ \,\,\,\,\,\,\,\times \,{{R}_{res}}^{3}\,-\,5.47\times {10}^{-8}\times {{R}_{res}}^{4}\,{R}_{res}\gt 10\,{\rm{\mu }}m,\end{array}\end{eqnarray} \tag{ A.2 }$

$\begin{eqnarray}&&\begin{array}{l}{E}_{res}\left({\rm{MeV}}\right)=3.51\times {10}^{-2}\times {R}_{res}\,+3.42\times {10}^{-2}\times {{R}_{res}}^{2}\\ -2.00\times {10}^{-3}\times {{R}_{res}}^{3}\,{R}_{res}\leqslant 10\,\mu {\rm{m}},\end{array}\end{eqnarray} \tag{ A.3 }$

where ${R}_{res}$ is the residual range of the α particle, given by the range of the particle minus the distance traversed by the α particle, and ${E}_{res}$ is the remaining energy of the alpha particle after traversing a ${R}_{res}$ distance. This fit agreed well for both sources of range-energy relationships and was used in both MC calculations. The value of the single-event specific energy, ${z}_{1},$ was obtained dividing the energy deposited in the nucleus by the mass of the cell nucleus. Delta rays were not considered in these simulations as the cell nucleus sizes considered are much larger than their range (Sgouros et al 2010).

Simulations for source-target configurations 1–4 were performed using an Excel spreadsheet (Microsoft, Redmond, WA, USA). Each simulation used 50 000 particles having a standard error of <0.5% (Roeske and Hoggarth 2007) and was computed in <1 s on a standard desktop computer with Intel^® Core™ iS-9600T CPU @ 2.20 GHz. For source-target configurations 5–8, a program was developed using the Fortran programming language (Mínguez et al 2018, Mínguez Gabiña et al 2020). The number of atoms simulated was of 10⁸ which resulted in ∼10 min computational time per simulation using an Intel^®CoreTM i3-550 processor. The relative variation in the calculated values with regard to a simulation with 10⁷ atoms was <0.1%. A comparison between the two computational codes using a common geometry was previously performed (Mínguez et al 2018) and showed maximum differences of 1.3%.

Alpha particle microdosimetry calculations using a shallow neural network

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

Dates

Abstract

1. Introduction

2. Methods

2.1. Data collection and preprocessing

2.2. Network design

2.3. Impact of neural network estimates on cell survival

3. Results

4. Discussion

5. Conclusion

Appendix

Alpha particle microdosimetry calculations using a shallow neural network

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

Dates

Abstract

1. Introduction

2. Methods

2.1. Data collection and preprocessing

2.2. Network design

2.3. Impact of neural network estimates on cell survival

3. Results

4. Discussion

5. Conclusion

Appendix