Moving GPU‐OpenCL‐based Monte Carlo dose calculation toward clinical use: Automatic beam commissioning and source sampling for treatment plan dose calculation

Abstract We have previously developed a GPU‐based Monte Carlo (MC) dose engine on the OpenCL platform, named goMC, with a built‐in analytical linear accelerator (linac) beam model. In this paper, we report our recent improvement on goMC to move it toward clinical use. First, we have adapted a previously developed automatic beam commissioning approach to our beam model. The commissioning was conducted through an optimization process, minimizing the discrepancies between calculated dose and measurement. We successfully commissioned six beam models built for Varian TrueBeam linac photon beams, including four beams of different energies (6 MV, 10 MV, 15 MV, and 18 MV) and two flattening‐filter‐free (FFF) beams of 6 MV and 10 MV. Second, to facilitate the use of goMC for treatment plan dose calculations, we have developed an efficient source particle sampling strategy. It uses the pre‐generated fluence maps (FMs) to bias the sampling of the control point for source particles already sampled from our beam model. It could effectively reduce the number of source particles required to reach a statistical uncertainty level in the calculated dose, as compared to the conventional FM weighting method. For a head‐and‐neck patient treated with volumetric modulated arc therapy (VMAT), a reduction factor of ~2.8 was achieved, accelerating dose calculation from 150.9 s to 51.5 s. The overall accuracy of goMC was investigated on a VMAT prostate patient case treated with 10 MV FFF beam. 3D gamma index test was conducted to evaluate the discrepancy between our calculated dose and the dose calculated in Varian Eclipse treatment planning system. The passing rate was 99.82% for 2%/2 mm criterion and 95.71% for 1%/1 mm criterion. Our studies have demonstrated the effectiveness and feasibility of our auto‐commissioning approach and new source sampling strategy for fast and accurate MC dose calculations for treatment plans.


| INTRODUCTION
Although Monte Carlo (MC) simulation is the most accurate dose calculation method in radiotherapy, 1 its long computational time has prevented clinical application. Recent research efforts have focused on developing high performance MC simulation packages on graphics processing unit (GPU) platforms. [2][3][4][5][6][7][8] While the accuracy and efficiency of fast GPU-based particle transport simulations in patients have been demonstrated, a number of barriers still need to be overcome before using GPU-based MC dose calculations in the clinic.
An accurate linear accelerator (linac) beam model is critical for the overall accuracy of MC dose calculations for external radiotherapy. 1,[9][10][11] A phase-space file obtained from the MC simulation of a linac head is the most accurate beam model, and hence is used in many CPU-based MC packages. However, regarding the computational efficiency, it is not preferable to directly use a phase-space file in GPU-based MC simulation. One reason is the overhead of loading a large number of particles and transferring them from CPU to GPU.
Although the data loading and transferring time (usually in seconds) is not significant for traditional CPU-based MC calculations, it becomes a considerable issue in GPU-based simulation when compared with the short simulation time on GPU (less than 1 minute for a typical patient case). 12 In addition, direct use of a phase-space file is not suitable for GPU's single-instruction-multiple-data programming scheme. That is because particles are stored in the phase-space files in a random order in terms of particle type (photon or electron), energy, and spatial location. When simulating a group of particles of different types and energies simultaneously on GPU platforms, the threads inside a warp will have very different execution paths, leading to the so-called thread divergence issue and impairing the overall program efficiency. 8 Fine-tuning the beam model to properly represent an actual clinical beam is an essential step. However, a phasespace file cannot be easily commissioned to represent a specific clinical beam. Conventional solution is to tune the parameters in linac head MC simulations in a trial-and-error fashion. Besides being cumbersome and time-consuming, [13][14][15][16] this commissioning process requires MC expertise, impeding its application in the clinic. An accurate beam model with an easy commissioning approach is hence required to facilitate the clinical application of MC dose calculations.
We have recently developed a GPU-OpenCL-based MC dose engine named goMC for radiotherapy, a cross-platform MC dose engine that may be executed on different computing devices, such as CPU, GPU from different vendors, and heterogeneous systems. 7 An analytical linac beam model derived from a phase-space file has been added to goMC for external photon therapy, along with a GPU-friendly scheme to sample source particles from the model. 17 In this paper, we report our effort in moving goMC toward clinical use. Regarding beam commissioning, we have previously developed an optimization-based automatic commissioning approach for a phase-space-file-based beam model. 18 We have shown that using measurement dose in water, this approach can finely tune the energy spectrum and fluence distribution of the beam model to accurately represent the targeted beam. The accuracy of the commissioned beam model was validated in water as well as in a headand-neck (HN) cancer patient case. 18 However, the method was only validated in a few 6 MV photon beams for the purpose of proof of principles, and primary and scattered photons were not separated in that beam model. Hence, it is the first objective of this paper to apply the commissioning approach to our new analytical beam model in goMC, where the degrees of freedom are increased due to separation of primary and scattered sub-sources. Moreover, we will further assess the validity of our analytical beam model and commissioning method on photon beams of different energies and flattening-filter-free (FFF) beams. In addition, some code modifications will be made to streamline the data preparation for commissioning.
The second motivation of this paper is to develop an effective sampling approach to incorporate plan parameters, e.g., beam angle, monitor unit (MU), and multi-leaf collimator (MLC) configuration at control points, into MC dose calculation of a treatment plan.
Although particle transport simulations may be performed within MLC, modeling its geometry and movements is not trivial 19,20 and adds additional computational burden. 21,22 Currently, no GPU-based MC packages can support this function. Sampling source particles for each beam angle in accordance with the fluence map (FM) at that beam is a feasible alternative. The conventional way is to let each source particle carry the FM value of the position where the particle intersects the FM plane as its weight in dose calculation.
This FM weighting approach has been widely employed in previous studies due to its simplicity. 7,12,23,24 However, because the MLC aperture at a control point is usually small, many source particles sampled from the beam model will hit the MLC leaves and carry small weights associated with MLC transmission. These particles contribute little to patient dose, indicating inefficient use of sampled particles in the computationally intensive MC simulations. This is not an issue for intensity modulated radiation therapy (IMRT) cases, as the FMs of IMRT plans are usually accumulated over the control points at a same beam angle to deal with the small MLC apertures at each individual control point. However, for volumetric modulated arc therapy (VMAT), the FM weighting approach becomes suboptimal. In this paper, we present an efficient source sampling approach to address this issue. It allows us to first sample the particle position on the FM plane based on the beam model, and then use the FM to bias the sampling of the control point to reduce the portion of those particles that would hit MLC leaves and contribute little to patient dose. Our experiments will demonstrate the accuracy and effectiveness of this method.

2.A | Automatic beam commissioning
The analytical beam model used in goMC was a field-independent beam model, characterizing the particle distribution of the linac beam just above the secondary collimator. In our beam model, the phasespace plane was partitioned into a set of phase-space-ring (PSR) sub-sources. 17 A PSR sub-source is defined as a group of particles with the following characteristics: (1) spatially distributed in a narrow ring region, (2) within a small energy range, (3) same particle type (photon or electron), and (4) similar interaction history (primary or secondary). The particle intensity and direction distribution in each sub-source were parameterized in analytical functions, by analyzing a reference phase-space file. Our previous study has demonstrated the feasibility of using this analytical beam model to accurately represent the reference phase-space file, and its efficiency benefit in GPUbased MC simulations. 17 However, manual adjustment of this analytical beam model to represent a real linac beam is impractical due to the large number of sub-sources in the beam model; automation is therefore a desirable feature. Using the measured dose to adjust the optimal angular width parameters (i.e., the parameters that characterize the particle direction distribution within each sub-source) of our beam model is challenging, since these parameters appear in exponential terms. 17 Hence, we assumed that the angular width parameters fitted from a reference phase-space file were accurate enough to model the local direction distribution of a real beam, and we only needed to finely tune the energy spectrum, the particle distribution, and the overall direction distribution to match the measured dose, by adjusting the relative intensity among the sub-sources of different energies and locations. This assumption enabled us to model the commissioning as a linear problem, due to the linearity of the dose calculations. A multiplicative correction factor was hence introduced for commissioning. The dose distribution of a commissioned beam model is then a simple summation of the sub-source doses weighted by their corresponding correction factors. These factors can be automatically adjusted by a numerical optimization algorithm that minimizes the differences between the actual measurement and the calculated dose. We have used this optimizationbased idea to commission a phase-space-let beam model previously, in which each sub-source contained both the primary and scattered photons. 18 In this study, we investigated the feasibility of this auto-commissioning approach on our analytical beam model, which was a more challenging scenario with increased degrees of freedom because of the separated primary and scattered sub-sources in the beam model.

2.A.1 | Pre-calculating the dose contribution of each sub-source
To determine these correction factors, we need to pre-calculate the dose contribution of each sub-source in water for open fields. In our previous study, 17 our MC dose calculation code was repeatedly launched, sampling a certain amount of source particles from one sub-source and transporting these particles in a water phantom to obtain the dose distribution of this sub-source. The resulting 3D dose distribution was then post-processed to extract the dose values at certain voxels (i.e., central axis depth dose and inline and cross-line profiles at several depths) to be used for commissioning later. This workflow was time consuming mainly due to two reasons: (1) repeatedly initializing the MC code and transferring input data from CPU to GPU; (2) post-processing a large data set of 3D dose distributions (e.g., a few hundred Gigabytes).
To address this issue and streamline the data preparation of commissioning, we have modified our dose calculation code to perform concurrent dose calculations for all the sub-sources in one run of MC simulation, with separate storage for each sub-source dose.
The modifications are described as follows: 2. When the MC code was launched to calculate the dose for all the sub-sources, the source particles were sampled from our analytical beam model and transported in a water phantom, as if conducting a typical dose calculation. The difference was that in this sub-source dose calculation each source particle carried a sub-source index denoting which sub-source it was sampled from. Secondary particles generated during transport inherited the same index. When a particle deposited dose to a voxel of the water phantom that belonged to the commissioning data set, this dose value would be recorded into an entry of matrix A. The column index of this entry in matrix A was specified by the subsource index carried by the particle. The row index of this entry was quickly determined using a pre-created row index look-up  | 71 PSRs and the rows corresponding to the voxels at depths larger than the electron penetration depth (i.e., after the build-up region). b 1 is a vector consisting of the measured data of a specific linac to be commissioned at those voxels. The x p vector contains the correction factor to be determined for each photon PSR. These correction factors should be non-negative, since they will be multiplied to the original intensity of each PSR to reweigh their contributions in a beam.

2.A.2 | Commissioning model
Once the correction factors of the photon PSRs were determined, we then commissioned the electron PSRs to match the buildup region. Note that the regions already commissioned in the last step were beyond the electron penetration depth, and hence not affected by this step. The electron PSRs within a same energy bin were grouped to form an "effective PSR," and were adjusted together in our commissioning via a single correction factor. The sub-source dose for an electron "effective PSR" was obtained by summing over the doses of all the electron PSRs within the corresponding energy bin. The purpose of this grouping strategy was to average out the relatively large statistical uncertainty in the dose distributions of the electron PSRs because of the very small amount of the contaminant electrons in a photon beam (~1% of the total particles). 18 The electron commissioning model was formulated as follows: Here, x e denotes a vector of the correction factor for each electron "effective PSR." Each column of the matrix A 2e corresponds to the dose for an effective PSR in the build-up region. This matrix was obtained by fetching a submatrix of A with those columns for electron PSRs and rows for voxels in the build-up region, and then summing over columns with the same energy bin. A 2pxp denotes the total dose of the previously commissioned photon PSRs in the buildup region. The b 2 term is the measured data in the same region.
Both the problems in Eqs. (1) and (2) are a least square optimization problem, and gradient descent algorithm was used to solve them.
Compared to our previous approach developed for the phasespace-let beam model, 18

2.B | FM-based biased source sampling
When performing dose calculation for a specific VMAT treatment plan, three groups of information have to be considered: particle distribution in the beam model, MU distribution among different control points, and the spatial distribution of the MLC transmission factors at each control point corresponding to its MLC aperture. This is not a trivial task, if one would like to achieve a high efficiency.
Generally speaking, each distribution can be incorporated either by altering particle weight or biasing source sampling according to the distribution. The latter is preferred because of its more effective use of source particles in MC simulations.
To achieve this, source sampling for VMAT treatment plan was conducted in two steps. The first step was to sample source particles from our field-independent analytical beam model. A GPU-friendly source sampling strategy has been designed for this step in our previous study. 17 To be efficient, the sampling was only performed within or nearby the jaw open area. Once the source particles were sampled from our beam model, they followed a given distribution in terms of particle type, spatial location, direction and energy that was characterized by the model; the weight carried by each particle equaled to one. The second step was to incorporate the specific plan information into these sampled source particles. The FM value of a position on the 2D FM plane (i.e., a plane defined on the MLC upper surface using same coordinate system of our beam model) at a control point reflects not only the MU that the beam delivered at this control point, but also the MLC leaf open-close status. It is preferable to bias source sampling based on FMs to reduce the portion of the sampled particles that would hit MLC leaves and contribute negligibly to patient dose.
In this study, we proposed a strategy to realize this biased sampling, referred to as the FM-based biased sampling method hereon, employing the inverse sampling method. 25 Specifically, let us denote the FM as FM x; y; k ð Þ, with (x; y) being a position on the FM plane and k representing a control point. Given a source particle already sampled from our beam model, its position on the 2D FM plane is already determined. For this particle, the cumulative probability density function (CPDF) of the control points can be calculated as This CPDF can be used to assign the control point index of the particle. For instance, after sampling a source particle from the beam model, we could generate a random number c 2 0; 1 ½ and search for the control point index k that satisfied CPDF k ð Þ c\CPDF k þ 1 ð Þ.
Then the source particle was rotated according to the beam configuration of this control point, and carried a weight equal to P k FM x; y; k ð Þ. However, searching for this control point index in the CPDF array is not preferred on GPUs, because of the sequential behavior and hence potential GPU thread divergence issues. This is particularly a concern for the treatment plans with a large number of control points. Hence, to quickly determine a control point index for each source particle, we pre-created a numerical inverse look-up With our method, once a source particle is sampled from the distribution of a beam model, it will have a high probability to be assigned to the control point that not only has a relatively large MU but also has the MLC leaves open at the particle's position on the FM plane. It is this effect that makes our approach more efficient, requiring fewer source particles to achieve a targeted uncertainty level.    We would like to mention that even with our improvement, the computation time for the sub-source dose calculation was still high.    These results were further quantitatively evaluated using regionspecific metrics as suggested by AAPM task group 53. 26  were calculated for the regions with relatively low-dose gradients, including the depth dose after build-up and the lateral dose profiles at inner and outer beam regions. These two metrics were calculated as: The output factors of our commissioned source model, which were defined at 100 cm SSD and d max depth, were also calculated and compared with the measurement and reference source model, as shown in Fig. 3(a). After commissioning, the maximum absolute difference was improved from 1.29 to 0.64% as compared with the measurement data. The average difference was improved from 0.61 to 0.17%.  Tables 3 and 4, respectively. For the depth dose curves, the average DTA was reduced from 0.12-0.38 cm to 0.06-0.20 cm, and the maximum DTA from 0.20-0.50 cm to 0.17-0.35 cm at the build-up region; the RMS and maximum difference were reduced from 0.42-1.96% to 0.22-1.27% and from 1.02-3.04% to 0.46-1.61%, respectively, at the region after build-up. RMS and maximum differences at the inner beam region were improved from 0.40-2.77% to 0.21-1.02% and from 1.01-4.52% to 0.61-1.95%, respectively. The calculated output factors are shown in Fig. 3(b). The maximum absolute differences of the output factors and the average difference improved from 1.55 to 0.68% and from 0.76 to 0.34%, respectively. for this beam are shown in Fig. 3(c). The maximum absolute difference of the output factors and the average difference improved from 1.70 to 0.37% and from 0.80 to 0.14%, respectively.

3.A.3 | 10 MV FFF photon beam
Although in an actual clinical setting the measured data at all field sizes were for the purpose of beam commissioning, our results show that using three field sizes (40 9 40, 10 9 10, and 2 9 2 cm 2 ) to commission our beam model already led to a sufficient accuracy for other different field sizes. That is because our analytical beam model was derived by analyzing a reference phase-space file, and hence already a good approximation of the linac beam. The commissioning was to further finely tune the beam model for each specific clinical beam. We expect that using more field sizes will improve the commissioning result, but probably incrementally.

3.B | Efficient source sampling
Since our FM-based biased source sampling method and the FM weighting method model the same physical process, they are expected to give us the same result at the limit of zero statistical uncertainty. To demonstrate this first, we calculated the dose distributions of a VMAT HN patient case with these two methods,  | 79 respectively, both achieving 0.25% average statistical uncertainty (relative to the prescription dose). The doses are shown in Fig. 8.
The average absolute difference between these two doses was 0.64% (relative to the prescription dose). We also performed 3D Gamma-index test, and achieved 99.78% passing rate for the 2%-2 mm criterion and 94.87% for the 1%-1 mm criterion.
We then investigated the overall efficiency of MC dose calculations on this VMAT case when using these two source sampling methods. Since the computation time of MC simulation was associated with the achieved statistical uncertainty, the same uncertainty level was achieved by these methods to ensure fair comparison. As listed in Table 7, to achieve 0.25% statistical uncertainty, our method required 2.16 9 10 10 source particles, compared to 6.00 9 10 10 particles needed by the FM weighting method. The reduction factor on the number of the required source particles was~2 To better understand how our method outperformed the FM weighting method in terms of efficiency, we recorded the number of the sampled "less useful" source particles (i.e., the source particles that hit the MLC leaves). Among 1.35 9 10 9 source particles sampled from the beam model, 4.01 9 10 8 particles were found to be "less useful" particles when using the FM weighting method, accounting for 29.70% of the sampled particles. In contrast, when sampling the same amount of source particles with our method, 1.46 9 10 8 "less useful" particles were found, accounting for 10.81% of the total amount. These numbers demonstrated that our FM-based biased source sampling could effectively reduce the portion of the "less useful" source particles, which hence lead to an overall efficiency gain in MC dose calculations.

3.C | Comparison with commercial treatment planning system
The doses calculated for a VMAT prostate patient case treated with 10 MV FFF beam are shown in Fig. 9. The first two rows represent the dose calculated in the Eclipse TPS and the one calculated in our goMC dose engine, respectively. The absolute differences between these two doses are shown in the third row. The dose volume histograms corresponding to these two doses are shown in Fig. 10. It actually took us two steps to reach an accurate analytical beam model to represent an actual linac beam: (1) fitting the particle distribution of a phase-space file into an analytical beam model first; (2) further fine tuning the beam model by adjusting the relative intensities of the sub-sources to match the measured data. It would be preferred to fit the analytical model directly using the measured dose.
However, the relationship between the particle distribution and the resulting dose is complicated. It is very difficult to directly model this relationship in a closed analytical form and then use it to fit the beam model. Our current two-step strategy is an alternative way to overcome these challenges. It is relatively straightforward to analyze the  We assumed beam rotational symmetry when deriving our analytical beam model from a phase-space file and further commissioning it. The reason was twofold. One was to efficiently use the limited amount of the particles stored in the file to build a model with relatively small statistical uncertainty. Another reason was to decrease the number of correction factors to be adjusted during commissioning. However, the actual spot size of the photon beam was usually found to be elliptical. 21 This elliptical photon spot, as well as different locations of X jaws and Y jaws along beam direction, should result in the non-interchangeable output factors of rectangular shapes 29  indicate that the beam asymmetric issue above the jaw does not seem to be a major concern for clinical use, although further studies are needed to test this in more challenging cases, e.g., small field dosimetry, with more stringent criteria.
Although we aimed to realize the clinical use of goMC, the autocommissioning method and the biased particle sampling strategy for treatment plan dose calculation are not specific to our own dose engine. First, our idea to solve the beam commissioning problem from an optimization perspective provides a general auto-commissioning approach. Our commissioning idea may be applicable to other linac beams, such as Cyberknife. Second, the proposed FMbased biased source sampling method is expected to be applicable to a variety of beam models. For instance, a phase-space file beam model presents the same efficiency issue for VMAT dose calculation, and may benefit from our sampling method.

This study is supported by Cancer Prevention & Research Institute
of Texas (CPRIT RP150485). The authors would like to thank Dr.
Damiana Chiavolini for editing the manuscript.

CONF LICT OF I NTEREST
The authors have no relevant conflicts of interest to disclose.