A knowledge-based intensity-modulated radiation therapy treatment planning technique for locally advanced nasopharyngeal carcinoma radiotherapy

To investigate the feasibility of a knowledge-based automated intensity-modulated radiation therapy (IMRT) planning technique for locally advanced nasopharyngeal carcinoma (NPC) radiotherapy. One hundred forty NPC patients treated with definitive radiation therapy with the step-and-shoot IMRT techniques were retrospectively selected and separated into a knowledge library (n = 115) and a test library (n = 25). For each patient in the knowledge library, the overlap volume histogram (OVH), target volume histogram (TVH) and dose objectives were extracted from the manually generated plan. 5-fold cross validation was performed to divide the patients in the knowledge library into 5 groups before validating one group by using the other 4 groups to train each neural network (NN) machine learning models. For patients in the test library, their OVH and TVH were then used by the trained models to predict a corresponding set of mean dose objectives, which were subsequently used to generate automated plans (APs) in Pinnacle planning system via an in-house developed automated scripting system. All APs were obtained after a single step of optimization. Manual plans (MPs) for the test patients were generated by an experienced medical physicist strictly following the established clinical protocols. The qualities of the APs and MPs were evaluated by an attending radiation oncologist. The dosimetric parameters for planning target volume (PTV) coverage and the organs-at-risk (OAR) sparing were also quantitatively measured and compared using Mann-Whitney U test and Bonferroni correction. APs and MPs had the same rating for more than 80% of the patients (19 out of 25) in the test group. Both AP and MP achieved PTV coverage criteria for no less than 80% of the patients. For each OAR, the number of APs achieving its criterion was similar to that in the MPs. The AP approach improved planning efficiency by greatly reducing the planning duration to about 17% of the MP (9.85 ± 1.13 min vs. 57.10 ± 6.35 min). A robust and effective knowledge-based IMRT treatment planning technique for locally advanced NPC is developed. Patient specific dose objectives can be predicted by trained NN models based on the individual’s OVH and clinical TVH goals. The automated planning scripts can use these dose objectives to efficiently generate APs with largely shortened planning time. These APs had comparable dosimetric qualities when compared to our clinic’s manual plans.


(Continued from previous page)
Conclusion: A robust and effective knowledge-based IMRT treatment planning technique for locally advanced NPC is developed. Patient specific dose objectives can be predicted by trained NN models based on the individual's OVH and clinical TVH goals. The automated planning scripts can use these dose objectives to efficiently generate APs with largely shortened planning time. These APs had comparable dosimetric qualities when compared to our clinic's manual plans.
Keywords: Knowledge-based, Intensity-modulated radiation therapy, Automated planning, Nasopharyngeal carcinoma Background Radiation therapy treatment planning for nasopharyngeal carcinoma (NPC) is often challenged by the convoluted target volume and many adjacent organs at risk (OAR) [1]. Intensity-modulated radiation therapy (IMRT) technique has been considered as a common treatment for NPC, because it delivers highly conformal doses to the targets and effectively spares the OARs, potentially improving the local control rate and reducing radiation-related toxicities [2]. However, it is time-consuming to manually generate an IMRT plan due to its intrinsic trial-and-error process. In addition, IMRT plan quality may be inconsistent due to the inhomogeneous knowledge and experience level of the planners [3]. Hence, it is of great need to develop highly efficient automated planning techniques to consistently generate high quality plans.
In our institution, we developed a knowledge-based IMRT treatment planning technique for locally advanced NPC based on a neural network (NN) machine learning model. The NN model correlated an individual patient's OVH with the corresponding plan optimization dose objectives by learning from a cohort of similar locally advanced NPC patients. A set of Perl scripts were developed to bridge the NN model predicted patient specific dose objectives to the treatment planning system for plan optimization and dose calculations.

Patient libraries
Consecutive 140 locally advanced NPC patients treated with definitive IMRT at Fujian Cancer Hospital between July 2016 and September 2018 were retrospectively selected and chronologically separated into a knowledge library (n = 115) and a test library (n = 25). Only NPC patients with bilateral cervical lymph nodes metastases were included. All patients were diagnosed and staged by pretreatment enhanced magnetic resonance imaging (MRI) according to the Chinese 2008 staging system for NPC [22,23]. Each patient was immobilised in a supine position with a thermoplastic mask and underwent contrast enhanced computed tomography (CT) (Brilliance CT Big Bore; Philips Medical Systems Inc., Cleveland, OH, USA) at a 3-mm slice spacing from the skull vertex to the level of 2 cm below the clavicles. Volume delineation was performed on the CT images in the Pinnacle3 treatment planning system (TPS) (Philips Radiation Oncology Systems, Madison, WI) after a CT-MRI fusion.
The target volumes were delineated using an institutional treatment protocol defined as following: the primary nasopharyngeal tumor (GTV_T) and definitive bilateral lymph nodes (GTV_NL and GTV_NR), as determined by clinical information, endoscopic examinations and radiography including CT and MRI. The clinical target volumes (CTVs) included high-risk regions (CTV1), low-risk regions (CTV2), and bilateral low-risk nodal regions (CTV_ NL and CTV_NR). The CTV1 included GTV plus 5-to 10-mm margin. The CTV2 was designed for potentially involved regions and encompassed the entire CTV1. Each target volume was expanded by 3 mm to generate the planning target volume (PTV) in consideration of the setup error, geometric uncertainties and patient movement. In total, for each patient, seven target volumes (GTV_T_P, CTV1_P, CTV2_P, GTV_NL_P, GTV_NR_P, CTV_NL_P and CTV_NR_P) and ten OARs (left/right parotid, brainstem, spinal cord, left/right optic lens, left/ right optic nerves, pituitary and optic chiasm) were delineated. A total dose of 69.96 Gy in 33 fractions at 2.12 Gy/ fraction to the GTV_T_P, 66 Gy at 2 Gy/fraction to the GTV_NL_P/GTV_NR_P, 61.05 Gy at 1.85 Gy/fraction to the CTV1_P, 56.1 Gy at 1.7 Gy/fraction to the CTV2_P/ CTV_NL_P/CTV_NR_P were prescribed.

Manual planning
All patient treatment plans in the knowledge and test libraries were manually generated by a single experienced physicist. All plans were optimised in Pinnacle3 9.2 for treatment delivery by an Elekta Synergy accelerator using seven equally spaced coplanar 6MV photon beams (210°, 260°, 310°, 0°, 52°, 104°, and 156°). During treatment planning, auxiliary structures were generated to be used in objectives parameters (Table 1).
Direct machine parameter optimization (DMPO) was set for all beams with 9 cm 2 minimum segment area, 9 minimum segment monitor unit (MU) and up to 60 maximum segments. The manual plans (MPs) in both libraries followed the institutional locally advanced NPC planning criteria shown in Table 2. To achieve these criteria, objectives shown in Table 3 were used as a starting point for planning. The type, volume and weight for regions of interest (ROIs) were preset and not allowed to change. Only the target dose objectives are tunable to improve plan quality. All the MPs required the planner's best effort to lower the OAR doses by only adjusting the target dose values while maintaining the PTVs' dose coverage. This iterative process shall be repeatedly executed until no further improvement can be made.

Neural network model
The patients in the knowledge library were equally and chronologically divided into 5 groups, each group with 23 patients. A 5-fold cross validation scheme was adopted to generate 5 NN machine learning models. Each model was used to validate one group by training the other 4 groups. The output dose objectives for patients in the test library were obtained by taking the mean of the 5 dose objectives generated from the 5 models.
The details of how to build our NN model were given in this paragraph. For all patients in the knowledge library, their OVH, target volume histogram (TVH) and dose objective values were extracted and normalised. The OVH essentially defines the overlapping volume fraction between an OAR and a uniformly contracted/ expanded PTV (see Fig. 1). It acts as a visualisable descriptor depicting the three-dimensional anatomical relationships between an OAR and the tumor volumes into the two-dimensional Cartesian coordinate system, which can be conveniently used as inputs to an NN model. The TVH indicates the uniformly contracted or expanded PTV. Each NPC patient in the knowledge library had 20 OVH, 5 TVH, and one set of 21 dose objectives. Both OVH and TVH had 11 values, starting from a zero or negative (contraction) distance to an ending positive distance (expansion) with a fixed step size (see Table 4). Our 3-layer NN model had consisted 275, 184 and 21 nodes in its input, hidden and output layer respectively, taking OVH and TVH values as inputs and returning dose objectives as desired outputs. The model learned by refining their node-to-node link weights between two neighboring layers to minimize the cost function defined as the mean squared error between the trained and known value on each output node.
The NN modeling was run by Spyder (a python integrated development environment) on a personal computer with an Intel (i7-2630QM) CPU with 2 GHz main frequency. The model learning rate affects how big a step we update our model weights and values to move towards the minimum output error. The rate was set to 0.02 and model iteration time set to 2500. The choice of     Fig. 1 The overlapping between the left parotid (sky blue) and: (a) the CTV-ALL contracted with a distance of 5 mm (red); (b) the initial CTV-ALL (purple); (c) the CTV-ALL expanded with a distance of 5 mm (tan). The overlap volume fraction is defined as the overlapping volume divided by the volume of the left parotid these parameters yielded satisfactory results in this feasibility study with relatively short training time. During test, the trained model simply calculated a set of patient specific dose objectives based on the OVH and TVH values.

Automated planning
Automated plans (APs) were all generated by an in-house developed Perl and HotScripts planning scripts in Pinna-cle3 9.2. It automated the entire planning process including additional structure generation, beam and optimization parameters setup, and the final inverse optimization. This script also received planning parameters of gantry angle, beam energy, beam modality, treatment isocenter placing, prescription, number of fractions, isodose lines for visualization, IMRT optimization type, maximum number of segments, minimum segment area, minimum segment MU, max iteration (100) and convolution dose iteration at 40th. Finally, the script incorporated the derived dose objectives before the APs were automatically generated with a single loop of iteration of the planning process. An overview of our proposed process was presented in Fig. 2.

Plan comparison and statistical analysis
The AP and the MP of each patient from the test library were all blindly reviewed and rated by one attending radiation oncologist in our institute by evaluating both DVH and dose distribution. Grade C indicated an inferior plan quality which is considered clinically unacceptable. A grade B plan was deemed just about acceptable and grade A suggested a superior plan where the DVH and dose distribution were more desirable. Similar quality plans could be deemed comparable and rated the same. The ratings for both the APs and MPs in the test library were compared by McNemar-Bowker tests using Statistical Package for the Social Sciences (SPSS 21.0; SPSS Inc., Chicago, IL, USA) software. The reviewer also recorded the numbers of ROIs achieving the given criteria to be compared between the APs and MPs. SPSS 21.0 was also used for statistical analysis. The dose parameters in Table 2 were included in the statistical analysis. Mann-Whitney U test was performed to compare dose parameters of the APs and MPs. D x was the received dose corresponding to x% of volume. D 5 was used to evaluate the high dose in PTV. V 30 was the percentage volume receiving 30 Gy dose. Conformity index (CI = (V PTV region receiving prescription dose /V PTV )* (V PTV region receiving prescription dose /V Prescription dose )) and homogeneity index (HI = D5/D95) were calculated for PTV evaluation. Furthermore, planning duration and MU per fraction were also analysed for both the APs and MPs. The alpha level was set at 0.05 and the Bonferroni correction was also applied to control type I error probability. Since 32 tests were carried out in this analysis, it was considered statistically significant when P < 0.0015.

Plan quality comparison
In the blind test, 11 APs were rated A, 10 rated B, and 4 rated C, while 12 MPs were rated A, 10 rated B, and 3 rated C (see Fig. 3). The APs and MPs had the same rating in 19 out of 25 patients. APs were rated better for two patients and worse for four patients. The McNemar-Bowker test result showed that there existed no difference between the rating distribution of AP and MP with a P value of 0.549.

ROI meeting criteria
The numbers of PTVs and OARs achieving the given criteria are listed in Table 5. For no less than 80% of the patients from the test library, the PTV coverage met the criteria in both the APs and MPs. Particularly, CTV_ NL_P and CTV_NR_P of all the APs and MPs achieved their given criteria. GTV_T_P remained the most challenging PTV, since the number of GTV_T_P D 95 achieved the given criteria was 20 and 22 in APs and MPs, respectively. All the left and right lens in both the APs and MPs met the dose constraint of Dmax<8Gy. Pituitary appeared the most challenging OAR to manage, as only 17 APs and 15 MPs were able to meet Dmax<66Gy. Notably, the number of the APs was close to that of the MPs in achieving each OAR criterion. The largest different OAR number achieving its criteria between the APs and the MPs was 2 in both the pituitary and optic chiasm..

Data comparison and analysis
Dose parameters of the PTVs and the OARs using Mann-Whitney U test and Bonferroni correction are also shown in Table 5. PTVs (including GTV_T_P, CTV1_P, CTV2_P, CTV_NL_P, and CTV_NR_P) in the MPs had significantly higher D 95 than those in the APs (P < 0.0015). No significant difference was observed in the D 5 , CI, and HI of PTVs between APs and MPs (P > 0.0015). Moreover, dose parameters of all OARs were comparable between APs and MPs (P > 0.0015), although all the APs showed lower mean dose parameters (except brainstem D 1cc ) compare to the MPs. The D 1cc of brainstem was 56.19 ± 6.87 cGy and 54.95 ± 7.8 cGy in the APs and the MPs, respectively (P = 0.449). The MU for the APs was comparable to that for the MPs (685.04 ± 59.63 vs. 721.36 ± 63.36, P = 0.051). It was also found that the planning duration for the APs was greatly shorten compared to that for the MPs (9.85 ± 1.13 min vs. 57.10 ± 6.35, P < 0.001). Figure 4 is the DVH for patient (#12), one of the best plans of which its AP (solid line) and MP (dashed line) were both rated grade A. It shows clinically acceptable PTV coverages for both the AP and the MP, and it also shows that the AP considerably increases dose sparing to both right optic lens and pituitary. For patient (#12), the PTV coverage in the AP was approximately equal to that in the MP; relative percentage difference at D 95 for GTV_T_P, CTV1_P, CTV2_P, GTV_NL_P, GTV_NR_P, CTV_NL_P and CTV_NR_P were − 0.4, − 1.2%, − 2.4, 0.2, 0.2, 1.9 and 1.8%, respectively. Compared to the MP, AP greatly reduced OAR dose for left parotid V 30 , right optic lens Dmax, left optic nerve Dmax, right optic nerve Dmax and pituitary Dmax with relative percentage

Discussion
We developed a feasible knowledge-based IMRT treatment planning technique for locally advanced NPC using a trained 3 layer NN model. The knowledge-based library consisted of a comparatively larger sample size of 115 locally advanced NPC patients [12,14,20], and each patient had a high-quality manual IMRT plan. 5-fold cross validation method was also applied in our study. In addition, a wide range of OVH and TVH information which would have a great effect on the resulting dose distribution were selected as the input of the NN model [24]. Patient specific dose objectives predicted by the model were subsequently used for a single-iteration automated planning, which generated high quality, clinically acceptable or superior APs for 21 out of the 25 patients under test. For the 4 patients whose APs were rated C, their MPs were rated C as well (#6, 13, and 18),   intervention when the embedded Pinnacle scripts were running. Currently, the dose objectives derived from the NN model on our personal computer had to be manually transferred to the TPS computer, so our knowledge-based automated planning technique was not fully automated in this sense. Nevertheless, the model could be transferred on the TPS to complete the automation workflow in the future. Note that although the training time for each NN model was 27 min, the time to generate a set of objective for one patient took only less than 0.1 s.
Wu B and his group [20] applied k-nearest neighbour method and made a prediction on the best DVH of each single OAR based on its OVH, which might compromise the dose distribution when every OAR reached its best DVH. However, our study took all target volumes and OARs into consideration at the same time, and employed a NN model to derive a patient-specific set of dose objectives.
Our study did not include some OARs such as oral cavity, temporal lobes, and thyroid glands because these OARs could easily achieve their dose constraint by setting dose constraint to the additional rings (R5200, R4500, R3600, and R3100). Our study has not fully addressed the dose inhomogeneity with single iteration optimization. But one study suggested that automatic generation of regions and objectives for hot and cold spots would further improve dose uniformity without manual interference [25]. The study also utilised embedded Pinnacle scripts and provided a solution on achieving better CI and HI for us.
Our study proposed a prospective automated IMRT planning technique for locally advanced NPC. Although our current study has limited the settings of machine parameters such as gantry angles and segment sizes, the same technique can be applied to more complicated IMRT delivery techniques. We anticipate that volumetric modulated arc therapy treatment planning can also take advantage of the described technique to achieve individually tailored optimal radiotherapy plans. In addition, as the volume, position and dose of targets and OARs would change during the treatment course for NPC patients [26][27][28], the introduction of adaptive radiation therapy (ART) could potentially improve the treatment outcome [29]. Our knowledge-based automated planning approach would be of great value to generate high quality ART plans for NPC patients in an efficient manner.

Conclusions
A robust and effective knowledge-based IMRT treatment planning technique for locally advanced NPC is developed by use of NN model and HotScripts planning scripts in Pinnacle3 9.2 TPS. This automated technique largely shortened planning time without compromising the plan quality.