Report on planning comparison of VMAT, IMRT and helical tomotherapy for the ESCALOX-trial pre-study

The ESCALOX trial was designed as a multicenter, randomized prospective dose escalation study for head and neck cancer. Therefore, feasibility of treatment planning via different treatment planning systems (TPS) and radiotherapy (RT) techniques is essential. We hypothesized the comparability of dose distributions for simultaneous integrated boost (SIB) volumes respecting the constraints by different TPS and RT techniques. CT data sets of the first six patients (all male, mean age: 61.3 years) of the pre-study (up to 77 Gy) were used for comparison of IMRT, VMAT, and helical tomotherapy (HT). Oropharynx was the primary tumor location. Normalization of the three step SIB (77 Gy, 70 Gy, 56 Gy) was D95% = 77 Gy. Coverage (CVF), healthy tissue conformity index (HTCI), conformation number (CN), and dose homogeneity (HI) were compared for PTVs and conformation index (COIN) for parotids. All RT techniques achieved good coverage. For SIB77Gy, CVF was best for IMRT and VMAT, HT achieved highest CN followed by VMAT and IMRT. HT reached good HTCI value, and HI compared to both other techniques. For SIB70Gy, CVF was best by IMRT. HTCI favored HT, consequently CN as well. HI was slightly better for HT. For SIB56Gy, CVF resulted comparably. Conformity favors VMAT as seen by HTCI and CN. Dmean of ipsilateral and contralateral parotids favor HT. Different TPS for dose escalation reliably achieved high plan quality. Despite the very good results of HT planning for coverage, conformity, and homogeneity, the TPS also achieved acceptable results for IMRT and VMAT. Trial registration ClinicalTrials.gov Identifier: NCT 01212354, EudraCT-No.: 2010-021139-15. ARO: ARO 14-01


Background
Head and neck cancer patients are treated by intensity modulated radiotherapy (IMRT) as standard of care in radiation oncology. Early in the era of IMRT application, the concept of the simultaneous integrated boost (SIB) was evaluated [1]. This boost technique creates a selective heterogeneous dose distribution in one target divided into subvolumes with the aim of better conformity [2]. Many results of planning comparisons using different treatment planning systems (TPS) and radiotherapy (RT) treatment concepts have been published for both head and neck as well as other cancer. Many of these used real patient data sets for comparison of treatment modalities at the same institution [3][4][5][6][7][8][9]. IMRT, volumetric modulated arc therapy (VMAT), and helical tomotherapy (HT) are routinely used to treat patients with head and neck cancer.
In multicenter prospective trials with IMRT, quality assurance (QA) of RT planning is mandatory [10,11]. QA for RT of head and neck cancer does not only include the quality of RT plans, but begins earlier in the treatment process. The delineation of the target and organ at risk (OAR) requires standardization and quality control. Especially with the implementation of RT plans with steep dose gradients, the correct target volume delineation is of outstanding importance. Otherwise the clinical outcome of head and neck cancer patients is compromised [11]. Feasibility of the RT-strategy is the first step of RT-trial design. The second step is the proof of dose specification with different TPS, followed by a dummy run of all participating trial centers including a dummy run for target volume and OAR contouring.
The ESCALOX trial was planned as a multicenter trial for dose escalation in head and neck cancer to the gross tumor volume (GTV) defined by CT and MRI imaging in 2010. FMISO-PET was part of a translational research project and not implemented in target volume definition. Data sets of the first six patients of the pre-study as determined by the German Federal Office of Radiation Protection (Bundesamt für Strahlenschutz-BfS) (BfSregistration number Z5-22463/2-2011-011) were used for planning comparison with different TPS. The aim of this planning study was to compare the non-uniform dose prescription and calculated planning target volume (PTV) doses of the different SIB volumes as well as the calculated doses to the OARs between two TPS: Eclipse version 13.0 (Varian Medical Systems, Palo Alto, CA, USA) for VMAT (RapidArc) and dynamic multi-field IMRT, and TomoTherapy Planning Station version 4.2.3 (Accuray, Sunnyvale, CA, USA).
The hypothesis of this investigation was the generation of comparable RT plans via different TPS and RT techniques respecting the dose prescription to the SIB volumes as a primary objective. The secondary planning goal was the concurrent minimization of dose to the organs at risk (OAR) while having an escalated dose of up to 77 Gy in the SIB.

Patient characteristics and RT concept
Patients were treated between 1/2016 and 2/2017 after inclusion in the pre-study of the ESCALOX trial at the Department of Radiation Oncology of the Technical University of Munich. All included patients were male with a smoking history of more than 10 packyears. The mean age was 61.3 years [53-70 years]. In all cases, primary tumor location was the oropharynx, for details, see  [12]).

Target delineation
Head and shoulder mask systems were customized before treatment planning CT for all patients (slice thickness 3 mm, intravenous contrast enhancement in all cases). CT extension comprised base of skull to mid-mediastinum. The CT images were reconstructed with 512 × 512 pixel matrices.
Before treatment planning, every patient was also staged by means of a head and neck MRI. These MRI scans were co-registered to the planning CT. Based on the panendoscopy report and the imaging entirety, the GTV, clinical target volume (CTV), and SIB (PTV) were delineated and crosschecked by two radiooncologists (MD, SP). Bilateral neck lymph drainage was PTV in all cases. Every isotropic margin generation by the TPS was re-contoured manually with respect to anatomic structures (bones, ligaments, cartilage, muscle) usually not tumor-invaded as recommended by trial protocol. The RT plans were evaluated quantitatively and qualitatively (isodose curves and color wash) by two physicians.

Treatment planning-VMAT, IMRT, HT
All plans were calculated, optimized, and approved by experienced medical physicists on the particular system (Eclipse for VMAT and IMRT, and TomoTherapy Planning Station for HT). Normalization was done to 95% of the volume with the highest prescribed dose (SIB77). In the tomotherapy planning workflow, normalization is done during the optimization process. Hence, there can be small discrepancies between the "normalization dose" and the dose after final dose calculation (after which the dose will not be normalized again). The first optimization goal was to bring 95% of the PTV to the specific prescribed doses ( Table 2). In addition, 50% of the PTV volumes (SIB77Gy, SIB70Gy, SIB56Gy) were to receive no more than 79.3 Gy, 72.1 Gy, and 57.7 Gy, respectively. The second planning objective was to spare the following OARs according to their constraints or to the ALARA principle: spinal cord (spinal cord + 5 mm) (mandatory), brachial plexus (mandatory), brainstem (mandatory), optical nerve (mandatory), mandible, glottis, and parotid glands.
For OARs labeled as "mandatory", the given dose was a hard constraint. In some cases, when it was not possible to fulfill both criteria (PTV dose and mandatory OAR dose-see Table 3), a lower dose in the PTV was accepted (detailed delineation and planning are described in [12]). All planning modalities    15:253 used identical dose criteria for the SIBs and OARs as described in the trial protocol.

VMAT and IMRT
Treatment planning for the VMAT and dynamic IMRT plans was performed with the Eclipse 13.0 TPS. For dose calculation, the Anisotropic Analytical Algorithm (AAA, version 10.028) was used with a dose grid size of 2.5 × 2.5 × 3.0 mm 3 . VMAT and IMRT planning was performed for treatment using a Varian Clinac Trilogy linear accelerator equipped with a 120 HD MLC. VMAT plans consisted of three arcs with 358° rotation each. The sliding window IMRT plans were optimized using a beam set-up of nine coplanar static beam directions with evenly distributed angular distances.
All plans were normalized to a dose of 77 Gy, covering 95% of the SIB77Gy volume as described in the trial protocol [12]. There were no limits for treatment planning time. Re-optimization for treatment planning was allowed and recommended to push the TPS to the limits.

Helical tomotherapy
Treatment planning was done with the TomoTherapy Planning Station (Accuray), version 4.2.3. The tomotherapy plans were all helical IMRT plans with field widths of 1 cm and calculation grid size set to fine.

Dose-volume histogram (DVH) analysis and statistics
All DVH analyses were done using the Eclipse TPS (HT plans were imported). DVH data was extracted to a TXTfile and then read out by R-Software (R version 3.4.1 (2017-06-30)-The R Foundation for Statistical Computing) and spreadsheets. Statistical analysis was calculated by IBM SPSS version 25. The DVH data of each patient was compared for every planning approach (IMRT, VMAT or HT) and between all patients by calculating the mean data for every parameter.
A comparison of the mean DVH data of all techniques was made by using Student's two-sided paired t-test or Wilcoxon test, depending on the distribution of the values (IMRT vs. HT; IMRT vs. VMAT and HT vs. VMAT). Statistical significance was supposed for p ≤ 0.05.

Parameters for DVH analysis
To compare different RT-plans (VMAT, IMRT and HT), the following DVH-SIB (PTV) parameters were rated: SIB77Gy-D mean , D 100% , D 98% (D min ), D 25% , D 5% and D 2% (D max ). We defined the SIB70Gy-volume as a shell without the inside-located SIB77Gy-volume. For the SIB56Gy-volume, the same procedure was done to create the shell without SIB70Gy-and SIB77Gy-volumes. For each shell (SIB56Gy and SIB70Gy), D 95% and D 5% are reported (Table 4) (for raw data see Additional file 1). For assessment of plan quality, the coverage factor (CVF), healthy tissue conformity index (HTCI), conformation number (CN) and the homogeneity index (HI) were calculated and compared to each other [3,13]. These parameters were used to judge the quality of the different planning target volumes (i.e. SIB).
The formulae to calculate all these parameters are given with: 1 CVF-coverage factor [ Parameters for OAR for head and neck cancer were reported from DVH and also compared depending on the RT technique (parallel OAR: parotid gland D mean [16][17][18]; larynx D mean [19], serial OAR: spinal cord + 5 mm D max ; brainstem D max , plexus brachialis D 2% , mandibula D mean and D 2% ). In order to properly assess the OAR doses of the parotid gland, it is important to note that three patients had left-sided and two patients right-sided cancer of the oropharynx. In one case, the tumor was located near the midline with growth to the right. In this way, we used "ipsilateral" parotid gland as the site of primary tumor location and contralateral for the primary tumor free zone. In order to compare the parotid glands (ipsi-or contralateral to the primary tumor), the conformity index was used. 5 COIN-conformity index according to Baltas et al. [20] . V OARref OAR volume receiving reference dose; V OAR total volume of OAR.
The D max for the spinal cord plus 5 mm was comparably safe for all kinds of radiotherapy. Interestingly, HT reached the lowest D max 45.7 ± 8.2 Gy, but the highest standard deviation.
For comparison of the plexus brachialis, this OAR was also discriminated between ipsilateral and contralateral localization. Concerning D 2% only for the contralateral plexus, a small significant difference was reached for IMRT vs. VMAT, favoring IMRT.

Discussion
The aim of this planning study was to investigate the planning options for the first six patients enrolled in the pre-study of the ESCALOX trial. Thus, IMRT, VMAT, and HT planning algorithms and the corresponding RT plans were compared and checked for the constraints as determined by the trial protocol [12]. Different plan quality parameters were calculated from DVH for all planning target volumes (SIB77Gy, SIB70Gy, and SIB56Gy) in order to compare all RT techniques. Dose coverage, dose conformity, and homogeneity were analyzed. For SIB77Gy, SIB70Gy, and SIB56Gy, the coverage of all RT modalities was comparable. HT gained best results for SIB77Gy concerning conformity (HTCI and CN) and homogeneity defined by the HI. Concerning coverage (CVF) of HT for SIB77Gy, there was a small gap between HT and VMAT and IMRT, which seems clinically irrelevant.
The SIB70Gy volume encompassed the shell between the margin of SIB77Gy and SIB70Gy. Best coverage for SIB70Gy was achieved by IMRT. Conformity and homogeneity favored HT. The analysis of SIB56Gy-the volume for elective nodal irradiation-revealed a comparable coverage for all RT modalities. Best conformity was reached by VMAT, superior homogeneity by HT. Thus, HT is favored for conformity defined by HTCI and CN for the SIB77Gy and SIB70Gy. Best performance for homogeneity (HI) in all SIBs was achieved only by HT, followed by IMRT. Important DVH parameters describing the steepness of the dose gradient as D 2 or D 5 as well as D mean showed better results for HT, but no differences between IMRT and VMAT. Despite the shown advantage of HT, all RT modalities fulfilled the dose specification of the trial protocol: 50% of the PTV volumes (SIB77, SIB70, SIB56) should receive no more than 79.3 Gy, 72.1 Gy, and 57.7 Gy, respectively.
Differding et al. compared in a RT planning study the potential of VMAT and HT in part 1 for best target coverage and in part 2 for maximal OAR sparing [5,21]. This was an FDG-PET based delineation for dose painting planning study on datasets of five patients with oropharyngeal cancer. HT and VMAT achieved comparable results for target coverage (PTV 70 Gy (SIB) and PTV 56 Gy applied in 35 fractions) and OAR sparing. Slight differences were seen for conformity of PTV 56 Gy favoring HT. The authors concluded that HT and VMAT were able to deliver similar dose distributions for FDG-PET-based dose escalation for the concept of dose painting by numbers. The good results for VMAT were at the cost of four arcs. The planning physicists extended the limits of the TPS by re-optimization of the RT plans after comparing their results to the other TPS results. For our planning scenario, both physicists did the same. In order to yield comparable results for target coverage and to minimize the OAR dose, VMAT applies the dose by using three arcs. Despite one additional arc compared to a routine head and neck RT plan, the beam on time is lower than the amount of time of application of an HT plan. This indicates that the probability of intra-fraction motion is smaller by VMAT than with HT.
Thorwarth et al. compared 9-field step and shoot IMRT with HT for inhomogeneous dose distribution by dose painting by numbers for dose escalation to hypoxic subvolumes. HT achieved a higher degree of conformity. Therefore, in this planning competition, the authors summarized both implemented techniques (IMRT and HT) as suitable for dose escalation [6]. Only small differences for target coverage were seen by Stromberger et al. for planning comparison of a SIB concept in bilateral and unilateral neck irradiation for intensity modulated proton therapy (IMPT), HT, and VMAT [3].
Concerning OARs, the comparison of DVH parameters for parotid gland, brainstem, and mandible saw HT ahead in our investigation. Despite the favored position of HT in many parameters, IMRT and VMAT ensured safe coverage of target volumes and adherence to the OAR constraints per trial protocol. The small differences seen in the head-to-head comparison will probably have no impact on clinical outcome. Other groups investigating different RT techniques and fractionation concepts for head and neck cancer hypothesized no clinical impact of small dosimetric differences [5,7] as well. Comparing D mean to the ipsilateral parotid gland, we achieved the lowest dose by HT (31.1 Gy). There was a difference of 2 Gy compared to IMRT (33.1 Gy). HT reached also best contralateral D mean (22.6 Gy) to the parotid gland (VMAT and IMRT > 26 Gy). Jacob et al. investigated the influence of leaf and jaw parameter on plan quality for head and neck RT in nine patients by IMRT, VMAT, and HT. They concluded that all plans fulfilled OAR constraints, but HT lowered the dose to the parotid glands by best dose homogeneity. Changes of leaf thickness and jaw width had no additional benefit for OAR sparing [22]. Inconsistent results favoring one technique over another for dose reduction to salivary glands are reported [9,[23][24][25][26][27]. The groups of Wiezorek [27] and Holt [28] examined head and neck RT plans created by diverse IMRT techniques and various machine parameters between different institutions, the conclusion was the same as in many single center experiences: good target coverage and acceptable OAR sparing for all techniques with slight preference. At one point, HT was superior, on occasion VMAT or IMRT. Both multicenter planning comparison trials delineated the PTV (respective SIB volumes) and OARs in the core unit. Thus, the uncertainty in target volume delineation was excluded and the results clearly concentrate on the physical aspect of RT planning and treatment delivery. The aim of this study consisted in proof of principle for using different TPS for planning of real patients in preparation of a multicenter study. For the purpose of initiating a multicenter IMRT-trial, a dummy run for all participating centers with a central QA core unit would be necessary. Early on, this was recommended by Wallner et al. in 1989 [29] and renewed by McDowell in 2019 [10]. In order to reduce unnecessary dose deposition to OARs and surrounding healthy tissue, the trial protocol recommended triple re-planning [12] based on results of the work of Duma et al. [30] and others e.g. [31]. The onboard imaging units of linear accelerators increase the accuracy of patient positioning (daily CB-CT, MV-CT) and allow for adaptive planning. Consequently, the target volumes and OARs can be observed and adapted if necessary.
This analysis was done with the primary image set for initial RT planning. The re-planning scenarios as performed for each patient during the pre-study are not shown.
MRI for GTV definition is important for safety of target and OAR delineation due to better spatial resolution. Over the last years, many results were confirmed for implementation of molecular imaging and widened sequences of MRI, e.g. diffusion-weighted MRI for GTV definition [32] and also better definition of OAR, for instance, swallowing structures [33]. In a dummy run, Felice et al. demonstrated the smaller GTV definition and lower inter-observer variability for MRI-based contouring compared to CT-based target volume definition [34]. Implementation of molecular imaging is a widely investigated field, not only in RT planning of head and neck cancer [35]. Some trials investigating dose escalation integrate the PET signal by employing different tracers and voxel-based RT-planning [5,6,[36][37][38][39]. Hence, the dose escalation strategy based on PET data to overcome troubleshooters like hypoxia or proliferation is not clear [40]. Until now, no large randomized trial on this topic has been published. Despite the advent of implementation of biological imaging into target volume delineation for head and neck cancer treatment aimed at attacking, for instance, hypoxic subvolumes by dose escalation, we resigned at the level of the pre-study of the performance of 18-F-FMISO-PET.
We generated in one center the presented results by calculating 18 different RT plans for a dose escalation study by using the two TPS on datasets of six real patients. The planning group consisted of two radiation oncologists (SP and MD) and two physicists (SK and MO) experienced in the TPS for head and neck cancer planning. Our results are comparable to those of other investigators of single or multicenter planning studies despite our homogeneous in-house team. The statistical results of small sample sizes always require careful interpretation. A significant difference detection calls for a defined number of dose values or a large difference between the values of the comparators. Possible sources for bias are discussed by van Gestel et al. contributing to different interpretations of planning comparisons by various investigators. One of the drawbacks of our study is the use of only those TPS which were available at our department and the limited number of patients. However, in light of these limitations, it was possible to reach the planning objectives defined by the ESCALOX trial protocol [12].

Conclusion
The analysis of the quality parameters for RT planning revealed that the tested TPS can produce comparable plan quality for this dose escalation trial. Despite the very good results of HT planning for coverage, conformity, and homogeneity of these plans, also the TPS for calculation of VMAT and IMRT achieved acceptable results for a double simultaneous integrated boost concept for dose escalation in head and neck cancer delivered by IMRT or VMAT. The tested TPS and IMRT-techniques are allowed for the ESCALOX trial.
Additional file 1. Raw data from DVH for analysis of plan comparison parameter.