Dosimetric and radiobiological comparison for quality assurance of IMRT and VMAT plans

Abstract Introduction The gamma analysis used for quality assurance of a complex radiotherapy plan examines the dosimetric equivalence between planned and measured dose distributions within some tolerance. This study explores whether the dosimetric difference is correlated with any radiobiological difference between delivered and planned dose. Methods VMAT or IMRT plans optimized for 14 cancer patients were calculated and delivered to a QA device. Measured dose was compared against planned dose using 2‐D gamma analysis. Dose volume histograms (for various patient structures) obtained by interpolating measured data were compared against the planned ones using a 3‐D gamma analysis. Dose volume histograms were used in the Poisson model to calculate tumor control probability for the treatment targets and in the Sigmoid dose–response model to calculate normal tissue complication probability for the organs at risk. Results Differences in measured and planned dosimetric data for the patient plans passing at ≥94.9% rate at 3%/3 mm criteria are not statistically significant. Average ± standard deviation tumor control probabilities based on measured and planned data are 65.8±4.0% and 67.8±4.1% for head and neck, and 71.9±2.7% and 73.3±3.1% for lung plans, respectively. The differences in tumor control probabilities obtained from measured and planned dose are statistically insignificant. However, the differences in normal tissue complication probabilities for larynx, lungs‐GTV, heart, and cord are statistically significant for the patient plans meeting ≥94.9% passing criterion at 3%/3 mm. Conclusion A ≥90% gamma passing criterion at 3%/3 mm cannot assure the radiobiological equivalence between planned and delivered dose. These results agree with the published literature demonstrating the inadequacy of the criterion for dosimetric QA and suggest for a tighter tolerance.


| INTRODUCTION
Treatment plans used in radiation therapy are generally evaluated on the basis of dose distribution and dose volume parameters. Accuracy of treatment delivery of complex treatment plans is assured through a quality assurance (QA) process using ion chambers, films and more commonly with diode array, 1,2 and ion chamber array measurements. 3 Measured data are generally compared against planned data using two-dimensional (2-D) gamma analysis. In reality, treatment target and organs at risk (OARs) present 3-D geometry, and a 2-D gamma analysis-based dosimetric comparison may not provide information about the criticality of a disagreement. Several studies [4][5][6][7] have shown that 2-D gamma analysis fails to detect errors in some cases. Even though detailed analysis is still under investigation, 8 dose volume histogram (DVH)-based dosimetric evaluation can provide structure-by-structure information and a 3-D gamma analysis can be a better option for the QA purpose. Validity of DVH-based evaluation of delivered intensity-modulated radiation therapy (IMRT) plans against the corresponding plans optimized with treatment planning system (TPS) have been demonstrated using film, ion chamber, 9 and BANG3 gel dosimetry. 10 A number of studies show that a DVHbased 3-D gamma analysis provides more reliable comparison than a point-by-point per-beam 2-D gamma analysis of IMRT plans. 8,11,12 Most of the DVH-based studies were based on measurement and/or interpolation of 2-D device measured data into 3-D dose distribution. 7,10,11 The DVH comparison studies have shown differences between planned and measured DVHs as well as differences in mean doses. 10 are and 4-6 times more sensitive to the change in uniform dose. 14 However, none of the above mentioned studies have evaluated whether radiobiological differences existed in any of the cases. A growing recognition of the limitation of dose volume parameters in correlating with biological response has prompted for the use of radiobiological models for treatment planning 15 but QA of all plans is still performed on the basis of dosimetric comparison alone. Very recently, an attempt for a radiobiological comparison between delivered and planned IMRT treatment plans was made using a 2-D QA device (MapCheck, Sun Nuclear) measured dose. 16 Here, we use a cylindrical QA device, ArcCheck, for the measurement and compare the measured data against TPS-calculated (planned) data using 2-D gamma analysis in SNC Patient TM software.
Application of ArcCheck for patient-specific dosimetric QA 12 and DVH-based plan verification using Sun Nuclear's 3DVH â software has been experimentally verified elsewhere. 17

2.B | Dosimetric comparison
Dosimetric comparison between planned and measured data comprised of mean and maximum dose to treatment target and few OARs. For H&N plans, mean dose to GTV, esophagus, larynx, parotids, and maximum dose to brainstem and cord were compared while for lung plans, mean dose to the GTV, heart, esophagus, normal lung, and maximum dose to the cord were compared.
The measured data were compared against the planned data using a minimum of 90% pass rate with 2-D gamma criteria of 3 mm distance to agreement (DTA), 3% dose difference (DD) global 1 using the SNC Patient. A 10% dose threshold and global normalization was used.
The 3DVH software requires two set of data for comparison.
TPS-optimized patient plan, dose distribution, contoured structures and planning CT images of each patient, corresponding TPS-calculated (planned) dose on ArcCheck CT images, and measured data were imported to 3DVH software version 3.

2.C | Radiobiological comparison
DVHs based on measurement and TPS were used in biological models to calculate TCPs and NTCPs using MATLAB. Conventional fractionation scheme (1.8-2 Gy per fraction) was used for the calculations.

2.D | TCPs with Poisson model
Clinical target volume (CTV) is the volume of tumor intended to treat. In our study, GTV included gross tumor and subclinical microscopic disease, and CTV was labeled as GTV. Hence, it is appropriate to calculate TCP for GTV even though dose is prescribed to planning target volume (PTV) to incorporate set up error. In this study, TCPs for H&N and lung GTVs were calculated using the Poisson model, [18][19][20] which is expressed in eq. (1).
Here, D 50 is dose yielding 50% probability for tumor control and c 50 is slope of the dose-response curve at the level of 50% TCP.
Here, Φ is the probit function defined by: Here, x = (EUD -D 50 )/mD 50 , where D 50 is the dose yielding 50% NTCP, obtained from dose-response curve, and EUD is equivalent uniform dose, defined as the dose which distributed uniformly over a structure would produce the same effect as the dose specified by the DVH. The parameter m represents the slope of the dose-response curve. EUD is also defined as a generalized equivalent uniform dose (gEUD) calculated using the series of dose volume pairs (D i . v i ), obtained from the DVH of a structure using the formula expressed in eq. 4.
Here, n is a parameter that determines the dose volume dependence of a given OAR.

2.F | Statistical analysis
The Shapiro-Wilk test was used to test the normality of the data. A DVH comparison based on planned and measured data for a patient (patient 2, Figure 2) plan is presented in Figure 3. As evident from Tables 4 and 5, the differences between two set of TCPs are statistically insignificant for H&N as well lung patients plans.

3.B.2 | NTCP comparison
The EUDs to majority of the OARs calculated based on the planned DVHs and measured data were close to each other and the differences were insignificant for majority of H&N as well as lung patient OARs. However, the differences were significant in few cases. There was a significant difference in NTCPs for larynx in H&N patients and for lungs-GTV, cord, and heart in lung patients. NTCPs from planned   and 99.5% with median pass rate of 96.6% for lung plans. However, for a 2-D gamma analysis based on 2%/2 mm criterion, pass rate for H&N patients ranged between 85.3% and 99.6% with the median pass rate of 98.3%. For lung patient plans, it ranged between 83.3% and 97.1% with the median pass rate of 90.2%. Only five H&N patient plans and three lung patient plans met the passing criterion of ≥90% at 2%/2 mm. The statistical test on the plans passing by ≥90% at 2%/2 mm criterion did not show any radiobiological difference for any of the structures studied.

| DISCUSSION
Our study showed small dosimetric differences between 2-D and 3-D gamma analysis results, which are in line with the results obtained by Infusino et al. 17 using ArcCheck for the measurement.
However, our radiobiological comparisons do not agree with the results from Sumida et al. 16 where the TCPs based on measured data were found to be significantly smaller and NTCPs to be significantly higher than the ones based on planned data. Possible differences could be because of differences in device type, geometry, differences in measurement and analysis techniques, as well as the different radiobiological models used to calculate TCPs and NTCPs. While Sumida et al. had used per-beam analysis using MapCheck measured data, we have used cumulative dose analysis using ArcCheck measured data.   12 and outlined by AAPM Medical Physics Practice Guideline 5.a. 27 Our future study will focus on finding the tolerance criteria for the radiobiological quality assurance.
Another point worth mentioning is that the dosimetric or biological pass rate was not favored by any of the IMRT or VMAT modality. Although this finding is clear in our dataset, generalizing this argument is not in the scope of this study. The investigation of this topic would require a study of a different design, where other elements of the plans such as the level of beam modulation, etc., would be thoroughly studied.

| CONCLUSION S
Differences between 2-D and 3-D gamma analysis results for H&N and lung patients are small and statistically insignificant. The differences between TCPs obtained from the planned and measured data are also small and insignificant. However, the differences in NTCPs based on planned and measured data for a few of the structures studied are statistically significant even though the dosimetric agreements are ≥94.9% at 3%/3 mm DTA. Our study based on 14 patients suggests that ≤94.9% pass rate at 3%/3 mm DTA used for 2-D or 3-D gamma analysis cannot assure the radiobiological equivalence between a delivered and the corresponding planned dose. Hence, radiobiological analysis in addition to dosimetric comparison may have to be considered for the QA of complex radiotherapy plans.

CONFLI CT OF INTEREST
The authors declare no conflict of interest.
T A B L E 7 NTCPs from planned and measured data for lung patient OARs and P-values of statistical test.