Accuracy of one algorithm used to modify a planned DVH with data from actual dose delivery

Detection and accurate quantification of treatment delivery errors is important in radiation therapy. This study aims to evaluate the accuracy of DVH based QA in quantifying delivery errors. Eighteen previously treated VMAT plans (prostate, H&N, and brain) were randomly chosen for this study. Conventional IMRT delivery QA was done with the ArcCHECK diode detector for error‐free plans and plans with the following modifications: 1) induced monitor unit differences up to ±3.0%,2) control point deletion (3, 5, and 8 control points were deleted for each arc), and 3) gantry angle shift (2° uniform shift clockwise and counterclockwise). 2D and 3D distance‐to‐agreement (DTA) analyses were performed for all plans with SNC Patient software and 3DVH software, respectively. Subsequently, accuracy of the reconstructed DVH curves and DVH parameters in 3DVH software were analyzed for all selected cases using the plans in the Eclipse treatment planning system as standard. 3D DTA analysis for error‐induced plans generally gave high pass rates, whereas the 2D evaluation seemed to be more sensitive to detecting delivery errors. The average differences for DVH parameters between each pair of Eclipse recalculation and 3DVH prediction were within 2% for all three types of error‐induced treatment plans. This illustrates that 3DVH accurately quantifies delivery errors in terms of actual dose delivered to the patients. 2D DTA analysis should be routinely used for clinical evaluation. Any concerns or dose discrepancies should be further analyzed through DVH‐based QA for clinically relevant results and confirmation of a conventional passing‐rate‐based QA. PACS number(s): 87.56.Fc, 87.55.Qr, 87.55.dk, 87.55.km

of ion chambers/diodes. Similar to gamma index (GI) analysis, distance-to-agreement (DTA) analysis indicates how the measured dose to the phantom agrees with planned dose in the TPS. GI searches for dose tolerance within the distance tolerance, whereas the DTA searches for the exact dose within distance tolerance. Even though GI and DTA are good indicators of deliverability of dynamic treatment plans, larger differences in a relative small volume might be overshadowed in the overall passing rate, resulting in clinically unacceptable doses to target structures and OARs. (4,5) Thus, new approaches based on measurement-reconstructed dose distributions are being investigated to predict clinically relevant results. (6) DVH-based quality assurance (hereafter referred to as the "Planned Dose Perturbation method" used in 3DVH (Sun Nuclear, Melbourne, FL)) has been discussed by various groups recently. (7)(8)(9)(10)(11)(12) Validations of this DVH-based QA have been performed by different researchers, proving that dose reconstructed via 3DVH is consistent with dose reconstructed through other detectors and algorithms. (6,(13)(14) Significant errors were discovered via DVH-based QA, where the GI/DTA methods did not find the errors, indicating potential pitfalls in using recommended GI/DTA metrics and action levels. (15) Significant clinical errors were observed in DVH metrics where the GI analysis incorrectly showed high pass rates. (10) Positive results were also reported by the GI analysis method when numerous types of errors, including MLC positioning errors, wrong dynamic wedge angle, cold/hot spots of varied sizes, and collimator rotation errors were introduced to test the sensitivity of DVH-based QA. (9,11,16) To overcome the limitations of traditional evaluation, a hybrid QA concept, involving a combination of DVH-based QA and GI analysis, has been recommended. (8,17) In this study, three types of clinically relevant errors were introduced into deliverable treatment plans to test the sensitivity of 3DVH software. The ability of 3DVH software to, first, detect these errors and, second, properly to account for the error through evaluation of delivered DVH, was studied.

A. Treatment plans and error introduction
A total of 18 clinically treated VMAT plans were randomly chosen for this study. The sample included six brain plans, six prostate plans, and six head and neck (H&N) plans. All plans were created with 6 MV photon beams utilizing the Eclipse TPS version 11.0 (Varian Medical Systems Inc., Palo Alto, CA). Analytical anisotropic algorithm (AAA) and 0.2 cm grid size were used for plan calculation.
Three types of errors, namely monitor unit (MU) difference, control point (CP) deletion, and gantry angle shift, were investigated in this study. A typical full-arc VMAT field in Eclips has 178 CPs, and each CP contains information for MLC shape, dose fraction, gantry speed, etc. In MU difference error, specific MU changes (ranging from -3% to +3% at 1% interval) were applied to each arc of the error-free plan. This error was designed to test the sensitivity of the 3DVH system in detecting dose deviations due to changes in MU. For CP deletion study, three, five, and eight control points were deleted from each arc of the error-free plans so as to simulate the potential data loss resulting from transferring plans via the network. (18) Finally, uniform gantry angle deviations were introduced to six of the plans to test the magnitude of dose fluctuation that could potentially be introduced from gantry angle variation. MUs were changed directly in the TPS for dose-difference errors. For the other two types of errors, the error-free treatment plans were exported to MATLAB (MathWorks, Inc., Natick, MA) for manipulation of the control points and gantry angles. Consequently, the error-induced plans were imported back into the TPS for validation.

B. Dosimetric verification
Nine treatment plans were delivered on a Varian TrueBeam equipped with HD Millennium 120 MLC, while the rest were delivered on a Varian Trilogy coupled with Millennium 120 MLC (also from Varian). Most of the H&N cases were treated on the Trilogy linac, due to the field size limitations of the TrueBeam. ArcCHECK (Sun Nuclear) was used to collect measurements with cavity plug inserted as recommended by the 3DVH manual for Eclipse. Verification plans on this phantom for each patient were calculated in the Eclipse TPS from their own error-free treatment plans. An array calibration was performed at the beginning of this study, and absolute dose calibration was measured each day prior to measurement. Four DICOM files from the TPS: radiotherapy (RT) plan, RT structure, RT dose for patient treatment, and RT dose for verification on phantom, all from error-free plans, were sent to 3DVH. Here, the original RT doses from Eclipse were considered as standard throughout this study. Composite ArcCHECK measurements for both error-induced and error-free plans were saved and imported to 3DVH. The imported measurement file and DICOM files were used to reconstruct a measurementguided dose reconstruction on the corresponding patient structures in the 3DVH environment.

C. Analysis
The sensitivity investigation of 2D and 3D global DTA passing rate was confined to only two acceptance criteria (3%/3 mm and 2%/2 mm) with respect to dose-difference errors, control point deletion errors, and gantry-angle shift errors for VMAT plans. The tolerance levels for passing rate evaluation were set to 95% for 3%/3 mm and 90% for 2%/2 mm.

C.1 2D DTA analysis
Two-dimensional global DTA were obtained from comparison between measured planar ArcCHECK dose files (measured planar dose file) and Eclipse verification dose files using SNC Patient software (Sun Nuclear). Measurement uncertainty option in the SNC Patient software was not used for the analysis.

C.2 3D DTA analysis
A concise introduction to the dose reconstruction process in 3DVH employing ArcCHECK, specifically the ArcCHECK planned dose perturbation (ACPDP), is presented here. A detailed version is presented in the article by Nelms et al. (19) ArcCHECK measurement is synchronized to each control point via recorded gantry angle information collected by virtual inclinometer at 50 ms intervals. A set of time-resolved subbeams are established based on the ArcCHECK measurement according to their time intervals. Utilizing a predefined planned dose perturbation (PDP) model configuration (which contains a set of specific parameters to a single permutation of linac model, MLC model, and beam energy), the 3D relative dose grid is calculated for each subbeam. (20) Subsequently, this 3D relative dose grid for each subbeam is morphed with the ArcCHECK cylindrical phantom via scaling factors determined by relevant entry and exit absolute doses. The summation of all the absolute subbeam dose grids, scaled by a global correction factor, gives the output of cumulative measurement-guided dose grid on the phantom. Finally, a correction matrix is calculated as the ratios of reconstructed phantom dose and the planned phantom dose for every voxel. The voxel correction matrix is applied to the planned patient dose, resulting in the perturbed 3D patient dose. Further analysis is then carried out comparing this perturbed dose to the original patient dose from Eclipse. For 3D DTA analysis, all the data were normalized to maximum dose. A low-dose threshold of 10% was applied; thus, dose values below 10% of the maximum value were excluded from the analysis.

C.3 DVH-based analysis
Since the passing rate would not yield any clinically relevant data, reconstructed DVH curves and DVH parameters in 3DVH were generated by applying the dose distribution to patient structures and compared to corresponding parameters calculated directly from the error-free Eclipse treatment plans. The following parameters (based on treatment anatomy) were evaluated for the DVH based study: D mean and D 95 for PTV; D 50 for bladder, rectum, left parotid, and right parotid; D max for spinal cord and optic chiasm. Percentage differences for all these DVH parameters between 3DVH and Eclipse were determined as (3DVH-Eclipse)/Eclipse*100%.

III. RESULTS
A. Evaluation of dose differences: DTA versus dose-volume parameter All 18 error-free plans had a composite passing rate of more than 95% for both 2D and 3D DTA evaluation using the 3%/3 mm criteria. For the 2D analysis using the 2%/2 mm criterion, four cases yielded passing rates less than 90%, whereas all cases passed the 90% passing rate criteria for 3D DTA analysis using the 2%/2 mm criteria.
The detailed results of DTA analysis for error-induced and error-free plans are presented in Table 1, while the number of plans passing the preset criteria (error detection rate) is shown in Table 2. For 2D 3%/3 mm analysis, over half of the -3% and +3% MU modified plans yielded passing rates less than 95%, due to fact that 3% is at the tolerance boundary. Also, the 2%/2 mm analysis detected most of the -3%/+3% MU changes. The 3D global passing rates generally gave higher values than 2D, but they decreased dramatically when the magnitude of MU errors reached +3% (two and eight cases passed 2D 2%/2 mm and 3%/3 mm, respectively, while zero and two cases passed the criteria for 3D DTA 2%/2 mm and 3%/3 mm). For -3% MU changes, the number of cases that fell into the action level was about the same. Percentage differences of selected dose-volume parameters between 3DVH perturbation and Eclipse are listed in Table 3. Deviations in parameters between error-induced DVH parameters and error-free DVH parameters for PTV and chosen OARs agreed with the magnitude of the induced error in general, but slightly larger variation existed for some OARs, such as optic chiasm and spinal cord. Figure 1(a) shows an example of corresponding DVH changes for an H&N case. The bold lines (DVHs from Eclipse calculation) were aligned with corresponding thin lines (DVHs from 3DVH estimation).  Table 1 also shows the passing rates for the control point deletion plans studied. Most of the error-induced plans failed the 2D DTA analysis, for both 2%/2 mm criteria as well as for the 3%/3 mm criteria. As the number of deleted control points increased, the number of plans failing the criteria also increased for both 2%/2 mm and 3%/3 mm. The 2%/2 mm is more sensitive than the 3%/3 mm in detecting these errors, as seen from Table 1.

B. DTA passing rate vs. dose-volume parameter in detecting deleted control points
The 3D analysis did not show significant differences in pass rates when measurements of CP-deletion-induced plans were compared with error-free plans from the Eclipse TPS. The average pass rates for the three-CPs-missing, five-CPs-missing, and eight-CPs-missing cases were above the passing criteria (90% for 2%/2 mm and 95% for 3%/3 mm, respectively), as seen from Table 1. Also the number of plans passing both the 3D DTA analysis (2%/2 mm and 3%/3 mm) is high for all CP deletion cases, as illustrated in Table 2.
The percentage difference in dose-volume parameters between the CP-error-induced plans and error-free Eclipse plans are listed in Table 4. Most of the DVH parameters did not show significant differences between the two plans even for eight CPs-deletion plans, indicating that DVH analysis for CP deletion may not detect the errors. Differences for optic chiasm showed slight overdose compared to other structures, while the left parotid was relatively underdosed. All the other percentage differences were close to zero. Most of the 3DVH predictions agreed with the same error-induced plan recalculated in the Eclipse TPS. An illustration of these changes can be seen in Fig. 1(b).

C. Sensitivity to gantry angle: comparison of DTA passing rates and dose-volume parameters
The results for the 2° gantry angle shift in the clockwise (indicated by "+2°") and counterclockwise (indicated by "-2°") direction are shown in Table 1. A gantry angle shift of 2° roughly corresponds to 3 mm spatial shift on the detector level of ArcCHECK phantom. The 3D DTA analysis did not detect any errors indicated by the high passing rates for all plans, as shown in Table 2. In contrast, the 2D DTA analysis did show low pass rates for most of the cases. Table 2 also suggested that moving clockwise and counterclockwise seems to produce similar results in terms of the passing rates. Table 4. Average percentage difference and standard deviation of selected dose-volume parameters for control point deletion errors with minimum and maximum value in the bracket. PTV refers to four prostate cases and four brain cases, while selected OARs for corresponding cases are: 1) rectum and bladder for prostate cases, and 2) optic chiasm and brainstem for brain.  Table 5 illustrates the changes in DVH parameters due to gantry-angle shift errors for both target and critical structures. Minor differences between error-free plans and gantry angle shifted plans were found, mostly less than 2% (except the optic chiasm in one case). This suggests that DVH-based analysis provides similar results as 3D DTA analysis in Table 1. Recalculation in Eclipse of the same error-induced plans agreed with 3DVH prediction, as shown in Fig. 1(c).

IV. DISCUSSION
A 3D analysis yielded higher pass rates than 2D analysis for the error induced plans, as shown in Table 1. For example, the 2° uniform gantry angle shift represents a 3 mm shift at the level of the ArcCHECK detectors. The 2D analysis was able to detect these errors, since any spatial shift in the 2D analyzing plane will result in incorrect dose comparisons between the measured and predicted dose matrix. For 3D analysis, addition of the third dimension results in more points to search within the evaluation criteria and configuration of low-dose threshold results in a fairly large amount of points involved in the comparison. Thus, an error within a relatively small area/volume will be neglected, resulting in higher passing rates.
As seen from Table 2, none of the eight-CP-deleted plans passed the 2D DTA 2%/2 mm criteria, whereas 75% of the eight-CP-deleted plans passed the 3D DTA 2%/2 mm criteria. Even though the beam weight of eight CPs could contribute up to 4% of the total dose per field, the detected dose might also be affected by the MLC shape, which made it hard to estimate the expected deviation from error-free plan. The cold spot formed by the consecutive deleted CPs may be detected by the 2D DTA analysis, but might be hidden within the entire volume comparison in the 3D DTA analysis. Therefore, as shown in Table 2, 2D DTA analysis demonstrated a superior error-detectability for all three error types studied.
Minor differences between reconstructed DVH parameters from 3DVH and TPS DVH parameters were observed as seen in Tables 3, 4, and 5. These results indicate that 3DVH can accurately predict errors in treatment delivery. However, there are small differences in the slopes of the 3DVH curves and the TPS DVH curves, as evident from Fig. 1(c). These slope differences can be a consequence of uniform built-in machine configuration in the model library Table 5. DVH parameter comparison summary for gantry angle shifts. Average percentage difference and standard deviation of selected dose-volume parameters calculated by 3DVH software and Eclipse TPS for gantry-angle shift errors with minimum and maximum value in the bracket. PTV refers to three prostate cases and three brain cases, while OARs for the corresponding cases are: 1) rectum and bladder for prostate cases, and 2) optic chiasm and brainstem for brain cases. (not our linac model) in the 3DVH, such as machine type, photon energy, and MLC model. Also, alignment of ArcCHECK before measurements might have a slight effect on the slope. Moreover, the 3DVH algorithms might also make some contributions to the DVH differences. Larger discrepancies were seen in some OARs with small volume, or to volume where the doses were reconstructed using the measurements from the end of the ArcCHECK detectors. In brain cases, as presented in Fig. 1(b), the perturbed DVH curves of optic chiasm, which had an extremely small volume, were relatively coarse due to the limited number of voxels within the structure. Unlike the 2D analysis, the 3DVH and Eclipse recalculation did not show significant differences in the DVH parameters when comparing the error-induced plans to the error free plans for both the CP deletion and gantry-angle shift plans. This is evident from Tables 4 and 5. These results imply that some of the errors do not significantly affect the dose distribution within the patient. But a good QA system must detect errors in treatment planning and delivery regardless of how significant the resulting error in delivery impacts delivered dose to the patient.

V. CONCLUSION
Detection of delivery errors is crucial in radiation therapy. The 2D DTA analysis is sensitive to detecting errors in plan delivery compared to 3D DTA analysis. DVHs from 3DVH were found to correspond to DVHs of error-induced plans in Eclipse. Thus, any dose discrepancy or uncertainty should be further analyzed through DVH-based QA to evaluate clinically relevant results.