A dosimetric evaluation of knowledge‐based VMAT planning with simultaneous integrated boosting for rectal cancer patients

RapidPlan, a commercial knowledge‐based optimizer, has been tested on head and neck, lung, esophageal, breast, liver, and prostate cancer patients. To appraise its performance on VMAT planning with simultaneous integrated boosting (SIB) for rectal cancer, this study configured a DVH (dose‐volume histogram) estimation model consisting 80 best‐effort manual cases of this type. Using the model‐ generated objectives, the MLC (multileaf collimator) sequences of other 70 clinically approved plans were reoptimized, while the remaining parameters, such as field geometry and photon energy, were maintained. Dosimetric outcomes were assessed by comparing homogeneity index (HI), conformal index (CI), hot spots (volumes receiving over 107% of the prescribed dose, V107%), mean dose and dose to the 50% volume of femoral head (Dmean_FH and D50%_FH), and urinary bladder (Dmean_UB and D50%_UB), and the mean DVH plotting. Paired samples t‐test or Wilcoxon signed‐rank test suggested that comparable CI were achieved by RapidPlan (0.99 ± 0.04 for PTVboost, and 1.03 ± 0.02 for PTV) and original plans (1.00 ± 0.05 for PTVboost and 1.03 ± 0.02 for PTV), respectively (p > 0.05). Slightly improved HI of planning target volume (PTVboost) and PTV were observed in the RapidPlan cases (0.05 ± 0.01 for PTVboost, and 0.26 ± 0.01 for PTV) than the original plans (0.06 ± 0.01 for PTVboost and 0.26 ± 0.01 for PTV), p < 0.05. More cases with positive V107% were found in the original (18 plans) than the RapidPlan group (none). RapidPlan significantly reduced the D50%_FH (by 1.53 Gy/9.86% from 15.52 ± 2.17 to 13.99 ± 1.16 Gy), Dmean_FH (by 1.29 Gy/7.78% from 16.59±2.07 to 15.30±0.70 G), D50%_UB (by 4.93 Gy/17.50% from 28.17±3.07 to 23.24±2.13 Gy), and Dmean_UB (by 3.94 Gy/13.43% from 29.34±2.34 to 25.40±1.36 Gy), respectively. The more concentrated distribution of RapidPlan data points indicated an enhanced consistency of plan quality. PACS number(s): 87.55.de; 87.55.dk

Systems, Palo Alto, CA) uses a pool of selected plans with consistent high quality as historical knowledge to train a DVH estimation model which predicts achievable DVH ranges and acceptable trade-offs during the semi-automatic plan optimization for the prospective patient.
Relative to the conventional experience-based planning, superior or comparable results of RapidPlan have been reported in the preliminary applications to head and neck, lung, oesophageal, breast, hepatocellular, and prostate cancer patients. (12)(13)(14)(15)(16)(17) However, consensus has been reached by these studies that both the model training and plan evaluation should be investigated further in a larger population and other cancer types in order to gain more experience and confidence before it is extensively applied clinically world-wide.
This study retrospectively selected 150 preoperative simultaneous integrated boosting (SIB) VMAT plans that have been clinically approved and delivered for rectal cancer patients: 80 of them were manually refined and used to train the DVH estimation model, which was subsequently used to reoptimize the MLC sequences of the remaining 70 cases. Relative to the manually optimized clinical plans, dosimetric comparison was conducted to evaluate the performance of RapidPlan on semiautomated optimization of rectal VMAT plans with SIB.

A. Plan selection
In accordance with the scope and clinical goals of this research, 150 manually optimized and consecutively treated plans were retrospectively selected. The gross target volume (GTV) was defined as the primary tumor, the mesorectal space, and the involved lymph nodes. The clinical target volume (CTV) was defined as the GTV, presacral region, mesorectal/lateral lymph nodes, internal iliac lymph node chain, and pelvic wall area. (18) The CTV also covered the external iliac lymph nodes when anterior organ involvement was suspected, and covered the inguinal lymph nodes when the lower third of the vagina was invaded or major tumor extension into the internal and external anal sphincter was observed. (19) The PTV boost and planning target volume (PTV) were created by adding an isotropic margin of 5 mm to the GTV and CTV, respectively. A total dose of 50.6 Gy and 41.8 Gy in 22 fractions was prescribed to 95% of PTV boost and PTV simultaneously.
Other planning goals included: a steep dose falloff from 50.6 Gy to 41.8 Gy in the external margin of 5 mm from PTV boost border into PTV (depending on the relative geometry of PTV boost and PTV); near maximum dose D 2% < 107% of 50.6 Gy (i.e., D 2% < 54.2Gy); dose to 50% of femoral head (D 50%_FH ) and urinary bladder (D 50%_UB ) volumes < 20.0 Gy and < 30.0 Gy, respectively; and to minimize the mean dose to the femoral head (D mean_FH ) and urinary bladder (D mean_UB ). All plans were created using 10 MV photon, 1 full arc, ± 10° collimator angle, and Millennium 120 MLCs based on Varian Trilogy accelerators.

B. Model configuration and knowledge-based treatment planning
Based on the Varian RapidPlan engine (V13.5), the anatomic structures, field geometries, dose matrices, and plan prescriptions of 80 aforeselected plans were extracted as historical knowledge to train a DVH-estimation model. (20) The PTV boost volumes ranged from 54.27 to 622.68 cm 3 (mean ± SD = 179.46 ± 93.60), and the PTV volumes ranged from 566.03 to 1688.26 cm 3 (mean ± SD = 1209.25 ± 181.82). Potential outliers as suggested by the statistical verification were examined and processed one by one, yet the diversity of OAR (organs at risk) geometries in the model were kept to accommodate the varieties of new patients. (16,21) The confirmed outliers were either removed, rematched, recontoured, or replanned by senior physicists to ensure only "good knowledge" was incorporated into the model and passed on to prospective plans. (20) According to the manufacturer, the geometry-based expected dose (GED) algorithm of RapidPlan divides the OARs into four subvolumes: the regions of out-of-field (scattered dose only), leaf-transmission (MUs-dependent), in-field (modulated the most), and target overlap (comparable to the target dose) respectively. (20) Therefore, the model is not intended for target dose estimations but works on the in-field regions primarily; hence, the dose-volume constrains for the targets were manually embedded to the model as fixed objectives, which were universally applied to all RapidPlan-generated plans.
The remaining 70 plans were duplicated for testing the performance of the RapidPlan model. The PTV boost volumes ranged from 76.93 to 342.48 cm 3 (mean ± SD = 177.23 ± 75.02), and the PTV volumes ranged from 925.41 to 1941.6 cm 3 (mean ± SD = 1243.24 ± 199.82). Using the objectives generated by the model, the original MLC sequences were redesigned, while the other parameters such as the field geometry and photon energy were maintained. To evaluate the OAR exposure based on adequate and similar target dose coverage, both RapidPlan and the original plans were normalized to ensure 95% of both PTV boost and PTV were covered by their corresponding dose prescriptions. (Because normalization can be done on one target only in Eclipse, it was performed based on the more underdosed target; hence the other target may be slightly overdosed afterwards).

C. Plan evaluation and statistical methods
The following metrics were evaluated to appraise the dosimetric difference between the knowledge-based and experience-based planning: 1) homogeneity index (HI) of PTV boost and PTV, defined as (D 2% -D 98% ) / D 50% ; 2) conformity index of PTV boost (CI PTVboost ) and PTV (CI PTV ), defined as the volume enclosed by the corresponding prescription isodose surface divided by the target volume; 3) the relative volume of the hot spot exceeding 107% of prescribed dose in PTV boost (V 107% , i.e. V 54.14Gy ); 4) the dose to the 50% of the femoral head and urinary bladder volume (D 50%_FH and D 50%_UB ); 5) the mean dose to the femoral head and urinary bladder (D mean_FH and D mean_UB ); and 6) total monitor units (MU). Moreover, based on an in-house MATLAB code (MathWorks, Natick, MA) and the DVH data exported in tabular format, the dose-volume metrics were averaged over the 70 patients in each planning technique group for plotting comparison.
Based on SPSS (V 21.0), paired samples t-test was used to compare the data couples when the normality test was passed, otherwise Wilcoxon signed-rank test was performed to analyze the differences. The significance level was put to p < 0.05 (two-tailed). All the plotting was performed by SigmaPlot software Version 10.0 (Systat Software, Inc., San Jose, CA). Table 1 displays the dosimetric statistics of the 80 cases for model training before (Training) and after replanning (Replanned) by the senior physicists during the model verification process. Much larger magnitude of dose reduction to the urinary bladder than to the femoral head was achieved by expert replanning.

III. RESULTS
Both RapidPlan and original plans were readily or nearly acceptable before the normalization. Only minor adjustment was performed for the coverage of PTV boost in 55 RapidPlan and 52 original cases, respectively. The rest of the plans were normalized for the coverage of PTV. Table 2 lists the numerical statistics of the 70 patients as planned manually (original) or semiautomatically using model-generated objectives (RapidPlan). The number of decimal places could not show the slight but significantly lower HI PTV of RapidPlan (0.255) than that of the original plans (0.263). As for the hot spot, positive V 107% was not observed in any RapidPlan cases, but appeared in 18 out of 70 original plans (25.70%): the greatest two V 107% values were 17.24% and 10.73%, respectively, and the rest were no larger than 2.28%. Limited by the subjective judgment, the suboptimal hot spots were deemed as acceptable trade-offs at the time; which could have been avoided, however.  The distributions of target HI and CI are displayed in Figs. 1(a) and 1(b), and the distributions of OAR dosimetry (D 50%_FH , D mean_FH , D 50%_UB , and D mean_UB ) are displayed in Figs. 2(a) and 2(b), respectively. The bottom and top of the boxes, and the band in between, indicate the first and third quartiles, and the median value, respectively. The whiskers show the variability outside the upper and lower quartiles, the ends of which suggest the 10th and 90th percentiles, respectively. The outliers are plotted as individual crosses. Figure 3 shows the average DVHs of the 70 testing patients as planned by conventional method (solid lines) and knowledge-based solution (dashed lines), respectively. Figs. 3(a), (b), (c), and (d) display the mean DVHs for PTV boost , PTV, femoral head, and urinary bladder, respectively.

IV. DISCUSSION
Consistent with previous published findings, (8)(9)(10)(11) the automatic process of RapidPlan has largely improved the planning efficiency; based on our stand-alone workstation (two processors of 2.00 GHz, 32.0 GB RAM, 64-bit Windows 7 Ultimate system), a typical case could be finished in about 30 min without any interactive objective adjustment.
For both the target (Fig. 1) and OAR (Fig. 2) metrics, the reduced degree of data dispersion in the RapidPlan group demonstrated its superior quality consistency and less variety than the original manual plans, which agreed with the RapidPlan rationale of alleviating the subjective dependency of individual planners. (1)(2)(3)(4)(5)(6)(7) Improved OAR sparing by RapidPlan can be interpreted from the distributions of Fig. 2, except for a few outliers where the original plans achieved lower OAR exposure. As an example of explanation, the red-circled data point of D 50%_FH in Fig. 2(a) was optimized more aggressively in the original plan at a cost of higher dose to the urinary bladder, yet a balance was struck by the RapidPlan reoptimization. Of the 108 OAR outliers in the two groups (crosses in Fig. 2), 42 or 38.89% were from the same patients, indicating that these either "challenging" or "simple" cases for manual planning were also so for RapidPlan solution.
Quantitatively speaking (Table 2), minor but significant improvement (by 0.01) of dose homogeneity to PTV boost was observed in the RapidPlan group (p < 0.01), and the tiny difference (by 0.01) of CI PTVboost was insignificant (p = 0.05). No obvious disparity (by < 0.001) was found in CI PTV (p = 0.90). These similar target dose metrics (also seen from the nearly overlapped target DVH lines in Figs. 3(a) and 3(b)) provided a fair basis for the comparison of OAR exposure, which is ascribed to the fact that RapidPlan does not estimate optimization objectives for the targets.
The overwhelmingly larger HI of PTV than that of PTV boost was attributable to the inherent features of SIB plans. However, the RapidPlan-generated objectives have significantly reduced the HI PTV by 0.008 (p < 0.01), hence improved the dose gradient in this zone. A possible justification for this change was the inclusion of the transitional area (5 mm from PTV boost to PTV) as one of the "OARs" of our RapidPlan modeling rather than as a "target," which predicts the DVH constraints (upper objectives) based on patient-specific evaluations. The small magnitude of improvement might be ascribed to the huge geometric varieties of the transitional zone; the diverse volume, shape, and location of the targets may have largely complicated the model regression and estimation than was the case with the relatively regular OAR structures.
As shown in Figs. 3(c) and 3(d) and Table 2, RapidPlan has significantly (p < 0.01) reduced the dose to the femoral head by 1.53 Gy (9.86%) and by 1.29 Gy (7.78%) for the D 50% and D mean , respectively. For the urinary bladder, the reduction magnitudes for the D 50% and D mean were 4.93 Gy (17.50%) and 3.94 Gy (13.43%), respectively (p < 0.01). RapidPlan also showed a stronger capability of controlling the hot spots (V 107% ). These improvements were achieved at a cost of slightly but significantly more MUs consumed by the RapidPlan cases (increased by 15 MU or 3.73%, p < 0.01). The apparently "unavoidable" trade-offs or "hard-to-achieve" goals of OAR sparing during the manual optimization were well managed by RapidPlan automatically, suggesting its potential value for overcoming the subjective limitations that arise during the conventional trial-and-error iteration of manual planning. These observations were highly consistent with earlier prostate studies at a similar clinical site. RapidPlan-associated dose reduction to the bladder (7,13) and femoral head (7,12,13) were macroscopic and significant. Similar improvement of OAR sparing and consistency has also been widely reported in other clinical sites. (14)(15)(16)(17) Should the performance of RapidPlan be strongly dependent on the expertise of the original planner and the model developer, a greater model library consisting better plans may further enhance the results of knowledge-based planning. (22) Replanning by senior physicists significantly improved the quality of training plans during the model verification process (Table 1); that, however, is sorely time-consuming and clinically impractical for all prospective patients. By means of machine learning, RapidPlan successfully applies the expertise of senior physicists onto prospective patients to generate above-average plan quality without interactive human intervention. Relative to the clinical plans, an even larger magnitude of dose reduction to the femoral head was achieved by RapidPlan than by the best effort of manual replanning (Tables 1  and 2).
"Varian-provided models," as preconfigured by the world-leading institutes, should be of high quality; however, modifications accommodating custom contouring and planning protocols were not allowed for these modules. This nonrevisability also prevents new representative cases from being added to the library in order to enlarge the scope of model applications. As other potential limitations of RapidPlan, some critical parameters such as the beam energy and field geometry are not optimizable by the model, which may vary dramatically among different centers and planners. That was why custom models were more popular in the previous publications. (12)(13)(14)(15)(16)(17) As for this study, the historical plans represent the average quality that is currently and reasonably achievable, and the improvement may be mostly attributable to the personalized objective as estimated by RapidPlan, which passes on the expertise as incorporated in the model to prospective plans.

V. CONCLUSIONS
Using a DVH estimation model trained with 80 historical VMAT SIB plans of high quality for rectal cancer patients, the knowledge-based replanning of 70 clinical plans displayed significantly superior characteristics compared to those of the conventional manual planning in terms of target dose falloff, hot spot control, and OAR sparing. Suboptimal manual plans could be improved by the RapidPlan model, hence enhancing the consistency of plan quality.