Using gamma index to flag changes in anatomy during image‐guided radiation therapy of head and neck cancer

Abstract During radiation therapy of head and neck cancer, the decision to consider replanning a treatment because of anatomical changes has significant resource implications. We developed an algorithm that compares cone‐beam computed tomography (CBCT) image pairs and provides an automatic alert as to when remedial action may be required. Retrospective CBCT data from ten head and neck cancer patients that were replanned during their treatment was used to train the algorithm on when to recommend a repeat CT simulation (re‐CT). An additional 20 patients (replanned and not replanned) were used to validate the predictive power of the algorithm. CBCT images were compared in 3D using the gamma index, combining Hounsfield Unit (HU) difference with distance‐to‐agreement (DTA), where the CBCT study acquired on the first fraction is used as the reference. We defined the match quality parameter (MQP x) as a difference between the x th percentiles of the failed‐pixel histograms calculated from the reference gamma comparison and subsequent comparisons, where the reference gamma comparison is taken from the first two CBCT images acquired during treatment. The decision to consider re‐CT was based on three consecutive MQP values being less than or equal to a threshold value, such that re‐CT recommendations were within ±3 fractions of the actual re‐CT order date for the training cases. Receiver‐operator characteristic analysis showed that the best trade‐off in sensitivity and specificity was achieved using gamma criteria of 3 mm DTA and 30 HU difference, and the 80th percentile of the failed‐pixel histogram. A sensitivity of 82% and 100% was achieved in the training and validation cases, respectively, with a false positive rate of ~30%. We have demonstrated that gamma analysis of CBCT‐acquired anatomy can be used to flag patients for possible replanning in a manner consistent with local clinical practice guidelines.


| INTRODUCTION
Radiation therapy of head and neck cancer is complex especially if the gross disease and possible nodal regions at risk are located in close proximity to several critical structures. Precision radiation therapy increases the probability of success and reduces the risk and severity of complications. Volumetric modulated arc therapy (VMAT) produces dose distributions with steep gradients in order to minimize dose to neighboring healthy organs. Therefore, daily image-guided radiation therapy (IGRT) is necessary to ensure accurate target localization during treatment. In addition, it is common for some patients to experience tumor regression or weight loss during treatment, which may result in anatomical changes that can affect dose delivery to the tumor and organs at risk. 1 Such changes may require the patient to have a repeat CT (re-CT) simulation with the possibility of generating a revised treatment plan based on the changes in anatomy. [2][3][4][5] At our institution, radiation therapists are responsible for documenting changes in patient anatomy as treatment progresses. If anatomical changes are judged to be substantial, the physicist and/or the clinical specialist in radiation therapy (CSRT) is called to make a recommendation to the radiation oncologist as to whether a re-CT simulation is required. This recommendation is based on assessing the potential overdosing of critical organs such as spinal cord and/or if there are visual changes of the gross disease which could result in suboptimal dose coverage to the high dose target volume. Before a decision to re-CT is made, the physicist and CSRT (and perhaps the physician) review several image matches offline to look at systematic trends and assess the magnitude of volume changes. This approach can be time consuming and is dependent on the judgment of several observers. Therefore, we have developed a method to automatically compare cone-beam CT (CBCT) images offline mathematically and provide an alert to the physician when action may be required. The alert provided by the algorithm is based on decision thresholds that are derived from retrospective analysis of CBCT image comparisons combined with re-CT decisions that were made on actual patient cases. Therefore, the algorithm is trained to flag changes in anatomy in a manner consistent with local practice. The overall goal is to improve the efficiency in the decision-making process, since many patients that are reviewed for changes in anatomy do not result in re-CT recommendations. A secondary goal is to provide a quality assurance safeguard to human judgment of anatomical changes.
The question of when to replan during head and cancer treatment has been studied by other investigators. Paganelli  They determined that replanning should be considered for patients who have had large decreases in thickness and circumference at the level of the mastoid tip. Our goal is to develop a method based on image comparison and generate decision criteria based on our clinical process. This is more generic since there are no specific dose thresholds and there is no need to outline structures on CBCT images.
We propose to use the gamma index 9 to compare CBCT image data for the purpose of monitoring changes in anatomy during treatment. There are other metrics that can be used to evaluate image registration quality or similarity. Wu and Murphy 10 developed a neural network approach to determine whether a 3D/3D bony registration was successful or unsuccessful as applied to head and cancer radiation therapy. They compared two metrics: mutual information and mean-squared intensity difference. Castadot et al 11 compared 12 DIR algorithms as applied to adaptive radiation therapy of head and neck cancer using the Dice similarity index 12 and the correlation coefficient. We chose the gamma index because of our familiarity with the method (through dose comparison) and it has the feature of defining pass/fail criteria for all pixels in the 3-D image and a pass/fail map can be generated (i.e., the gamma map).
The study design is carried out in two parts: (a) algorithm training and (b) validation of the alert system. For algorithm training, we use retrospective CBCT data from ten patients that were replanned and calculate gamma maps from CBCT comparisons. The gamma maps are based on CBCT number differences and distance-to-agreement between two imaging data sets. From this, we define a match quality parameter (MQP) that tracks the level of anatomy mismatch, such that plotting this parameter by fraction number shows a downward trend as the degree of mismatch worsens. Then, the downward trend pattern is compared to when re-CT simulation was actually ordered by the radiation oncologist to determine the alert signal decision threshold to recommend a re-CT. The optimum parameter set is chosen based on receiver-operator characteristic (ROC) analysis 13 that assesses the sensitivity and specificity of the alert software. For algorithm validation, we test the algorithm on twenty different patients: ten patients that were replanned and ten patients that were not replanned.

2.A | Adaptive planning process
As part of our institutional guideline, the CBCT match to the planning CT on the first treatment day must be reviewed by the radiation oncologist or the clinical specialist in radiation therapy (CSRT).
For the remaining treatment, the radiation therapists are responsible for monitoring anatomy changes that may occur, as mentioned before. Figure 1 The dose distribution is calculated using the original monitor units on the new CT scan and compared to the original plan. If the dose distribution is substantially hotter as judged by the physicist, or if the dose to critical organs is compromised, the radiation oncologist reviews the dose calculation and a decision is made to replan. If a replan is not warranted, then the new planning CT scan is exported to the treatment unit to use for subsequent image guidance. The purpose of a more automated CBCT comparison tool is to reduce the time and judgment involved in deciding whether the adaptive process is necessary. Figure 1(b) shows the change to the process if an alert was sent to the clinician(s) directly. In this case, the algorithm compares CBCT images quantitatively in the background and tracks changes through software, which suppresses (or skips) the review step in Fig. 1(a). A re-CT is recommended based on numerical decision criteria, reducing the time taken by staff to review images offline. It would then be a quick check by the radiation oncologist, CSRT or physicist to verify that the algorithm made a reasonable recommendation. The reduction in time by staff will be most relevant for cases where it is obvious that action is not required, i.e., cases that should not be flagged for review. It should be noted that the CBCT comparison algorithm proposed in this work does not include dose calculation. Therefore, the comparison tool is an anatomical alarm and dose impact assessment would only take place once a re-CT is ordered.

2.B | Patient data and CBCT comparison
The imaging guideline at our institution for head and neck cancer treatment is to use daily IGRT: CBCT twice per week (including day 1) and orthogonal kV radiographs on all other days. Daily CBCT is used depending on the case, e.g., proximity of high dose to critical structures such as spinal cord, brain stem, optic structures, and parotid glands. As mentioned previously, we analyzed a total of 30 patients: ten patients for algorithm training and 20 patients for algorithm validation. Ethics approval was obtained for chart review and access to image data sets. Of the 20 patients that were replanned, 13 patients were rescanned because of weight loss, four patients had setup issues, two patients had swelling and one patient had early tumor response. All patients were treated with two 360°V MAT arcs on Varian linear accelerators (21iX or TrueBeam, Varian, Palo Alto CA, USA). The CBCT images were exported from Offline Review and were imported into the gamma comparison software developed in-house. During treatment, the CBCT is rigidly registered to the planning CT using bony landmarks by the radiation therapists and couch shifts are applied with no action level. Since all CBCT are coregistered to the planning CT in Offline Review, no further image preprocessing steps are required for the gamma comparison. Setting the planning CT scan as the reference scan posed potential problems due to CT number discordance between conebeam and helical CT imaging. CBCT numbers are affected by scattering conditions in the patient and are less accurate than those obtained by fan-beam helical CT. We therefore opted to use the CBCT on fraction 1 as the reference image set to which all subsequent CBCT scans are compared. If there is a replan during treatment, the CBCT on the first day of subsequent treatment is set as the new reference.
As part of treatment plan quality assurance for VMAT, a combination of dose difference and distance-to-agreement (DTA), called gamma analysis, 9 is often used to compare the planned dose distribution to the dose as delivered by the treatment machine. 14 We repurposed the gamma analysis technique to highlight changes in patient anatomy instead of dose, as imaged by CBCT. We use DTA and CT number difference criteria, where CT number contrast is expressed in Hounsfield units (HU). The mathematical formulation of gamma analysis is well-established. 9 Briefly, the gamma analysis is a quadratic combination of the distance between the pixels being | 81 compared and the difference in their CT-number value, scaled by DTA and HU-difference parameters. The main feature of the gamma comparison is that any pixels that have c [ 1 correspond to a failure 9 and the gamma map can show regions of anatomy mismatch visually. Gamma analysis identifies how well two 3-D data sets match in the context of user-supplied criteria. These parameters designate how well a pixel must match its immediate surroundings to be considered a pass, where as any pixels not meeting these criteria are deemed a failure. Although it is a computationally intensive calcula- , optimizations to the algorithm, 15 and adopting graphics processing units (GPUs) can speed up the analysis by several orders of magnitude. 16 In our lab, commercially available graphics hardware (NVidia GeForce GTX 780) allows gamma computations comparing two CBCT image sets (384 9 384 9 70 voxels) to be completed in less than 5 s.

2.C | Match quality parameter (MQP)
Instead of calculating a pass rate as is commonly done for dose quality assurance, 14 we opted to analyze the number of failed pixels shows the corresponding histograms of c [ 1 generated from each map. We calculate the x th percentile of the histograms and then take their difference, namely: where MQP x;i is defined as the match quality parameter for the i th Then, the problem is to use this plot to determine a decision threshold to indicate that a re-CT may be required.

2.D | Definition of re-CT decision criteria
As a starting point, gamma maps were generated using gamma cri- at all for this patient, which is incorrect since this patient was actually replanned. Therefore, in the training phase of the algorithm, we need to find the MQP threshold value that gives the best trade-off in sensitivity and specificity for all ten patients in the training set. In order to quantify the algorithm's predictive power, we define the following:

3.A | MQP plots
The training data set consisted of ten head and neck cancer patients that were replanned during their treatment course. One patient was replanned twice because of early tumor response. This resulted in 11 possible true positives (TP) in this data set. Two of the patients did not have sufficient CBCT scans after their replan to qualify for the three-fraction MQP decision condition so these were excluded from the analysis. This resulted in eight possible true negatives (TN).
The test patient data set consisted of ten additional patients that were replanned and ten patients that were not replanned or reviewed for possible re-CT. This resulted in ten possible TP and 16 possible TN since four of the replanned patients did not have sufficient CBCT data after their replan. Figure 4 shows the MQP 80 , i.e., the MQP calculated from the 80 th percentile of the failed-pixel histograms, for one of the patients in the training data set using our  Fig. 5), which would be expected, except for two patients. In both of these patients there were external contour differences but they were less than 1 cm in magnitude, which were not flagged for review by the radiation therapists because of our in-house guideline. This means that the algorithm (correctly) did not recommend a re-CT in eight of ten patients that were not replanned. fore, we deduce that 6 mm DTA would not be acceptable in the ability to track changing anatomy in general, if this algorithm is to be used clinically. We applied the remaining parameter sets from Fig. 6 to the test patients and the result is shown in Fig. 8. Clearly from  The percentile from the failed-pixel histograms for each curve is the same as that shown in Fig. 6.

3.B | ROC analysis
overdosing critical organs in a more automated way, while reducing the number of image reviews where it is obvious that a replan would not be necessary.

| CONCLUSION
We have developed a cost-effective tool to assess anatomical changes in CBCT images using the gamma comparison method. A parameter called match quality parameter (MQP) was introduced and was calculated using the histogram of pixels that fail the CBCT gamma criteria (c [ 1). The MQP plotted with fraction number showed a downward trend if the magnitude of anatomical differences increased as the treatment progressed. We proposed that recommending a re-CT requires three consecutive MQP values to be less than or equal to a numerical decision threshold value. The decision criteria were derived from comparing the timing of the algorithm to the timing of actual re-CT decisions that were based on expert judgment within our department. The parameter combination of gamma criteria, area under the histogram (percentile) and MQP threshold that gave the best trade-off in sensitivity and specificity was determined using ROC analysis.