Risk Stratification for Oropharyngeal Squamous Cell Carcinoma Using Texture Analysis on CT – A Step Beyond HPV Status

Background and Purpose: Human papillomavirus-associated oropharyngeal squamous cell carcinoma (OPSCC) is increasingly prevalent. Despite the overall more favorable outcome, the observed heterogeneous treatment response within this patient group highlights the need for additional means to prognosticate and guide clinical decision-making. Promising prediction models using radiomics from primary OPSCC have been derived. However, no model/s using metastatic lymphadenopathy exist to allow prognostication in those instances when the primary tumor is not seen. The aim of our study was to evaluate whether radiomics using metastatic lymphadenopathy allows for the development of a useful risk assessment model comparable to the primary tumor and whether additional knowledge of the HPV status further improves its prognostic efficacy. Materials and Methods: 80 consecutive patients diagnosed with stage III-IV OPSCC between February 2009 and October 2015, known human papillomavirus status, and pre-treatment CT images were retrospectively identified. Manual segmentation of primary tumor and metastatic lymphadenopathy was performed and the extracted texture features were used to develop multivariate assessment models to prognosticate treatment response. Results: Texture analysis of either the primary or metastatic lymphadenopathy from pre-treatment enhanced CT images can be used to develop models for the stratification of treatment outcomes in OPSCC patients. AUCs range from .78 to .85 for the various OPSCC groups tested, indicating high predictive capability of the models. Conclusions: This preliminary study can form the basis multi-centre trial that may help optimize treatment and improve quality of life in patients with OPSCC in the era of personalized medicine. Visual Abstract


Introduction
Head and neck cancer is the sixth most common cancer worldwide, with oropharyngeal squamous cell carcinoma (OPSCC) arising from the base of tongue and tonsils comprising the most common histologic subtype and anatomic subsites. 1Traditional risk factors include smoking and alcohol consumption, and affected patients most commonly are white men between 40-55 years of age.In recent years, human papillomavirus (HPV)associated oropharyngeal squamous cell carcinoma related to infection with high-risk HPV genotypes, most notably HPV-16, has emerged as a distinct entity.It differs from the HPV-negative disease in patient demographics, molecular profiles and more importantly, in its markedly more favorable prognosis.This is reflected in the most recent eighth edition of the American Joint Committee on Cancer (AJCC) staging guideline for oropharyngeal cancer from 2017 which provides distinct staging criteria based on HPV status, and immunohistochemistry of p16 protein as a marker for HPV status has been incorporated into the routine workup for OPSCC. 2,3eatment options for both HPV-negative and HPVpositive OPSCC patient populations include primary surgery and/or radiation therapy with or without chemotherapy.5][6][7] Despite the overall more favorable outcome for the HPV-positive OPSCC patient population, there is heterogeneity in the treatment response with a substantial risk of disease recurrence [. 26 Hazard Ratio (HR)] and death (.24 HR) among these patients. 8Therefore, identification of additional parameters in addition to HPV status to predict treatment response in OPSCC patients would be beneficial for personalization of care and selection of candidates for potential de-intensification treatment.
Radiomics is a rapidly emerging field of imaging analysis that employs high-throughput quantitative mapping of features from standard of care medical images to develop models that assist in clinical decision making by improving diagnostic, prognostic and predictive accuracy.So far, oncologic applications include screening, staging and prognosis; in particular, prediction of survival and the development of distant metastatic disease. 9,102][13][14][15][16][17][18] Advanced analysis methods to estimate the mean intracellular water molecular lifetime and volume transfer constant in MRI have been utilized to prognosticate overall survival in patients with head and neck SCC. 19,20Ou et al. have developed a model to stratify patients with locally advanced head and neck SCC on chemoradiotherapy treatment. 15In a meticulous study using pre-treatment enhanced CT imaging, the M.D. Anderson Cancer Centre Head and Neck Quantitative Imaging Working Group consortium used radiomics based on primary tumor in oropharyngeal squamous cell carcinoma to predict local recurrence at 55 months. 21However, in some cases, the primary tumor remains occult on imaging and endoscopic examinations, and diagnosis in these cases relies on biopsy of metastatic lymph nodes.
We hypothesize that using texture signatures from pretreatment CT images of the metastatic lymph nodes may help in stratification of treatment outcome of OPSCC patients.The aim of this study is therefore to develop a multivariable risk assessment model for OPSCC based on 3D high order texture analysis of both the primary tumor and metastatic lymphadenopathy on pre-treatment CT images, with or without regard to HPV status, and compare their prognostic efficacy.We sought to elucidate whether radiomics of metastatic lymphadenopathy allows for stratification of OPSCC patients, and whether additional knowledge of the HPV status leads to a higher prognostic efficacy of the assessment model.

Patient Selection
Retrospective chart review of 299 consecutive patients with head and neck cancer diagnosed between February 2009 and October 2015, and treated at our institution was performed.All patients underwent standard of care treatment (surgery, radiation therapy, and chemotherapy, either as single modalities or in combination).80 consecutive patients were selected that had biopsy-proven stage III-IV oropharyngeal squamous cell carcinoma (both p16 positive and negative), known HPV status and pre-treatment contrast-enhanced CT scan of the neck that showed both the primary tumor and metastatic lymph nodes.Patient characteristics are summarized in Table 1.Further inclusion criteria included minimum follow-up of 24 months and documented treatment response.Patients with distant metastatic disease at the time of diagnosis, past malignancy, prior chemotherapy or radiotherapy or surgery to the neck region were excluded.Patients with unclear documentation of the treatment protocol or treatment response were also excluded.Favorable treatment response was defined as the absence of regional or distant metastatic disease at 24 months by clinical examination and follow-up imaging.In patients with unfavorable treatment response, regional or distant metastatic disease recurrence secondary to prior OPSCC was confirmed by biopsy.The study was approved by the Conjoint Health Research Ethics Board.

CT Imaging and Segmentation
Contrast-enhanced CT examinations of the neck were performed preoperatively.Images were acquired at .625 mm intervals from the supraorbital ridge to the sternoclavicular joints after administration of 125 mL non-ionic iodinated contrast material (Optiray 350) using a split injection technique (50 mL with a 3-minute delay and 75 mL with a 20 second delay) on one of the CT scanners: Lightspeed VCT 64-slice or 16-slice CT scanners (GE Healthcare, Milwaukee, Wisconsin), Revolution 64-slice CT scanner (GE Healthcare) and Somatom 128-slice CT scanner (Siemens Healthcare, Forchheim, Germany).We employed a compensation method to correct for the variations of radiomic features caused by using different CT scanners and reconstruction techniques. 22sing the reformatted soft tissue algorithm axial images with 2.5 mm slice thickness, manual segmentation of either both primary tumor and a representative ipsilateral metastatic lymph node containing cystic and solid components was performed by a senior radiology resident (YS.C.) in a slice-byslice manner using the OsiriX graphical-user interphase. 23mage slices with artifact from dental hardware involving the primary tumor or metastatic lymph node were excluded from segmentation.Contouring was validated independently by a senior neuroradiology fellow (J.N.) and a fellowship-trained neuroradiology staff with more than 10 years' experience in head and neck image interpretation (J.T.L.).Representative images of segmentation are shown in Supplemental Figure 1.

Texture Features
326 texture features were extracted from the contoured primary tumor and metastatic lymph node using an in-house developed MATLAB based software. 24These features can be classified into three separate sets: 1) first order statistics (34 features), 2) volume-based statistics (6 features), and 3) higher order statistical textural information (286 features).
First order statistics are calculated from all voxels in the region of interest, and are common measurements such as mean and standard deviation, among others.Volume-based features quantify the size and shape of the region of interest.Texture features are a broad spectrum of higher-order statistical operations that characterize the heterogeneity of the region in 3 dimensions.

Radiomics Features and Machine Learning Modules
Not all 326 features were included in the machine learning models implemented later in this paper.First, the large set of statistical features were reduced to a subset of the most discriminative features.Linear discrimination analysis (LDA) was used to rank each feature individually in terms of its discriminatory importance (Table 2) This list was used to select the number of features included in each machine learning model.These features were then used to train a machine learning algorithm to classify the desired binary groups.The top 15 most distinctive features for each of the binary groups are listed in Table 3.The maximum number of texture features included was determined by maximizing cross-validated accuracy.This value was not the same for each binary group, or each machine learning model.Maximums typically occurred between 5-20 texture features.To make one-to-one comparisons, the average threshold was determined for all models/binary groups.This threshold was 15 +/À 3 texture features, where the error was determined from the standard deviation of all thresholds averaged.This standard deviation was converted into uncertainty when reporting AUCs and cross-validated accuracies.Logistic regression (LR) is often used in small data sets due to their resistivity to overfitting.Due to the small sample size of our study, LR minimized overfitting and provided reliable cross-validated accuracy and AUC.In our study, LR was implemented to discriminate between 5 binary groups.For each binary group, the data sets were trained with increasing numbers of texture features, taken from the rank list created from LDA.The optimal number of features was determined by maximizing cross-validated accuracy.
Oversampling.Due to the asymmetry between the positive and negative binary groups, a correction was required to ensure the models were not biased towards a certain binary subgroup.This was corrected by oversampling each group, to equalize the positive and negative groups.
ROC Analysis and Leave-One-Out Cross Validation.The Area Under the Curve (AUC) of a Receiver Operating Characteristic (ROC) curve was used to evaluate the performance of the derived machine learning model.To minimize overfitting, leave-one-out cross validation (LOOCV) strategy was used to train the machine learning models.Specifically, the model is constructed with the one less than the patient population.Then, that one patient is tested with the fitted model to determine if it can accurately classify the patient into the correct binary class.This is completed for all patients in the data set, "leaving one out" for each iteration.The accuracy, sensitivity and specificity was then calculated considering results from all iterations.This accuracy tests the variance in the model by assessing overfitting.This was the primary statistic used when determining a texture feature dimensionality threshold.

Patient Characteristics
Of the 80 patients with biopsy-proven OPSCC and known HPV status that fulfilled the selection criteria, 53 patients had favorable response to chemo radiation treatment ("responders"), and 27 patients demonstrated regional or distant metastatic disease ("non-responders") at 24 months following treatment.68 patients were HPV-positive as confirmed with p16 immunohistochemistry staining.Of these, 46 patients were responders, and 22 patients were non-responders.
The patients were divided into five binary groups, and their texture features used by the machine learning model to further stratify each group into responders and non-responders.The first four groups were: primary OPSCC (HPV-positive and HPV-negative combined; "OP"), metastatic lymph node (HPV-positive and HPV-negative combined; "LN"), HPVpositive only primary OPSCC and HPV-positive only metastatic lymph node.The fifth group comprised primary OPSCC texture features from 68 HPV-positive and 12 HPV-negative patients and served as an internal control for the efficacy of the machine learning model utilized to determine the p16/HPV status (Figure 1).
As stated earlier, we determined the p16/HPV status based on primary OPSCC texture features of our patients as an internal control ("Primary tumor HPV status" group of Figs.), and the results are shown in Figure 2. The AUC measures .91 ± .06,indicating a robust discrimination of the HPV status by LR.AUC for the primary OPSCC binary group ("Primary tumor combined" group of Figure 1) was .86 ± .03 and .78± .03 for the metastatic lymphadenopathy group ("Metastatic LN combined" group in Figure 1), irrespective of HPV status (Figure 3).
We proceeded to analyze radiomics features of primary OPSCC and metastatic lymph nodes within the HPV-positive patient group to elucidate whether knowledge of HPV status provided added information in the stratification process (Figure 4).For the HPV-positive primary OPSCC group ("Primary tumor HPV (+)-only" in Figure 1), AUC measured .84± .05,slightly lower compared to .86 ± .03 for the general primary OPSCC group.However, AUC for the HPV-positive metastatic lymph node group ("Metastatic LN HPV (+)-only" in Figure 1) measuring .80 ± .02 was slightly higher when compared to .78 ± .03 for the general metastatic lymph node group.The AUC, Accuracy, Sensitivity and Specificity for all models have been summarized in Figures 2, 3 and 4.

Discussion
Radiomics has emerged as a promising tool to advance clinical decision making by extracting quantitative information based on texture features from medical imaging.In the era of personalized   medicine, stratification of OPSCC patient population based on pretreatment imaging studies harbors the prospect of decreasing treatment-associated side effects and improving the quality of life of patients.
The use of radiomics to stratify treatment response based on texture features derived from pre-treatment CT images of primary OPSCC has been elucidated meticulously in the study from M.D. Anderson Cancer Centre. 21Bogowicz et al. showed that expanding radiomics analysis with LN radiomic features improves model performance for prediction of the more complex and clinically more relevant endpoint loco regional control. 25n our study, we showed that texture analysis of either the primary tumor or metastatic lymphadenopathy from pretreatment enhanced CT images can be used to develop models for the stratification of OPSCC patient treatment outcome.Our models using texture features from primary OPSCC are in keeping with results from M.D. Anderson Cancer Centre, albeit the lower sensitivity and specificity in our study is likely due to our considerably lower patient number.In addition, we showed that models using primary tumor are consistently slightly superior compared to the ones using texture features from metastatic lymph nodes, irrespective of HPV status.However, combination of texture features of the metastatic lymph nodes with additional knowledge of positive HPV status is superior to texture features alone.This is of importance, as sometimes the primary tumor cannot be found either on endoscopy or pretreatment imaging.In these instances, advanced imaging modalities such as MRI or FDG PET-CT could provide  alternative sources for patient stratification, 11,20 but are not always performed and are themselves limited in their ability to detect the primary tumor.In most institutions, contrastenhanced CT remains the most common pre-treatment imaging modality performed.
Our study was limited by the retrospective nature of the analysis, small patient number and the absence of external validation.Multi-class training of texture features was also not possible, so analysis was limited to binary outcomes.Further validation of our models is planned with either a separate set of OPSCC patients meeting the same inclusion/ exclusion criteria within our institution (internal validation) or from an external institution (external validation).For the latter option, differences in image acquisition secondary to technology-and protocoling-related differences will have to be taken into consideration.

Conclusions
In this study based on pre-treatment CT images of stage III-IV patients with OPSCC, we developed predictive models using texture features derived from primary tumor and/or metastatic lymphadenopathy.We showed that, while primary tumor data produced superior models, models derived from metastatic lymphadenopathy were only slightly inferior.In addition, the combination of radiomics features with knowledge of HPV-status in metastatic lymph nodes resulted in a superior model compared to radiomics features alone.These findings are of significance since the primary tumor may not be visible endoscopically or on medical imaging.Our findings suggest that radiomics using metastatic lymphadenopathy is a useful adjunctive approach to predicting treatment response.This is a preliminary study to show the potential of lymph node radiomic features as prognostic factor in treatment of OPSSC patients.These results need to be further investigated and validated in a larger, prospective multicentre cohort.

Figure 2 .
Figure 2. Comparison of HPV status binary for the primary tumor.Brackets indicate uncertainty.

Figure 3 .
Figure 3.Comparison between lymph node and primary tumor binaries.Brackets indicate uncertainty.

Figure 4 .
Figure 4. Comparison between lymph node and primary tumor binaries for HPV-positive patients.Brackets indicate uncertainty.

Table 2 .
Texture Features Calculated in the Present Work.

Table 3 .
Top 15 Selected Features for Each Binary Group.Proportion of 3 × 3 × 3 kernel LBP with a value of 5 to the total distribution of LBP values 2. Standard deviation of information measure of correlation1 of GLCM at sagittal orientation 3. Proportion of 5 × 5 × 5 kernel LBP with a value of 8 to the total distribution of LBP values 4. Mean of information measure of correlation1 of GLCM at axial orientation 5. Proportion of 5 × 5 × 5 kernel LBP with a value of 10 to the total distribution of LBP values 6. Proportion of 5 × 5 × 5 kernel LBP with a value of 4 to the total distribution of LBP values 7. Mean of 5 × 5 × 5 kernel LBP 8. Mean of information measure of correlation2 of GLCM at axial orientation 9. Run length of non-uniformity of GLRL at level of 120 at coronal orientation 10.Proportion of 5 × 5 × 5 kernel LBP with a value of 11 to the total distribution of LBP values 11.Proportion of 3 × 3 × 3 kernel LBP with a value of 4 to the total distribution of LBP values 12. Mean of information measure of correlation2 of GLCM at axial orientation 9. Absolute mean of LAWS LLW 10.Run length of non-uniformity of GLRL at level of 30 at coronal orientation 11.Standard deviation of sum of entropy of GLCM at sagittal orientation 12. Absolute mean of LAWS LLR 13.Proportion of pixel numbers with 75 -100 HU to the total number of ROI pixels 14.Proportion of LBP length of 2 with kernel size of 3 15.Angular second moment of GLDM at axial orientation (continued)