Prediction of radiotherapy response with a 5‐microRNA signature‐based nomogram in head and neck squamous cell carcinoma

Abstract Radiotherapy is unlikely to benefit all patients with head and neck squamous cell carcinoma (HNSCC). Therefore, novel method is warranted to predict the radiotherapy response. Our study aimed to construct a microRNA (miRNA)‐based nomogram to predict clinical outcomes of patients with HNSCC receiving radiotherapy. We screened out 56 differential miRNAs by analyzing 44 paired tumor and adjacent normal samples miRNA expression profiles from The Cancer Genome Atlas (TCGA). A total of 307 patients with HNSCC receiving adjuvant radiotherapy were randomly divided into a training set (n = 154) and a validation set (n = 153). In the training set, we combined the differential miRNA profiles with clinical outcomes, and LASSO regression model was applied to establish a 5‐miRNA signature. The prediction accuracy of the 5‐miRNA signature was further validated. In addition, target genes of these miRNAs were predicted, and Gene Ontology (GO) analysis as well as KEGG pathway analysis was executed. A 5‐miRNA signature including miR‐99a, miR‐31, miR‐410, miR‐424, and miR‐495 was identified. With a cutoff value of 1.2201 from Youden's index, the training set was divided into high‐risk and low‐risk groups, and the 5‐year overall survival was significantly different (30% vs. 73%, HR 3.65, CI 2.46–8.16; P < 0.0001). Furthermore, our 5‐miRNA signature revealed that only low‐risk group would benefit from radiotherapy. Then, a nomogram combining 5‐miRNA signature with clinical variables to predict radiotherapy response was constructed. The analysis of 108 target genes of these miRNAs revealed some potential mechanisms in HNSCC radiotherapy response for future investigations. In conclusion, the 5‐miRNA signature‐based nomogram is useful in predicting radiotherapy response in HNSCC and might become a promising tool to optimize radiation strategies.


Introduction
In the training set, we combined the differential miRNA profiles with clinical outcomes, and LASSO regression model was applied to establish a 5-miRNA signature. The prediction accuracy of the 5-miRNA signature was further validated. In addition, target genes of these miRNAs were predicted, and Gene Ontology (GO) analysis as well as KEGG pathway analysis was executed. A 5-miRNA signature including miR-99a, miR-31, miR-410, miR-424, and miR-495 was identified. With a cutoff value of 1.2201 from Youden's index, the training set was divided into high-risk and low-risk groups, and the 5-year overall survival was significantly different (30% vs. 73%, HR 3.65, CI 2.46-8. 16; P < 0.0001). Furthermore, our 5-miRNA signature revealed that only low-risk group would benefit from radiotherapy. Then, a nomogram combining 5-miRNA signature with clinical variables to predict radiotherapy response was constructed. The analysis of 108 target genes of these miRNAs revealed some potential mechanisms in HNSCC radiotherapy response for future investigations. In conclusion, the 5-miRNA signature-based nomogram is useful in predicting radiotherapy response in HNSCC and might become a promising tool to optimize radiation strategies.
radiotherapy may pose some post-therapy burdens to these patients and affect their quality of life [3,4]. However, it remains unsolved to identify patients insensitive to radiotherapy in real-life clinical practice. Specially, novel methods that can reliably predict individual response to radiotherapy are much warranted.
MicroRNAs (miRNAs), the evolutionarily conservative single-stranded noncoding RNAs, exert inhibitory effects on protein-coding genes via targeting the 3′-untranslated region (UTR) of a target gene, followed by subsequent induction the target degradation of messenger RNAs (mRNA) or suppression of protein translation at the posttranscriptional level. Recently, several studies focused on the application of miRNAs in the evaluation of radiotherapy response, and a close link has been observed among them in HNSCC. It was reported that an upregulation of miR-494-3p was associated with improving response to radiotherapy [5], whereas miR-196a expression was associated with radio-resistance and poorer OS in HNSCC [6].
In our study, a 5-miRNA signature was developed from a large database of The Cancer Genome Atlas (TCGA) to serve as a reliable biomarker for predicting radiotherapy response in HNSCC. Likewise, the biological function of target genes of the 5-miRNA signature was investigated. In addition, a nomogram which could be put into clinical practice was established.

Data acquisition
Normalized miRNA sequencing data (level 3, reads per kilobase per million mapped reads) as well as medical records of each patient during follow-up were downloaded/acquired from TCGA database (https://cancergenome.nih.gov). "DESeq," a package of R software (version 3.3.2, R Foundation for Statistical Computing, Vienna, Austria), was used in the normalization of miRNA-seq data and to make comparisons between normal mucosa and HNSCC. As a consequence, 56 differential miRNAs with the fold change were two times or more, and significant if FDR (adjust P value) < 0.01 was identified. Clinical XML files of each patient were summarized into a .txt document using R package "XML." Patients with overall survival (OS) <1 month were excluded to avoid the impact of unrelated causes of death.

Establishment of 5-miRNA signature with LASSO regression model
The expression level of miRNA was defined as log2 reads per million of total aligned miRNA reads. Based on the overall mortality, X-tile (version 3.6.1, Yale University School of Medicine, New Haven, CT, USA), an automatic program for calculating cutoffs via Kaplan-Meier survival analysis and log-rank test [7], was used, and the cutoff of these differential miRNAs was estimated. According to the calculated cutoff, status of each miRNA was assigned as 0 (normalized miRNA expression lower than cutoff) or 1 (normalized miRNA expression higher than cutoff).
In the training set, with the R package "glmnet," we used the least absolute shrinkage and selection operator method (LASSO, the classical and modified method in Cox regression analysis of high dimensional data [8]) to pick out the most valuable predictable miRNAs and construct a signature which can stratify the patients into high-risk and low-risk groups.

Statistical analysis
Time-dependent receiver operating characteristic (ROC) curve, a graphical plot that illustrates the diagnostic value of binary classifier system [9], was performed via R package "pROC." Youden's index = max (1-(sensitivity + specificity)). Kaplan-Meier survival analysis was performed to investigate potential risk factors for overall mortality with log-rank test. A two-tailed P-value < 0.05 was considered as statistically significant for all analyses executed. Statistical analysis was carried out using SPSS version 16.0 software (SPSS Inc., Chicago, IL, USA).
A nomogram was established in combination with age, T staging, N staging, 5-miRNA expression, and survival, and Cox regression analysis was performed using SPSS software and R package "rms." The calibration plot, in which the 45-degree line was defined as the idealized model, was also performed using "rms."

Results
Differential miRNAs identification and cutoff estimation miRNA array profiles and corresponding clinical records for patients with HNSCC were obtained from the TCGA L. Chen et al. A MiRNA Nomogram to Predict HNSCC RT Response database. A total of 553 samples, consisting of 509 carcinomas and 44 normal mucosa specimens, from 509 patients with HNSCC were employed. Among them, 307 patients received external beam radiotherapy after surgical operation, and the radiation dose ranged from 11 Gy to 73 Gy. After normalization, expressions of miRNAs in normal specimens and carcinoma specimens were compared to help overcome the confounding effect of tumor purity, and 56 differential miRNAs (|fold change| ≥2 and FDR < 0.01) were identified ( Figure S1). Using X-tile, the cutoff of each miRNA from these 56 differential miR-NAs was calculated in combination with the OS with Kaplan-Meier survival analysis and log-rank test. As a result, the cutoff expression of miR-99a, miR-31, miR-410, miR-424, and miR-495 is 8.5, 5.3, 2.3, 7.5, and 2.3, respectively. The miRNA status was defined as 0 if the miRNA expression level is lower than estimated cutoff and defined as 1 if higher.

Establishment of a 5-miRNA signature for predicting clinical outcomes
Three hundred and seven patients receiving radiotherapy were randomly divided into two groups, including a training set (n = 154) and a validation set (n = 153) ( Table 1). In order to predict these patients' radiotherapy responses, a formula of combined miRNAs was established using LASSO regression model in the training set, where MiR score from 5-miRNA signature = miR-99a status × (−0.5289) + miR-31 status × 0.4797 + miR-410 status × 0.8658 + miR-424 status × 0.5513 + miR-495 status × 0.3789 (Fig. 1A,B), and these five miR-NAs linked closely to clinical outcomes of HNSCC RT patients (all P < 0.05; Fig. 1C). A cutoff of 1.2201 was calculated using Youden's index, while favorable sensitivity as well as specificity was shown (62.4% and 75%, respectively). Poorer prognoses were observed in high-risk patients whose MiR scores were higher than 1.2201, while patients with MiR scores lower than 1.2201 were with favorable prognoses ( Fig. 2A, left panel). According to time-dependent ROC analysis, the prognostic accuracy of the five miRNA signatures is 0.784 at 3 years and 0.736 at 5 years, respectively. It was revealed that the 5-year survival rate in lowrisk group reached 73%, while significantly lower survival rate (30%) was shown in high-risk group (hazard ratio [HR] 3.65, 95%CI 2.46-8.16; P < 0.0001; Fig. 2A).
Prediction value of this 5-miRNA signature was further evaluated in the validation set and the total set of all patients who received radiotherapy. The 3-year and 5-year prognostic accuracy of these miRNA signatures were 0.705 and 0.672 in the validation set, and 0.743 and 0.699 in the total set, respectively. Similarly, significantly higher 5-year survival rates were observed in low-risk groups, in comparison with high-risk ones, both in the validation set and in the total set ( Fig. 2B; Figure S2).
The prognostic values of the 5-miRNA signature in HNSCC patients with/without RT A proportion of patients with HNSCC may be radioresistant, instead of benefiting from radiotherapy, some of them even suffered from its toxicities [2]. Identification of these unfavorable patients may help refining radiation therapy strategy. We evaluated the prognostic value of the 5-miRNA signature in a total of 509 patients with HNSCC; patients in low-risk group predicted by our 5-miRNA signature generally had better 5-year OS than those in high-risk group (Fig. 3A). When further stratified patients according to their clinic-pathological risk factors, such as age, T stage, N stage, and clinical stage, the 5-miRNA signature remained as an independent and statistically significant prognostic predictor, except for Stage I-II patients, which was mainly due to the small proportion. (Figures S3 and  S4; Fig. 3B).
To further evaluate the predictive value of the 5-miRNA signature in HNSCC with radiotherapy, patients were divided into different groups based on the 5-miRNA signature, and the responses to radiotherapy were tracked. It was revealed that in different subgroups, such as patients with various T stage and N stage (which was often used as the basis to make decision on radiotherapy), the responses to radiotherapy differed in distinguished 5-miRNA signature-based groups. In high-risk group, patients did not benefit from radiotherapy (all P > 0.05; Fig. 3C, upper panels), while better clinical outcomes of radiotherapy were observed in low-risk patients (all P > 0.05; Fig. 3C, lower panels). These results suggested that high-risk patients tend to be more insensitive to radiotherapy when compared with low-risk ones.
As HPV-positive HNSCC tumors and laryngeal carcinomas are already known to show superior response to standard therapy which typically including radiotherapy and better survival, so we excluded the HPV+ HNSCC tumors (HPV positivity was defined by p16 staining) and laryngeal carcinomas (72 HPV+ and larynx tumors in the low-risk group and 33 HPV+ and larynx tumors in the high-risk group) and then repeated our analysis in the rest HNSCC RT patients (n = 202). As a consequence, the 5-miRNA signature remained as an independent prognostic factor for the specific HNSCC RT patients ( Figure  S5A). Likewise, higher survival rate was observed in the low-risk group with RT in comparison with those without RT ( Figure S5C). Therefore, we can correctly draw the conclusion the low-risk group produced by our original 5-miRNA signature did not overrepresent for HPV-positive tumor and laryngeal carcinoma ( Figure S5). Moreover, with the exclusion of HPV-positive HNSCC and laryngeal carcinomas, our 5-miRNA signature is also predictive for overall survival by RT treatment.

Establishment of a nomogram to predict the overall survival in HNSCC with RT
In order to construct a visual scale to predict the clinical outcomes of patients with HNSCC receiving radiotherapy, multivariate Cox regression analysis was performed to evaluate the relevant variables in predicting overall survival, including the MiR score, age, and TNM staging. It was revealed that except for MiR score, only N stage was correlated with the prognosis of patients with HNSCC receiving radiotherapy (Fig. 4A). Nevertheless, age and T stage were traditionally considered as risk factors for adjuvant radiotherapy in clinical practice [10,11]. As a result, AUC indicates area under curve. The AUC was assessed at 3 and 5 years, and the P value was acquired through log-rank test. We calculated P values using the log-rank test.
A MiRNA Nomogram to Predict HNSCC RT Response L. Chen et al. The 5-miRNA signature was capable of predicting the OS in different clinicopathological factors. (C) Kaplan-Meier survival in 5-miRNA signaturebased risk group according to patients with/without RT, who were stratified by different clinical characteristics. We calculated P values using the log-rank test. We found in different subgroups, only low-risk group could get benefit in receiving RT, but high-risk group had similar survival with or without RT.  we constructed a nomogram to predict the clinical outcomes of 307 HNSCC patients with radiotherapy, and variables including age, T staging, and N staging were taken into consideration (Fig. 4B).
Nomogram is a simple tool in prognosis evaluation. Firstly, a vertical line could be drawn from each factor to the "Points" line to get a score in the nomogram. Then, each factor's score was added to acquire the total points, which could be used to estimate survival probability. For example, consider a 62-year-old T2N1MX HNSCC patient with MiR score equal to 0.8 who received radiotherapy, the total points for this patient is 0 + 9 + 14 + 45 = 65. According to the nomogram, the estimated 3-year OS is approximately 70-75%, while 5-year OS is approximately 55-60%. The accuracy of the nomogram is 0.81 and 0.731 at 3 years in the training set and validation set, respectively (Fig. 4D). According to the calibration plot of both sets, the prediction accuracy of the nomogram was comparable to the actual situation (Fig. 4C).

Target genes prediction and bioinformatics analysis of the 5-miRNA signature
In order to further investigate the cellular biological functions and potential mechanisms of the 5-miRNA signature, we predicted target genes of the five miRNAs from four databases, including miRanda, mirwalk, TargetScan, and PicTar. A total of 738 overlapped genes were observed.
To narrow down the range and to ensure that these predicted genes do play roles in patients with HNSCC receiving radiation therapy, we overlapped these 738 predicted genes with 3968 HNSCC relative genes and 4378 radiotherapyrelated genes which were both achieved from GeneCards. At last, we got 118 target genes, which might partially explain the potential mechanisms of the five miRNAs in affecting the prognosis of HNSCC patients with radiotherapy.
David Bioinformatics database (https://david.ncifcrf.gov) was used to execute Gene Ontology (GO) analysis and pathway analysis. The results are listed in Table 1. GO analysis consisted of three domains, including biological  process (BP), molecular function (MF), and cellular component (CC). It was found that these genes were mainly associated with gene transcription and cell proliferation in terms of BP, and involved in MF such as protein-binding and protein tyrosine kinase activity and played a role in constructing cytoplasm and cytosol in CC (Fig. 5A-C; Table  S1). Pathway analysis indicated that these genes mainly enriched in malignancy-related pathways (Table S2), including Pathways in cancer, Cell cycle, Melanoma, Proteoglycans in cancer, microRNAs in cancer, and so forth (Fig. 5D). In addition, we assumed that genes such as FGF9, FGF2, FGFR1, FGFR3, HBEGF, IGF1, TGFB2, VEGFA, and MAP2K1 might play crucial roles in HNSCC. However, our target gene prediction findings remain as speculative, awaiting further validation.

Discussion
Although radiation therapy has been widely used in the treatment of HNSCC, and about 75% patients benefit from it, a subset of patients are still resistant to radiotherapy [2]. For the radioresistant subset of patients, radiotherapy has no benefit and may also introduce undesirable side effects and compromises their quality of life.
A number of researchers were trying to figure out methods in predicting radiotherapy effects in HNSCC. In a retrospective study of Aaron et al. [12], which focused on the relationship between HNSCC body composition and radiotherapy, pre-RT body mass index (BMI) played roles in predicting RT response. Likewise, HPV-positive tumors are more radiosensitive, when compared with HPV-negative ones [13]. Recent study [14] showed the expression of 13 genes was closely correlated with the postradiotherapy prognosis of HPV-negative HNSCC. However, these studies only illustrated the possible factors relevant to radiotherapy effects, and predictive biomarkers for clinical practice were still lacking.
In recent years, the relation between miRNAs and radiotherapy responses aroused much concern in cancer. For example, Liu et al. [15] figured out a 5-miRNA

Supporting Information
Additional supporting information may be found in the online version of this article: Figure S1. HEATMAP 0F 56 miRNA. Figure S2. Mir score by the 5-miRNA signature, timedependent ROC curves and Kaplan Meier survival in total adiotherapy sets. Figure S3. The 5-miRNA signature was independent. No significant difference between different groups in every clinical characteristic. Figure S4. The prognostic values of the 5-miRNA signature for HNSCC patients with/without RT in different clinical stage. Figure S5. The prognostic values of the 5-miRNA signature in HNSCC patients(excluding HPV+ HNSCC and laryngeal carcinomas) with/without RT. Figure S6. X-tile plots of other 5 miRNA from Liu's study. Table S1. Top 10 Gene oncology terms in 3 domains of the predicted genes. Table S2. Pathway analysis of predicted genes.