Nomograms for predicting long-term overall survival and cancer-specific survival in patients with major salivary gland cancer: a population-based study

In this study, we aimed to develop and validate nomograms for predicting long-term overall survival (OS) and cancer-specific survival (CSS) in major salivary gland cancer (MSGC) patients. These nomograms were developed using a retrospective cohort (N=4218) from the Surveillance, Epidemiology, and End Results (SEER) database, and externally validated using an independent data cohort (N=244). We used univariate, and multivariate analyses, and cumulative incidence function to select the independent prognostic factors of OS and CSS. Index of concordance (c-index) and calibration plots were used to estimate the nomograms’ predictive accuracy. The median follow-up period was 34 months (1–119 months). Of 4218 MSGC patients, 1320 (31.3%) died by the end of the follow-up; of these 1320 patients, 883 (20.9%) died of MSGC. The OS nomogram, which had a c-index of 0.817, was based on nine variables: age, sex, tumor site, tumor grade, surgery performed, radiation therapy and TNM classifications. The CSS nomogram, which had a c-index of 0.829, was based on the same nine variables plus race. External validation c-indexes were 0.829 and 0.807 for OS and CSS, respectively. Based on SEER database, we have developed nomograms predicting five- and eight-years OS and CSS for MSGC patients with perfect accuracy. These nomograms will help clinicians customize treatment and monitoring strategies in MSGC patients.


INTRODUCTION
Major salivary gland cancer (MSGC) accounts for approximately 3-6% of all head and neck malignancies [1,2]. Globally, the overall annual MSGC incidence is 1.195/100,000 [2]. MSGC encompass a cohort of histologies, of which the most common cancer is mucoepidermoid carcinoma, followed by adenoid cystic carcinomas and acinar cell neoplasms, and the most common site of involvement is the parotid gland. Because of the histological heterogeneity and biological behavior diversity, surgery-related issues and the role of radiation therapy were disordered [3,4], and the 5-year overall survival (OS) rates of MSGC patient vary widely, ranging from 32% to 74% [5,6]. In addition, since MSGC is rare, most reports of clinical prognostic factors, OS, and cancer-specific survival (CSS) are from small singleinstitution retrospective studies [5][6][7][8]. To our knowledge, no such studies have been performed using data from a national database. The Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute, which collects and publishes cancer incidence and survival data from population-based cancer registries, provides such a free database [9].
Currently, treatment strategies and prognostic predictions for patients with MSGC are based on the American Joint Committee on Cancer (AJCC) TNM staging system (7th edition), which is recommend by the National Comprehensive Cancer Network (NCCN) guidelines [10]. However, the disease staging (I to IV) based on TNMstatus dose not take into account the role of other factors that significantly affect the survival of patients with MSGC, such as patient characteristics (age, race, and sex), tumor variables (tumor site, perineural invasion, lymphovascular invasion, and tumor grade) as well as treatment modality (surgery performed and radiation therapy) [5,6,8,11,12]. Therefore, this system might be inadequate for customized therapeutic decision-making and prognosis prediction, and a new tool is required to address this issue.
Nomograms, which are reliable statistical predictive tools, can estimate individual patient survival with higher accuracy than the AJCC TNM staging system can, by incorporating numerous factors, including TNM elements [13]. Nomograms have been widely developed and applied in a variety of cancers [14][15][16], and nomograms with good performance have been introduced into NCCN guidelines [17]. Morever, several nomograms have been used to assist clinicians in making treatment and followup decisions in patients with head and neck cancers, including squamous cell carcinoma [18][19][20][21], adenoid cystic carcinoma [22] and nasopharyngeal cancer [23]. Two studies have reported the utility of nomograms for predicting the outcomes of patients with MSGC [5,7], but both were based on a single population retrospective cohort and did not include patients who did not undergo surgery.
In the present study, on the basis of multi-institution and multi-population data from SEER database, we aimed to develop the first practical MSGC nomograms that predict long-term OS and CSS. These nomograms can help clinicians design customized treatment and management strategies for patients with MSGC.

Nomograms construction
After univariate analysis, all variables other than tumor laterality were found to be statistically associated with OS. Multivariate analyses revealed that nine variables were independent prognostic factors for OS in patients with MSGC: age, sex, tumor site, tumor grade, surgery performed, radiation therapy, and TNM classifications ( Table 2). These variables were used to develop the nomogram for predicting five-and eight-year OS ( Figure 1A).
The five-and eight-year cumulative incidences of death in the training cohort from MSGC and other causes by are presented clinicopathological variables in Table 3. Gray's test and a multivariate competing risks model revealed ten variables were independent prognostic factors for CSS in patients with MSGC: age, race, sex, tumor site, tumor grade, surgery performed, radiation therapy and TNM classifications. Therefore, a second nomogram predicting five-and eight-year CSS was created using these variables ( Figure 1B).

Nomograms validation
In the present study, we performed both internal and external validation of the nomograms. As shown in Table 4, in the internal validation cohort (SEER cohort), models showed good accuracy with c-index of 0.817 (95 % confidence interval [CI], 0.806-0.828) and 0.829 (95 % CI, 0.817-0.841) for MSGC OS and CSS, respectively. External validation using the FMMU cohort showed that the c-index for the OS and CSS nomograms were 0.829 (95 % CI, 0.783-0.869) and 0.807 (95 % CI, 0.761-0.853), respectively. The internal and external calibration curves approached the 45-degree ideal match straight line, indicating that the nomograms for OS and CSS in MSGC were generally well calibrated ( Figure 2 and Figure 3).

DISCUSSION
MSGC is a rare but histologically diverse entity that represents 23 separate primary salivary gland malignant tumors [24]. Although exposure to Ionizing radiation has been reported as a potential causative factor in MSGC development, the specific underlying  etiological factors remain unclear [25]. Therefore, the establishment of treatment-related decisions and followup strategies for patients with MSGC is challenging but urgently needed. Nomogram is a statistical tool that can meet these requirements. To date, there is no welldesigned nomogram for MSGC based on international database. Using the population-based SEER database with a mean follow-up of 34 months, we developed nomograms for predicting five-and eight-year OS and CSS in individual patients with MSGC by applying competing risks analysis.
To ensure the predictive accuracy of nomograms, we used the Kaplan-Meier method and Cox's proportional hazards regression model to select factors for the development of the OS nomogram. A competing risks model was used to select factors for the development of the CSS nomogram. In addition, c-indexes and calibration plots were applied to estimate the predictive accuracy of the models by performing internal and external validation. All nomograms had excellent c-indexes higher than 0.8, and the performance of the calibration plots was ideal.  Compared to the widely accepted TNM staging, our nomograms are not only easy to use, but also have the ability to provide a quantified prognosis for an individual patient. For example, consider two patients with T3N0M0 cancer: case A) a 35-year-old woman diagnosed with moderately-differentiated parotid gland cancer, who underwent both surgery and radiotherapy, and case B) a 60-year-old man diagnosed with undifferentiated submandibular gland cancer, who underwent radiotherapy only. Firstly, a vertical line is drawn from every factor to the "Points" line in the nomogram. Second, all the "Points" are summed up to obtain the "Total Points" and a vertical line is drawn from "Total Points" to the "OS" and "CSS" line to obtain corresponding survival. Thus, used nomograms descried in the present study reveal that the patients in case A and B have eight year OS probabilities of 86% and 30%, respectively, and eight year CSS probabilities of 94% and 45%, respectively. However, according to TNM staging [26], both patients would be classified as stage III, which indicating identical outcomes.
In the present study, several clinical and pathologic characteristics were shown to be independent prognostic factors for OS and CSS in patients with MSGC, including age, sex, tumor site, tumor grade, surgery performed, radiation therapy, and TNM classifications, which is consistent with previous reports [5,6,8,11,[27][28][29][30]. Apart from these, comorbidity and postoperative complications, which were not included in the present study, have been proven to be accurate prognostic factors [11,30]. However, data from previous studies have revealed that a higher incidence of comorbidity and complications is significantly associated with advanced age [30,31]. Thus, this limitation was potentially compensated by the effect of advanced age on mortality in the models.

Variables
Cause-specific death Death From Other Causes  Interestingly, on the CSS, we found that both 5-and 8-year cause-specific death (CSD) rates of MSGC patients who did not received radiotherapy (15.2% and 15.7%, respectively) were lower than those of MSGC patients who received radiotherapy (22.5% and 23.9%, respectively). In contrast, on the OS, radiotherapy performed improved the 5-and 8-year OS of patient with MSGC. This may be explained as follows: First, whereas 1320 of the 4218 patients with MSGC died of MSGC-related causes, 437 (33.1%) died of other causes other than MSGC, and 5and 8-year mortality rates from other causes other than MSGC decreased with radiotherapy from 8.7% to 8.3% and 9.9% to 9.3%, respectively. In addition, radiotherapy was primarily used in patients with higher-grade disease or as a treatment option for advanced inoperable salivary gland tumors [6,21].
The present study has several merits. Our nomograms have excellent accuracy with an overall c-index>0.80, which compares very favorably with those of other widely accepted nomograms in other cancers [15,[18][19][20][21], whose c-indexes range from 0.60 to 0.80. Moreover, the variables utilized in these nomograms are easily available in clinical practice. Finally, compared with previous MSGC nomograms, the present nomograms were developed on the basis of information from an international database, and were externally validated by using another independent cohort (FMMU cohort).
Despite these merits, the present study also has some limitations. First, our nomograms were constructed using retrospective data, which introduces the risk of potential selection bias. In addition, data on some important clinicopathological variables were incomplete, reducing the number of eligible case. Second, although the quality of SEER database information is considered high, TNM classifications information was not available until 2004, and the prognostic factor lymph-vascular invasion [32] was not included until 2010. Therefore, we failed to predict a survival time longer than eight years and lymphvascular invasion was not included in the nomograms, but we plane to address these limitation in a future study. Third, data regarding tumor recurrence, chemotherapy [33], and perineural invasion [32], which are important prognostic factors for MSGC, is not available in the SEER database. Therefore, we failed to collect and analyze these factors, and develop nomograms for predicting locoregional control.
In summary, based on a large population-based cohort, we have developed and externally validated two clinically useful nomograms that could objectively provide five-and eight-year OS and CSS for patients with MSGC for the first time. The performance of these nomograms was accurate and they may aid patient counselling, clinical decision-making, and the development of follow-up strategies for management of MSGC.

SEER cohort
We identified information on the clinical and pathologic characteristics of all patients diagnosed with major salivary gland carcinoma between 2004 and 2013 from the SEER program of the National Cancer Institute, which is a national collaboration program [9]. The flow diagram of data selection is shown in Figure 4. Briefly, the basic inclusion criteria were as follows the primary tumor site was major salivary gland, including the parotid, submandibular, and sublingual glands; malignant behavior; and age older than 15 years at diagnosis. To improve the accuracy and homogeneity of the SEER cohort, the final inclusion criteria were as follows: diagnostic information confirmed microscopically and not from a death certificate or autopsy only; active follow-up; patient and tumor information (age, race, sex, marital status, tumor site, tumor laterality, tumor grade, surgery performed, radiation therapy and TNM classifications) were known and exact. A total of 4017 patients were excluded due to indefinite follow-up information or because patients died less than 1 month after treatment. After applying the screen criteria, 4218 patients were included in the final SEER cohort.
For analyses, age was transformed into categorical variables on the basis of recognized cutoff values. Race classification based on the SEER Program was as follows: white, black, and other (American Indian/Alaskan Native and Asian/Pacific Islanders). Tumor grade included I (well differentiated), II (moderately differentiated), III (poorly differentiated) and IV (undifferentiated). All patients TNM classification were staged according to the 7th edition AJCC Staging Manual [26].

FMMU cohort
The FMMU cohort comprised 256 patients who were histologically diagnosed with major salivary gland carcinoma at the department of Oral and Maxillofacial Surgery, School of Stomatology, Fourth Military Medical University (FMMU) in China. All inclusion criteria were identical to those used in the SEER cohort except that all patients were Chinese and received surgery as the primary treatment. Twelve patients were excluded, eight patients because of indefinite follow-up information, and four because they died less than 1 month after surgery. After applying the screen criteria, 244 patients were included in the final FMMU cohort.

Nomograms
The SEER cohort was used to establish the OS and CSS nomogram. OS was defined as the time from diagnosis to death or censoring (if a patient was alive at the last follow-up). The median follow-up time was estimated as the actual patient survival time. The Kaplan-Meier method and log-rank test were used to conduct the univariate prognostic analysis. Variables that were possible prognostic factors (P < 0.001) on univariate analyses were included in the multivariate cox proportional hazards analysis to yield independent MSGC OS factors (P < 0.001) [34]. Next, the nine independent prognostic factors in multivariate analyses were used to build nomogram for five-and eight-year OS in patients with MSGC at by employing a stepwiseselection method in the R software.
When constructing a competing risks nomogram for MSGC, death from MSGC and death from other causes were considered two different event types in this analysis. CSS was defined as the time from diagnosis to death attributed to MSGC or censoring (if a patient was alive at the last follow-up or death from other causes). The cumulative incidence function (CIF) was used to assess the probability of death, and the difference was assessed using Gray's test [35]. Variables whose P values were less than 0.001 for the CIF values were considered significant independent MSGC CSS factors. Subsequently, by integrating all the significant independent factors, we developed nomograms to predict five-and eight-year CSS in patients with MSGC via a proportional sub-distribution hazards regression method proposed by Fine and Gray [36] using R software"cph" and "step" commands.

Nomograms validation
The SEER and FMMU cohorts were applied to estimate the predictive accuracy of the model by performing internal and external validation, respectively. All the internal and external validations were measured by c -index and calibration plots, and performed using bootstrapping with 1000 resamples and ten-fold crossvalidation, respectively. C-indexes quantified the discrimination between predicted and actual situations, with values ranging from 0.5 (no discrimination) to 1.0 (perfect discrimination), proposed by Harrell [37]. In addition, a marginal estimate versus model was used to plot calibration curve that represented the calibration between nomogram-predicted and actual survival.
All statistics analysis was conducted using the SPSS software version 19.0, (SPSS Inc., Chicago, IL, USA) and the R software version 3.3.0 (R Foundation for Statistical Computing, Vienna, Austria; www.R-project.org) with the R packages rms, and cmprsk. All calculated P values were two sided, and P <0.001 was considered statistically significant.

Ethics statement
Our study was approved by the Fourth Military Medical University Ethical Committee. Informed patient consent was not required for data released by the SEER database.