Clinical characteristics and prognosis of basaloid squamous cell carcinoma of the lung: a population-based analysis

Background This study analyzed the clinical features and prognosis of basaloid squamous cell carcinoma of the lung (BSC), and constructed a nomogram to predict the prognoses of patients. Methods The information of pure BSC patients was obtained in the Surveillance, Epidemiology, and End Results database between 2004 and 2015. Then, it was evaluated, and compared with the data of lung squamous cell carcinoma (SCC), lung large cell carcinoma (LCC) and lung adenocarcinoma (LAC) patients. Subsequently, we used univariate and multivariate analyses to investigate the independent factors related to the prognoses of patients with BSC and constructed a nomogram to verify the prognoses. Results A total of 425 patients diagnosed with BSC were enrolled. Compared with patients with SCC, LCC and LAC, the mean survival time of BSC patients was better than all of them. Compared with SCC, there were significant differences between the characteristics of grade (P < 0.001), total stage (P < 0.001), T stage (P < 0.001), N stage (P < 0.001), M stage (P < 0.001), surgery (P < 0.001), radiotherapy (P < 0.001), and chemotherapy (P < 0.001), while BSC also had significantly different clinical characteristics from LCC and LAC. Univariate and multivariate survival analyses showed that age (P < 0.001), T stage (P < 0.001), N stage (P = 0.009), M stage (P < 0.001), and surgery (P < 0.001) were independent prognostic factors of BSC. The survival of patients undergoing lobectomy was significantly better than sublobar resection, with an OR of 0.389 (0.263–0.578). We constructed a nomogram with a C-index of 0.750 (95% confidence interval) based on the results of multivariate analysis. The calibration curves based on nomogram scores indicated that the nomogram could accurately predict the prognosis of patients. Conclusions BSC had unique clinical and prognostic features. T stage, N stage, M stage, age, and surgery were independently associated with overall survival (OS). Lobectomy was a relative ideal choice for patients with BSC. The nomogram effectively predicted the OS at 1-, 3-, and 5-years.


INTRODUCTION
Lung cancer is the most commonly diagnosed cancer worldwide, as well as the leading cause of cancer deaths. Worldwide in 2018, it accounted for 2.1 million new cases and 1.8 million deaths. Squamous cell carcinoma is one of the major well-studied histological subtypes of lung cancer (Allemani et al., 2018;Bray et al., 2018). However, basaloid squamous cell carcinoma of the lung (BSC), as a rare subtype of lung squamous cell carcinoma, is less studied, and the clinical features and prognostic factors remain unclear.
BSC is an uncommon histological variant of lung cancer composed of cells exhibiting cytological and tissue architectural features of both squamous cell lung carcinoma and basal cell carcinoma, while the proportion of squamous cell components is less than 50% (Brambilla et al., 2001;Wang et al., 2011;Crapanzano et al., 2011). It is reported that BSC accounts for 3.9%-5.2% of all lung squamous carcinomas and has unique clinical characteristics such as a high rate of metastasis and death, according to previous researches. Apart from this, BSC was once described with overlap features of large cell carcinoma (LCC) (Travis et al., 2015;Vignaud, 2016). Up to this date, no relevant literatures have reported the differences of clinical features with LCC and other non-small cell lung cancer (NSCLC). In this study, we compared clinicopathological characteristics of related lung cancer subtypes in detail, then we used univariate and Cox hazards regression analyses to identify risk factors affecting overall survival (OS) of BSC. We further developed a nomogram of patients with BSC based on the results of survival analysis to better predict the prognoses of patients.

Ethic statement
We obtained permission to use data files from the public database of the Surveillance, Epidemiology, and End Results (SEER) database. Thus, our research was exempted by the Ethics Committee of Suzhou Municipal Hospital.

Data extraction
Data of all primary pure basaloid squamous cell carcinoma patients (ICD-O-3: 8083/3) between 2004 to 2015 were identified by the SEER*Stat software (v8.3.5, https://seer.cancer.gov/seerstat/) from the SEER database (http://seer.cancer.gov/). Exclusion criteria were: (1) pathological types of non-pure-type basaloid squamous cell carcinoma; (2) unknown aspects regarding differentiation, stage, and treatment methods; and (3) a history of tumors in other sites (Fig. 1). We extracted and analyzed the data regarding patients' race, sex, age, grade, TMN stage, surgery type, radiotherapy, and chemotherapy. The total stage, T stage, N stage, and M stage of all patients were manually restaged according to the 8th edition of the American Joint Committee on Cancer (AJCC) lung cancer staging project.
Chi-square tests were used for comparison of multi-class variables like race between basaloid squamous cell carcinoma and other types of lung carcinomas. Rank sum tests were used for comparing two categorical variables or ordered variables. The quantitative variables of age were compared by the variance analysis method. In the analysis of prognostic factors for BSC, we used Kaplan-Meier analyses and log-rank tests for univariate analyses, and Cox model tests for multivariate analyses. The above analyses were performed using SPSS (version 25) software (SPSS, Chicago, IL, USA), all of which were two-sided tests, while P < 0.05 was considered to be statistically significant. R language (R Core Team, 2018) was used to generate and validate the nomogram, while the main packages used were rms and Hmisc (Sun et al., 2017;Wang et al., 2018).

Comparison of clinical features between BSC and other types of NSCLC
After screening, we enrolled 425 patients with BSC of pure type and 90006, 6997 and 160638 patients with SCC, LCC and LAC, respectively. Survival curves indicated that the survival of BSC patients was significantly better than those of SCC, LCC and LACpatients ( Fig. 2 and Fig. S1). As shown in were stage II, 23.8% were stage III, and 19.3% were stage IV. BSC patients had significantly less well differentiated tumors (P < 0.001), less N+ disease (P < 0.001), fewer distant metastases (P < 0.001), lower proportion of radiotherapy (P < 0.001) and chemotherapy (P < 0.001), but a higher percentage of radical surgical resection (P < 0.001) than those of SCC and LAC patients. Conversely, only LCC patients had more undifferentiated tumors (P < 0.001), while much lower proportion of surgery (P < 0.001), radiotherapy (P < 0.001) and chemotherapy (P < 0.001) ( Table 1).

Analyses of BSC prognostic factors
We used univariate analyses to investigate possible prognostic factors in patients with BSC. As shown in Table 1, there was a statistically significant correlation between age (P < 0.001), grade (P < 0.001), total stage (P < 0.001), T stage (P < 0.001), N stage (P < 0.001), M stage (P < 0.001), surgery (P < 0.001), radiotherapy (P < 0.001) and chemotherapy (P = 0.013) with prognoses of BSC. In other words, elder age, lower differentiation, and a higher total stage meant the worse prognosis. In addition, as shown in Fig. 3A, univariate analyses showed that patients who underwent surgery had better prognoses than patients who did not, and were similar to patients receiving other surgical treatments (P < 0.001). Particularly, the patients with Stage I to Stage IV who underwent lobectomy had better benefits than those undergoing sublobar resection (Fig. 3B), including for total stage (P < 0.0001), Stage I (P = 0.00016), Stage II (P = 0.0012), Stage III, and IV (both P = 0.03) (Figs. 3C, 3D and 3E). Sex (P = 0.257) and race (P = 0.077) for the prognosis of BSC were not statistically significant ( Table 2). The data revealed that the factors of age (P < 0.001), T stage (P < 0.001), N stage (P = 0.009), M stage (P < 0.001), and surgery (P < 0.001) with statistical significance    using univariate analysis were found to be independent factors according to multivariate analyses (Table 2). Multivariate analyses revealed that the older patient and the higher the TMN stage, the worse the prognosis. Excluding the significant effects of T3 (P = 0.003), T4 (P < 0.001) and N2 (P = 0.001), the remaining T stage and N stage of the prognoses of patients were similar. Compared with T1 and N0, the odds ratios (ORs) were T2: 1.240  Unlike the results of univariate analysis, multivariate analysis showed that radiotherapy and chemotherapy were not independent prognostic factors.

Production and inspection of the nomogram
We successfully constructed a nomogram based on the above independent predictors of patient outcomes (Fig. 4A). According to the patients' age, T stage, N stage, M stage, and surgery, we visually calculated the patient's 1- (Fig. 4B), 3- (Fig. 4C) and 5-year (Fig. 4D) survival probabilities. The C-index of this nomogram was 0.750 as determined by the discriminant test. The consistency test showed that the 3-year and 5-year survival rates predicted by the nomogram were in good agreement with the actual 3-year and 5-year survival rates, and the slope of the consistency curve was close to 1.

DISCUSSION
We conducted an in-depth analysis of BSC using the patients' data from the SEER database and we found that there were significant statistical differences with SCC, LCC and LAC in terms of race, grade, total stage, T stage, N stage, M stage, surgery, radiotherapy and chemotherapy. We also found that age, T stage, N stage, M stage and surgery were independent influencing factors for the prognoses of patients with BSC. We then plotted a nomogram. Consistency detection proved that the nomogram effectively predicted the 1-, 3-and 5-year survival probabilities of patients, while the nomogram scores effectively discriminated the patients' survivals. BSC is an invasive subtype of squamous cell carcinoma that can be detected in the proximal bronchi (Wang et al., 2011;Kim et al., 2003). Unlike other previous studies (Brambilla et al., 1992), we verified that the prognosis of BSC was better than SCC, LCC and LAC. In this study, we found that the prognosis of BSC in a population-based cohort was better than SCC, LCC, and LAC. However, there are some previous studies reported opposite results to ours (Brambilla et al., 2014;Moro-Sibilot et al., 2008), perhaps due to that the number of cases varied, and most of other studies focused on the patients with surgery. Meanwhile, BSC has a significant lower TNM stage than other lung cancers according to our results.
In this population-based study, BSC and other types of lung cancers had similarities in terms of age. But Moro-Sibilot et al. (2008) reported that BSC patients are older than non-BSC patients. Thus, Wang et al. (2011) demonstrated that there was no significant statistical difference of mean age between BSC (58.6 years) and poorly differentiated squamous cell carcinoma (60.5 years) (P = 0.363).There were more patients with poorly differentiated BSCs, while the numbers of patients of N + and M + were less than those with SCC, LCC and LAC. In our study, the 5-year survival rate of the BSC patients was close to 17.6%. In other reports, the 5-year survival rate for BSC of stage I and stage II was less than 15%, much lower than the 5-year survival rate of 47% for resectable poorly differentiated SCC (Moro et al., 1994). However, Kim et al. (2003 reported that there was no significant difference in the median survival rate between BSC and SCC in patients with stage I, without lymph node metastasis. Moreover, Moro-Sibilot et al. (2008) reported that operative modes had no difference between the prognosis of BSC and poorly-differentiated SCC. As shown in the Fig. S2, we compared the differences between the two groups by utilizing the survival curve. It clearly indicated that poorly-differentiated BSCs had better 5-year prognosis than poorly-differentiated SCCs, which were similar to the overall comparison results of this research. Wang et al. (2011) also revealed that BSC and poorly differentiated squamous cell carcinomas had very similar clinical features, and there were no significant differences in survival rates, while in our results the survival of poorly differentiated BSC was superior to that of SCC with the same differentiation. More research should be carried out to validate the results.
Currently, surgery is the best curative treatment in stage I, stage II, and some stage III non-small lung cancers (Lang-Lazdunski, 2013). Thus, lobectomy is still recommended as a preferred treatment for BSC, while more patients with peripheral tumors have undergone sublobar section (Zhang & Shen-Tu, 2015). However, both in our univariate analyses and multivariate analyses, patients with lobectomy had a better prognosis than patients undergoing other therapies. Our results also suggested that at any stage, even stage III and IV, the prognosis of patients with lobectomy was significantly better than those with sublobar section. This may due to the radical lobectomy that reduces the potential risks for relapse and distant metastases of solid tumors (Wang & Zhao, 2016). In addition, survival following sublobar section was inferior to lobectomy for stage I non-small cell lung cancer (Zhang et al., 2015b). Therefore, further studies with larger cohorts, between lobectomy and sublobar section, especially when classified by histology, should be performed.
Nomogram, as an easily available and measurable tool of statistical prediction, which provides prognostic probability of specific outcomes (Kent et al., 2016;Zhang et al., 2015a). So far, multiple nomograms have been constructed for predicting prognosis of different types of lung cancers (Zhang et al., 2017;Young et al., 2017;Ye et al., 2018). Thus, it has even been considered more applied than the traditional AJCC TNM staging system in diverse malignancies according to great quantity of previous evidence (Liang et al., 2015;Xie et al., 2015). Furthermore, nomograms are especially advisable to deal with individual patients without existing definite clinical guidelines. In general, it seems simple and convenient via utilizing nomograms to predict patients' long-time survival according to their own characteristic.
The latest National Comprehensive Cancer Network recommends that EGFR mutations and other gene mutations should be considered as markers for lung squamous cell carcinoma, especially for non-smokers, small biopsy, or mixed squamous cell carcinoma (Keedy et al., 2011;Felip et al., 2011). Although the gene mutation status has not been well investigated in BSC, a molecularly targeted treatment may still have great potential to be used in the treatment for BSC.
The SEER database is a population-based tumor epidemiology database in the United States, covering about 28% of the population, including thousands of cases of lung cancers since 1973, therefore the SEER database is of great help in the study of lung cancer and other tumors (Yang et al., 2017;Yang et al., 2018). By analyzing the cases in the entire population of the SEER database, it is possible to effectively avoid the bias of the patients from the research given by a single institution. Nevertheless, there is often a lack of imaging data, smoking history, gene mutations, tumor markers, and data regarding other detailed treatments, especially chemotherapy regimens in the SEER database. Therefore, the impact of these factors on the prognoses of patients with BSC was not included in our study. These factors may significantly affect the prognoses of the patients.
In our study, we have selected BSC cases that met the requirements as much as possible. But there was still a significant gap with the number of SCC. Though there seemed to be some controversy, it was still determined by its specific characteristics. We should further pay close attention to the future prognosis of BSCs. We acknowledge that the article limited the findings to epidemiological analysis and did not set more emphasis on exploring the biology of rare tumors such as molecular mechanism for gene therapy strategy.