Prognostic Value of BIRC5 in Lung Adenocarcinoma Lacking EGFR, KRAS, and ALK Mutations by Integrated Bioinformatics Analysis

Objective This study was aimed at investigating the prognostic significance of Baculoviral IAP repeat containing 5 (BIRC5) in lung adenocarcinoma (LAD) lacking EGFR, KRAS, and ALK mutations (triple-negative (TN) adenocarcinomas). Methods The gene expression profiles were obtained from Gene Expression Omnibus (GEO). The identification of the differentially expressed genes (DEGs) was performed by GeneSpring GX. Gene set enrichment analysis (GSEA) was used to execute gene ontology function and pathway enrichment analysis. The protein interaction network was constructed by Cytoscape. The hub genes were extracted by MCODE and cytoHubba plugin from the network. Then, using BIRC5 as a candidate, the prognostic value in LAD and TN adenocarcinomas was verified by the Kaplan-Meier plotter and The Cancer Genome Atlas (TCGA) database, respectively. Finally, the mechanism of BIRC5 was predicted by a coexpressed network and enrichment analysis. Results A total of 38 upregulated genes and 121 downregulated genes were identified. 9 hub genes were extracted. Among them, the mRNA expression of 5 genes, namely, BIRC5, MCM4, CDC20, KIAA0101, and TRIP13, were significantly upregulated among TN adenocarcinomas (all P < 0.05). Notably, only the overexpression of BIRC5 was associated with unfavorable overall survival (OS) in TN adenocarcinomas (log rank P = 0.0037). TN adenocarcinoma patients in the BIRC5 high-expression group suffered from a significantly high risk of distant metastasis (P = 0.046), advanced N stage (P = 0.033), and tumor-bearing (P = 0.031) and deceased status (P = 0.003). The mechanism of BIRC5 and coexpressed genes may be linked closely with the cell cycle. Conclusion Overexpressed in tumors, BIRC5 is associated with unfavorable overall survival in TN adenocarcinomas. BIRC5 is a potential predictor and therapeutic target in TN adenocarcinomas.


Introduction
Lung adenocarcinoma (LAD), which accounts for more than 50% of all lung cancers, is the most frequent pathological type [1]. There have been several studies focusing on gene mutations in LAD. Epidermal growth factor receptor (EGFR) mutation occurs in up to 20-50% of LAD cases in Asian countries [2], making patients respond to EGFR tyrosine kinase inhibitors (TKI), such as gefitinib; thus, they demonstrate modest survival benefits [3]. A previous report revealed that Kirsten rat sarcoma viral oncogene homolog (KRAS) gene mutations occur more frequently in patients from western countries (approximately 20-25%) than in those from Asian countries [4]. Unfortunately, KRAS inhibitors are still being evaluated in experiments. On the other hand, anaplastic lymphoma kinase (ALK) rearrangements occur in 3-5% of patients with non-small-cell lung cancer (NSCLC) [5], and ALK-positive patients can benefit from ALK inhibitors [6]. In contrast, a subset of LAD patients remain without EGFR, KRAS, and ALK mutations (triple-negative (TN) adenocarcinomas). A poor prognosis of TN adenocarcinomas is attributed to a lack of sufficient genetic information and therapeutic targets.
Gene expression profiling is a powerful method that can aid in identifying important mechanistic pathways. Highthroughput bioinformatic approaches provide strong tools to comprehensively quantify gene expression [7]. Bioinformatic analysis, including mechanistic pathway and gene ontology strategies, enable investigators to assess cell states by categorizing discrete and recurrently upregulated or downregulated genes [8]. Currently, the acquisition of big data on multiple platforms has increasingly been performed not only to understand the mechanisms of oncogenesis inherent to specific cancers, but also to provide us with drug targets and molecular diagnostic and prognostic factors, as well as biomarkers for patient risk stratification and treatment [9][10][11].
In our analysis, we identified differentially expressed genes (DEGs) between tumor and nontumor tissues in LAD patients from the Gene Expression Omnibus (GEO) database, enriched potential pathways, and ontology functions using gene set enrichment analysis (GSEA) and conducted survival analysis of hub genes, in the hope of assessing the prognostic significance and potential molecular mechanism of prognostic candidates for TN adenocarcinoma patients.

Differentially Expressed Gene (DEG) Identification.
GeneSpring GX 12.5 software (Agilent Technologies Inc., Santa Clara, CA, US) was used to screen DEGs in the three datasets. The gene expression data was normalized by the robust multiarray average (RMA) quantile normalization analysis algorithm. Quality control was performed by 3D principal component analysis (PCA) scores. t-Test unpaired analysis and an asymptotic P value computation with Benjamini-Hochberg multiple testing correction was used to detect DEGs whose fold change FC ≥ 2 with a false discovery rate (FDR) cutoff < 0 05 between LAD and adjacent nontumor tissues.

Enrichment Analysis of DEG
Candidates. An enrichment analysis was performed on pathways and gene ontology to determine the functions of the overlapping DEGs by using gene set enrichment analysis (GSEA) (version 3.0, http:// software.broadinstitute.org/gsea/), a computational method that determines whether a defined set of genes shows statistical significance [12,13].

Protein-Protein Interaction (PPI) Network Construction.
The Search Tool for the Retrieval of Interacting Genes (STRING) database (version 10.5, https://string-db.org/) was used to analyze potential interactions between overlapping genes at the protein level, and a medium confidence score > 0 4 was considered significant. Subsequently, the PPI network was constructed by Cytoscape software (version 3.6.0, https://cytoscape.org/). The Molecular Complex Detection (MCODE) plugin was used for searching the most significant module from the network [14]. The MCODE criteria for selection were as follows: MCODE scores ≥ 5, degree cutoff = 2, node score cutoff = 0 2, K − core = 2, and max depth = 100.
2.5. Hub Node Extraction and Verification. We used the cytoHubba plugin, which integrates eleven topological analysis methods and six centralities with the Maximal Clique Centrality (MCC) algorithm, to explore the top 10 candidates as hub genes in the PPI network [15]. Finally, we combined the results of MCODE and cytoHubba analysis, and candidates were extracted from the network. UCSC Xena (version 2.0, https://xena.ucsc.edu/welcome-to-ucsc-xena/), a securely analyzed and visualized functional genomic dataset in TCGA datasets, was utilized to verify the differential expression of these candidates between LAD and nontumor tissues.
2.6. Survival Analysis. We investigated the prognostic significance of hub genes among TN adenocarcinomas. Clinical data, mRNA expression, and mutation data in the lung adenocarcinoma (TCGA, Provisional) database were obtained from the cBioPortal online platform (version 1.17.1, http://www.cbioportal.org/) [16,17]. TN adenocarcinoma patients were included for further analysis to investigate the association between BIRC5 and survival by using the median cutoff of mRNA expression. The Kaplan-Meier plotter (http://kmplot.com), an online database which includes both clinical and expression data, was utilized to verify the relationship between BIRC5 and the survival time of LAD [18].

Coexpression Network Construction and Topology
Analysis. To further explore the mechanism of BIRC5 as a potential prognostic significance molecule for TN adenocarcinomas, BIRC5 and coexpression genes were analyzed by using cBioPortal. The coexpression network was visualized by Cytoscape, and degree ≥ 160 (average degree) was set as a cutoff criterion.

Statistical
Analysis. The differences of gene expression between the individual groups were analyzed using the Student t-test, Mann-Whitney U test, chi-squared test, and Ridit analysis based on the types of variables. Kaplan-Meier analysis was conducted by GraphPad Prism 7 (GraphPad Software Inc., CA, USA). Factors associated with LAD overall survival were assessed by both Cox univariate and multivariate analyses. Only covariates significantly associated with outcomes in the univariate analysis (two-sided P < 0 10) were included in the multivariate model. PASW Statistics software version 23.0 from SPSS Inc. (Chicago, IL, USA) was used. A two-tailed P < 0 05 was considered significant for all tests.

Functional Enrichment Analysis of DEGs.
To classify the biological function of DEGs, gene ontology and pathway enrichment analyses were performed using GSEA. As shown in Figure 2(a), GO analysis revealed that 159 DEGs were significantly enriched in the following biological processes (BP), including response to an oxygen containing compound, response to an endogenous stimulus, tissue development, response to an organic cyclic compound, and response to lipids. Cellular component (CC) analysis showed that extracellular space, proteinaceous extracellular matrix, extracellular matrix, membrane microdomain, and membrane region were mostly classifications. In terms of molecular functions (MF), the DEGs were mainly associated with the molecular function regulator, enzyme regulator activity, enzyme inhibitor activity, protein dimerization activity, and macromolecular complex binding. In addition, the pathways that had the most significant enrichment terms were shown by KEGG and Reactome. As shown in Figure 2(b), KEGG pathways included leukocyte transendothelial migration, tight junction, tyrosine metabolism, vascular smooth muscle contraction, and focal adhesion and Reactome pathways included hemostasis, developmental biology, phase 1 functionalization of compounds, biological oxidations, and cell surface interactions at the vascular wall.

PPI Network Construction and Hub Gene Identification.
Protein interactions among DEGs were predicted by the STRING online platform. The PPI network, which included a total of 104 nodes and 188 edges, was visualized by Cytoscape ( Figure 3(a)). Subsequently, we utilized the MCODE plugin to analyze the whole network and 4 modules were chosen (Figure 3 To further identify the hub genes in the PPI network, we used the cytoHubba plugin to extract the following top 10 genes as hub genes (Figure 3(c)): TOP2A, BIRC5, KIAA0101, CDC20, UBE2C, AURKA, TRIP13, MCM4, KIF20, and GNG11. Notably, 9 genes of Module 1 were all included in the hub genes. Finally, we confirmed 9 candidates for our further study: TOP2A, BIRC5, KIAA0101, CDC20, UBE2C, AURKA, TRIP13, MCM4, and KIF20. Interestingly, all of candidates were upregulated in LAD tissues.

Differentially Expressed Analysis of Hub Genes.
To verify the results of the GEO dataset analysis, UCSC Xena was carried out to exhibit the differential expression of 9 hub genes between LAD and nontumor tissues. As shown in Figure 4(a), the hierarchical clustering revealed similar results in the TCGA datasets and all of 9 hub genes were more upregulated in LAD than in nontumor tissues. Following which, to investigate the relationship of hub genes and TN adenocarcinomas, the mRNA expression levels of the 9 hub genes were calculated between TN adenocarcinomas and non-TN adenocarcinomas in TCGA datasets. The profiles of a total of 544 LAD patients were available for analysis. As shown briefly in Figures 4(b)-4(f), the mRNA expression levels of 5 genes, namely, BIRC5, MCM4, CDC20, KIAA0101, and TRIP13, were significantly upregulated in TN adenocarcinomas (allP < 0 05) compared with those in non-TN adenocarcinomas, while the mRNA expression levels of 4 genes, namely, TOP2A (P = 0 259), UBE2C (P = 0 0569), AURKA (P = 0 0778), and KIF20 (P = 0 1634) were not upregulated in TN adenocarcinomas. grouped by the median cutoff of the mRNA expression levels of 5 genes. Notably, only the high level of mRNA expression of BIRC5 was associated with poor overall survival (OS, log rank P = 0 0037, Figure 5(a)), but not the relapse-free survival (RFS, log rank P = 0 1126, Figure 5(b)). Meanwhile, this significant difference was not observed in MCM4, KIAA0101, CDC20, and TRIP13 among TN adenocarcinomas. We described clinicopathological features in the BIRC5 high-and low-expression groups in TN adenocarcinomas, as shown in Table 1. The BIRC5 high-expression group had more male patients (P = 0 023) and smoked packs per year (P = 0 025), and TN adenocarcinoma patients in the BIRC5 high-expression group suffered from a significantly higher risk of distant metastasis (P = 0 046) and advanced N stage (P = 0 033). Tumor-bearing (P = 0 031) and deceased status (P = 0 003) are also related with BIRC5 high expression. There were also more patients who received additional radiation therapy in the BIRC5 high-expression group than those in the BIRC5 low-expression group (26.3% vs. 14.1%, P = 0 034, Table 1).
In order to further evaluate the prognostic significance of BIRC5 in TN adenocarcinomas, the potential risk factors associated with OS of TN adenocarcinoma patients were  analyzed using Cox regression analysis. Univariate analysis revealed that high-level BIRC5, recurrence/metastasis, new tumor event after initial treatment, advanced TNM stage, and person neoplasm cancer status with tumor are potential risk factors significantly associated with OS in TN adenocarcinoma patients (all P < 0 05, Table 2). When these variables were included in the multivariate analysis using the forward LR method, high-level BIRC5, advanced TNM stage, and person neoplasm cancer status with tumor were the risk factors that significantly contributed to worse OS in TN adenocarcinoma patients (P = 0 015, P = 0 049, and P < 0 001, respectively, Table 2).

Coexpression Network Construction and Enrichment
Analysis of Coexpression Genes. BIRC5 indicated a significant prognostic correlation, and genes coexpressed with BIRC5 were predicted using the cBioPortal online platform. As shown in Figure 6(a), the coexpression network was visualized by Cytoscape software, and the network included 73 nodes and 2,613 edges. According to the topological analysis of the coexpression network, STAG2 is the most relevant gene with BIRC5.
We next imported BIRC5 and the coexpressed genes into GSEA to carry out enrichment analysis. As shown in Figure 6(b), GO terms were most significantly enriched in BP, including sister chromatid cohesion, sister chromatid segregation, nuclear chromosome segregation, chromosome    enrichment analysis in Figure 6(c) showed that oocyte meiosis, wnt signaling pathway, cell cycle, long-term depression, and progesterone-mediated oocyte maturation were enriched in the KEGG pathway. Mitotic prometaphase; mitotic M-M/G1 phases; DNA replication; cell cycle, mitotic; and cell cycle were gathered in the Reactome pathway. Briefly, enrichment results suggested that BIRC5 and coexpressed genes may be linked closely with the cell cycle.

Discussion
BIRC5, 14.7 kb long, is a mitotic spindle checkpoint gene, located near the telomeric end of chromosome 17 [19]. BIRC5 is not only an essential protein molecule for the regulation of mitosis and apoptosis, but it is also involved in pathological processes [20]. It has been shown that BIRC5 was upregulated in tumor tissues. In a variety of human cancers, such as breast cancer [21], pancreatic cancer [22], hepatocarcinoma [23], neuroblastoma [24], and esophageal carcinoma [25], high expression of BIRC5 indicated poor clinical outcomes. In addition, circulating IgG antibodies derived from BIRC5 could serve as a biomarker for malignant glial tumor [26] and early diagnosis of cervical cancer [27]. Therefore, BIRC5, as a target, might be useful for the treatment of drugs in gastric and colorectal cancers [28]. Targeting BIRC5 may be a promising strategy against esophageal tumor relapse and chemoradioresistance [29].
In this study, we identified that the high level of mRNA expression of BIRC5 not only was related to unfavorable outcomes of LAD, but it also indicated a shorter OS of TN adenocarcinomas. Coexpression and enrichment analysis revealed that BIRC5 may serve as a critically coexpressed gene for mediating the cell cycle-related signaling pathway. We assumed that the oncogenic role and prognostic value of BIRC5 in TN adenocarcinomas can be ascribed to this "cell cycle-mediated" mechanism.
We therefore summarize previous publications. First of all, BIRC5 was overexpressed in LAD compared to normal controls [30][31][32]. Our findings are consistent with existing investigations. Secondly, the potential prognostic and therapeutic value of BIRC5 in LAD was evaluated. Plasma anti-BIRC5 IgG may be a useful marker for the assessment of prognosis of NSCLC [33]. BIRC5 expression is an independent poor prognostic marker in LAD [34]. In the aspect of regulating the sensitivity of LAD cells to anticancer drugs, BIRC5 also plays a critical role [35]. In terms of EGFR or RAS mutant NSCLC treatment, BIRC5 contributes to induced cytotoxicity [36]. Briefly, BIRC5 may constitute a valuable biomarker and therapeutic target. Notably, compared with the previous investigations, our study provided new evidence that BIRC5 plays an important role in the prognosis of TN adenocarcinomas. Finally, the mechanism behind the effect of BIRC5 on tumors was described. Survivin, encoded by BIRC5, has a bifunctional molecule effect on apoptosis and cell proliferation, mainly from the two phosphorylation sites on its different domains (Thr34 and Thr117). Thr117 is a key site on the regulation of proliferation and cell cycle, and Thr34 is involved in cell apoptosis [37]. BIRC5 has been identified as a critical target involved in a variety of cancer cell signaling pathways, and the upregulation of BIRC5 may serve as a role in the following mechanisms: critically antagonizing caspase-dependent apoptosis and activating P53 and its downstream target P21, which stalls cell cycle progression as a cyclin-dependent kinase inhibitor (CDKi) [38]. Additionally, BIRC5 promotes the migration and invasion of tumor cells mediated by the TGF-β pathway [39] and the PI3K/AKT pathway [25], interacts with proteins associated with DNA damage repair [40], and regulates tumor cell proliferation mediated via the β-catenin pathway [41]. These facts explained why BIRC5 was mainly involved in the regulation of cell cycle and apoptosis; meanwhile, it is supported in our study that the overexpression of BIRC5 promoted the development of TN adenocarcinomas, which may be via regulating the cell cycle signaling pathway.
In conclusion, for TN adenocarcinomas, BIRC5 may serve as a promising prognostic predictor and therapeutic target. The upregulation of BIRC5 may enable the activation of the cell cycle signaling pathway to participate in the development of this specific type of LAD. Our findings provide a novel insight into TN adenocarcinomas. However, cancer is a result of highly complex molecular mechanisms, and more experimental studies are necessary to clarify the cell cyclerelated signaling for TN adenocarcinomas in the context of BIRC5.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.

Authors' Contributions
Yajuan Cao and Weikang Zhu contributed equally to this work.