Pan-cancer analysis identifies BIRC5 as a prognostic biomarker

The BIRC5 gene encodes for the Survivin protein, which is a member of the inhibitor of apoptosis family. Survivin is found in humans during fetal development, but generally not in adult cells thereafter. Previous studies have shown that Survivin is abundant in most cancer cells, thereby making it a promising target for anti-cancer drugs and a potential prognostic tool. To assess genetic alterations and mutations in the BIRC5 gene as well as BIRC5 co-expression with other genes, genomic and transcriptomic data were downloaded via cBioPortal for approximately 9000 samples from The Cancer Genome Atlas (TCGA) representing 33 different cancer types and 11 pan-cancer organ systems, and validated using the ICGC Data Portal and COSMIC. TCGA BIRC5 RNA sequencing data from 33 different cancer types and matching normal tissue samples for 16 cancer types were downloaded from Broad GDAC Firehose and validated using breast cancer microarray data from our previous work and data sets from the GENT2 web-based tool. Survival data were analyzed with multivariable Cox proportional hazards regression analysis and validated using KM plotter for breast-, ovarian-, lung- and gastric cancer. Although genetic alterations in BIRC5 were not common in cancer, BIRC5 expression was significantly higher in cancer tissue compared to normal tissue in the 16 different cancer types. For 14/33 cancer types, higher BIRC5 expression was linked to worse overall survival (OS, 4/14 after adjusting for both age and tumor grade and 10/14 after adjusting only for age). Interestingly, higher BIRC5 expression was associated with better OS in lung squamous cell carcinoma and ovarian serous cystadenocarcinoma. Higher BIRC5 expression was also linked to shorter progressive-free interval (PFI) for 14/33 cancer types (4/14 after adjusting for both age and tumor grade and 10/14 after adjusting only for age). External validation showed that high BIRC5 expression was significantly associated with worse OS for breast-, lung-, and gastric cancer. Our findings suggest that BIRC5 overexpression is associated with the initiation and progression of several cancer types, and thereby a promising prognostic biomarker.


Introduction
The BIRC5 (Baculoviral IAP Repeat Containing 5) gene located on chromosome 17 (17q25.3) encodes for the Survivin protein, which is a member of the IAPs (inhibitor of apoptosis family) that is normally expressed in humans during fetal development and in adult proliferating cells [1]. Survivin is a small protein with different isoforms, the majority of which are related to inhibition of apoptosis and promotion of cell proliferation [2]. Research during the past 20 years has shown that Survivin is highly expressed in most cancer cells [3,4]. Although attempts have been made to develop small molecules targeting Survivin, there is no treatment currently in therapeutic use [5]. Recently, a study identified an association between BIRC5 expression Fäldt Beding et al. BMC Cancer (2022) 22:322 and tumor-infiltrating lymphocytes (TILs) [6]. Moreover, we previously evaluated BIRC5 in breast cancer subtypes, thereby demonstrating that high BIRC5 expression is associated with worse prognosis in breast cancer patients [7].
Several studies have found that Survivin can be implicated in chemoresistance to platinum-based [8] or taxanebased [9] chemotherapy in ovarian cancer. In contrast, a previous study comparing Survivin expression in ovarian cancer patients (n = 435) treated with platinum/cyclophosphamide (PC) (n = 244) or taxane/platinum (TP) (n = 191) found that patients with high nuclear Survivin expression and an accumulation of TP53 in tumor cells that were treated with TP had a decreased risk of recurrence and death [10]. Furthermore, high nuclear Survivin expression and TP53 dysfunction had a higher likelihood of having high platinum sensitivity. A recent in vitro study on human cell lines of neuroendocrine tumors (NETs) showed increased BIRC5 expression in irradiated cells, additionally BIRC5 knockdown resulted in reduced cellular proliferation but not significantly increased radiosensitivity [11]. Kleinberg et al. [12] observed an association between high nuclear Survivin in tumor samples and improved progression-free survival (PFS) in chemotherapy-naïve patients. Expression analysis of the BIRC gene family in 30 patients with triple-negative breast cancer (TNBC) [13] identified higher gene expression of the BIRC gene family (including BIRC5) in patients (< 50 years old) with TNBC. In contrast, TNBCs with lymphovascular and fat tissue invasion had lower expression of BIRC genes. Although BIRC5 had the highest average expression of the tested genes, high BIRC5 expression had no significant association with tumor size. However, there was a significant difference in BIRC5 expression when comparing patients with no nodal metastasis (N0) with patients with micrometastases up to 1-3 axillary metastases and when comparing N0 with patients with 10 or more nodal metastases. There was also an association between histopathological grade in breast tumors and BIRC5 expression [13].
Copy number gains of three BIRC genes (BIRC2, BIRC3, and BIRC5) were identified in melanoma [14], while miR-195-5p/− 218-5p, and not genetic/epigenetic aberrations, was correlated in high BIRC5 levels in gastric cancer [15]. In the present study, we used publicly available -omics (genomics, transcriptomics) and survival data to examine BIRC5 genetic alterations and altered expression in 33 cancer types in relation to prognosis.

Data collection cBioPortal for Cancer Genomics repository
The cBioPortal for Cancer Genomics repository [16][17][18] was first used to analyze multi-omics data from The Cancer Genome Atlas (TCGA Pancancer) [19] for the BIRC5 gene. Genomic and transcriptomic data from approximately 9000 samples representing 33 different cancer types and 11 pan-organ systems were analyzed (Table 1). Esophageal squamous and adenocarcinoma were combined into esophageal carcinoma. Colon and rectal carcinoma were combined into colorectal carcinoma in available genomic data from cBioPortal, resulting in 32 different tumor groups. First, BIRC5 gene alteration frequency was determined on the DNA level for the different cancer studies. Genetic alterations were subsequently divided into mutation, fusion, amplification, deep deletion, multiple alterations for 8812 samples from 32 different cancer types. From the same platform, we downloaded DNA amplification data for BIRC5 in the different cancer types.
The cBioPortal repository was then used to identify genes that were co-expressed with BIRC5 for 32 tumor types corresponding to 9351 samples (Table 1). Only data for esophageal adenocarcinoma (and not esophageal squamous carcinoma) were available for this analysis, while cervical squamous cell carcinoma and endocervical adenocarcinoma were both included under cervical squamous cell carcinoma (CESC). Spearman's correlation was used to identify genes with mRNA expression that were significantly correlated with BIRC5 mRNA expression. Pathway analysis was then performed using Reactome [20,21] with BIRC5 and the top 100 co-expressed genes for every tumor type.

Broad GDAC Firehose and UCSC Xena Browser
RNA sequencing (RNA-seq) data (UNC RNASeqV2 level 3 expression (normalized RSEM) for BIRC5 expression were downloaded from Broad GDAC Firehose [22] for 8526 TCGA tumor samples corresponding to the 33 different cancer types, as well as matching normal tissue samples (n = 627) for 16 cancer types (n = 5507; Table 1). Liu et al. recently compiled genomic and clinical data for the TCGA dataset into a standardized version called the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) [23]. Therefore, we downloaded survival and phenotype data for the TCGA dataset from UCSC Xena Browser [24,25] and from National Cancer Institute -Genomic Data Commons [23,26]. Although the survival data included four clinical end points, i.e. overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI) and progression-free interval (PFI), OS (the time from diagnosis to death of any cause) and PFI (the time from diagnosis to new tumor event, e.g. progression of disease, local recurrence, distant metastasis, new primary tumors, or died with the cancer without a new tumor event) were used in the present study as these were deemed to be relatively accurate endpoints by Liu et al.  [33] were reevaluated from our previous work to identify DNA copy number alterations, exonic variants, and gene fusions. Additionally, survival analysis was performed using BIRC5 expression and OS. 4. Survival analysis: Survival analysis for BIRC5 gene expression and OS was performed using the KM plotter web-based tool [34] with RNA microarray data for breast- [35], ovarian- [36], lung- [37], and gastric cancer [38]. The following settings were selected in KM plotter: (1) BIRC5 (Affymetrix probe 202094_at), (2) 'auto select best cutoff ' to stratify the patient cohort, (3) OS endpoint, and (4) only 'Jetset' best probe set [39]. No cutoffs were made with regards to tumor subtype or treatment.

Statistical analysis
Statistical analyses were performed using R/Bioconductor 3.12 (BiocManager 1.30.12) in RStudio (version 1.3.1073), where p-value < 0.05 was considered to be statistically significant. BIRC5 expression in cancer samples was compared to expression in corresponding normal tissue. Tumor samples with no corresponding normal samples were removed from the analysis (Table 1). Boxplots were then generated with R packages ggpubr version 0.4.0 [40] and rstatix version 0.6.0 [41] using Wilcoxon test adjusted with Benjamini-Hochberg correction. DNA amplification data for BIRC5 from TCGA Pancancer was matched with RNA sequencing data from Broad GDAC Firehose using Wilcoxon test to determine the effect of DNA amplification on BIRC5 gene expression. Multivariable Cox proportional hazards regression analysis was performed using R packages survival version 3.2-7 [42,43], survminer version 0.4.9 [44], and Publish version 2020.12.23 [45]. Cox regression models were calculated using RNA sequencing data for BIRC5 expression with the OS and PFI endpoints, adjusting for age and/or tumor grade (if available). Only age was avail- Forest plots were generated using the R package forestplot version 1.10 [46]. Due to missing data, LAML was excluded in the PFI analysis. For the external breast cancer dataset [33], multivariable Cox regression models adjusted using age and tumor grade were calculated using BIRC5 expression and OS.
To identify clinicopathologic features that were associated with BIRC5 expression, BIRC5 expression was first categorized from RNA sequencing data as low BIRC5 (lower than median BIRC5 expression) and high BIRC5 (higher than median BIRC5 expression) by calculating the quantiles (0, 25, 50, 75, 100%) for BIRC5 expression; median BIRC5 expression (50%, quantile 2) was 0.4996274. Phenotype data were then retrieved for each cancer type from Xena Browser and matched with the RNA sequencing data in one file. Tableone script (version 0.13.0) in R was then used to identify clinicopathologic features associated with BIRC5 expression. However, 9/33 cancer types (COAD, DLBC, GBM, LAML, OV, READ, SKCM, TGCT, UCS) could not be analyzed in tableone due to that they only had samples with high BIRC5 expression.
To validate these findings, the ICGC Data Portal and COSMIC were used to identify somatic mutations of 'high mutation impact' or 'pathogenic' in the BIRC5 gene. ICGC data showed that eight patients affected by different cancer types harbored eight different BIRC5 mutations (Additional Table 1). Four of the eight cancer projects (BRCA, PRAD, SKCM, UCEC) were derived from TCGA data and the other four were from projects in China (colorectal cancer, COCA-CN; liver cancer, LICA-CN; nasopharyngeal cancer, NACA-CN) and Spain (chronic lymphocytic leukemia, CLLE-ES). For the Fig. 1 Distribution of genetic alterations in the BIRC5 gene in 32 cancer types using the interactive web-based online tool cBioPortal (cbiop ortal. org). A Although only 196 of the 8812 cases (2%) had a gene alteration of any kind, DNA amplification was found to be most prevalent. Figure modified from cBioPortal [16,17]. B Amplification of BIRC5 in relation to mRNA expression levels of BIRC5 (P < 2.2e- 16) TCGA data, high impact BIRC5 mutations were classified as missense for SKCM and UCEC and stop gain for BRCA and PRAD, which were in line with the findings in cBioPortal, BIRC5 mutations in the other datasets were classified as a frameshift mutation in COCA-CN, start loss mutation in LICA-CN, missense mutation in NACA-CN, and frameshift mutation in CLLE-ES. Furthermore, genome-wide screening data (array comparative genomic hybridization and SNP genotyping) were reevaluated from our previous work on breast cancer. DNA amplification in the BIRC5 gene was found in 15/229 (0.066%) breast cancer samples. None of the samples were shown to harbor deep deletions, mutations or fusions. COSMIC data revealed 210 unique cancer samples with somatic mutations in the BIRC5 gene, of which 33 were classified as pathogenic mutations (Additional Table 2). In total, 6 nonsense substitutions (breast, endometrium, hematopoietic and lymphoid, large intestine), 22 missense substitutions (breast, cervix, endometrium, esophagus, hematopoietic and lymphoid, kidney, large intestine, lung, prostate, skin, stomach, urinary tract), 3 synonymous substitutions (skin), and 1 unclassified mutation (head and neck) were identified. Eighteen of the 33 unique samples were derived from TCGA data.

BIRC5 is frequently co-expressed with genes involved in cell cycle and DNA replication
To identify genes recurrently co-expressed with BIRC5 in cancer, the top 100 co-expressed genes in the 32 cancer types were extracted from the Spearman correlation analysis (Q < 0.05) in cBioPortal. When combining the top 100 co-expressed genes for each cancer type, some genes occurred more than once, e.g. AURKB (encoding for Aurora kinase B [47]) and CDC20 (encoding for Cell division cycle 20 [48]) were the most frequent among the combined list of 3200 genes. In total, 117/3200 genes were negatively correlated with BIRC5 and the remaining genes were positively correlated (Additional Table 3). When duplicates were removed and BIRC5 was included in the list, 629 genes remained. The Reactome Pathway Database was then used to identify signaling pathways associated with BIRC5 and the co-expressed genes. In total, 436/629 genes involving 1039 pathways were identified in Reactome, including pathways playing a pivotal role in cell cycle and DNA replication were found to be overrepresented (Fig. 3, Additional Table 3, Additional file 1).

Survival analysis demonstrates the prognostic relevance of BIRC5 expression in cancer
To assess the prognostic significance of BIRC5 expression in the 33 cancer types, multivariable Cox regression analysis was performed for OS or PFI after adjusting for age and/or tumor grade. In total, 14/33 cancer types were found to be associated with more unfavorable OS in patients with tumor samples expressing BIRC5, whereas BIRC5 expression was linked to a protective effect in 2/33 cancer types (LUSC and OV; Fig. 4 expression was significantly associated with better overall survival for breast-, lung and gastric cancer (Fig. 5). Intriguingly, BIRC5 expression was only associated with significantly more unfavorable PFI in 14/33 cancer types, with 4/14 cancer types (KIRC, LGG, LIHC, and PAAD) after adjusting for both age and tumor grade and 10/14 cancer types (ACC, KICH, KIRP, LUAD, MESO, PCPG, PRAD, SARC, THCA and UVM) after adjusting for age (Fig. 6). The highest HR was seen for UVM (HR = 4.69; 95% CI: 2.24-9.82) and PCPG (HR = 4.34; 95% CI: 2.34-8.04).

Clinicopathological features and BIRC5 expression
We then determined a relationship between clinicopathological features and BIRC5 expression stratified as high BIRC5 (higher than median BIRC5 expression) and low BIRC5 (lower than median BIRC5 expression) expression. Intriguingly, it was apparent that most cancer samples were classified as BIRC5 high (Additional Table 4

Discussion
Here, we applied a pan-cancer multiomics approach in 33 different cancer types to examine molecular mechanisms that can ultimately lead to the high BIRC5 gene expression patterns observed in cancer. We show that, although genetic alterations are uncommon in the BIRC5 gene, DNA amplification is associated with higher RNA levels of BIRC5. However, the clinical impact of genetic alterations such as DNA amplification in the BIRC5 gene is still unclear. In agreement with previous studies [3], our results also show that BIRC5 expression levels are higher in cancer tissue than normal tissue. In several different cancer types, we observe an association between higher BIRC5 expression and unfavorable OS. Taken together, our findings demonstrate the prognostic relevance of BIRC5 expression in a variety of cancer types from different organ systems.
The highest HR values for OS and BIRC5 expression were found for adrenocortical carcinoma (ACC) and pheochromocytoma and paraganglioma (PCPG), both of which are hormone-producing tumors [49]. A previous study using immunohistochemistry to evaluate Survivin levels in ACC samples revealed overexpression of Survivin in carcinomas compared to adenomas or normal glands, with worse prognosis for patients with tumors expressing higher Survivin levels (not statistically significant). Knockdown of Survivin in an ACC cell line resulted in higher apoptotic rates [50]. Another study comparing Survivin expression in healthy adrenal medulla and pheochromocytoma/paraganglioma (malignant and benign) showed no significant difference between malignant or benign tumors. However, a more recent study showed an association between increased Survivin expression and worse prognosis in pheochromocytoma [51]. For uveal melanoma where we show an association between BIRC5 expression and worse PFI, two previous studies showed conflicting results, one did not find any difference in immunohistochemical expression and tumor activity [52] and the other indicated the possible involvement of Survivin in Cisplatin-resistance using human uveal melanoma cell lines [53]. Our results show that high BIRC5 expression is associated with worse prognosis in all three analyzed types of kidney cancer. This is in line with previous results from a meta-analysis on 10 studies containing 1063 renal cancer cases, which demonstrated that high Survivin expression is associated with TNM stage and Fuhrman grade [54]. Other studies have also found a connection with more aggressive renal tumors and high Survivin expression [55][56][57]. Our study proposes Survivin/BIRC5 as a promising biomarker using RNA sequencing data. Survivin/BIRC5 could be an addition to other biological detection indicators. A study investigating cervical cancer cell lines found that Survivin showed more intense fluorescence in cancer cells than in normal cervical cells. Although the authors found Survivin to have a clinical sensitivity of 72.5% and a specificity of 77%, the sensitivity increased to 98% when combining Survivin with HPV16E6 and 96.1% when only using HPV16E6 [58].
Unlike the tumor types discussed above, higher BIRC5 expression seems to be beneficial for OS for patients with   [20] ovarian cancer. A previous study found an association between high Survivin and response to taxane-platinum treatment [10]. However, recently, two meta-analyses found that high Survivin expression in ovarian cancer is associated with poor prognosis and worse tumor stage [59,60]. Further studies on BIRC5 expression/Survivin protein levels are needed in order to determine its prognostic significance for ovarian cancer or its possible connection to chemotherapy response. Interestingly, it has been shown that wild type of the tumor suppressor gene p53 could subdue Survivin expression [61], suggesting that non-functional p53 in cancer could result in higher Survivin expression. Similar conclusions were also determined in a recent study showing elevated Survivin expression in mice with p53-mutated esophageal squamous cell carcinoma, which could play a role in aiding lung metastasis [62]. A clinical study examined the genome and transcriptome of 198 lung squamous cell carcinomas and found that BIRC5 amplification was prevalent in tumors with p53 mutations [63].
There are several in vitro studies that show how BIRC5 overexpression or silencing could affect cancer cell lines. For instance, TP53 has been linked to BIRC5 in both glioblastoma multiforme (GBM) cells and 5-fluorouracil resistant cholangiocarcinoma (CHOL) cell lines [64,65]. In vitro studies have suggested that BIRC5/Survivin could be implicated in chemotherapy resistance of Irinotecan in colon adenocarcinoma (COAD), Oxaliplatin in esophageal squamous and esophageal adenocarcinoma (ESCA) and Cisplatin in hepatocellular carcinoma (LIHC) [66][67][68]. In breast cancer cell lines, Survivin, as well as FOXM1 and XIAP have been shown to contribute to drug-resistance [69]. Silencing of Survivin in HeLa cells (cervical carcinoma cells) was shown to result in an increased sensitivity to radiation therapy [70]. Several studies on thyroid carcinoma (THYR) cell lines demonstrate the involvement of Survivin in inhibiting cell proliferation [71][72][73] and an in vivo study using human gastric adenocarcinoma cell lines in mice xenografts showed that inhibition of Survivin expression could promote cell death [74].
In conclusion, BIRC5 is indeed overexpressed in most cancer types, which frequently correlates with patient clinical outcome. Although publicly available TCGA data are useful for explorative pan-cancer studies, these findings need to be examined further in specific tumor types at the protein level to assess the clinical utility of BIRC5/Survivin. A limitation of the current study was the lack of large datasets similar to the TCGA dataset that contained both gene expression and clinical data to validate the prognostic relevance of BIRC5 expression in cancer. In future studies, it would also be interesting to evaluate the impact of BIRC5 expression levels on chemotherapy efficacy. In oncology, there is a constant need for better predictive markers in order to choose the right course of treatment [75]. Some treatment regimens are not only associated with acute toxicity, but also long-lasting chronic complications [76,77]. Our study suggests BIRC5 as a promising prognostic biomarker for several cancer types, but these findings need to be investigated further.

Supplementary Information
The

Availability of data and materials
The data used in this study have already been deposited in Gene Expression Omnibus (accession GSE97293), as stated in our previous publication [33]. The data bases referenced in the methods section of this article are all open access.  Fig. 6 Forest plot of the association between progression-free interval (PFI) and BIRC5 expression. * denotes significant p-values. All cancer types are adjusted for age using multivariable Cox proportional hazards regression analysis; cancer types marked with ** are adjusted for both age and tumor grade