Identification of survival-associated alternative splicing events and signatures in adrenocortical carcinoma based on TCGA SpliceSeq data

Objective: To explore the correlations among alternative splicing (AS), splicing factors (SF) and survival outcome in adrenocortical carcinoma (ACC) patients. Results: A total of 92 ACC patients were included. Univariate analysis identified 3919 AS events significantly associated with overall survival. Lasso method followed by multivariate analysis revealed that the prognostic capacity of these survival-related AS events is satisfactory. Interestingly, we found that the area under the curve (AUC) of AA, AD, AP and RI were more than 0.9, indicating that these four types of AS were of great significance. Independent prognostic analysis showed that only the risk score was the independent risk factor of ACC survival. Finally, we constructed an interesting interaction network between AS and SF. Conclusions: This is the first and most comprehensive study to explore the aberrant AS variants in ACC, which might provide novel insights into molecular mechanism of ACC. Methods: The transcriptome data, clinical information and Percent Spliced In (PSI) values of the ACC were obtained from TCGA database and TCGA SpliceSeq data portal. Lasso method and uni/multivariate Cox regression analysis were used to identify survival-related AS events and develop multi-AS-based signatures. The relationship between AS events and SFs was also investigated.


INTRODUCTION
As a rare but greatly aggressive malignancy, the prognosis of adrenocortical carcinoma (ACC) is rather poor [1,2]. Surgical excision currently remains the preferred treatment for ACC in patients with localized lesion [3]; however, the five-years overall survival is only 15-44% even if the tumor is completely removed [4]. Although there have been numerous studies exploring ACC, the specific molecular mechanism has not been fully elucidated.
As the most important post-transcriptional regulatory process, alternative splicing (AS) modifies more than 90% of human genes [5,6], contributing greatly to protein diversity and complexity [7]. Numerous physiological and pathological behaviors in the human body are related to AS, including angiogenesis, proliferation, hypoxia, etc [5,8,9]. It was reported that survival predictors of gene expression were consistently inferior to AS-based survival predictors [10]. There were emerging data revealing that abnormal AS events were involved in carcinogenesis, immunologic escape, cancer progression and metastasis [6,[10][11][12]. However, little work has been devoted to investigate the role of AS in ACC.
In this study, a total of 92 patients with ACC from the Cancer Genome Atlas (TCGA) database were enrolled to comprehensively profile genome-wide AS events, AGING which play a vital role in the development of ACC. Further, uni/multivariate Cox regression analysis and Lasso regression analysis were used to identify survival-related AS events and develop multi-AS-based signatures. We also explored the relationship between AS events and clinical feature using clinical data from TCGA database. Moreover, AS related splicing factors (SF) were identified and functional enrichment analyses were performed.

Clinical characteristics and integrated AS events
A total of 92 ACC patients from TCGA database were included in the present study. The clinicopathological characteristics of these 92 cases were summarized in Table 1. The AS events were comprehensively analyzed. Intersections among these seven types of AS events and quantitative analysis of these interactive sets were presented by the UpSet plot ( Figure 1A). These data suggested that a single gene could possess several types of mRNA splicing events and ES was the predominant type in ACC cohort whereas ME was rare.

Survival associated AS events
Univariate Cox regression analysis between AS events and OS was conducted to determine the survival associated-AS events. A total of 3919 AS events were significantly associated with OS (P<0.05). Consequently, one gene might contain two or more AS events that were remarkably related to the OS, and ES was the most common survival associated event. The UpSet plot was generated to visualize the intersecting sets of genes and splicing types vividly ( Figure 1B). The top 20 (if available) significant survival associated AS events of each type were selected and visualized in Figure 2. The distributions of AS events significantly associated with OS were displayed in Figure 2D.

Construction and evaluation of the prognostic AS signature
Lasso method followed by multivariate Cox regression analyses were performed to evaluate the prognostic capacity of these survival related AS events. The results of Lasso regression analysis including seven types of AS events and all AS events were presented in Figure 3. Using Lasso regression analysis, we selected the most highly correlated AS events. Next, the particular AS events were selected and risk scores were calculated based on multivariate Cox regression model. The detailed information of particular AS events in the eight prognostic models was shown in Table 2. Moreover, ACC patients were stratified into low-risk and high-risk groups based on the median value of the risk score as cut off. The Kaplan-Meier curves were employed to demonstrate the survival probability variation of patients in the low-risk and high-risk groups. The results showed that survival time differed notably between these two groups in each type and the whole cohort ( Figure 4). The ROC curves were generated to evaluate the predictive power of each prognostic signature. Finally, four prognostic signatures with AUC ≥ 0.9 were selected for subsequent analysis. The results revealed that risk score of AD performed the greatest prognostic power with an AUC of 0.983, followed by risk score of AA with an AUC of 0.969 and risk score of RI with an AUC of 0.928 ( Figure 5). The detailed information of corresponding splicing pattern of the candidate AS events, living status as well as survival time ranked by the distribution of risk score was displayed ( Figure 6).

Stratification survival analysis with the prognostic signatures
The following clinical parameters including gender, T stage, N stage and clinical stage were integrated into the uni/multivariable Cox regression analysis. Univariate Cox regression analysis indicated that several clinical features could predict poorer survival of ACC patients including high T stage, high N stage and high clinical stage and high risk-score. However, multivariate Cox regression showed that the risk score was the only independent risk factor of ACC survival (Figure 7). These results revealed that the risk scores with an AUC ≥ 0.9 were independent prognostic indicators after adjusted by the clinicopathological characteristics.

Functional enrichment analysis of AS events related SF genes
Functional enrichment analysis was performed for the sake of revealing the potential mechanisms of corresponding genes (known as SFs) with survivalassociated AS events. The biological process terms of these genes were mainly related to "mRNA processing", "RNA splicing", "RNA splicing via transesterification reactions". "RNA splicing via transesterification reactions with bulged adenosine as nucleophile". "Spliceosomal complex", "nuclear speck", "catalytic step 2 spliceosome" were the three most significant cellular component terms. For molecular function, "mRNA binding", "helicase activity" and "mRNA 3'-UTR binding" were three most enriched categories ( Figure 8A). KEGG analysis revealed a total of 5 remarkably enriched pathways, including "spliceosome", "RNA transport", "mRNA surveillance pathway", "RNA degradation", and "Legionellosis" ( Figure 8B).

Construction of potential splicing regulatory network
Then, a correlation network between the expression of SFs and the PSI values of survival-associated AS events was conducted. As shown in Figure 8C, there were a total of 42 down-regulated AS events (purple triangles), 50 up-regulated AS events (yellow triangles) and 67 SFs in this network. In this network, we selected the five most significant nodes as the hub AS events or hub SFs according to the degree, including one upregulated AS event (UQCR11-46528-AT), two downregulated AS events (UQCR11-46527-AT and NASP-2754-ES) and two SF (EIF3A and HNRNPR).

DISCUSSION
It has been reported that there was an extremely high incidence of splicing defects in carcinogenesis and that RNA splicing modulators might be a promising target for cancer treatment [13][14][15]. Alternative splicing allows cells to generate diverse [13,14] mRNA by modifying mRNA isoforms [16]. There have been numerous studies exploring the important role of AS in various malignant tumors, including colorectal cancer [17], renal cell carcinoma [18], breast cancer [19] and so on. Gray et al. [20]. found a novel splicing event which showed both an allelic expression imbalance and preferential splicing for one of the alleles in ACC samples. Previous studies demonstrated that the activation of spliced X-box protein 1(XBP1) -mRNA splicing might be one of the important molecular mechanisms of mitotane in the treatment of ACC [21][22][23]. Besides, Kroiss et al. [22]. found that combination of bortezomib and carfilzomib with low-dose mitotane could be used to treat ACC effectively by increasing XBP1-mRNA splicing. Freddi et al. [24]. explored the expression of splice variants of growth hormonereleasing hormone receptor (GHRH-R) in human primary adrenocortical tumors and found that splice variants might be a key pathogenic factor in adrenal carcinogenesis. However, limited studies focused on the AS events in occurrence and progression of ACC.
As far as we know, this is the first study to date lucubrating the aberrant alternative spliced variants in patients with ACC using high throughput data. In this study, we systematically explored the prognostic value    AGING of AS events in ACC patients for the first time. A total of 3919 AS events, which were significantly associated with survival outcome, were identified by univariate Cox regression analysis. Next, Lasso regression and multivariate Cox regression analysis were preformed to constructed eight the predictive significances, including seven types of AS and all AS, respectively. These ACC patients from TCGA database were then divided into low-and high-risk groups according to the risk score.

AGING
Kaplan-Meier analysis demonstrated that the difference of overall survival between the low-risk patients and high-risk patients were significant. Besides, the results showed that AUC of AA, AD, AP and RI were more than 0.9, indicating that these four types of AS were of great significance in predicting survival outcome in ACC patients. As a consequence, these four types of AS were screened to further analysis. However, as is known to us, there were few studies related to AA, AD, AD and RI in ACC. Future study into these four AS types in ACC might have noticeable research value. Given the high prevalence of splicing defects in tumor, these prognostic AS events identified in this study could be novel therapeutic targets in ACC treatment.
As is known to us, splicing factors were one of the vital regulated factors of AS events [25]. Changes of expression or sequence mutations of SFs may affect splicing events [26]. To elucidate the underlying mechanism of AS events in the survival of ACC patients, we performed GO and KEGG enrichment analysis for the SFs related significantly to AS events. According to the results, "mRNA processing", "RNA splicing" and "RNA splicing via transesterification reactions" were the three most significant biological process. "Spliceosomal complex", "nuclear speck", "catalytic step 2 spliceosome" were the three most significant cellular component terms. For molecular function, "mRNA binding", "helicase activity" and "mRNA 3'-UTR binding" were three most enriched categories. The GO results revealed that these SFs have a significant relationship with splicing. KEGG analysis demonstrated that spliceosome is the most significant pathway among these SFs. Besides, we also established an interaction network between AS events and SF. In this network, we selected the five most significant nodes as the hub AS events or hub SF according to the degree, including one upregulated AS event (UQCR11-46528-AT), two downregulated AS events (UQCR11-46527-AT and NASP-2754-ES) and two SF (EIF3A and HNRNPR). These new hub SF genes and AS events were not reported previously, which need further validation.
There were several limitations in this study. Firstly, we merely analyzed the survival-associated AS events of ACC using TCGA database due to no additional external cohort concerning splicing data. Secondly, this is a retrospective study with limited number of patients.
Further study with larger sample size is warranted.

CONCLUSIONS
In summary, this is the first and most comprehensive study to explore the aberrant alternative spliced variants in patients with ACC. These identified survival-related alternative spliced events and the interesting interaction network between AS and SF might provide novel insights into molecular mechanism of ACC. Further researches are needed to investigate the role of AS in ACC.

Data collection and preprocessing
The transcriptome data and clinical information of the ACC cohort were downloaded from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/). Percent Spliced In (PSI) values for AS events of 33 various tumor types and available adjacent normal samples, have been loaded into TCGA SpliceSeq data portal (https://bioinformatics.mdanderson.org/TCGA SpliceSeq/) [13]. The AS-related data in TCGA SpliceSeq is of high quality [13]. The data, download procedure and operating environment in TCGA SpliceSeq have been widely used to explore survival-associated AS events and signatures of various malignant tumors [12,13]. We obtained AS events with PSI value of ACC from the TCGA SpliceSeq data portal. There were seven types of AS events, including retained intron (RI), exon skip (ES), mutually exclusive exons (ME), alternate promoter (AP), alternate donor site (AD), alternate terminator (AT) and alternate acceptor site (AA).

Identification of survival-related AS events
Univariate Cox regression analysis was carried out to evaluate the association between AS events and overall survival (OS). UpSet plot and volcano Plot was used to present the survival-related AS events based on the seven types of AS events. Besides, the bubble chart was used to summarize the top 20 all AS events, top 20 RI events, top 20 ES events, top 8 ME events, top 20 AP events, top 20 AD events, top 20 AT events and top 20 AA events.

Prognostic risk scores calculation and survival analysis
Lasso regression analysis was further conducted to screen highly correlated AS events and avoid overfit the models constructed subsequently. Multivariate Cox regression analysis was applied to calculate the prognostic risk scores for OS prediction based on seven types of AS events. All patients were divided into high-risk group and low-risk group according to the median risk score. Kaplan-Meier curves were used to demonstrate the survival probability variation between low-risk and highrisk subgroups. Receiver operating characteristic (ROC) curve was constructed and the area under the curve (AUC) was also calculated. Further, prognostic evaluation of the risk score was performed using risk curve. The expression heatmap, distribution of risk score and survival time of related signature were then visualized.

Independence of prognostic signature
The following clinicopathological characteristic from TCGA database including gender, T stage, N stage and clinical stage were subjected to subsequent analysis. Univariate and multivariate Cox regression analysis was performed to evaluate whether the final AS prognostic signature with an AUC ≥ 0.9 was an independent risk factor of OS.

Functional enrichment analyses
The list of splicing factor (SF) genes was collected from the SpliceAid 2 database (http://193.206.120.249/ splicing_tissue.html) [14] and was shown in Supplementary Table 1. The expression profiles of SF genes were downloaded from TCGA database. To further illustrated the underlying mechanisms of AS in ACC, we identified corresponding SF genes of AS events. Functional enrichment analyses, including Gene Ontology (GO) analysis and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, for these SF genes was performed. The results of GO analysis were presented by three parts including biological processes (BP), molecular functions (MF), and cellular component (CC). Both GO analysis and KEGG analysis was conducted using R x64 3.6.1 software.

Construction of splicing regulatory network
Spearman test was conducted to analyze the correlation between the expression of SF genes and PSI values of survival-associated AS events. P value < 0.001 and correlation coefficient > 0.6 was considered statistically significant. Besides, we constructed an interaction network of AS and SF. The plug-in Molecular Complex Detection (MCODE) was used to select most significant modules from the interaction network. The most significant interaction network was then visualized by Cytoscape software (version 3.4.0). Furthermore, the top five hub nodes were selected according to the connection degree from this interaction network.

CONFLICTS OF INTEREST
The authors of this manuscript declare no conflicts of interest.