Identification of Prognosis-Related Genes in Bladder Cancer Microenvironment across TCGA Database

Background Bladder cancer (BCa) is a common urothelial malignancy. The Cancer Genome Atlas (TCGA) database allows for an opportunity to analyze the relationship between gene expression and clinical outcomes in bladder cancer patients. This study is aimed at identifying prognosis-related genes in the bladder cancer microenvironment. Methods Immune scores and stromal scores were calculated by applying the ESTIMATE algorithm. We divided bladder cancer patients into high and low groups based on their immune/stromal scores. Then, differentially expressed genes (DEGs) were identified in bladder cancer patients based on the TCGA database. We evaluated the correlation between immune/stromal scores and clinical characteristics as well as prognosis. Finally, we validated identified genes associated with bladder cancer prognosis through a cohort study in the Gene Expression Omnibus (GEO) database. Results A higher stromal score was associated with female (vs. malep = 0.037), age > 65 (vs.age ≤ 65 p = 0.015), T3/4 (vs. T1/2,p < 0.001), N status(p = 0.016), and pathological high grade (vs. low gradeP < 0.001). By analyzing DEGs, there were 1125 genes commonly upregulated, and 209 genes were commonly downregulated. Protein-protein interaction networks further showed the important protein that may be involved in the biological behavior and prognosis of BCa, such as FN1, CXCL12, CD3E, LCK, and ZAP70. A total of 14 DEGs were found to be associated with overall survival of bladder cancer. After validation by a cohort of 165 BCa cases with detailed follow-up information from GSE13507, 10 immune-associated DEGs were demonstrated to be predictive of prognosis in BCa. Among them, 5 genes have not been reported previously associated with the prognosis of BCa, including BTBD16, OLFML2B, PRRX1, SPINK4, and SPON2. Conclusions Our study elucidated tight associations between stromal score and clinical characteristics as well as prognosis in BCa. Moreover, we obtained a group of genes closely related to the prognosis of BCa in the tumor microenvironment.


Introduction
Bladder cancer (BCa) is the fourth most common cancer in men and the twelfth in women and the leading cause of cancer-related mortality [1,2]. Epidemiology studies reported that men have a four-fold higher incidence of bladder cancer than women [3]. Urothelial carcinoma accounts for 90% of BCa cases, whereas the remaining 10% cases are squamous cell carcinoma (SCC) or adenocarcinomas [4]. Bladder cancer treatment may include surgery, chemotherapy, radiotherapy, immunotherapy, and targeted therapy. Despite significant improvement in cancer treatment, two-thirds of patients with UBC will have a recurrence or disease progression within 5 years, leading to poor prognosis [5].
The tumor microenvironment consists of immune cells, mesenchymal cells, endothelial cells, along with inflammatory mediators, and extracellular matrix molecules [6]. Immune cells and stromal cells are the two most important nontumor cells in the tumor microenvironment. Previous evidences indicate that the components in the tumor microenvironment have an important role in promoting the proliferation and invasion of BC [7][8][9]. High stromal tumor-infiltrating lymphocytes within the tumor immune microenvironment is indicative of an inflamed subtype with an 80% 5-year disease-specific survival, and a lack of immune infiltrates is associated with an uninflamed subtype with a survival rate of smaller than 25%. A separate immune evading phenotype with upregulated immune checkpoints is associated with poor prognosis in bladder cancer [5]. Tumor microenvironment may have protective functions in other cancers, such as head and neck squamous cell carcinoma and glioblastoma [10,11]. However, there are few reports that systematically investigate the genes involved in the tumor microenvironment and their associations with prognosis in bladder cancer.
To better understand the impact of tumor genetic composition on clinical outcomes, the Cancer Genome Atlas (TCGA), a comprehensive genome-wide gene expression collection, has been established to discover genomic abnormalities around the world [12]. Tomczak   There was a significant association between the clinical stage of bladder cancer and stromal scores (n = 412, p < 0:001). (c) According to the ranking of immune scores, bladder cancer cases were divided into low and high groups. (d) According to the ranking of stromal scores, bladder cancer cases were divided into low and high groups.
predict the infiltration of nontumor cells by analyzing specific gene expression signature [13]. In this present study, we identified a list of tumor microenvironment-related genes, which are closely related to the prognosis in bladder cancer by using the TCGA database and ESTIMATE algorithm. Moreover, we validated prognosis-related genes in a different bladder cancer cohort available from the Gene Expression Omnibus (GEO) database.     BioMed Research International scores and stromal scores were calculated by the ESTIMATE algorithm [14]. For validation, the expression data and clinical information of 165 eligible samples were identified from the GEO database (GSE13507) (https://www.ncbi.nlm.nih. gov/gds). Only histologically verified transitional cell carcinoma samples were selected in the GEO dataset (GSE13507).

Differentially Expressed Gene Analysis.
Differentially expressed genes (DEGs) were determined between 412 BCa, and 19 nontumor counterparts were performed using the package "limma" in R. Genes were considered as DEGs following the thresholds of |log 2 fold change ðFCÞ | >2 and adjusted p value < 0.05. Expression profiling of the DEGs was performed using the R package heatmap. The overlap of DEGs was determined using the R package Venn diagrams.

Functional Enrichment Analysis of DEGs.
Functional enrichment analysis of DEGs was performed by R packages "clusterProfiler", "http://org.Hs.eg.db", "enrichplot", and "ggplot2" to identify gene ontology (GO) categories, including biological processes (BP), cellular components (CC), or molecular functions (MF). The abovementioned packages were also used to perform KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis [15]. The protein-protein interaction (PPI) network was built by Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https://string-db.org); interaction score > 0:95 was set as the cut off to screen for the key PPI nodes [16].

Overall Survival
Analyses. The survival analysis was assessed by Kaplan-Meier methods. The patients with BCa were split into high-score group and low-score group according to the median immune/stromal scores. Overall survival curves were compared between the two groups using the Kaplan-Meier method. To investigate the associations of DEGs with overall survival, the Kaplan-Meier method was utilized to compare the survival rate difference between high and low groups stratified by the median gene expression levels. Log-rank tests were used to assess the statistical differences. Univariate and multivariate survival analyses were performed between overall survival, DEG expression levels, and clinical features, including age, gender, T clinical stage, N clinical stage, and M clinical stage and grade, using the Cox regression models of the R package survival. Hazard ratio (HR) and 95% confidence interval (CI) of HR were extracted from the Cox regression models. Prognosis-associated genes were classified into protective genes (0 < HR < 1) and risk genes (HR > 1) according to their HR values. p values < 0.05 were considered statistically significant.

Statistical Analysis.
The correlations between clinicopathological parameters and immune/stromal scores were analyzed using a two-side student t test. All of the statistical tests were performed in R version 3.3 (https://www.rproject.org/). p values < 0.05 were considered as statistical significance.

Results
We   (Figure 1(a)). However, stromal cell scores were significantly associated with the clinical stage (p < 0:001) (Figure 1(b)). We divided the 412 bladder cancer cases into high and low score groups. Kaplan-Meier survival curves (Figure 1(c)) showed that the low stromal scores group had a lower mortality rate than the high scores group (33.5% and 45.3%, p = 0:117). Cases with low immune scores had a lower mortality rate compared to patients with high scores group (Figure 1(d), 38.9% and 39.4%, p = 0:472), although no statistical differences were found.
To evaluate the correlations between clinicopathological parameters and immune/stromal scores, we compared and plotted the distribution of immune scores and stromal scores stratified by age, gender, T status, N status, M status, and Fuhrman grade. We found that higher stromal score was associated with female (vs. male, Figure 2(a) p = 0:037), age > 65 (vs. age ≤ 65, Figure 2

Characteristics in Gene
Heatmaps with Immune Scores and Stromal Scores in BCa. Heatmaps showed the differential  gene expression profiles based on the immune and stromal scores (Figures 3(a) and 3(b)). There were 1371 genes upregulated and 457 genes downregulated in the high immune scores group as compared to the low immune score group. A total of 1519 genes were upregulated, and 398 genes were downregulated in the high stromal score group as compared to the low stromal score group. Moreover, Venn diagrams (Figures 3(c) and 3(d)) showed that 1125 genes were commonly upregulated, and 209 genes were commonly downregulated.

Protein-Protein Interactions among Genes of Prognostic
Value. We drew PPI networks by using the STRING (Figure 4). The number of protein nodes in the network diagram was shown in Figure 4. There were 13 genes in the network diagram with more than 20 interconnected nodes, such as FN1, CXCL10, CXCL12, IL10, ITGAM, CCL5, CCR5, CD4, CCR2, CXCL11, CXCL13, CXCL9, and LCK ( Figure 5). It includes multiple genes that are closely related to immune response. Moreover, FN1, CXCL12, CD3E, LCK, and ZAP70 were key interconnected node genes in PPI networks and are also associated with overall survival in patients with bladder cancer (Figure 6).

GO and KEGG Enrichment Analysis of Genes.
Top GO terms included the regulation of leukocyte activation and T cell activation in biological processes, collagen-containing extracellular matrix and extracellular matrix in cellular components, and receptor-ligand activity and receptor regular activity in molecular functions (Figure 7). In addition, the results of KEGG pathway enrichment analysis of differentially expressed proteins are shown in Figure 8, and cytokinecytokine receptor interaction, PI3K-Akt signaling pathway, and chemokine signaling pathway are the top three enrichment pathway for differentially expressed proteins.

Correlation of Expression of Individual DEGs in Overall
Survival and Validation in the GEO Database. We generated Kaplan-Meier survival curves to evaluate the role of individual DEGs in overall survival in bladder cancer. A total of 274 DEGs were found to be associated with the prognosis of bladder cancer (p < 0:05, selected genes are shown in Figure 9). To further validate, the prognosis-related genes found in the TCGA database are significant in other bladder cancer cases. We analyzed a cohort study of 165 bladder cancer cases from the GEO database. A total of 29 genes were validated ( Figure 10) to be significantly associated with bladder cancer prognosis (Table 1). Univariate and multivariate analyses confirmed that 9 genes were risk genes and 5 protective genes in the TCGA cohort (p < 0:05 for all cases, Supplementary Table 1-2). Of 14 prognosis-associated genes, 5 risk genes (CALD1, TNC, OLFML2B, PRRX1, and SPON2) and 5 protective genes

Discussion
In our study, we found that immune scores and stromal scores were associated with BCa patients' survival based on TCGA datasets, although no statistical differences were found in K-M survival analysis. Stromal scores were associated with multiple clinicopathological parameters, including AJCC stage, age, gender, T status, N status, and Fuhrman grade of BCa. However, immune scores were only associated with Fuhrman grade, and no significant correlation was found between immune scores and AJCC stage, age, gender, T status, N status, and M status. These results indicated that stromal cells in the tumor microenvironment may play a more important role in multiple biological behaviors, affecting the occurrence, development, and prognosis of bladder cancer, compared with immune cells.
We further performed heatmaps to show the differential gene expression profiles according to the immune and stromal scores. In addition, Venn diagrams showed that 1125 DEGs were commonly upregulated, and 209 DEGs were commonly downregulated. Moreover, protein-protein interaction networks were constructed based on commonly differentially expressed genes. There are many key proteins in the PPI networks involved in immune/inflammation response, such as IL10 and CD4. Luo reported that blocking IL-10 can enhance the effect of BCG immunotherapy for bladder cancer [17]. Previous research found that CD3D/CD4 ratio is an important marker for the prognosis of bladder cancer [18].
The results of GO enrichment analysis suggested that many of the enriched differential genes are related to immune cell activity (leukocyte activation and T cell activation) and stromal cell components (collagen-containing   [19]. The PI3K-Akt signaling pathway is a classic pathway for cell proliferation, thus playing an important role in the pathogenesis of multiple cancers. Previous researches have shown many oncogenes promote proliferation via the PI3K-Akt signaling pathway in bladder cancer [20,21]. All these suggest that our analysis results are helpful to clarify the pathogenesis and mechanism of bladder cancer and provide new research ideas for the treatment of bladder cancer. In recent years, increasingly more mRNAs have the potential to be molecular biomarkers for evaluating and predicting the prognosis in various cancer types [22][23][24]. However, there are few reports on the tumor microenvironmentrelated genes to predict cancer outcomes in bladder cancer. In our study, a total of 274 genes are significantly associated with overall survival in bladder cancer. Among the 274 genes, the proteins encoded by FN1, CXCL12, CD3E, LCK, and ZAP70 genes were also key proteins in PPI. FN1 has the largest number of interconnected nodes in the PPI, suggesting that it may be involved in multiple aspects of bladder cancer development. FN1 encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and itself is a potential urine biomarker for bladder cancer detection [20]. CXC chemokine ligand 12 (CXCL12) is an important member of the CXC subfamily of chemokines. Nazari et al. suggested that elevated protein and mRNA levels of CXCL12 are found in human bladder cancer, which plays a role during the genesis of BCa and its further development [25]. Punt et al. found that high CD3E expression was correlated with improved disease-specific survival in squamous cervical cancer [26]. Previous study lymphocyte-specific protein tyrosine kinase (Lck) was one of the key molecules regulating T-cell functions, which emerged as a novel druggable target molecule for the treatment of cancers [27]. Fu et al. demonstrated that ZAP70 (a tyrosine kinase of the Syk family) may play an important role in the T-cell receptor (TCR) signaling pathway, which facilitated PCa cell migration and invasion [28]. To our delight, the function of CD3E, Lck, and ZAP70 gene have not been reported in bladder cancer, indicating that functional studies on these genes may help us to more accurately understand the prognosis-related biological behavior of bladder cancers. Furthermore, they also provide a new idea for further exploring the molecular mechanism and targeted therapy of bladder cancer.
Importantly, we validated these OS-related genes in an independent cohort of 165 bladder cancer patients from the GEO database (GSE13507). A total of 10 genes were identified to be correlated with overall survival. Among these identified genes, 5 genes (CALD1, HOXB3, HOXB6, MOGAT2, and TNC) have been reported to be associated with the development or prognosis of bladder cancer, indicating that the results in our study are consistent with previous publications in bladder cancer [29][30][31]. The remaining 5 genes have not been found to be associated with the prognosis of bladder cancer (BTBD16, OLFML2B, PRRX1, SPINK4, and SPON2). Further research that focuses on these potential prognostic-related genes may find new biomarkers of bladder cancer. Meanwhile, these genes may lead to gene-mediated molecular targeting therapy.

Conclusions
In summary, high stromal is associated with females, age above 65, clinical T stage 3/4, clinical N status, and pathological high grade. We found 1125 differentially expressed genes associated with tumor microenvironment in bladder cancer. Ten genes were closely related to prognosis, and they may have the potential to become prognostic biomarkers for bladder cancer.
scores stratified by N status. The scatter plot indicated that higher immune scores were associated with higher N status (p = 0:937). (e) The scatter plot indicated no significant associations between immune scores and M1 (vs. M0, p = 0:056). (f) The scatter plot indicated that higher immune scores were associated with high grade (vs. low grade, p = 0:001)