Identification of an immune-related long non-coding RNA signature and nomogram as prognostic target for muscle-invasive bladder cancer

To identify an immune-related prognostic signature based on long non-coding RNAs (lncRNAs) and find immunotherapeutic targets for bladder urothelial carcinoma, we downloaded RNA-sequencing data from The Cancer Genome Atlas (TCGA) dataset. Functional enrichment analysis demonstrated bladder urothelial carcinoma was related to immune-related functions. We obtained 332 immune-related genes and 262 lncRNAs targeting immune-related genes. We constructed a signature based on eight lncRNAs in training cohort. Patients were classified as high-risk and low-risk according to signature risk score. High-risk patients had poor overall survival compared with low-risk patients (P < 0.001). Multivariate Cox regression suggested the signature was an independent prognostic indicator. The findings were further validated in testing, entire TCGA and external validation cohorts. Gene set enrichment analysis indicated significant enrichment of immune-related phenotype in high-risk group. Immunohistochemistry and online analyses validated the functions of 4 key immune-related genes (LIG1, TBX1, CTSG and CXCL12) in bladder urothelial carcinoma. Nomogram proved to be a good classifier for muscle-invasive bladder cancer through combining the signature. In conclusion, our immune-related prognostic signature and nomogram provided prognostic indicators and potential immunotherapeutic targets for muscle-invasive bladder cancer.


INTRODUCTION
Bladder urothelial carcinoma (BCa) has been identified as the ninth most common malignant neoplasm all over the world [1,2]. More than 199,000 people died of it and over 549,000 cases were newly diagnosed in 2018 [1,2]. In the past twenty years, the number of BCa incident cases is growing globally and the BCa burden may ascend in the future due to aging of population and environmental pollution [3,4]. Although various Long non-coding RNAs (lncRNAs) are a series of RNAs without protein-coding capacity and their lengths are over 200 nucleotides (bp) [18]. Abundant evidence exists to suggest that lncRNAs contribute to tumor development and metastasis through activating immune system process and immune response including antigen release, antigen presentation, immune cell differentiation, immune cells migration, T cells infiltration and recognition and killing of cancer cells [10,19]. LncRNA CECR7 (cat eye syndrome chromosome region, candidate 7) activates the expression of CTLA4 (cytotoxic T-lymphocyteassociated protein 4) by targeting miR-429 in diabetic pancreatic cancer, which suggested that lncRNAs are involved in immune cell differentiation [20]. Another research revealed that, lnc-sox5 is pivotal in immune cells infiltration in colorectal cancer (CRC). Lnc-sox5 knock-down can directly increase the activity of regulatory T cells and their cytotoxicity is also dramatically enhanced in CRC [21]. In addition, bioinformatics analyses based on The Cancer Genome Atlas (TCGA) database demonstrated that lncRNAs are strongly implicated in the carcinogenesis of BCa through immune-associated pathways [22]. Therefore, aberrantly expressed lncRNAs may be potential prognostic biomarkers for BCa patients and may be served as potential therapeutic targets. As a consequence, based on the gene expression profile of high-throughput sequencing data obtained from TCGA, we carried out this present study in order to explore lncRNAs targeting immune-related genes and further construct an immune-related prognostic signature for MIBC patients.

RESULTS
The flowchart of the present study was summarized in Figure 1.

Identification of differentially expressed genes (DEGs) and immune-related DEGs in TCGA BLCA dataset
The TCGA BLCA (Bladder Urothelial Carcinoma) dataset was processed with differential expression analysis of all genes. According to the criteria mentioned above, a total of 1617 DEGs including 536 up-regulated and 1081 down-regulated genes between 411 BCa samples and 19 normal bladder samples were selected for further analyses (Figure 2A).
In addition, 332 immune-related genes were identified from Molecular Signatures Database v4.0 and were also analyzed with differential expression analysis. A total of 33 immune-related DEGs were screened out, among which 18 were up-regulated and 15 were downregulated as shown in the volcano plot and heat map ( Figure 2B, 2C).

Functional enrichment analysis
We performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses to determine the biological function and pathways involved in the 1617 DEGs in TCGA BLCA dataset. The top 10 enrichment items of biological process (BP) and KEGG pathway analyses were illustrated in Figure 3A, 3B, separately.
At the same time, we also analyzed the 33 immunerelated DEGs with GO and KEGG pathway enrichment analyses ( Figure 3C, 3D).
Next, to validate and confirm that BCa is closely immune-related, we adopted Venn diagram to obtain overlapped enrichment results between all DEGs enrichment analysis and immune-related DEGs enrichment analysis. We found that they both had enrichments in negative regulation of immune system process (GO:0002683), G protein-coupled receptor binding (GO:0001664) and transcriptional misregulation in cancer (hsa05202), which indicated that the development of BCa was associated with immune system and immune response ( Figure 3E).

Clinical characteristics of TCGA BLCA patients
A total of 389 BCa patients were involved in construction and validation of the lncRNA signature. They were randomly assigned to a training cohort (n = 195) for construction and a testing cohort (n = 194) for validation. Clinical characteristics of 389 BCa patients were displayed in Table 1.

Identification of the 8-lncRNA prognostic signature in training cohort
Pearson correlation analysis identified that 262 lncRNAs were correlated with the 18 up-regulated immune-related DEGs or 15 down-regulated immunerelated DEGs in the TCGA BLCA dataset (|R| > 0.5 and P < 0.01). And the 262 lncRNAs were lncRNAs targeting immune-related genes. Next, we used univariate Cox regression analysis method to identify prognosis-related lncRNAs from the 262 lncRNAs in the training cohort and univariate Cox regression analysis identified 21 prognosis-related lncRNAs dataset. Immune-related genes were extracted from Molecular Signatures Database v4.0. LncRNAs targeting immune-related genes were identified according to Pearson correlation. DEGs (differentially expressed genes) were calculated between BCa (bladder urothelial carcinoma) samples and normal bladder samples in TCGA BLCA dataset. The training cohort was used to identify the lncRNAs targeting immune-related genes and establish a prognostic signature based on the prognostic lncRNAs. The prognosis analysis was validated in the testing cohort, entire TCGA BLCA cohort and Tianjin validation cohort, respectively. Nomogram was constructed by including the immune-related signature and other prognosis-related clinical features in training cohort. Immunohistochemistry from THPA (The Human Protein Atlas) and online analyses from GEPIA (Gene Expression Profiling Interactive Analysis) were used to validate four key immune-related genes (CTSG, CXCL12, LIG1 and TBX1). Functional enrichment analyses were utilized to explore immune-related functions.   Evaluating the predictive power of the 8-lncRNA prognostic signature in training cohort 195 BCa patients in training cohort were divided into high-risk (n = 98) and low-risk (n = 97) groups determined by the median value of 8-lncRNA signature risk score. The risk score, survival time and survival status of each patient in training cohort were illustrated in Figure 4A, 4B. The heat map of the expression of the eight lncRNAs was displayed in Figure 4C.

AGING
The Kaplan-Meier (KM) survival curve of low-risk and high-risk groups in training cohort was shown in Figure 4D. The survival time of the low-risk group was significantly higher than that of the high-risk group (P < 0.001). In addition, time-dependent receiver operating characteristic (ROC) curves of 8-lncRNA prognostic signature and other clinical characteristics were plotted. The area under curve (AUC) of 8-lncRNA prognostic signature risk score was 0.742, which was better than other clinical characteristics in Figure 4E.
Furthermore, multivariate Cox regression analyses of age, gender, Union for International Cancer Control (UICC) stage, histological grade and 8-lncRNA prognostic signature were evaluated in Figure 4F. Risk score of the 8-lncRNA signature (HR = 1.500, 95%CI = 1.330-1.692, P < 0.001) might be considered as an independent prognostic indicator of OS (Table 3).   HR hazard ratio, CI confidence interval.

Validation of the 8-lncRNA prognostic signature in testing cohort and entire TCGA BLCA cohort
In order to further confirm the predictive power and stability of the 8-lncRNA signature in predicting the OS of BCa patients, we validated it in testing cohort (n = 194) and entire TCGA BLCA cohort (n = 389) ( Figures 5 and 6). Risk scores of the 8-lncRNA signature were also calculated with the above mentioned formula.
The KM survival curve of low-risk and high-risk groups in testing cohort was shown in Figure 5D. The survival time of the low-risk group was significantly higher than that of the high-risk group (P < 0.001). In addition, time-dependent ROC curves of 8-lncRNA prognostic signature and other clinical characteristics were plotted. Consistent with the finding in training cohort, the AUC of 8-lncRNA prognostic signature risk score was 0.724, which was better than other clinical characteristics in Figure 5E. Multivariate Cox regression analysis revealed that risk score of the 8-lncRNA signature (HR = 1.137, 95%CI = 1.090-1.186, P < 0.001) might be an independent prognostic indicator in Figure 5F and Table 3. Similar results of validation were obtained in entire TCGA BLCA cohort.

Survival and Clinical characteristics with 8-lncRNA prognostic signature in entire TCGA BLCA cohort
To further validate the prognostic value and explore the wide applicability of the immune-related signature, we performed survival analyses through stratification analysis based on entire TCGA BLCA cohort ( Figure 7A-7G). Samples in the high-risk group had poor survival compared with those in the low-risk group among stage III (P < 0.001) and stage IV (P = 0.022) patients. Similar significant results were revealed in different age groups and different gender groups.
Clinical characteristics exhibited distributed patterns corresponding to the risk score of 8-lncRNA prognostic signature on the entire TCGA BLCA cohort ( Figure 7H). We found that dead samples more likely gathered in the high-risk group, indicating that samples with high risk score might have worse survival (P < 0.005) ( Figure 7I). In addition, samples with higher UICC stages (stage III and stage IV) were more likely to have higher risk scores than those with lower stages (stage I and stage II) (P < 0.005) ( Figure  7J), which confirmed that higher scores of the 8-lncRNA prognostic signature might be significantly associated with the progression of BCa.

Principal component analysis (PCA) and Gene set enrichment analysis (GSEA)
PCA was performed to explore the different distribution patterns between low-risk and high-risk groups according to the eight lncRNAs targeting immunerelated genes and the whole gene set. Based on the eight lncRNAs, low-risk and high-risk groups were significantly distributed in two different directions. That is to say, the eight lncRNAs were used to separate BCa patients into two sections, indicating that the BCa patients in the low-risk group was quite distinguished from those in the high-risk group ( Figure 8A, 8B).
However, these two groups did not show significant distinctions when PCA was performed based on the whole gene set ( Figure 8C).
GSEA was further utilized to validate the functional annotation between high-risk and low-risk groups in entire TCGA cohort. The DEGs between the two groups were enriched in immune system process pathway (FDR [False discovery rate] = 0.022, NES [normalized enrichment score] = 2.033) and immune response pathway (FDR = 0.015, NES = 2.117) ( Figure 8D, 8E).
In summary, the immune-related prognostic signature based on the eight lncRNAs targeting immune-related AGING genes was significantly associated with the immune status of BCa, and high risk score of the signature was more likely to activate immune-related pathways in BCa patients.

Validation based on immunohistochemistry and survival analysis of immune-related DEGs
In order to further validate the immune-related prognostic signature, we compare the expression of 4 immune-related DEGs including CTSG, CXCL12, LIG1 and TBX1 between normal bladder tissues and BCa tissues from The Human Protein Atlas (THPA). Immunohistochemistry indicated that the expression levels of CTSG and CXCL12 are down-regulated in BCa tissues, while LIG1 and TBX1 are up-regulated in BCa tissues. These results confirmed that there are differences in the expression of the four immune-related DEGs between BCa tissues and normal bladder tissues ( Figure 9A).
We continued to extract data from Gene Expression Profiling Interactive Analysis (GEPIA) database and KM curves of the four immune-related DEGs revealed AGING that higher expression levels of CTSG (P = 0.011) and CXCL12 (P = 0.0069) are associated with worse OS time of BCa. However, higher expression levels of LIG1 (P = 0.021) and TBX1 (P = 0.023) contributed to a better prognosis compared with lower expression levels ( Figure 9B).

Building and validating a predictive nomogram
Nomogram was constructed by combining the details of age, UICC stage, histological grade, and the 8-lncRNA prognostic signature in training cohort ( Figure 10A). By using bootstrap method, calibration plots showed no significant deviation from the ideal for 1-year, 3-year and 5-year survival ( Figure 10B). The training cohort (N = 195) was divided into two high-nomogram-score (N = 98) and low-nomogram-score (N = 97) groups by the median value of nomogram score. The KM curve showed that BCa patients in high-nomogram-score group had worse prognosis compared with lownomogram-score group (P < 0.001) ( Figure 10C). The AUCs for 1-year, 3-year and 5-year of their timedependent ROC curves were 0.746, 0.829, and 0.825, respectively ( Figure 10F). Combining the immunerelated prognostic signature with other prognosisrelated clinical factors increased the AUC for predicting OS of BCa patients.
In addition, we also validated the nomogram in testing cohort (n = 194) and in entire TCGA BLCA cohort AGING (n = 389). KM curves and their ROC curves were displayed in Figure 10D, 10E, 10G, 10H. Table 4 showed the concordance index (c-index) of the immune-related signature and nomogram. Table 5 showed the AUC in time-dependent ROC curves. Both the c-index and AUC are indicators for evaluating the predictive value of prognostic model. The c-index of nomogram exceeded the c-index of immune-related signature in training cohort, testing cohort and entire TCGA BLCA cohort (P < 0.05), which indicated the predictive value of nomogram was better than the immune-related signature. In addition, the AUC of nomogram exceeded the AUC of immune-related signature, especially in predicting 5-year OS (P < 0.05), which indicated the nomogram had better predictive power in long-term follow-up. Therefore, the nomogram combined the immune-related prognostic signature with other prognosis-related clinical factors and increased the predictive power of OS, which might help to improve clinical management of BCa patients.

External validation of immune-related signature and nomogram in Tianjin cohort
To further confirm the 8-lncRNA signature and nomogram for MIBC, we recruited MIBC patients (n = 72) from Tianjin cohort for validation. Clinical characteristics of enrolled BC patients and controls in Tianjin validation cohort are displayed in Table 6. Quantitative real-time polymerase chain reaction (qRT-PCR) was performed to measure the expression levels of the eight lncRNAs. Risk score was calculated with the previous following formula of signature.
The KM survival curve of Tianjin cohort indicated the survival time of the low-risk group was significantly higher than that of the high-risk group in Figure 11A (P < 0.001). Consistent with the finding in previous cohorts, time-dependent ROC curves indicated that the AUC of 8-lncRNA prognostic signature risk score was 0.822 (95%CI = 0.793-0.851), which was better than other clinical characteristics in Figure 11B. Cindex was 0.723 (95%CI = 0.688-0.751). Furthermore, AGING multivariate Cox regression analysis revealed that risk score of the 8-lncRNA signature (HR = 3.073, 95%CI = 1.424-6.632, P = 0.004) might be an independent prognostic indicator in Figure 11C.
The results in Tianjin validation cohort confirmed the predictive power of immune-related signature and nomogram, which was consistent with the results of TCGA dataset.

DISCUSSION
Recently, the roles of lncRNAs in the development and prognosis of various tumors have been well investigated in many studies. Aberrantly expressed lncRNAs in cancer can be utilized as biomarkers for diagnosis, prognosis and target therapy [22]. Shen et al. performed a study based on TCGA BLCA dataset identified a panel of 20 key lncRNAs that were most implicated in MIBC prognosis after differential expression analysis and prognostic correlations [22]. Among the 20 lncRNAs, lnc-BOD1-1:7, -1:8, and -1:9, as well as lnc-GCH1-2:1, -2:2, and -2:3 were associated with both immune-related pathways and cancer-associated pathways. High expression level of lnc-CGRRF1-3:1 was mainly correlated with increased immune activity and decreased apoptosis in MIBC. In addition, Cao et al. AGING carried out another study based on TCGA BLCA dataset to explore the tumor immune microenvironment of BCa [23]. They found the abundance ratio of four immune cells including T cell CD4 memory activated, T cell CD4 memory resting, T cell CD8 and natural killer (NK) cell resting was related to BCa survival. T cell CD4 memory resting, T cell CD8, NK cell resting decreased as the increase of UICC stage and lymph node metastasis, which indicated that the four cells are vital in immune infiltration as well as BCa immunotherapy.
As immunotherapy has attracted increasing attention in the field of cancer treatment, the gene expression profile of high-throughput sequencing data have been used in order to explore valuable immune-related biomarkers, identify potential prognostic targets and analyze underlying mechanisms [24,25]. In the present study, we adopted TCGA BLCA dataset to explore lncRNAs targeting immune-related genes and construct an immune-related signature, which may serve as promising prognostic indicators for BCa and can be applied as promising immune therapeutic targets.    Previous studies have identified that the immune system is significantly associated with the tumor microenvironment, especially tumor immune escape [23]. Immune components in the tumor microenvironment have essential effects on gene expression by tumor tissues and the clinical outcomes [26,27]. In this study, a total of 332 immunerelated genes were identified from TCGA BLCA dataset. According to differential expression analysis, we found 1617 DEGs and 33 immune-related DEGs between BCa samples and normal bladder samples. Negative regulation of immune system process (GO:0002683) is the overlapped GO item based on the results of enrichment analyses of both DEGs and immune-related DEGs, which indicated that the development of BCa was associated with immune system and immune response ( Figure 3E).
Next, we established an immune-related prognostic signature based on eight lncRNAs targeting immune-AGING related genes (WNT5A-AS1, AL136084.3, MIF-AS1, AC008735.2, AL357033.4, LINC00649, AC099343.2 and USP30-AS1) in the training cohort. BCa patients were divided into low-risk and high-risk groups according to the median risk score of 8-lncRNA signature and we further found the immune-related signature was able to distinguish samples with a better prognosis or a worse prognosis. Low-risk group had better OS than high-risk group (Figures 4D, 5D and 6D).
Results of multivariate Cox regression analyses revealed that higher risk score was an independent poor prognostic indicator of OS (HR > 1) (Table 3, Figures  4F, 5F and 6F). ROC curves showed that the AUCs of the immune-related signature were higher than those of UICC stage and histological grade, indicating that the 8-lncRNA signature may be superior to other clinicopathologic features and serve as better prognostic biomarkers ( Figures 4E, 5E and 6E). For external validation of the immune-related signature for MIBC, we recruited 72 MIBC patients and validated these results in Tianjin validation cohort. The KM survival curve, time-dependent ROC curves and multivariate Cox regression analysis were consistent with the results of TCGA dataset, which confirmed the predictive power of the prognostic signature ( Figure 11).
In order to explore the wide applicability of the 8-lncRNA signature, we performed survival analyses in different subgroups stratified by UICC stage, age and gender. We observed that the 8-lncRNA signature was able to assess the risk score in various subgroups by accurately dividing the patients into high-risk group with poor survival and low-risk group with good survival ( Figure 7B-7G), except for patients with stage I and stage II BCa ( Figure 7A). One explanation for this phenomenon is that most of the patients with low stage AGING BCa had favourable prognosis with good survival and the heterogeneity of prognosis among them is small [4].
In addition, we also compared the risk scores between different subgroups and we identified that patients with advanced BCa (stage IV and stage III) had a higher risk score than those low stage BCa (stage II and stage I) ( Figure 7J). The above results indicated that the 8-lncRNA prognostic signature could help to distinguish prognosis of patients with different variables and may also predict the tumor progression.
To further validate the 8-lncRNA prognostic signature was tightly associated with immune system, GSEA was carried out between high-risk and low-risk groups. We identified that up-regulation of the immune-related signature was enriched in immune system process pathway (FDR = 0.022, Figure 8D) and immune response pathway (FDR = 0.015, Figure 8E). We also selected four immune-related DEGs including CTSG, CXCL12, LIG1 and TBX1 to explore their functions in BCa tumorigenesis and prognosis. CTSG is reported to activate neutrophil effector functions through release of formyl peptide receptor agonists in inflammatory and immune responses [28,29]. The CXCL12-CXCR4/CXCR7 chemokine axis activates immune cell migration and inhibits immune resistance in gastrointestinal malignancies [30]. Biallelic et al. found mutations in LIG1 are identified to underlie a spectrum of immune deficiencies by in vitro studies and TBX1 mutation is responsible for most of the congenital immune defect seen in the mouse models and in patients [31,32]. Immunohistochemistry from THPA and online analyses from GEPIA demonstrated that all the 4 immune-related DEGs differentially expressed between BCa and normal bladder tissues and they were associated with OS of BCa patients (P < 0.05, Figure 9).
In addition, our predictive nomogram combining our immune-related signature suggested that AUCs of 1year, 3-year and 5-year survival in the entire TCGA BLCA cohort are all greater than 0.700, indicating our nomogram and immune-related signature have wide applicability for both long-term and short-term followup patients ( Figure 10F-10H). In addition, we used cindex and AUC to compare the predictive value of the signature and nomogram. Both c-index and AUC of nomogram exceeded the immune-related signature, especially in predicting 5-year OS (Tables 4 and 5), which indicated the nomogram had better predictive power in long-term follow-up. Therefore, through combination of the 8-lncRNA signature and other prognosis-related clinical factors, the nomogram further increased the predictive power of survival.
In recent years, immunotherapy for BCa attracted more and more attention owing to the availability of targeted immunotherapies and checkpoint inhibitors [33][34][35]. Among these eight lncRNAs, WNT5A-AS1 and AL136084.3 are risk lncRNAs, while MIF-AS1, AC008735.2, AL357033.4, LINC00649, AC099343.2 and USP30-AS1 are protective lncRNAs. MIF-AS1 acted as a competing endogenous RNA by activating miR-1249-3p/HOXB8 axis in breast cancer [36]. LINC00649 might participate in intracellular receptor signaling pathways in prostate cancer patients [37]. LINC00649 was reported to be involved in vacuolar transport and histone modification functions as well as G protein-coupled receptor and Rho GTPases signaling pathways and is significantly related to the development and prognosis of CRC [38]. Although most of the identified lncRNAs have not been reported to be directly associated with immune-related function, the eight lncRNAs might participate in immune system through indirect pathways for the reason that enrichment results were immune-related. Hence, the eight lncRNAs targeting immune-related genes will have attractive applications to provide therapeutic targets for BCa, which could improve the prognosis of BCa patients.
Nowadays, many studies used high-throughput sequencing data to identify gene signatures and construct clinical predictive models. Gao et al. [39] identified a 6-lncRNA signature to predict the prognosis of BCa and their AUC for 5-year survival is 0.751, which is similar to 0.742 by our signature. However, they didn't report the AUC value for 1-year survival and thus the predictive power for short-term follow-up is uncertain. In addition, we validated its application in the external validation cohort, which further confirmed the predictive value of our signature. We also constructed a nomogram by enrolling the immunerelated signature and other prognosis-related clinical factors to broaden the applicability and increase the clinical significance. In the present study, we validated our findings in other online databases and discussed the relationship between immune system and BCa by immunohistochemistry.
To our knowledge, our study focused on lncRNAs targeting immune-related genes and further construct a signature and a nomogram for BCa. However, several limitations still existed. Firstly, most enrolled patients from TCGA BLCA dataset are MIBC and only 2 patients are NMIBC (stage I). Hence, the 8-lncRNA signature and nomogram might be more suitable for MIBC patients and further cohorts based on NMIBC patients are required to verify its application in NMIBC management in the future. Secondly, although external validation has been performed to confirm the predictive power of the signature and nomogram, the exact molecular mechanisms of eight lncRNAs have not been AGING investigated in the present study. Therefore, further in vitro and in vivo studies based on functional studies are warranted to verify these hypotheses and to make these results more convincible for clinical application in the future.
In summary, we identified an immune-related signature and nomogram based on eight lncRNAs targeting immune-related genes in MIBC. The 8-lncRNA signature and nomogram were confirmed to be independent prognostic indicators for MIBC and could act as a good classifier in different subgroups of MIBC patients. External validation was utilized to verify the predictive value and immunohistochemistry as well as GSEA validated the association between the signature and immune-related functions. These eight lncRNAs will have attractive applications to provide new diagnostic methods and treatment targets for MIBC, which could improve the prognosis of MIBC patients, if validated by further experiments.
To identify DEGs between BCa samples and normal bladder samples in TCGA BLCA dataset, we utilized limma R package [40]. The cut-off criteria of adjusted P-value (adj. P-value) was set as 0.05 and the criterion of Fold change was set as |logFC| ≥ 1. We also generated a volcano plot for these DEGs.
In order to reveal the potential functions of DEGs, the clusterProfiler R package is used to perform GO enrichment analysis and KEGG pathway enrichment analysis. In addition, adj. P-value < 0.05 was set as the cutoff criteria.

Immune-related genes and functional enrichment analysis
Immune-related genes were extracted from Molecular Signatures Database v4.0 (http://www.broadinstitute. org/gsea/msigdb/index.jsp: Immune system process M13664, Immune response M19817) [41]. Ultimately, 332 immune-related genes were identified. Then, we identified immune-related DEGs between BCa samples and normal bladder samples in TCGA BLCA dataset and generated a volcano plot as well as a heat map for these immune-related DEGs by using pheatmap R package. In addition, GO and KEGG pathway enrichment analyses were carried out with the criteria mentioned above.
Venn diagrams (http://bioinformatics.psb.ugent.be/ webtools/Venn/) were applied to identify overlapped enrichment results between all DEGs enrichment analysis and immune-related DEGs enrichment analysis. KEGG pathway and GO functional enrichments including BP, cell component (CC) and molecular function (MF) were calculated respectively.

Identification of lncRNAs targeting immune-related genes
A total of 33 immune-related DEGs (18 were upregulated and 15 were down-regulated) were identified through differential expression analysis based on the 332 immune-related genes. In order to further find lncRNAs targeting immune-related genes, Pearson correlation analysis was performed between expression levels of all lncRNAs and 18 up-regulated immunerelated DEGs or 15 down-regulated immune-related DEGs, respectively. |R| > 0.5 and P-value < 0.01 were established as cut-off criteria. Ultimately, 262 lncRNAs targeting immune-related genes were identified according to the above criteria in the TCGA BLCA dataset.

Identification of an immune-related prognostic signature based on lncRNAs targeting immunerelated genes
We used survival R package to construct the lncRNA signature. The criteria for BCa samples in identification and validation of the lncRNA signature were as follows: (1) complete lncRNA expression values and clinical characteristics (age at diagnosis, gender, UICC stage, histological grade, and survival time); and (2) samples with total survival time less than 1 month were excluded for these samples might die of nonneoplastic factors including severe infection or hemorrhage. Ultimately, a total of 389 BCa samples were included for further construction of the lncRNA signature. We randomly divided 389 BCa samples into the training cohort (n = 195) and testing cohort (n = 194). The training cohort was used to develop the lncRNA signature. The testing cohort was used to validate the lncRNA signature.
Next, the 262 selected lncRNAs were put into univariate Cox regression analysis in the training cohort in order to screen prognosis-related lncRNAs (P-value < 0.01). Univariate Cox regression analysis identified 21 prognosis-related lncRNAs. Then, we used stepwise selection with Akaike information criteria (AIC) AGING method from the 21 prognosis-related lncRNAs to select the optimal model [42]. Ultimately, we selected 8 lncRNAs with the smallest AIC value to construct the predictive model.
We performed a multivariate Cox regression analysis on the 8 prognosis-related lncRNAs to develop a prognostic lncRNA signature and calculate the coefficients [43][44][45]. The risk score of the prognostic lncRNA signature for each patient was calculated as the following formula: Risk score = ExpressionlncRNA1 × CoefficientlncRNA1 + ExpressionlncRNA2 × CoefficientlncRNA2 + … + ExpressionlncRNAn × CoefficientlncRNAn.
The risk score of the prognostic lncRNA signature was calculated according to a linear combination of the expression level of lncRNAs weighted by the regression coefficient (β). The β was calculated by log-transformed HR derived from multivariate Cox regression analysis [46,47]. Low-risk and high-risk groups were determined by the median value of risk score.

Predictive power of the lncRNA prognostic signature in training cohort and validation
We performed univariate and multivariate Cox regression analysis to evaluate the predictive power of lncRNA signature and other clinical characteristics (age, gender, UICC stage, and histological grade) in training cohort (n = 195). C-index and time-dependent ROC curve with AUC value were performed to further assess the predictive value of lncRNA signature and these clinical characteristics by survivalROC, timeROC and rms R packages. The training cohort was divided into two lowrisk and high-risk groups by the median value of risk score. Then, KM survival curve was plotted to compare the differences of OS in the two groups. P-value < 0.05 was considered as statistically significant difference.
At the same time, stability and reliability of the lncRNA signature was validated in testing cohort (n = 194) and in entire TCGA BLCA cohort (n = 389). They were also divided into two low-risk and high-risk groups by the median value of risk score. Furthermore, univariate and multivariate Cox regression analysis, time-dependent ROC curve, KM survival curve and c-index were analyzed for the validation as mentioned earlier.

Bioinformatics analysis
PCA was carried out to profile expression patterns of grouped samples by using scatterplot3d R package. GSEA (http://www.broadinstitute.org/gsea/index.jsp) was carried out to find different functional phenotypes between the two groups [48]. Gene set permutations were performed 1000 times for each analysis. FDR < 25% and nominal P-value < 0.05 were regarded as the cut-off criteria of sorting GO and KEGG pathway enrichments in GSEA.
What's more, we used GEPIA database (http:// gepia.cancer-pku.cn/) for further validating the differential expression of the 4 immune-related genes based on TCGA and GTEx datasets and calculating OS with the 4 immune-related genes on the basis of TCGA BLCA dataset [50].

Identification and validation of a predictive nomogram
Nomogram [51] was constructed by including age, UICC stage, histological grade, and the immune-related prognostic signature in training cohort. The training cohort was divided into two low-nomogram-score and high-nomogram-score groups by the median value of nomogram score. KM survival curve, time-independent ROC curve and c-index were conducted to evaluate the predictive power of nomogram. We also performed calibration plot to explore the calibration and the discrimination of the nomogram by a bootstrap method with 1000 resamples.
In the meantime, stability and reliability of the nomogram was validated in testing cohort. KM survival curve, ROC curve and c-index were analyzed for the validation in testing cohort (n = 194) and in entire TCGA BLCA cohort (n = 389) as mentioned earlier.

Quantitative real-time polymerase chain reaction and External validation in Tianjin cohort
A total of 72 patients who were pathologically and clinically diagnosed with MIBC were enrolled from Tianjin Medical University General Hospital. MIBC tissues were obtained at the time of first surgery as Tianjin cohort. The study was approved by the Ethics Committee of Tianjin Medical University General Hospital. All recruited participants signed informed consent before being enrolled in our study. Table 7. Primer sequences used to amplify target lncRNAs by quantitative real-time polymerase chain reaction (qRT-PCR). Total RNA from MIBC samples were extracted using RNeasy kit (Qiagen, Valencia, CA). The first chain of cDNA was synthesized by reverse transcription with TaqMan ® Reverse Transcription Reagents (Applied Biosystems, Grand Island, NY). GAPDH was used as internal control. The sequences of the primers were displayed in Table 7. qRT-PCR was performed using the CFX96 Touch PCR system (Bio-Rad). The relative lncRNA expression levels of WNT5A-AS1, AL136084.3, MIF-AS1, AC008735.2, AL357033.4, LINC00649, AC099343.2 and USP30-AS1 were calculated by 2 -ΔΔCt method. Risk score was calculated with the previous following formula of signature. In addition, univariate and multivariate Cox regression analysis, time-dependent ROC curve and KM survival curve were analyzed for the validation in Tianjin cohort (n = 72) as mentioned earlier.

Statistical analysis
Statistical analyses were carried out using R software (v3.5.3: http://www.r-project.org) and SPSS v23.0. The RNA-sequencing data of gene and lncRNA were log2transformed. Univariate and multivariate Cox regression analyses, KM method and log-rank test were used to compare the influence of the lncRNA prognostic signature on survival along with other clinical characteristics. Chi-square test and Student's t test were used to evaluate qualitative variables and quantitative variables, respectively. Time-dependent ROC curve and c-index was utilized to assess the prognostic value based on the lncRNA signature and nomogram.
Delong's Z-test was utilized to compare the AUC and cindex between the signature and nomogram [52]. The two-sided P-value < 0.05 was regarded as statistically significant.

AUTHOR CONTRIBUTIONS
Research design: SYX, JDH and CJY; Data extraction and statistical analysis: SYX, JDH, LZW and CGY; Drafting of the manuscript and modification: SYX, LXQ and YYJ.