Artificial intelligence‐driven consensus gene signatures for improving bladder cancer clinical outcomes identified by multi‐center integration analysis

To accurately predict the prognosis and further improve the clinical outcomes of bladder cancer (BLCA), we leveraged large‐scale data to develop and validate a robust signature consisting of small gene sets. Ten machine‐learning algorithms were enrolled and subsequently transformed into 76 combinations, which were further performed on eight independent cohorts (n = 1218). We ultimately determined a consensus artificial intelligence‐derived gene signature (AIGS) with the best performance among 76 model types. In this model, patients with high AIGS showed a higher risk of mortality, recurrence, and disease progression. AIGS is not only independent of traditional clinical traits [(e.g., American Joint Committee on Cancer (AJCC) stage)] and molecular features (e.g., TP53 mutation) but also demonstrated superior performance to these variables. Comparisons with 58 published signatures also indicated that AIGS possessed the best performance. Additionally, the combination of AIGS and AJCC stage could achieve better performance. Patients with low AIGS scores were sensitive to immunotherapy, whereas patients with high AIGS scores might benefit from seven potential therapeutics: BRD‐K45681478, 1S,3R‐RSL‐3, RITA, U‐0126, temsirolimus, MRS‐1220, and LY2784544. Additionally, some mutations (TP53 and RB1), copy number variations (7p11.2), and a methylation‐driven target were characterized by AIGS‐related multi‐omics alterations. Overall, AIGS provides an attractive platform to optimize decision‐making and surveillance protocol for individual BLCA patients.


Introduction
As one of the most common malignancies globally, over 430 000 patients were diagnosed with bladder cancer (BLCA) in 2020 [1]. Due to the rapid disease progression and undertreatment, BLCA patients performed high mortality, recurrence, and treatment failure rates [2,3]. Hence, the early identification and intervention of high-risk BLCA patients exhibit dramatic significance. As a recognized authoritative grading system, the American Joint Committee on Cancer (AJCC) stage system provides a common platform for evaluating the prognostic risk and treatment measures in clinical management. However, different clinical outcomes of the same stage patients and the lack of molecular biological characteristics indicate the limitations of the AJCC classification system [4,5], which might lead to potential over-or under-treatment. Over the past decades, immunotherapy has exhibited a great sensation due to the dramatic benefits of solid cancer treatment, including BLCA [6,7]. Immune checkpoint inhibitor (ICI) can promote the immune system to recognize and suppress basic molecular targets of tumour cells such as PD-1, CTLA-4, and PD-L1. In the US, PD-L1 targeting drugs like atezolizumab and pembrolizumab have been approved as first-line therapy in patients with platinum-ineligible PD-L1+ BLCA [8]. Apart from this, there are other molecular classification tools to stratify patients, such as CTLA-4, PD-1, and tumour mutation burden (TMB) [9]. Unfortunately, these classification systems do not perfectly predict response to ICI therapy and just a small proportion of BLCA patients can benefit from them [9]. Given the enormous cost and serious adverse effects of immunotherapy, exploring a novel biomarker for effective immunotherapy management in BLCA is also warranted.
As is known to all, BLCA is a comprehensive tumour with inter-and intra-tumour heterogeneity [10]. Ideal biomarkers should possess homogeneous expression within and between tumour tissues to behave stably accord all patients. Thus, multigene panels are possible to be an effective approach to address inter-and intra-cancer heterogeneity. With the rapid development of bioinformatics and computer technology, a large number of gene prognosis signatures have been reported [11,12]. Models integrated through multiple gene profiles, including messenger RNA, micro-RNA, and long non-coding RNA, were constructed and validated as promising biomarkers for BLCA [13][14][15]. However, considering the underutilization of data information, inappropriate machine-learning methods, and lack of rigorous validation in different cohorts and clinical trials, multigene signatures demonstrate dramatic limitations in clinical application [16,17].
To fill these gaps, we comprehensively investigated the clinical value of BLCA gene expression profiles, and a consensus artificial intelligence-derived gene signature (AIGS) from the combination of 76 machinelearning algorithms was systematically developed. The predictive value of AIGS for overall survival (OS), recurrence-free survival (RFS), progression-free survival (PFS), immunotherapy, and chemotherapy was tested in 1351 BLCA patients from 11 independent cohorts. After comparing our signature with 58 published models, traditional clinical traits, and molecular features, the clinical translation value and robust performance of AIGS were further validated. Additionally, a methylation-driven target and some AIGSrelated mutations (TP53 and RB1) and copy number variations (CNVs; 7p11.2) were obtained based on multi-omics data analysis, and seven potential therapeutic drugs for high-risk patients were also identified in our study. Overall, our work provides an attractive platform for recognizing high-risk patients and optimizing precision treatment for BLCA.

Data collection and processing
The flow chart of our research is illustrated in Fig. 1. A total of 10 independent datasets (n = 1003) were retrospectively retrieved from Gene Expression Omnibus and The Cancer Genome Atlas (TCGA) database, respectively. The IMvigor210 cohort (n = 348) was enrolled from the published research [18]. Samples were screened according to the following conditions: (a) primary cancer tissues; (b) no preoperative radiotherapy or chemotherapy received; (c) survival information was available; and (d) RNA expression data were available. See Table S1 for the detailed baseline information.
Among the 11 cohorts, eight cohorts (TCGA-BLCA, GSE13507, GSE19423, GSE31684, GSE37815, GSE48075, GSE48276, and IMvigor210) with complete OS information were employed to construct and validate our AIGS. Two cohorts (GSE31684 and GSE154261) contained RFS, and PFS information was utilized to investigate the predictive performance of AIGS for RFS and PFS, respectively. Two immunotherapy cohorts (including IMvigor210 and GSE91061) and one chemotherapy cohort (GSE52219) were performed to assess the value of AIGS in predicting immunotherapy and chemotherapy responses for  BLCA. Of note, the FPKM normalized data from TCGA were further converted into log2 (TPM + 1), and 14 843 intersection genes of these cohorts were obtained for the subsequent analysis.

Generation of AIGS
As previously reported by Liu et al. [19,20], gene expression profiles were converted into z-score across all datasets to enhance comparability between different cohorts, which in turn constructed a consensus artificial intelligence-derived signature. Cohorts with OS information including TCGA-BLCA, GSE13507, GSE19423, GSE31684, GSE37815, GSE48075, GSE48276, and IMvigor210 were utilized to develop AIGS in as following steps: 1 14 843 intersection genes of these eight OS cohorts were utilized to perform univariate Cox analysis.
Considering the small sample of some cohorts and the strict multiple testing correction that might filter out potential genes associated with OS, genes with both an unadjusted P < 0.2 and the same hazard ratio (HR) direction for more than six cohorts were considered as robust OS-related genes (RORGs). 2 To develop a consensus AIGS with high accuracy and stability performance, we integrated 10 machine-learning algorithms including random survival forest (RSF), elastic network (Enet), Lasso, Ridge, stepwise Cox, CoxBoost, partial least squares regression for Cox (plsRcox), supervised principal components (SuperPC), generalized boosted regression modelling (GBM), and survival support vector machine (survival-SVM). A few algorithms possessed the ability of feature selection, such as Lasso, stepwise Cox, CoxBoost, and RSF. Thus, we combined these algorithms to generate a consensus model. In total, 76 algorithm combinations were conducted on RORGs to fit prediction models based on 10-fold cross-validation. The initial signature discovery was performed in TCGA-BLCA. The RSF model was implemented via the randomForestSRC package. RSF had two parameters ntree and mtry, where ntree represented the number of trees in the forest and mtry was the number of randomly selected variables for splitting at each node. We used a grid search on ntree and mtry using 10-fold crossvalidation. All the pairs of (ntree, mtry) were formed, and the one with the best C-index value was identified as the optimized parameter. The Enet, Lasso, and Ridge were implemented via the glmnet package. The regularization parameter, lambda, was determined by 10-fold cross-validation, whereas the L1-L2 trade-off parameter, a, was set to 0-1 (interval = 0.1). The stepwise Cox model was implemented via survival package. A stepwise algorithm using the AIC (Akaike information criterion) was applied. The CoxBoost model was implemented via CoxBoost package, which is used to fit a Cox proportional hazards model by componentwise likelihood-based boosting. For the CoxBoost model, we used 10-fold cross-validation routine op-timCoxBoostPenalty function to first determine the optimal penalty (amount of shrinkage). Once this parameter was determined, the other tuning parameter of the algorithm, namely, the number of boosting steps to perform, was selected via the function cv.CoxBoost. The dimension of the selected multivariate Cox model was finally set by the principal routine CoxBoost. The plsRcox model was implemented via plsRcox package. The cv.plsRcox function was used to determine the number of components requested, and the plsRcox function was applied to fit a partial least squares regression generalized linear model. The SuperPC model was implemented via SUPERPC package and is a generalization of principal component analysis, which generates a linear combination of the features or variables of interest that capture the directions of largest variation in a dataset. The superpc.cv function used the 10-fold cross-validation to estimate the optimal feature threshold in SUPERPC. To avoid problems with fitting Cox models to small validation datasets, it uses the 'pre-validation' approach. The GBM was implemented via the gbm package. Using the 10-fold cross-validation, the cv.gbm function selected index for number trees with minimum cross-validation error. The gbm function was used to fit the generalized boosted regression model. The survival-SVM model was implemented via survivalsvm package. The regression approach takes censoring into account when formulating the inequality constraints of the support vector problem. 3 All these 76 algorithms were applied in the other seven OS cohorts. The C-index across all validation cohorts was calculated for each signature, and signature displayed the highest average C-indices was considered the optimal one.

Acquisition of published signatures
As illustrated in given the lack of miRNA information in our cohorts. Afterwards, univariate Cox analyses were employed, and the C-indices for each signature across all cohorts were calculated.

Gene set enrichment analysis (GSEA)
Pearson's correlation was utilized to evaluate the correlation between AIGS scores and genes. All genes were arranged in descending order of correlation coefficient, and GSEA was performed by clusterProfiler R package to recognize remarkably enriched terms associated with the GO and KEGG pathways, subsequently [21].

Gene set variation analysis (GSVA)
The Hallmark gene set was obtained from the molecular signatures database. According to the median AIGS score, patients were divided into two groups. To reduce the overlap and redundancy of pathways, gene set associated with a pathway was screened to contain unique genes, and genes related to multiple pathways were also removed [22]. The limma package was employed to recognize the remarkably altered pathways between the high-and low-AIGS groups, and the pathway with |t| > 1 was regarded as significant.

Comprehensive analyses based on immune cell infiltration and immune checkpoints
The single-sample Gene Set Enrichment Analysis was performed to assess the infiltration abundance of 24 immune cells in tumour immune microenvironment via GSVA package in the IMvigor210 cohort. Gene set of 24 immune cell and 27 immune checkpoints including the member of the TNF superfamily, B7-CD28 family, and other molecules were retrieved from the published research [23][24][25][26][27]. Afterwards, the correlations among the AIGS scores, immune infiltration, and immune checkpoints were investigated.

Immunotherapeutic response prediction
The tumour immune dysfunction and exclusion (TIDE), immunophenoscore (IPS), and subclass mapping (SubMap) algorithms were employed to predict the responses to immunotherapy from multiple perspectives [28][29][30]. TIDE is an algorithm that evaluates immune evasion by integrating the expression characteristics of T-cell exclusion and T-cell dysfunction. IPS was proved to be a superior predictor to identify determinants of immunogenicity and characterize the intratumoral immunologic landscape [31], and Sub-Map was utilized to derive the degree of commonality between high-and low-AIGS groups. Notably, the adjusted P-value was employed to assess the similarity, and a lower adjusted P-value represents a higher similarity.

The mutation landscape of BLCA
The somatic mutation (VarScan2 variant aggregation and masking) and HumanMethylation450 array were downloaded from the TCGA GDC website, and TMB was obtained by calculating the count of non-silent somatic mutation in every patient. To investigate the differences in genomic mutations between high-and low-AIGS patients, the mutation waterfall plot of the top 30 genes with the highest mutation number was visualized via the maftools and ComplexHeatmap R packages. Afterwards, univariate and multivariate logistic regression analyses were employed to evaluate and verify the relationship between 30 genes mutation status and AIGS. It is worth noting that apart from age, gender, and stage, TMB was also included in the multivariate logistic regression to ensure that AIGS was an independent factor for these mutated genes.

CNV for BLCA patients
CNV data processed by the Genomic Identification of Significant Targets in Cancer 2.0 algorithm were retrieved from FireBrowse (http://firebrowse.org/) [32]. The ComplexHeatmap package was performed to visualize the CNV waterfall chart of the top 15 amplification (AMP) and homozygous deletion (Homdel) chromosome fragments. To investigate the proportion of genomic alterations, the fraction of genomic alterations, fraction of genomes gained (FGG), and fraction of genomes lost (FGL) were also calculated. Besides, the Wilcox test, univariate and multivariate logistic analyses were employed to assess the correlation between 30 CNV fragments and AIGS. Similarly, FGG and FGL were contained in the multivariate logistic regression analysis to confirm that AIGS was an independent factor of AMP and Homdel fragments, respectively.

Identification of methylation-driven genes (MDGs)
The HumanMethylation450 array for BLCA was retrieved from TCGA. Referring to the description of Liu et al. [33,34], the global methylation level was evaluated through averaged beta values of the specific probes. The MethylMix R package was used to integrate methylation and mRNA data for correlation analysis. Genes with significantly negatively correlated with expression were defined as MDGs. The relationship between MDGs and AIGS was explored subsequently.
2.11. Estimation of drug response and potential therapeutic agents Drug sensitivity information of cancer cell lines (CCLs) was collected from PRISM and the Cancer Therapeutics Response Portal datasets (CTRP). Sensitivity data for over 481 compounds can be retrieved in CTRP, and sensitivity data for 1448 compounds in over 482 CCLs are available in PRISM. Both datasets provide area under the dose-response curve (AUC) value as a measure of drug sensitivity, with lower AUC values indicating higher drug response. Knearest neighbour (k-NN) imputation was employed to impute the missing AUC values, and before imputation, compounds with less than 20% missing values were screened out.

Statistical analysis
The relationship of two variables was obtained by Pearson's correlation. The Survival package was utilized to perform Kaplan-Meier survival analysis, and the different significance was determined by the logrank test. The ROC curves were plotted by pROC package. Differences in AIGS between the high and low groups were compared by independent samples ttest or Wilcoxon rank-sum test. All statistical P values were two-sided, and P < 0.05 was defined as statistically significant. Adjust P-value was employed using Benjamini-Hochberg (BH) multiple test correction. All data processing and plotting were finished in R 4.1.2 software.

Integrative construction of AIGS
Based on the screening criteria in Fig. 2A, a total of 30 RORGs were obtained (Fig. 2B). The expression profiles of these 30 RORGs were utilized to construct the consensus signature subsequently. According to the 10-fold cross-validation framework, 76 kinds of prediction signatures were developed in TCGA-BLCA cohort. The C-index was employed to assess the predictive ability of these 76 models. Additionally, to verify whether the models performed consistently in the different cohorts, the C-index of each model was calculated across the remaining seven validation cohorts. As Fig. 2C displayed, Ridge was confirmed to be the optimal signature with the highest average Cindex (0.709). Based on the expression and weighted regression coefficients of the 30 RORGs, a risk score for each patient was further calculated. For the annotation information of these 30 RORGs, please refer to Table S3.

Independent prognostic value of AIGS
Kaplan-Meier survival analysis exhibited that the mortality rate in the high-risk group was significantly higher than the low-risk group in the training cohort (TCGA-BLCA, n = 400, P < 0.0001) and other seven validation cohorts GSE13507 (n = 165, P < 0.0001), GSE19423 (n = 48, P < 0.0001), GSE31684 (n = 93, P = 0.00029), GSE37815 (n = 18, P = 0.00099), GSE48075 (n = 73, P < 0.0001), GSE48276 (n = 73, P = 0.012), and IMvigor210 (n = 348, P = 0.0008; Fig. 3A-H). Similarly, comparisons of RFS and PFS suggested that the relapse rate in the high-risk group was significantly higher in GSE31684 (n = 93, P = 0.00024), and the progress rate in the high-risk group was dramatically higher in GSE154261 (n = 71, P = 0.0091; Figs 3I and S1A). Multivariate Cox analysis was employed to investigate whether the prognostic performance of AIGS was independent after adjusting for clinical factors and molecular features like age, gender, stage, T, N, M, grade, intravesical therapy, systemic chemotherapy, smoking, BCG treatment, platinum therapy, neoadjuvant chemotherapy, FGFR3, p53, RAS, and RB1 mutations. As shown in Tables S4-S6, AIGS displayed statistically significant for OS, RFS, and PFS across all cohorts after adjusting these features, suggesting that it was an independent risk factor in BLCA.      (Fig. 3K). All of the above results demonstrated that AIGS possessed superior stability and extrapolation across multiple independent cohorts.

Stable performance of AIGS
Over the past few decades, some clinical traits and molecular features played a fundamental role in prognosis risk assessment and clinical decision optimization. Therefore, we compared the prediction accuracy of AIGS with these variables. As illustrated in Fig. 4A-H, AIGS exhibited dramatically higher accuracy than these features, including age, stage, gender, T, N, M, grade, intravesical therapy, systemic chemotherapy, smoking, BCG treatment, platinum therapy, neoadjuvant chemotherapy, FGFR3, p53, RAS, and RB1 mutations, revealing that our AIGS potential to be a promising tool in evaluating the prognosis risk of BLCA.

Comparisons between AIGS and 58 published signatures
With the rapid development of high-throughput sequencing and bioinformatics, a large number of prediction models based on gene expression have been reported. To compare whether our AIGS signature is more predictive than other models, a total of 58 mRNA /lncRNA predictive signatures developed by various machine-learning algorithms, including Lasso, GBM, Ridge, were systematically retrieved (Table S2). Univariate Cox analysis exhibited that AIGS not only remained significant but also displayed superior Cindices across all eight cohorts (Fig. 5A,B), indicating its robust stability and performance to predict the prognosis risk of BLCA patients. Notably, in the TCGA dataset, a few models with TCGA as the training cohort (for example, Chen, Cao, Yan, Liu, Zhu, etc.) possessed higher C-indices than AIGS. However, all these signatures performed poorly on other cohorts, suggesting that they suffer from overfitting and frustrating extrapolation. In GSE37815, AIGS ranked second, weaker than Wu, while Wu's performance was relatively poor in other datasets, the C-indexes in some cohorts were even less than 0.6, which might also be caused by the overfitting.

Nomogram based on AIGS and clinical features
Considering the promising clinical application of AIGS, a prognostic nomogram integrated two independent predictors (AIGS and clinical Stage) of mortality was developed (Fig. 6A). Meanwhile, individualized patient scores were calculated to predict the OS of 1, 3, and 5 years. The calibration plot suggested that our nomogram displayed superior performance in predicting the prognosis of BLCA patients (Fig. 6B). Likewise, the AUC values of the nomogram were 0.774/0.752/0.778 at 1/3/5 years (Fig. 6C), indicating its high accuracy and stability.

Biological function analysis of AIGS
To explore the potential functional and molecular mechanisms of AIGS, GSEA and GSVA were subsequently performed. After sorting in descending order according to the absolute value of Normalized Enrichment Score, the top 20 pathways of GO and KEGG were selected, respectively. As displayed in Fig. S1C, D, AIGS-related genes were mainly enriched in tumour development and metastasis-related pathways, including DNA replication, chromosome segregation, epidermal cell differentiation, cell cycle, mismatch repair, and ECM receptor interaction. Figure S1E,F revealed that all these pathways were positively correlated with the AIGS score. In parallel, the GSVA result demonstrated that the vast majority of pathways in the Hallmark gene set were significantly different between the high and low-risk groups, indicating that AIGS was highly correlated with tumours (Fig. S1G).

Low AIGS scores predicted better immune and chemotherapy responses
To evaluate the predictive potential of AIGS for immunotherapy and chemotherapy, two immunotherapy cohorts (IMvigor210, GSE91061) and one chemotherapy cohort (GSE52219) were retrospectively collected. As illustrated in Fig. 7A,B, a low AIGS score predicted better immune response in both IMvig-or210 (P < 0.01) and GSE91061(P < 0.01) cohorts. The same result was obtained in the chemotherapy cohort (Fig. 7C, P < 0.05), revealing that patients with a lower AIGS score respond better to the chemotherapy. Additionally, based on multiple immune response testing algorithms and features like TIDE, IPS, and SubMap, we further verified the predictive performance of AIGS in immune response. The results suggested that patients in the low AIGS group exhibited a higher immune response rate (Fig. 7D), higher IPS score (Fig. 7E), and a superior anti-PD-1 and CTLA-4 immunotherapy efficacy (Fig. 7F), indicating the powerful stability of AIGS in predicting immune response for BLCA.

Immune landscape and potential AIGS immunotherapeutic targets of BLCA
To validate the above results and investigate the underlying mechanism between AIGS and immunity, we explored the immune landscape for BLCA based on immune infiltration and immune checkpoints in the IMvigor210 dataset. As Fig. S2A exhibited, there was a significant difference in immune cell infiltration between the high and low AIGS groups. Notably, the low AIGS group exhibited higher infiltration of CD8 T cells (P < 0.05), dendritic cells (P < 0.01), NK CD56 bright cells (P < 0.01), plasmacytoid dendritic cells (P < 0.0001), and lower infiltration of macrophages (P < 0.01) and T helper (Th) 2 cells (P < 0.01; Fig. 7G). In parallel, the results of immune checkpoint correlation analysis suggested that a low AIGS score predicted higher expression of CTLA-4, CD274, TMIGD2, CD40, and SIGLEC15 and lower expression of VTCN1, CD70, and TNFRSF18 (Figs S2B and 7H), indicating that CTLA-4, CD274, TMIGD2, CD40, and SIGLEC15 were promising to be the potential targets for immunotherapy.

Somatic mutational landscape, CNVs, and methylation-driven gene with regard to AIGS
The mutational landscape of the top 30 frequently mutated genes (FMGs) was exhibited in Fig. 8A. Overall, seven FMGs exhibited significantly higher mutational frequency between the high and low AIGS groups, including TP53, TTN, RB1, FGFR3, ELF3, SPTAN1, and NEB (Fig. 8B). Univariate and multivariate logistic regression analysis demonstrated that AIGS was not only dramatically associated with TP53, RB1, FGFR3, and ELF3 mutations but also remained an independent significance after adjusting for clinical characteristics such as age, gender, stage, and TMB (Fig. 8C). The CNV status of the top 15 AMP and Homdel chromosome fragments between high and low AIGS groups was further characterized (Fig. 8D). As displayed in Fig. S2C (Fig. 8E).    Through the MethylMix R package, 55 MDGs (gene expression was significantly negatively correlated with methylation) were screened. Afterwards, the correlation between these genes and the AIGS score was further calculated via the Wilcox test. As displayed in Fig. S2D, the gene expression and methylation level of RTP4 were significantly negatively correlated in the high and low AIGS group and all patients. Besides, a higher AIGS score exhibited a higher RTP4 methylation level (Fig. 8F) and a lower expression (Fig. 8G).

Drug response evaluation and potential therapeutic agents for high-risk BLCA patients
Gene expression and drug sensitivity profiles for hundreds of CCLs are available from the CTRP and PRISM websites to develop predictive models of drug response. After removing compounds with more than 20% of samples with missing values and cell lines from hematopoietic and lymphoid tissues, 266 and 1285 compounds were obtained in the CTRP and PRISM databases, respectively. Afterwards, based on the expression profile, the pRRophetic package based on the ridge regression algorism was employed to predict the drug response, resulting in an estimated AUC value for each compound in the sample. To ensure that the obtained drug sensitivities were reliable, cisplatin, a commonly applied drug in neoadjuvant chemotherapy for BLCA was utilized to explore whether the predicted drug sensitivity was consistent with its clinical efficacy. Studies have shown that reduced expression or dysfunction of BRCA1 predicts a higher cisplatin sensitivity in BLCA [35]. Thus, patients in the TCGA cohort were divided into highand low-BRCA expression groups based on the median values of BRCA1 expression. In line with the published research, low BRCA1 expression patients exhibited a lower cisplatin AUC (P < 0.001; Fig. 9B), indicating the remarkable accuracy of estimated drug response.
To obtain potential therapeutic agents for patients with high AIGS scores, two different screening modalities were employed (Fig. 9A). Firstly, differential drug response analysis was performed on patients in the AIGS high and low score groups to select compounds with high AIGS scores and low AUC, followed by Spearman's correlation analysis to screen compounds that displayed significantly negative correlation between AUC values and AIGS (r < À0.15 for CTRP or r < À0.2 for PRISM). As displayed in Fig. 9C, seven compounds with lower estimated AUC values and higher AIGS scores were finally obtained.

Discussion
The AJCC classification is a traditional applied clinical management guidance scheme for BLCA. Given the heterogeneity of tumours and the different clinical outcomes of patients with the same stage, AJCC staging displayed significant limitations in prognostic management and risk assessment, which may lead to potential undertreatment or overtreatment [4,5]. Recently, the rise of immunotherapy has brought new insights into BLCA treatment, significantly extending the OS of patients. However, only a small proportion of patients can benefit from immunotherapy [9]. Thus, the development of stable biomarkers to distinguish the immunotherapy-sensitive patients accounts for a major challenge in current BLCA immunotherapy [36]. To bridge these gaps, we comprehensively investigated the relationship among gene transcriptome profiles, prognosis, recurrence, and immunotherapy response.
In this study, based on eight independent cohorts, 30 stable OS-related genes were screened for the development of AIGS. Under the comprehensive application of computer technology, high-throughput sequencing and bioinformatics, a large number of gene expression prediction models based on various machine-learning methods have been developed. Notably, why a particular algorithm is employed and which algorithm is optimal deserves further discussion. In fact, the choice of researchers in the algorithm is largely determined by individual preferences and biases. To avoid this problem, we combined 10 machinelearning algorithms and selected the optimal signature among 76 combined signatures. Eventually, AIGS, a Ridge-based consensus machine-learning signature with the highest mean C-index (0.709) among eight cohorts, was developed, which is also an independent prognostic factor for BLCA prognosis (OS, RFS, PFS). To enable the better clinical application of AIGS, a nomogram was further developed, and it displayed a higher C-index in the TCGA cohort compared to AIGS alone, indicating a superior predictive value for the prognosis prediction of BLCA.
Some traditional clinical traits (e.g., age, gender, AJCC stage, and smoking) and emerging molecular features (e.g., mutation status of TP53 and RB1) displayed dramatic significance in the prognostic evaluation and clinical management of BLCA, for instance, compared with male, female with BLCA exhibit a worse prognosis and higher risk of death [37]. Patients with TP53 and RB1 mutations are generally more aggressive and display worse OS [36]. Thus, we compared the superiority of AIGS with these clinical and molecular features. Across eight independent cohorts, our model not only demonstrated an independent predictive performance after adjusting features like age, gender, AJCC stage, grade, intravesical therapy, systemic chemotherapy, smoking, BCG treatment, platinum therapy, neoadjuvant chemotherapy, FGFR3, p53, RAS, and RB1 mutations but also presented remarkable superior accuracy in assessing the prognosis risk according to the C-index assessment, suggesting that it potentially to be a promising surrogate to evaluate the prognosis risk for BLCA in clinical practice.
In addition, we retrospectively collected 58 published signatures consisting of different function genes. The result showed that AIGS maintained relatively superior performance and extrapolation in each cohort compared to these models. Notably, some models performed well in their respective modelling cohorts like Chen, Cao, Yan, Liu, Zhu, et al., but all these signatures performed poor performance in other datasets due to the overfitting.
Based on the above results, AIGS exhibited superior stability in the stratification of high-and low-risk patients; therefore, rational clinical intervention for patients with different levels of AIGS is warranted. Indeed, our AIGS performed high stability in the prediction of immunotherapy response. A lower AIGS predicted higher sensitivity to immunotherapy in both cohorts IMvigor210 and GSE91061, which was further validated in multiple algorithms including TIDE, IPS analysis, and Submap. This finding exhibited far- Fig. 9. Identification of candidate agents with higher drug sensitivity in high-AIGS score patients. (A) Schematic outlining the strategy to identify agents with higher drug sensitivity in high AIGS score patients. (B) Comparison of estimated cisplatin's sensitivity between high and low BRCA1 expression groups; t-test. (C) Potential therapeutic compounds for high-AIGS BLCA patients; Wilcox test. The error bars indicate 95% confidence interval (CI). *P < 0.05, ***P < 0.001. AIGS, Artificial intelligence-derived gene signature. reaching implications for the optimization of treatment strategies in patients with BLCA. In addition, lower AIGS also suggested higher chemosensitivity. However, due to the lack of treatment information, this analysis was only performed in one cohort (GSE52219), and the relationship between AIGS and chemotherapy response remains to be confirmed by more clinical and prospective experiments. Additionally, we explored the underlying biological mechanisms of AIGS. High AIGS was mainly associated with tumour development and metastasis-related pathways (including DNA replication, chromosome segregation, epidermal cell differentiation, cell cycle, mismatch repair, ECM receptor interaction, etc.). Besides, GSVA results suggested that MTORC1 signaling and epithelial-mesenchymal transition pathways were significantly positively correlated with AIGS, which have been reported closely related to tumour cell proliferation, invasion, and metastasis [38,39]. These results may explain the poor prognosis of patients with high AIGS. Notably, studies have shown that the E2F targets pathway is associated with chemotherapy resistance in bladder cancer [40]. Combined with the results of GSVA, patients with low AIGS may be more sensitive to chemotherapy, which paralleled our result in GSE52219.
As we all know, high CD8 T-cell infiltration generally predicts a better immunotherapy response [41,42]. Immune cell infiltration results suggested that CD8 T cells exhibited a higher infiltrating abundance in patients with low AIGS, which is consistent with the above results. Over the last decade, ICIs have made a huge splash in cancer treatment by targeting immune checkpoints like PD-L1, CD40, and CTLA-4 [43]. Our findings performed that the expression of CTLA-4, CD274 (PD-L1), and CD40 was significantly increased in patients with low AIGS, which supports our conclusion that low AIGS predicts a better immune response. Besides, the expression of TMIGD2 and SIGLEC15 was significantly increased in the low AIGS patients, suggesting that they may be potential immunotherapy targets.
Based on multi-omics data, we further investigated the mutation and CNV features with regard to AIGS. The result suggested that high AIGS was independently associated with TP53, RB1, and ELF3 mutations, and low AIGS was independently related to FGFR3 mutation (after adjusting for TMB). Studies have performed those mutations of TP53, RB1, and ELF3 and revealed the occurrence and poor prognosis of BLCA [44][45][46][47][48]. Conversely, FGFR3 mutation predicted a better OS [49]. All these results support the conclusion that high AIGS scores displayed a worse prognosis. Additionally, our study suggested that high AIGS not only predicted the amplification 7p11.2 but also remained an independent significance after adjusting for clinical characteristics such as age, stage, gender, and FGG. Research by Nishiyama N et al. suggested that the amplification of 7p11.2 is tightly related to the development of aggressive non-papillary BLCA. Consistent with our conclusions, BLCA patients with AIGS may behave more aggressively and display a worse prognosis. Besides, based on the methylation profile, RTP4, a novel AIGS-related methylation driver gene, was identified, which is of great significance for further research considering its tight relationship with AIGS and the gap in this field. Last but not least, based on multiple drug databases as well as comprehensive bioinformatics algorithms, seven compounds with low estimated AUC values and high AIGS scores were finally obtained, which is important for refined treatment of high-risk BLCA patients.
Although AIGS is a promising comprehensive biomarker, some limitations should be acknowledged. Firstly, all the samples in our study were retrospective, future validation of AIGS should be conducted in prospective fresh samples. Secondly, some clinical and molecular traits on public datasets were very inadequate, which thus had concealed the potential associations between AIGS and some variables. Thirdly, the roles of most genes from AIGS in BLCA remain unknown, and further in vivo and in vitro experiments are needed to reveal their functions.

Conclusions
In conclusion, based on multiple bioinformatics and machine-learning algorithms, we developed a robust and powerful consensus artificial intelligence signature that can accurately predict the prognosis, recurrence, and immune response for BLCA. In addition, AIGS is also a promising biomarker for predicting chemotherapy response, and the identification of potential compounds demonstrates dramatic implications of precise treatment for high-risk patients. Overall, AIGS is a promising tool to optimize decision-making and surveillance protocol for individual BLCA patients.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Survival and functional analysis of AIGS. Fig. S2. Immune landscape and multi-omics analysis with regard to AIGS. Table S1. Details of baseline information in 11 public datasets.