Identification of Pathway-Based Biomarkers with Crosstalk Analysis for Overall Survival Risk Prediction in Breast Cancer

Recently, many studies have investigated the role of gene-signature on the prognostic assessment of breast cancer (BC), however, the tumor heterogeneity and sequencing noise have limited the clinical usage of these models. Pathway-based approaches are more stable to the perturbation of certain gene expression. In this study, we constructed a prognostic classifier based on survival-related pathway crosstalk analysis. We estimated pathway’s deregulation scores (PDSs) for samples collected from public databases to select survival-related pathways. After pathway crosstalk analysis, we conducted K-means clustering analysis to cluster the patients into G1 and G2 subgroups. The survival outcome of the G2 subgroup was significantly worse than the G1 subgroup. Internal and external dataset exhibits high consistency with the training dataset. Significant differences were found between G2 and G1 subgroups on pathway activity, gene mutation, immune cell infiltration levels, and in particular immune cells/pathway’s activities were significantly negatively associated with BC patient’s outcomes. In conclusion, we established a novel classifier reflecting the overall survival risk of BC and successfully validated its clinical usage on multiple BC datasets, which could offer clinicians inspiration in formulating the clinical treatment plan.


INTRODUCTION
As a highly metastatic and invasive malignant tumor with high incidence, breast cancer (BC) seriously threatens women's health and quality of life (Veronesi et al., 2005;Siegel et al., 2019;Rüschoff et al., 2020). BC occupied a quarter of all malignant tumors, which has received numerous clinical attention worldwide (Ferlay et al., 2015). At present, the primary treatment options for BC are chemotherapy, surgery, and radiotherapy (Shi et al., 2019). However, BCs tend to exhibit drug resistance and high recurrence rates on account of heterogeneity, making the therapeutic effects and prognosis of the disease unsatisfactory (Natarajan et al., 2012). Screening biomarkers for BC has a significant effect on reducing mortality, early diagnosis, and the improvement of prognosis in BC. et al. (2019) identified clinically relevant immune clusters by integrating 15 BC cohorts, and discovered that patients with protumorigenic immune infiltration were associated with poor prognosis. PAM50 (Parker et al., 2009) gene signature is a well-known molecular subtyping signature for BC, which could classify the BC into five molecular intrinsic subtypes: Normal-like, Basal-like, HER2+, Luminal A, and Luminal B. These efforts have helped us gain a deeper understanding of BC. Nevertheless, studies have found that due to the tumor heterogeneity and sequencing noise, gene-based signatures were highly unstable and the identified biomarkers were dramatically affected by the selection of training datasets (Michiels et al., 2005;Domany, 2014). In recent years many researchers indicated pathways could be helpful to extract more stable and interpretable features for risk prediction. Efforts have been made to decode cancer at levels of predefined pathways available in biological databases, such as Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al., 2016), Reactome (Fabregat et al., 2018), and Gene-Set Enrichment Analysis (GSEA) (He et al., 2018). However, most existing pathways are general rather than disease-specific, and disease progression can only be partially affected by them. For pathway pairs with many common genes, we call it crosstalk. Taking the impact of overlapping genes on the pathway activity score (PAS) quantification of the two pathways into consideration can help identify disease-related features. Although it is intuitively believed that pathways will influence each other, especially when genes are shared, the existence of this phenomenon has not been studied in PAS estimation. And few studies have explored the PAS in cancer with crosstalk accommodated among well-established pathways to identify cancer-specific sub-pathways that could be used to predict the prognosis of cancer patients. Therefore, subtyping patients based on PAS and pathway crosstalk analysis is essential to promote personalized medicine.
In this study, we constructed a novel classifier reflecting the overall survival risk of BC based on survival-related pathway crosstalk analysis. We calculated the PAS for each pathway obtained from KEGG and GO resources based on the expression matrix. And then investigated the influence of crosstalk between these selected pathways on different cohorts to select the most critical 100 sub-pathways among all cohorts. we further conducted a K-means clustering analysis to cluster the patients into G1 (moderate) and G2 (aggressive) subgroups. Internal and external dataset exhibits high consistency with the training dataset.

Pathway Activity Score
The pathway activity score (PAS) for each dataset was calculated based on the method proposed by Bhandari et al. (Bhandari et al., 2019). We downloaded all pathways from the gene ontology (GO) database (http://geneontology.org/) and generated a new mRNA expression matrix that contains only genes that exist in it for each pathway. After that, for each gene, based on its expression level, we classified the tumors into two subgroups, the samples in the higher group were scored +1, while the others were scored −1. Finally, we averaged all gene scores in this pathway as the pathway activity score for each tumor sample. A higher PAS indicates a higher pathway activity in the sample, and otherwise, a lower score means lower activity in the sample.

Selection of Survival-Related Sub-Pathways
Based on PASs and survival information, we calculated the logrank p-value for each pathway by regression analysis. The pathways were then ranked based on the log-rank p-value. To minimize the false positive rate, we used the common significant pathways of these three large breast cancer cohorts, instead of using any single data set. The combined rank of each pathway was determined by the sure independence screening (SIS) method. We further selected the top n pathways for pathway crosstalk analysis. The threshold n was set to 100, which is much bigger than N/log(N), where N is the sample size of each cohort.
Different pathways often share some of the same genes, which can lead to crosstalk in the prognostic associations of different pathways. Considering the influence of overlapping genes on the PAS quantification of the two pathways can help identify cancerrelated features. We further identified the crosstalk among the 100 selected survival-related pathways to define sub-pathways related to survival. The crosstalk between two pathways with at least three genes in common could be classified into three types. The overlapped genes between pathway A and pathway B could be defined as P A ∩P B , while the unique genes that specifically exist in pathway A or pathway B were defined as P A -(P A ∩P B ) and P B -(P A ∩P B ). Based on this classifier, each pathway pair could generate three sub-pathways.
To make sure each sub-pathway contains enough genes for further analysis, we obtained sub-pathways that consist of at least three genes. The Cox-PH model was used to calculate the survival risk p-value based on the recalculated PAS for each sub-pathway. After Bonferroni correction, we identified critical survival-related pathways and sub-pathways (FDR p-value < 0.01) for each dataset, and the overlapped pathways were finally adopted for further modeling.

Model Construction and Evaluation
With the pathways generated above, we constructed a pathway activity matrix with the row names are sub-pathways and the column names are sample IDs for each dataset. We performed consensus clustering with the pathway features acquired above to classify the TCGA samples and obtained the best cluster number as 2 based on three metrics, including C-index, Brier score, and log-rank p-value to redefine the samples as G1 and G2 subgroups.
To predict these two subgroups for other datasets, we used several machine learning methods, including SVM, Adaboost, and Gaussian, to build a prediction model and obtained the best performance based on the pathway activity matrix. The robustness of the model was evaluated in the TCGA testing dataset and several external individual GEO datasets. We further built a classification model using several machine learning methods, including SVM, Adaboost, and Gaussian, based on these labels. We used the grid search to slightly turn the hyperparameters of the classifier. In the cross-validation procedure, TCGA samples were divided into training and testing datasets in a 4:1 ratio, and the training dataset was used to perform 10-fold cross-validation. To predict the GEO dataset, we used all TCGA samples to build the classification model. We then compared three metrics, including C-index, Brier score, and log-rank p-value, to evaluate the model's performance. These metrics can quantify the proportion of patient pairs in a cohort whose risk prediction is highly consistent with survival outcomes. Usually, a higher C-index indicates more precise prediction performance, and 1 means perfect prediction, while 0.5 means the prediction performance is similar to random prediction. To calculate the C-index, we built a Cox-PH model based on the clustering labels and the patient's survival information in the TCGA training dataset and predicted the survival rate in the testing dataset, which was calculated by the R "survcomp" package. Brier score reflects the mean difference between observed and predicted survival after a certain period in survival analysis, and a lower score means good performance. Log-rank p-value was calculated by the R "survival" package to show the survival difference between the two groups, and a lower score means a more significant survival difference.

Survival Analysis
The log-rank test compares the survival difference of two groups at each observed event time was performed by R "survival" package. Kaplan-Meier analysis was applied to obtain a survival-curve plot of BRCA subtypes. Multivariate Cox regression analysis determined the independent role of this newly established predictor. Besides, we adopted Fisher's exact test to compare the census gene mutation differences between G2 and G1 subgroups in the TCGA cohort. We also compared the distributions of G2 and G1 in different clinicopathological features, such as tumor stage, new tumor event, and sex, by using Fisher exact tests.
We used the "DESeq2" R package (Love et al., 2014) to real the differential expressed genes between G2 and G1 subgroup; the significant DEGs were identified as |LogFoldChange| > 1 and false discovery rate (FDR) < 0.05.

Gene Set Enrichment Analysis
GSEA analysis was used to compare the pathway activity difference between G2 and G1, in which the R "ClusterProfier" package was performed. We adopted Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways to perform GSEA analysis and the top 20 significant pathways were displayed.

Mutation Analysis
The R "maftools" package was utilized to analyze and visualize the mutation data. The mutation data were compared between one group and the other groups using the chi-square test. A p-value of less than 0.05 was considered significant.

Identification of Overall Survival Risk Subtypes in BC
We obtained seven BC datasets (TCGA, GSE1456A, GSE1456B, GSE7390, GSE16446, GSE20685, GSE20711, and GSE42568) gene expression profiles and available clinical survival information from the TCGA and GEO databases. After calculating the PAS for each pathway obtained from KEGG and GO resources and selecting the survival-related pathways, we investigated the influence of crosstalk between these selected pathways on different cohorts, and then the most critical 100 subpathways among all cohorts were identified (Supplementary Table S1). K-means clustering analysis was used to divide the TCGA patients into two subgroups, defined as group 1 (G1, moderate) and group 2 (G2, aggressive) (Supplementary Figure  S1A, and Supplementary Table S2). Notably, patients from the G2 subgroup show significantly worse clinical outcomes (overall survival and relapse-free survival) compared to the G1 subgroup (Supplementary Figures S1B,C; p 0.0053 and p 0.0031, respectively; log-rank test). We further built a classifier based on the TCGA training dataset with the label defined by k-means clustering analysis (Materials and Methods).

Evaluation of the Performance
To evaluate the robustness of OS risk prediction of the model, we tested the model performance on the TCGA testing dataset and several external GEO datasets, including GSE1456A, GSE7390, GSE16446, GSE20685, GSE20711, and GSE42568. As shown in Figure 1 and Table 1, the model was stable and exhibited excellent classification capability, indicated by C-index and log-rank p-values between G2 and G1. The TCGA test cohort generated a high C-index (0.661), low Brier score (0.179), and significant average log-rank p-value (p 0.00123) on survival difference. In different datasets, our classifier can significantly divide the samples into a good prognostic group and a poor prognostic group, which suggested that the newly developed classifier is able to universally predict the overall survival outcomes for patients with BC.
In order to compare the risk prediction capabilities of our predictor with some other clinical information, we performed univariate cox regression analysis for each clinical information Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 689715 (including age, tumor stage, and PAM50 subtyping) in the TCGA dataset as well as the external validation datasets. As shown in Figure 2, our classifier has a more general prognostic ability than other clinical information (p < 0.05 in all datasets). We further introduced several published transcriptomic-based predictors as previous study (Lee et al., 2021), including the proliferation index FIGURE 1 | Kaplan-Meier analysis to determine the survival differences between group 2 (G2) and group 1 (G1).  (Whitfield et al., 2006), interferon-γ (IFNγ) signature score (Ayers et al., 2017) as well as cytolytic activity score (Rooney et al., 2015), and performed a multivariate Cox regression analysis with age, tumor stage, and our classifier (Supplementary Figure  S2). In this analysis, the proliferation index and the IFNγ signature score were estimated as ssGSEA score (Yi et al., 2020a) of each gene signature, respectively, and the cytolytic activity score was calculated as the mean expression level of GZMA and PRF1 (Rooney et al., 2015). As shown in Supplementary Figure S2, the proliferation index and IFNy signature score show higher predictive power (hizard ratios were 2.13 and 0.27, respectively), but also have larger confidence intervals and p-values, which suggesting that they cannot be used as independent prognostic factors of BRCA. Reassuringly, our classifier had a more stable hizard ratioa near statistically significant p-value. In addition, we also test the risk prediction performance in different subgroups of age and tumor stage (Supplementary Figure S3). This result suggests FIGURE 3 | Census mutation landscape between group 2 (G2) and group 1 (G1). Only cancer census genes in the COSMIC database are shown in the plot. The significance of the difference in gene mutation frequency between the two groups is shown in the barplot on the right (fisher's exact test).
Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 689715 5 that our classifier can be used essentially for the typing of different clinical subgroups, although in the low-age group and low-level group of TCGA the p-values did not reach significance.

Association Between Different Survival Subtypes and Genomic Feature
We found that the mutation rates of PI3KCA and CDH1 were significantly higher in the G1 group than G2 group (PI3KCA: OR 0.655, 95%CI: 0.471-0.907, p 0.00951; CDH1: OR 0.389, 95%CI: 0.220-0.660, p 0.000195, Figure 3, and Supplementary Table S3, Fisher's exact test). Other significant differentially mutated genes between the two groups, including ATR, ALK, TBX3, AKAP9, TPR, KDM6A, CREBBP, AMER1, CRNKL1, TRIM24, ZNF429, AFF3, IGF2BP2, and LIFR (Supplementary Table S3, all p < 0.05). No significant tumor mutation burden (TMB) level difference was found between G1 and G2 subgroups (Supplementary Figure S4). PI3KCA and CDH1 are two frequently mutated genes in many cancers, including breast cancer, gastric cancer, colorectal carcinoma, and head and neck squamous cell carcinoma (Hansford et al., 2015;Millis et al., 2016;Zhang et al., 2017;An et al., 2018). However, the association of PIK3CA mutation and prognosis has not been clarified clearly, PI3KCA mutation can be associated with a better prognosis (Barbareschi et al., 2007;Maruyama et al., 2007;Pérez-Tenorio et al., 2007;Kalinsky et al., 2009) or a worse prognosis (Li et al., 2006;Lerma et al., 2008). In some studies, PIK3CA mutation even has no obvious relationship with the prognosis (Saal et al., 2005;Lai et al., 2008;Stemke-Hale et al., 2008;Michelucci et al., 2009;Loi et al., 2010;Boyault et al., 2012). A similar phenomenon was found for CDH1 mutation as well (Corso et al., 2018). We then performed differential expression analysis between G2 and G1 subgroups, and identified 290 upregulated and 824 downregulated genes (|log 2 fold change| > 1 and FDR p-value > 0.05) (Figures 4A,B, and Supplementary Table S4) based on the TCGA cohort. KEGG pathway analysis indicated that these upregulated genes were mostly enriched in neuroactive ligand−receptor interaction, cholinergic synapse and estrogen signaling pathways ( Figure 4C). The downregulated genes were mostly enriched in immune-related pathways, such as cytokine−cytokine receptor interaction, hematopoietic cell lineage, graft−versus−host disease, and Th17 cell differentiation ( Figure 4D). These results prompted that the G1 subgroup might be immune activated subtype, which could be associated with its better overall survival. We also tested the correlations between the two survival subtypes (G2 and G1) and clinicopathological characteristics from the TCGA cohort and found that no significant differences were revealed in age, sex, tumor stage, metastasis coded, estrogen receptor status, progesterone receptor status, and histological type subgroups, instead of PAM50 subtype (Supplementary Figure S5, Supplementary Table S5).
We then performed GSEA analysis to compare the G2 and G1 subgroups, aiming to identify critical pathways that displayed different activities between the G1 and G2 subgroups (Supplementary Figure S6, Supplementary Tables S6-S8).
Hallmark pathway enrichment analysis showed that immunerelated pathways including the inflammatory response, allograft rejection, interferon-gamma response and TNFA-signaling via NFκB were enriched in the G1 subgroup, while metabolic-related pathways such as oxidative phosphorylation signaling were activated in the G2 subgroup (Supplementary Table S6). Pathway enrichment analysis indicated that the differences between these two groups were concentrated in the KEGG pathways of "Graft vs. host disease", "primary immunodeficiency", and "allograft rejection" (Supplementary Table S7) and Reactome pathways related to co-stimulation by the CD28 family, generation of second messenger molecules, and cytokine signaling in the immune system (Supplementary Table S8). Previous studies have proved that metabolic pathway activities like oxidative phosphorylation signaling were negatively correlated with immune infiltration Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 689715 7 and contributed to a worse prognosis in TNBC (Gong et al., 2021), which is consistent with our results.

Comparison of Tumor Microenvironment Between G2 and G1
We further employed the CIBERSORT algorithm to investigate the distributions of infiltrated immune cells between the G2 and G1 subgroups (Supplementary Figure S7). The result revealed that significant differences were obtained between two groups in CD8+T cells, CD4+T memory cells (resting), CD4+T memory (activated), γδ T cells, Macrophages M0, Macrophages M1, Macrophages M2, Dendritic cells (resting), and Mast cells (resting) ( Figure 5A). Among macrophages, Macrophages M1 accounts for a higher proportion of the G1 subgroup, while the G2 subgroup consists of a higher proportion of Macrophages M2. Macrophages M2 was found to be dominant in BC and associated with poor clinical outcomes of BC (Bao et al., 2021), which could be the reason that the G2 subgroup patients have a worse overall survival.
Considering that the tumor tissue has tumor cells, stromal cells and immune cells, we measured stromal score and immune score based on specific gene expression signature to represent the level of immune infiltration and stroma infiltration of each tumor following the previous reported method-ESTIMATE (Yoshihara et al., 2013). Also, an ESTIMATE score also calculated which reflects the overall level of both immune infiltration and stromal infiltration. As shown in Figures 5B-D, G1 presented a higher stromal score, immune score and ESTIMATE score compared with G2. These results consistent with the previous definition that the G1 subgroup might be immune activated subtype, there was abundant crosstalk in the tumor microenvironment of this type of tumor, which could benefit from immunotherapy.

DISCUSSION
In the era of personalized medicine, there is an urgent need for a molecular marker-based approach to predict the prognostic outcomes of cancer patients accurately. Previous studies have reported many gene-based signatures to subtype BC (van de Vijver et al., 2002;Pu et al., 2020) (Tekpli et al., 2019) (Parker et al., 2009). Here, we constructed an overall survival risk model to classify samples into two subgroups. Internal and external datasets validation exhibits high consistency with the training dataset. Significant differences were found between the G2 and G1 subgroups including pathway activity, gene mutation, immune cell infiltration levels. In particular, immune cells/pathway's activities were significantly negatively associated with BC patient's outcomes.
In order to test whether our classifier is applicable to all ages and tumor grades, we performed prognostic association analysis for different clinical subgroups. For a data set with sufficient samples (more than 20 samples for each subgroup), our classifier can basically distinguish patients with different overall survival periods, although the high-age group and the low-stage group of TCGA have not reached statistical significance. Although the p-value of the high-age group of TCGA does not reach statistical significance (0.076), a clear trend can still be seen. However, it is challenging to explain why our classifier is unable to distinguish OS in the low-stage samples of TCGA with prognostic significance, though it performed well in the other two verification sets (Supplementary Figures S3M,O).
We found a significant mutation rate difference of PI3KCA and CDH1 gene between the G1 and G2 subgroups. It is not yet clear whether PIK3CA mutation is associated with clinical outcome, PI3KCA mutation can be associated with a better prognosis (Barbareschi et al., 2007;Maruyama et al., 2007;Pérez-Tenorio et al., 2007;Kalinsky et al., 2009) or a worse prognosis (Li et al., 2006;Lerma et al., 2008). In some studies, PIK3CA mutation even has no obvious relationship with the prognosis (Saal et al., 2005;Lai et al., 2008;Stemke-Hale et al., 2008;Michelucci et al., 2009;Loi et al., 2010;Boyault et al., 2012). A similar phenomenon was found for CDH1 mutation as well (Corso et al., 2018). Our results suggested that mutations of PI3KCA are positively associated with a favorable prognosis, but future studies are needed to investigate the potential mechanisms.
We also found that the G1 subgroup displayed significant higher level of immune infiltration, stromal infiltration level than the G2 subgroup. As reported, Th1 and cytotoxic types of memory T cells and CD8+ T cells can predict better prognosis in diverse cancers (Wei et al., 2018;Yi et al., 2020b;St. Paul and Ohashi, 2020). Several studies showed that the existence of mature antigen-presenting dendritic cells (DCs) could infiltrate colon cancer and theoretically increase immune response, which are correlated with improved survival as well (Schwaab et al., 2001). Other immune cells such as macrophages always produce plenty of factors influencing tumor cell's survival and growth, chemotaxis, cell invasion, angiogenesis, or repress T cell responses (Pagès et al., 2010). Therefore, a high rate of tumorassociated macrophages typically serves as a poor prognostic factor. It is valuable to predict the efficacy of specific therapies, especially immunotherapy. For example, Peng et al. recently developed a computational method named TIDE to accurately predict immunotherapy outcomes of melanoma (Jiang et al., 2018). The level of immune infiltration was significantly associated with the efficacy of immunotherapy (Galon and Bruni, 2019). The significant immunological differences between G1 and G2 suggest that our classifier may be predictive of immunotherapy efficacy. We will collect relevant data resources for more in-depth study in our future work.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: TCGA and Gene Expression Omnibus (GSE1456A, GSE16446, GSE20685, GSE20711, GSE42568, and GSE7390).

AUTHOR CONTRIBUTIONS
XL and GO designed this experiment. XL and LS performed the research and analyzed the data. XL and JL wrote the manuscript, and XL, LS, JL, and GO revised the manuscript. All these authors read and approved the final manuscript. Kaplan-Meier plot shows the OS difference between group 2 (G2) and group 1(G1). (C) Kaplan-Meier plot shows the RFS difference between group 2 (G2) and group 1(G1).
Supplementary Figure S2 | Multivariate Cox regression analysis of prognostic factors for overall survival in TCGA patients. Age, tumor stage, proliferation score, IFNg score, cytolytic activity as well as our classifier were taken into consideration in this analysis.
Supplementary Figure S3 | KM curves show the prognostic performance of the classifier in different age level and tumor stage level of all included cohorts. Only cohorts with available clinical information are shown in the plot.
Supplementary Figure S4 | Comparison of tumor mutation burden between the G1 and G2 subgroups.
Supplementary Figure S5 | Sankey plot shows the distribution of group 2 (G2) and group 1 (G1) in different clinicopathological and genetic subgroups.
Supplementary Figure S6 | Gene Set Enrichment Analysis (GSEA) analysis shows the differences between group 2 (G2) and group 1 (G1) at Hallmark, (Kyoto Encyclopedia of Genes and Genomes) KEGG, and Reactome pathway aspects.
Supplementary Figure S7 | CIBERSORT analysis shows the distributions of 22 tumor infiltrated immune cells in each breast cancer sample.