Baseline data of study cohort
In this study, 554 prostate tissue transcriptome and clinical data were acquired from TCGA. After excluding the samples with missing clinical data or followed up for less than 30 days, we obtained the transcriptome expression information of 386 PCa tissues and 52 adjacent tissues. In addition, data sets GSE70770 and GSE116918 were downloaded from GEO. The former contained transcriptome data and follow-up information on biochemical recurrence from 199 PCa patients undergoing radical prostatectomy. The latter contained transcriptome data and follow-up information from 248 PCa patients undergoing radical radiotherapy.
CAFs infiltration was negatively correlated with PCa prognosis
To explore the correlation between the abundance of CAFs and the prognosis of PCa, we applied the "MCPcounter" package and "EPIC" package to evaluate the infiltration level of CAFs in PCa tissues in three data sets: TCGA, GSE70770, and GSE116918. Survival curve analysis results indicated that in both the TCGA (Fig. 1A, P < 0.001) and GSE116918 data sets (Fig. 1E, P < 0.001), the prognosis of CAF_MCPcounter high infiltration group was worse than that of CAF_MCPcounter low infiltration group. In the TCGA (Fig. 1B, P < 0.001), GSE70770 (Fig. 1D, P = 0.001), and GSE116918 data sets (Fig. 1F, P < 0.001), the prognosis of CAF_EPIC high infiltration group was also worse than that of CAF_EPIC low infiltration group. According to the above results, the higher the infiltration level of CAFs in the PCa tissues, the worse the prognosis of PCa patients.
Single-cell analysis extracted CAFsRGs
We acquired the GSE176031 dataset from the GEO. This dataset contained scRNA-seq data from four patients with PCa tissues and normal adjacent tissues. A total of 16,107 cells were acquired, with 8,069 cells from PCa and 8,038 cells from cancer-adjacent tissues. We further identify cell groups through their expression patterns. UMAP plot depiction of scRNA-seq data showed the seven main cell types (Fig. 1G). Fibroblasts were discovered to exhibit unique gene expression patterns (Fig. 1H), with 323 DEGs between the seven cell types (Table S1). Gene differential expression analysis was carried out between PCa tissues and normal tissues, and the related genes for fibroblasts were defined as PCa CAFsRGs (Fig. 1I, Table S2). We use univariate COX analysis to acpuire CAFsRGs associated with prognosis (Fig. 1J, P < 0.05). Seven CAFsRGs (PTGS2, FKBP10, ENG, CDH11, COL5A1, COL5A2, and SRD5A2) were further screened out through LASSO regression analysis (Fig. 1K, L).
Characteristics of the seven CAFsRGs with prognostic significance
UMAP plot showed the mRNA expression of 7 CAFsRGs with prognostic significance in the seven main PCa cell types (Fig. 2A-G). Bubble plot visualized 7 CAFsRGs expression characteristics in scRNA-seq data (Fig. 2H). The expression of these 7 CAFsRGs was significantly different between CAFs and other cell types. Survival curve (Fig. 3A-G) showed PTGS2 (P = 0.028), FKBP10 (P = 0.021), ENG (P = 0.0001), CDH11 (P = 0.003), COL5A1 (P = 0.004), COL5A2 (P = 0.0001), and SRD5A2 (P = 0.0013) was significantly different in BRFS between the high- and the low-expression groups. The violin plot indicated the differences in expression levels of the 7 CAFsRGs with prognostic significance among clinical subgroups (PR/CR and PD/SD (Fig. 3H), Age < 60 and Age ≥ 60 (Fig. 3I), T1-2 and T3-4 (Fig. 3J), N0 and N1 (Fig. 3K), Gleason ≤ 7 and Gleason ≥ 8 (Fig. 3L)). It is suggested that these CAFsRGs were related to the therapeutic efficacy, tumor grade, and tumor malignancy in PCa patients.
Construction of prediction model based on CAFsRGs
Based on the CAFsRGs with prognostic significance, the risk coefficients of each gene were calculated by multivariate COX analysis (Table S3) and CAFsRS were also calculated for all patients. Based on the median of CAFsRS, we categorized PCa patients into two groups: high-risk and low-risk. The results from the survival curve demonstrate worse prognosis was observed in the high-risk group compared to the low-risk group worse prognosis was observed in the high-risk group compared to the low-risk group (Fig. 4A, P < 0.001), test set (Fig. 4B, P = 0.036), GSE70770 (Fig. 4C, P = 0.004) and GSE116918 (Fig. 4D, P < 0.001) cohorts. The scatter diagram of survival state indicated a significant increase in the number of patients with biochemical recurrence in the high-risk group compared to the low-risk group across the four cohorts (Fig. 4E-H). The ROC analysis of the training set confirmed the predictive efficacy of CAFsRS (1-year BRFS AUC = 0.732, 3-year BRFS AUC = 0.773, 5-year BRFS AUC = 0.775, Fig. 4I). The result of the test set, GSE70770 and GSE116918 cohorts also validated CAFsRS’s prognostic value (Fig. 4J-L). These results demonstrate that CAFsRS serves as a highly reliable prognostic indicator.
Table 1
Baseline data for four sets of patients.
Characteristic | | Training set (n = 257) | Testing set (n = 129) | GSE70770 (n = 199) | GSE116918 (n = 248) |
Age, n (%) | < 60 | 92 (35.8%) | 56 (43.4%) | 48 (24.1%) | 29 (11.7%) |
≥ 60 | 165 (64.2%) | 73 (56.6%) | 63 (31.7%) | 219 (88.3%) |
unknown | 0 (0%) | 0 (0%) | 88 (44.2%) | 0 (0%) |
T, n (%) | T1 | 0 (0%) | 0 (0%) | 0 (0%) | 51 (20.6%) |
T2 | 93 (36.2%) | 42 (32.6%) | 80 (40.2%) | 76 (30.6%) |
T3 | 158 (61.5%) | 85 (65.9%) | 118 (59.3%) | 92 (37.1%) |
T4 | 6 (2.3%) | 2 (1.5%) | 1 (0.5%) | 4 (1.6%) |
Tx | 0 (0%) | 0 (0%) | 0 (0%) | 25 (10.1%) |
N, n (%) | N0 | 206 (80.2%) | 105 (81.4%) | 98 (49.3%) | - |
N1 | 51 (19.8%) | 24 (18.6%) | 8 (4.0%) | - |
Nx | 0 (0%) | 0 (0%) | 93 (46.7%) | - |
PSA, n (%) | < 10 | - | - | - | 50 (20.2%) |
≥ 10 | - | - | - | 198 (79.8%) |
Gleason, n (%) | 5 | 0 (0%) | 0 (0%) | 2 (1.1%) | 0 (0%) |
6 | 12 (4.7%) | 6 (4.7%) | 34 (17.1%) | 42 (16.9%) |
7 | 135 (52.5%) | 62 (48.1%) | 140 (70.3%) | 99 (39.9%) |
8 | 33 (12.8%) | 16 (12.4%) | 13 (6.5%) | 52 (21.0%) |
9 | 75 (29.2%) | 44 (34.1%) | 9 (4.5%) | 54 (21.8%) |
10 | 2 (0.8%) | 1 (0.7%) | 1 (0.5%) | 1 (0.4%) |
PSA, prostate-specific antigen.
The results of a subgroup analysis of the performance of CAFsRS
To further evaluate the performance of CAFsRS, we conducted a subgroup analysis based on TCGA and GSE116918 cohorts. Survival curve indicated a significative difference in BRFS between the two risk groups of CAFsRS in the TCGA cohort in subgroup Age < 60 (Fig. 5A, P = 0.034), Age ≥ 60 (Fig. 5B, P < 0.001), Gleason ≤ 7 (Fig. 5C, P = 0.165), Gleason ≥ 8 (Fig. 5D, P = 0.005), T1-2 (Fig. 5E, P = 0.701), T3-4 (Fig. 5F, P < 0.001), N0 (Fig. 5G, P < 0.001) and N1 (Fig. 5H, P = 0.168). Survival curve indicated a difference in BRFS between the two risk groups of CAFsRS in the GSE116918 cohort in subgroup Age < 60 (Fig. 5I, P = 0.270), Age ≥ 60 (Fig. 5J, P < 0.001), Gleason ≤ 7 (Fig. 5K, P = 0.002), Gleason ≥ 8 (Fig. 5L, P = 0.040), T1-2 (Fig. 5M, P = 0.082), T3-4 (Fig. 5N, P = 0.001), PSA < 10 (Fig. 5O, P = 0.012) and PSA ≥ 10 (Fig. 5P, P = 0.002).
CAFsRS was significantly correlated with CAFs infiltration level
The “MCPcounter”, “TIDE”, “xCell” and “EPIC” packages were used to evaluate the level of CAFs in the PCa tissues in the training set, test set, GSE70770, and GSE116918 cohorts. Violin plots were used to compare the levels of CAF_EPIC, CAF_MCPcounter, CAF_TIDE, and CAF_xCell between the high-risk and low-risk groups of CAFsRS. In the training set, the levels of CAF_EPIC (Fig. 6A, P < 2.222e-16), CAF_MCPcounter (Fig. 6B, P < 2.222e-16), CAF_TIDE (Fig. 6C, P = 6.8e-10) and CAF_xCell (Fig. 6D, P = 0.00013) were higher in the high-CAFsRS group. In the test set, the levels of CAF_EPIC (Fig. 6E, P = 3.2e-13), CAF_MCPcounter (Fig. 6F, P = 1.6e-07), CAF_TIDE (Fig. 6G, P = 0.00031) and CAF_xCell (Fig. 6H, P = 0.01) were higher in the high-CAFsRS group. In the GSE70770 cohort, the levels of CAF_EPIC (Fig. 6I, P < 2.22e-16), CAF_MCPcounter (Fig. 6J, P = 9e-13), CAF_TIDE (Fig. 6K, P = 7.9e-07) and CAF_xCell (Fig. 6L, P = 0.04) in the high-CAFsRS group were higher. In the GSE116918 cohort, the levels of CAF_EPIC (Fig. 6M, P < 2.22e-16), CAF_MCPcounter (Fig. 6N, P = 2.9e-07), and CAF_TIDE (Fig. 6O, P = 0.00035) in the high-CAFsRS group were higher. The level of CAF_xCell (Fig. 6P, P = 0.57) did not show a statistical difference between the two risk groups of CAFsRS.
Enrichment Analysis
The results of enrichment analyses revealed several significant findings. CAFsRS was found to be related to the transforming growth factor beta receptor signaling pathway, cytokine binding, PCa, and apoptosis, transcription regulator complex (Fig. 5A, B; Table S4). The GO and KEGG items identified in the analysis indicate that CAFsRS are involved in oncogenic pathways, tumor mutations, and immune functions. The results of GSEA revealed that nucleotide excision repair, respiratory electron transport, mitochondrial protein import, ribosome, and mRNA splicing minor pathway were activated in the low-CAFsRS cohort (Fig. 7C). However, the chemokine signaling pathway, TGFβ signaling pathway, ECM receptor interaction, cytokine cytokine receptor interaction, and FOXM1 pathway were activated in the high-CAFsRS cohort (Fig. 6D). In summary, CAFsRS was found to be involved in multiple biological functions and pathways in PCa.
Construction of nomogram
The variables included in the univariate Cox (Fig. 8A) and multivariate COX (Fig. 8B) analysis were age, T stage, N stage, Gleason score, and CAFsRS all obtained from the training set, to acquire independent prognostic factors for PCa. The forest plot indicated that the T and CAFsRS had an independent impact on the BRFS of PCa. Based on these two features, we used the “survival” and the “rms” package to draw a nomogram (Fig. 8C) to predict the BRFS rate at 1,3 and 5 years. The calibration curve indicated that the nomogram can accurately predict the 1,3 and 5-year BRFS in the training set (Fig. 8D), test set (Fig. 8E), GSE70770 (Fig. 8F), and GSE116918 (Fig. 8G) cohorts. The ROC analysis of the training set confirmed the predictive efficacy of nomogram (1-year BRFS AUC = 0.746, 3-year BRFS AUC = 0.803, 5-year BRFS AUC = 0.758, Fig. 8H). The calculation of the test set, GSE70770, and GSE116918 cohorts also validated the performance of the nomogram (Fig. 8I-K). These results showed that our nomogram was a highly reliable prognostic model. Finally, we developed an online prognostic prediction app (https://sysu-symh-cafsnomogram.streamlit.app/) to facilitate the practical application of the nomogram (Figure S1).