Prognostic values of the core components of the mammalian circadian clock in prostate cancer

Background Prostate cancer (PC) is one of the most common malignancies in males. Extensive and complex connections between circadian rhythm and cancer were found. Nonetheless, in PC, the potential role of the core components of the mammalian circadian clock (CCMCCs) in prognosis prediction has not been fully clarified. Methods We firstly collected 605 patients with PC from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) databases. Survival analysis was carried out for each CCMCC. Then, we investigated the prognostic ability of CCMCCs by Cox regression analysis. Independent prognostic signatures were extracted for the establishment of the circadian clock-based risk score model. We explored the predictive performance of the risk score model in the TCGA training cohort and the independent GEO dataset. Finally, the relationships between risk score and clinicopathological parameters, biological processes, and signaling pathways were evaluated. Results The expression levels of CCMCCs were widely correlated with age, tumor status, lymph node status, disease-free survival (DFS), progression-free survival (PFS), and overall survival (OS). Nine circadian clock genes, including CSNK1D, BTRC, CLOCK, CSNK1E, FBXL3, PRKAA2, DBP, NR1D2, and RORB, were identified as vital prognostic factors in PC and were used to construct the circadian clock-based risk score model. For DFS, the area under the 3-year or 5-year receiver operating characteristic curves ranged from 0.728 to 0.821, suggesting better predictive performance. When compared with T3-4N1 stage, PC patients at T2N0 stage might be benefited more from the circadian clock-based risk score model. Furthermore, a high circadian clock-based risk score indicated shorter DFS (p < 0.0001), early progression (p < 0.0001), and higher 5-year death rate (p = 0.007) in PC. The risk score was related to tumor status (p < 0.001), lymph node status (p < 0.001), and ribosome-related biogenesis and pathways. Conclusions The vital roles of circadian clock genes in clinical outcomes were fully depicted. The circadian clock-based risk score model could reflect and predict the prognosis of patients with PC.


INTRODUCTION
Prostate cancer (PC) is the most common malignant tumor in the male urinary system (Siegel et al., 2021). An increased incidence rate of PC was found around the world (Siegel et al., 2021;Chen et al., 2015;Jemal et al., 2016). The incidence and mortality rates of PC reached up to 10 per 100,000 and four per 100,000, respectively (Siegel et al., 2021;Chen et al., 2015;Jemal et al., 2016). Individual differences of PC are obvious. For PC, the therapeutic scheme of each patient mainly depends on tumor grade, doctors' clinical judgment, and conventional risk assessment. Conventional risk assessment included several clinical factors: prostate specific antigen (PSA) level, TNM staging, and Gleason score. Prostatectomy and radiotherapy are the recommended treatments for localized PC. Even the prognosis of most clinical patients with early-stage PC was satisfactory, postoperative recurrence was unavoidable (Seikkula et al., 2018). For advanced PC patients, despite initial sensitivity to androgen deprivation therapy, the majority of patients with PC finally developed resistance to castration therapy after 18 to 24 months of clinical treatment (Seikkula et al., 2018;Small et al., 2019;Saad et al., 2021). Thus, it is of significance to identify patients with high relapse risk.
The discoveries in the circadian rhythm were considered as dramatic breakthroughs in the field of medicine. In 2017, three scientists won the Nobel Prize for their work on the circadian rhythm (Callaway & Ledford, 2017;Burki, 2017). Several studies highlighted the essential role of circadian disruption in multiple biomolecular processes of cancer, including PC and other solid tumors (Sigurdardottir et al., 2012;Wendeu-Foyet & Menegaux, 2017;Viswanathan, Hankinson & Schernhammer, 2007;Stevens et al., 2014;Papagiannakopoulos et al., 2016;Kettner et al., 2016;Innominato et al., 2009;Huisman et al., 2015). The core components of the mammalian circadian clock (CCMCCs) were defined as a group of genes that could regulate human circadian rhythm through regulating RNA expression levels and biological pathways (Takahashi, 2017;Partch, Green & Takahashi, 2014). CCMCCs are composed of a total of 22 genes. These 22 CCMCCs included seven core clock genes (CLOCK, ARNTL, PER1, PER2, PER3, CRY1) and 15 other circadian clock-related genes (CRY2, BTRC, CSNK1D, CSNK1E, CUL1, DBP, FBXL21, FBXL3, NFIL3, NR1D1, NR1D2, PRKAA1, PRKAA2, RORA, RORB, SKP1). CCMCCs predominately promoted many biochemical activities to work in rule and order, thereby maintaining homeostasis. These key circadian clock genes also affected tumorigenesis, tumor growth, metastasis, and clinical outcomes of cancer patients. In tumor-bearing mice, the expression levels of five CCMCCs, including NR1D1, PER1, PER2, ARNTL, and DBP, were downregulated in hepatic metastasis from colorectal cancer when compared with healthy tissue (Huisman et al., 2015). In lung cancer, both PER2 and ARNTL were tumor suppressor genes (Papagiannakopoulos et al., 2016). In PC, overexpression of the clock gene PER1 promoted tumor cell apoptosis (Cao et al., 2009). The complex physically interaction between PER1 and the androgen receptor was also found (Cao et al., 2009). However, there are very limited researches that investigated the vital functions of key circadian clock genes in the pathogenesis and prognosis of PC. In this study, we systematically explored the association between circadian clock and prognosis in PC. Then, we proposed the circadian clock-based risk score and constructed a circadian clock-related prognostic model. The performance of the risk score model was verified in The Cancer Genome Atlas (TCGA) dataset and the independent Gene Expression Omnibus (GEO) dataset. We found correlations between the circadian clock gene signature and several biological functions, signaling pathways, and clinicopathologic features.

Dataset acquisition from the TCGA and GEO database
We identified suitable public datasets of PC patients in the TCGA (https://tcgadata.nci.nih.gov/tcga/) and GEO (https://www.ncbi.nlm.nih.gov/geo/) Databases. We eliminated datasets without intact gene expression data and prognostic information. Both RNA sequencing (RNA-seq) data and complete clinical annotation for each PC patient were downloaded online. In total, GSE70770 with 112 PC patients (Ross-Adams et al., 2015) and TCGA-PC dataset with 493 PC patients (Liu et al., 2018) were eventually gathered in this study for further analysis. Characteristics of the TCGA cohort and the GEO cohort were summarized in Table S1. All participants gave their informed consent for publication.

Construction of prognostic signature based on clock genes
A total of 22 CCMCCs were obtained from previously published reviews (Takahashi, 2017;Partch, Green & Takahashi, 2014). We applied the COX regression analysis to assess the effects of CCMCCs on clinical prognosis. We selected potential prognosis-related clock genes to construct the circadian clock-based risk score and the prognosis prediction model. The definition of the circadian clock-based risk score was as follows: The circadian clock-based risk score = λi where i represents the expression of prognosis-related clock genes, λ is the coefficient that extracted from the COX regression analysis. The final formula of the circadian clockbased risk score was as follows: the circadian clock-based risk score = (0.8*expression value of CSNK1D) + (−1.824*expression value of BTRC) + (−1.7645*expression value of CLOCK) + (0.4555*expression value of CSNK1E) − (1.239*expression value of FBXL3) + (−1.56*expression value of PRKAA2) + (1.325*expression value of DBP) + (0.433*expression value of NR1D2) + (1.049*expression value of RORB). The training subset and the internal validation subset was from the TCGA cohort, while GSE70770 was used as an external validation dataset. The receiver operating characteristic (ROC) analysis was applied to validate the performance of the proposed model. By R package termed ''survivalROC'', areas under the ROC curve (AUC) were calculated.

Functional enrichment analysis
Differentially expressed genes (DEGs) between the high and low circadian clock-based risk score groups were recognized by the ''limma'' R package. DEGs with absolute value change of expression more than 2 and p value less than 0.05 were selected for signal pathway analysis. The Gene Ontology (GO, http://www.geneontology.org/) and the Kyoto Encyclopedia of Genes and Genomes (KEGG, https://www.kegg.jp/kegg/) enrichment analyses were utilized to reveal unique biological processes and signal pathways between the high and low circadian clock-based risk score groups.

Statistical analysis
The survival analysis of each CCMCC and the circadian clock-based risk score was conducted by the Kaplan-Meier (KM) method and the log-rank test. The optimal cutoff point was evaluated by the R package ''survminer''. The differences between the two groups were compared with the t -test. All statistical analyses were two-tailed. And p value less than 0.05 was considered as statistical significance. The visualization of all statistical results was performed by the SPSS 22.0 software and the R 4.0.1 software.

Relationship between CCMCCs and prognosis
To evaluate the association between CCMCCs expression and clinical outcomes, we performed the survival analysis. In the TCGA cohort, the sample size of prostate cancer patients who owned disease-free survival (DFS), progression-free survival (PFS), and overall survival (OS) data was 333, 493, and 493, respectively (Figs. 2-6). We found that high expression levels of ARNTL (p = 0.037), CLOCK (p = 0.006), PER2 (p = 0.0051), Considering the influence of TNM stage on prognosis, we applied survival subgroup analysis to the T2N0 cohort and the T3-4N1 cohort (Tables S2-S7). Importantly, at both T2N0 stage and T3-4N1 stage, high BTRC, CLOCK, CRY1, FBXL3, PER3, and RORA expression indicated longer DFS (both p < 0.05, Tables S2-S3). It was also worth to mention that high DBP linked to shorter PFS in T2N0 prostate cancer (p = 0.013, Table S4), while the contrary result was found in T3-4N1 stage patients (p = 0.0016, Table S5). For OS, high NR1D1 expression was only significantly related to better prognosis in T2N0 stage patients (p = 0.041, Table S6), while the negative result was found in T3-4N1 stage patients (p = 0.13, Table S7).

Construction and validation of circadian clock-based risk score
To verify our hypothesis, we separated the TCGA cohort into the test set and the internal validation set in a ratio of 7:3. All enrolled patients in the cohort underwent surgical treatment, thus we mainly investigated the application of the predictive model in DFS prediction. As shown in Figs. 7A-7B, the AUC value of the predictive model in the training cohort and the validation cohort for 3-year DFS was 0.742 and 0.821, respectively. The ROC curves and AUC values (0.728 for the test cohort; 0.753 for the validation cohort) further indicated the satisfactory predictive power of the circadian clock-based signature in DFS prediction (Figs. 7C-7D). PC patients with high circadian clock-based risk score also had lower DFS than the low risk score group (p < 0.0001, Fig. 7E). In an independent cohort with 112 PC patients who prostatectomy, we divided the GSE70770 cohort into high and low risk score groups and conducted the KM survival analysis. Results showed that the high circadian clock-based risk score was correlated with early relapse (vs low circadian clock-based risk score: 13.000 months vs 21.000 months, p = 0.06; Fig. S3). Moreover, we explored the impact of TNM stage. When compared with clinical prognostic factors, such as T (AUC values range from 0.502 to 0.808) and N stage (AUC values range from 0.500 to 0.515), the predictive performance of the proposed model was superior in DFS (Fig. S4). Similar results were also detected in PFS and OS prediction (Fig. S4). The improvement of utility when combined circadian clock-based risk score model with T and N stage was limited. Then, for better clinical application, we investigated the power of circadian clock-based risk score model in T2N0 disease and T3-4N1 disease (Fig. S5). AUC values of DFS curves indicated that the risk model performed better in patients with T2N0 stage (vs T3-4N1 stage, AUC value: 0.749-0.834 vs 0.515-0.745, Fig. S5). For PFS, the predictive values of risk score model in T2N0 disease and T3-4N1 disease was almost (Fig. S5). In T2N0 stage, high risk score was significantly related to shorter DFS (p = 0.0024) and PFS (p = 0.034), while no statistical significance was found in OS (p = 0.22, Fig. S6). In T3-4N0 stage, high circadian clock-based risk score was significantly related to shorter PFS (p = 0.016), while no statistical significance was found in DFS (p = 0.12) and OS (p = 0.24, Fig. S6). We also tentatively applied the risk score model to PFS and OS prediction (Figs. S7-S8). The 3-year and 5-year AUC values of PFS curves ranged from 0.607-0.735 (Fig. S7), while the AUC values of OS curves were higher than 0.700 (Fig. S8). The high circadian clock-based risk score indicated poor PFS (p < 0.0001), and higher 5-year death rate (p = 0.007). Correlation between clinicopathological parameters and circadian clock-based risk score The relationship between clinicopathological parameters and circadian clock-based risk scores was evaluated in the TCGA cohort. There was no significant difference in the risk score according to age (p = 0.19; Fig. S9A). However, high risk score was significantly linked to high T status (p = 0.00015) and N status (p = 0.00051; Figs. S9B-S9C). When in comparison with T2N0 stage, we found that higher circadian clock-based risk score was found in the T3-4N1 stage (the TCGA cohort, p = 4e−05; the GEO cohort, p = 3.3e−06; Figs. S9D-S9E).

Functional analysis of circadian clock-based risk score
In the TCGA cohort, on the basis of the circadian clock-based risk score, 246 PC patients were assigned to the high risk score group, while 247 PC patients were assigned to the low risk score group. There were a total of 1114 DEGs between the two groups, including 4643 upregulated and 6471 downregulated DEGs (Fig. 8A). According to the GO analysis results,

DISCUSSION
Circadian clocks and circadian clock-related genes were essential to maintaining homeostasis. Disruption of the circadian system and aberrant expression of CCMCCs induced tumorigenesis and promoted the proliferation and invasion of cancer cells (Viswanathan, Hankinson & Schernhammer, 2007;Stevens et al., 2014;Papagiannakopoulos et al., 2016;Kettner et al., 2016;Innominato et al., 2009;Huisman et al., 2015). However, CCMCCs expression signature and its function in PC have rarely been investigated. In the research, the expression profiles and functions of CCMCCs were outlined. We also Notes. HR, hazard ratio; P, P value for whole; 95% CI, 95% confidence interval. Statistically significant data were marked with bold and underline. explored the close relationship between CCMCCs and the prognosis of PC patients, thus developing a circadian clock-based risk score model. The circadian clock-based risk score might participate in some biological processes and signaling pathways.
In the present study, we revealed the close relevance between the 22 enrolled CCMCCs and prognosis. Cao et al. (2009) demonstrated that one of the core clock genes, PER1, regulated the expression of androgen receptor, which might affect drug sensitivity in PC. Additionally, in high-grade colon cancer, the relative low expression of PER1 was found (Krugluger et al., 2007). In colon cancer and cholangiocarcinoma, the overexpression of PER1 inhibited tumor progression and growth (Krugluger et al., 2007;Han et al., 2016). In PC, we found a downregulation of PER1 in the T3-4 group in comparison with the T2 group, which was consistent with its expression pattern in other cancer types. In our study, we also detected overexpression of PER1 in the T2N0 stage. Moreover, the expression level of PER1 was positively associated with OS in PC. In stomach adenocarcinoma, patients with high FBXL3 expression showed poor clinical outcome (Liu et al., 2019). However, in kidney renal clear cell carcinoma, an opposite result was found. Specifically, Liu et al. found that patients with high FBXL3 expression showed a better prognosis (Liu et al., 2019). In the 493 PC cases that enrolled in our study, high FBXL3 expression was significantly correlated with longer DFS (p = 0.012) and PFS (p = 0.0066). In breast cancer, overexpression of CSNK1D was found in the N1 group (Abba et al., 2007). Similar overexpression trend of CSNK1D was also found in PC tissues with N1 status. PC patients with short DFS (p = 0.00032), PFS (p = 0.017), and OS (p = 0.016) also showed overexpression of CSNK1D. In glioma, the expression level of BTRC was correlated with clinical outcome (Zhou et al., 2021). Prognostic effects of some CCMCCs in PC remain unclear. In PC, we found higher expression level of BTRC in the T2N0 disease in comparison with the T3-4N1 disease. On the basis of T and N stage, we divided the whole cohort and carried out the survival subgroup analysis. As Tables S2-S7 shown, there were conflicting roles of some CCMCCs in the prognosis of PC patients, such as DBP and NR1D1, suggesting different expression patterns of CCMCCs in T2N0 and T3-4N1 disease. Collectively, in the study, we fully investigated the association between 22 circadian clock-related genes and clinical survival.
As one of the vital biomarkers in PC, high PSA levels also existed in some benign diseases, such as prostatitis, prostatic hyperplasia, and after prostatic massage. Thus, the clinical value of PSA in PC diagnosis and survival prediction remained controversial (Draisma et al., 2003; Manceau et al., 2021;Andriole et al., 2012;Schröder et al., 2012). A multicenter trial reported that PSA failed to affect the prognosis of PC (Andriole et al., 2012), while another experiment found that PSA effectively reduced the death rate of PC patients (Schröder et al., 2012). In our study, the results showed that PSA level at diagnosis was not statistically related to CCMCCs expression. Gleason grade was another important biomarker in PC (Sopyllo, Erickson & Mirtti, 2021;Moris et al., 2020). Recently, a meta-analysis found that Gleason grade was positively associated with recurrence after surgery (John et al., 2021).
Gleason scoring system mainly concentrated on the pathological structure of PC, while the transcriptomic and genomic features were missed. In the study, we attempted to find the correlation between Gleason score and gene expression. Nevertheless, a weak to moderate correlation was found between Gleason score and expression levels of CCMCCs. These results might demonstrate that the mechanisms of influences of PSA level, Gleason score, and CCMCCs, on prognosis were different. Not only clinical factors, but also many genes were considered as vitally prognostic factors in PC, such as PTEN and TP53 (Rebello et al., 2021;Vitkin et al., 2019;Vidotto et al., 2020;Muñoz Fontela et al., 2016). PTEN, one of the tumor suppressor genes, was commonly mutated in PC (Rebello et al., 2021;Vitkin et al., 2019;Jamaspishvili et al., 2018). Several researches found that PTEN mutation or down-expression had extensive influences on tumor microenvironment and PI3K signaling pathways, thus leading to tumor progression and poor prognosis in PC (Vidotto et al., 2020;Jamaspishvili et al., 2018;Garcia et al., 2014;Vidotto et al., 2019;Toso et al., 2014). For instance, PTEN loss promoted T regulatory cell' proliferation and infiltration, thereby promoting immune suppression and tumor metastasis (Vidotto et al., 2019). TP53 deficiency was presented in 10% to 40% of PC (Rebello et al., 2021;Muñoz Fontela et al., 2016). TP53 loss aggravated the genomic instability and activated several pathways, thus promoting tumor growth and poor outcome (Bezzi et al., 2018;Robinson et al., 2015;Hamid et al., 2019). In the enrolled cohort, we also found that the mutation rate of PTEN and TP53 was over 10%, which was consistent with previous studies (Rebello et al., 2021;Vitkin et al., 2019;Vidotto et al., 2020;Muñoz Fontela et al., 2016;Robinson et al., 2015;Hamid et al., 2019). An increasing number of studies also highlighted the close relationship between clinical outcome and PTEN, TP53, BRCA1, BRCA2, ATM, RB1, PALB2, CHEK2, MLH1, MSH2, MSH6, and PMS2 (Rebello et al., 2021;Muñoz Fontela et al., 2016;Abida et al., 2019). However, few groups investigated the correlation between CCMCCs and the above prognostic genes. In our research, we considered the mutation status and expression levels of key genes. Through statistical analysis, we found the link between CCMCCs and PTEN mutation, TP53 mutation, ATM mutation, PTEN expression, TP53 expression, ATM expression, etc. These close correlations might also explain why CCMCCs could be used for prognosis prediction.
A certain number of existing prognostic models for PC patients were proposed (Zhang et al., 2020;Xiaoli et al., 2015). These predictive models were developed by lncRNAs, miRNAs, or immune-related genes (Zhang et al., 2020;Xiaoli et al., 2015). Nevertheless, none of them involved circadian clock genes in. The mammalian circadian clock-related genes interacted with each other (Lehmann et al., 2015). CLOCK and ARNTL regulated the activity and expression of NR1D2, one of the nuclear receptors. Subsequently, NR1D2 also could inhibit the mRNA level of ARNTL and NFIL3, leading to the repression of DBP. Collectively, extensive interactions of CCMCCs existed. It is important to make accurate predictions about the prognosis of PC patients for the development of precise treatment. In the present study, through COX regression analysis, we identified 9 vital CCMCCs which could predict prognosis in PC. In unselective PC patients who all received surgery, we developed the circadian clock-based risk score model with higher accuracy in the prediction of DFS than clinical features, including T stage and N stage. Then, we also verified the performance of the risk score model in T2N0 disease and T3-4N1 disease. Importantly, in DFS prediction, the risk score model showed preferable utility in T2N0 disease, indicating that PC patients who were diagnosed at T2N0 stage might be benefited more from the circadian clock-based risk score model. The predictive model also performed well in terms of PFS and OS. The 9-CCMCCs signature also reflected particular molecular functions, cellular components, biological processes, and signaling pathways. Apart from clinical factors, such as T stage, N stage, and PSA, the risk score model might put new insights in the prognosis of PC from the aspect of circadian clock.
We acknowledged some limitations in this research. Firstly, the data of all enrolled public cohorts were obtained retrospectively. Secondly, the heterogeneity of PC samples was nonnegligible. For most PC patients, detailed therapeutic schedules, such as operative approaches and chemotherapy regimens, were unavailable and lacking. Another prospective cohort with less sample heterogeneity and metastasis-free survival data is still necessary. Moreover, in vitro and in vivo experiments are required to further confirm the significant roles of CCMCCs in PC.

CONCLUSION
In summary, our study improved the understanding of the role of the circadian clock in PC and proposed a circadian clock-based risk score model for prognostic prediction. Moreover, PC patients at T2N0 stage might be benefited more from the circadian clock-based risk score model. These results might be helpful for further investigations of the circadian clock-related molecular mechanisms and the development of therapies for cancer.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
The authors received no funding for this work.