Nomograms and scoring system for forecasting overall and cancer‐specific survival of patients with prostate cancer

Abstract Background Estimated life expectancy is one of the most important factors in determining treatment options for prostate cancer (PCa) patients. However, clinicians have few effective prognostic tools to individually assess survival in patients with PCa. Methods We screened 283,252 patients diagnosed with PCa from the Surveillance, Epidemiology, and End Results (SEER) database between 2004 and 2015, and randomly divided them into the training and validation groups. We used univariate and multivariate Cox analyses to identify independent prognostic factors and further established nomograms to predict 1‐, 3‐, 5‐, and 10‐year overall survival (OS) and cancer‐specific survival (CSS) for PCa patients. The prediction performance of nomograms was tested and externally validated by Concordance index (C‐index) and receiver operating characteristic (ROC) curve. Calibration curve and decision curve analysis (DCA) were used for internal validation. We further developed PCa prognostic scoring system based on the impact of available variables on survival. Results The variables age, race, marital status, TNM stage, surgery method, radiotherapy, chemotherapy, PSA value, and Gleason score identified as independent prognostic factors were included in the survival nomograms. The results of training (C‐index: OS = 0.776, CSS = 0.889; AUC value: OS = 0.772–0.802, CSS = 0.892–0.936) and external validation (C‐index: OS = 0.759, CSS = 0.875) indicated our nomograms had good performance in predicting 1‐, 3‐, 5‐, and 10‐year OS and CSS prediction. Internal validation using the calibration curves and DCA curves demonstrated the effectiveness of the prediction models. The prognostic scoring system was more effective than the AJCC staging system in predicting the survival of PCa patients, especially for OS. Conclusion The prognostic nomograms and prognostic scoring system have favorable performance in predicting OS and CSS of PCa patients. These individualized survival prediction tools may contribute to clinical decisions.


| INTRODUCTION
As a global threat to men's health, prostate cancer (PCa) has attracted wide attention. According to statistics released in 2020, there were approximately 191,930 new cases of PCa in the United States, accounting for 21.5 percent of all male malignancies. 1 Among all stages combined, PCa has the highest 5-year survival rate of 98 percent, higher than the 67 percent average for all malignancies. 2 Radical prostatectomy is the preferred treatment for localized PCa, and the surgical indications have extended to some patients with locally advanced disease in recent years. [3][4][5] Estimated life expectancy is one of the most important factors in determining treatment options for PCa patients. 6 Current consensus guidelines recommend that radical prostatectomy is the first choice of treatment for PCa patients in low-and intermediate-risk localized groups with a life expectancy of more than 10 years and high-risk localized groups with a life expectancy of more than 5 years. 7 Radiotherapy combined with androgen deprivation therapy is the preferred treatment for localized PCa patients with a life expectancy of less than 5 years and for all advanced PCa patients. 7 Life expectancy greatly influences the treatment choice and intensity of PCa patients. Some scholars had proposed that the survival time of patients can be evaluated according to their age and gait speed. 8 Some clinicians established charlson comorbidity index to predict the survival time of PCa patients based on race, sex, comorbidity, and social life expectancy. 9 However, these prognostic systems cannot personalize the patient's survival in combination with clinicopathological factors such as age, marital status, pathological grade, tumor stage, surgical, and chemoradiotherapy methods.
Nomogram, a statistical tool that can plot multiple independent risk factors into an intuitive graph, has been used extensively in recent years to predict survival time for patients with various cancers. [10][11][12] Thus, the purpose of this study is to establish a nomogram prognostic tool to accurately and individually estimate overall survival (OS) and cancer-specific survival (CSS) for PCa patients to help clinicians select treatment options and intensity.

| Data source
The clinicopathological data of all patients in this study were obtained from the SEER database (Surveillance, Epidemiology and End Results) database [SEER18 Regs Custom Data, based on the 2018 submission]. The SEER database, supported by the National Cancer Institute, is the largest registry of cancer patients in the United States. 13 The SEER database is public and identifiable, so informed consent of patients and ethics committee permission is not required for this study.

| Patients and clinicopathologic factors
We set inclusion criteria for patients in this study: (a) Patient pathologically diagnosed with adenocarcinoma of PCa between 2004 and 2015; (b) Patients had no previous history of other cancers out of PCa; (c) The clinicopathological information of included patients were available from the hospital, including race, age at diagnosis, marital status, TNM stage, surgery method, radiotherapy, chemotherapy, prostate-specific antigen (PSA) value, Gleason score, OS state, CSS state, and survival data. We excluded patients with missing information on above mentioned. (d) We excluded patients with scarce Gleason scores, such as Gleason 1 + 4, 2 + 5 and 3 + 1.

| Statistical analysis
We randomly divide PCa patients diagnosed in year of 2004, 2010, and 2012 as the validation cohort, and the remaining patients were used as the training cohort to establish nomogram models. We identified independent prognostic factors with p values <0.05 in the multivariate Cox analyses. Effective prognostic factors were further used to develop nomograms to predict 1-, 3-, 5-, and 10-year OS and CSS rates in patients with PCa. Receiver operating characteristic (ROC) curves and nomogram were used to establish the survival model. Area under curve (AUC) and C-index were computed to quantify the predictive ability of the survival models. Then we validated the accuracy of the nomogram models externally. Internally validation was performed by decision curve analysis (DCA) to further evaluate the practicability of the nomogram models for clinical decision-making. We further established prognostic score for OS and CSS of PCa patients based on the coefficient in Cox models for each variable. The prognostic score was ranged from 0-100 and was used to divide the prognosis of PCa patients into four grades. We used Kaplan-Meier curves and Akaike information criterion (AIC) to compare OS and CSS in PCa patients between the prognostic grading system and AJCC staging system.

| Patient clinicopathological data
According to the selection criteria, 283,252 patients with PCa were screened in this study, among which patients were divided into training cohort and validation cohort ( Figure 1). The 1-, 3 -, 5 -, and 10-year OS and CSS rates in the training cohort were similar to those in the validation cohort (OS: training cohort vs. validation cohort, 98.6%, 94.7%, 90.3%, and 76.6% vs. 98.6%, 94.7%, 90.2%, and 75.4%, respectively; CSS: training cohort vs. validation cohort, 99.4%, 98.0%, 96.9%, and 93.6% vs. 99.4%, 98.1%, 96.8%, and 93.3%, respectively). We listed the baseline characteristics of PCa patients in Table 1. We compared the distribution of invasion factors in PCa, and found that PCa patients with higher Gleason grade tended to have higher PSA and T staging, and were prone to have regional lymph nodes and distant metastases (Table S1).

| Independent predictors for patients
The results of univariate and multivariate analyses for OS and CSS in the training cohort were presented in Table 2 p < 0.001) were the highest risk factors for cancer-specific death ( Table 2).

| Prognostic nomogram for survival
The independent prognostic factors, race, age at diagnosis, marital status, TNM stage, surgery method, radiotherapy, chemotherapy, prostate-specific antigen (PSA) value and Gleason score were combined to establish nomograms to predict 1-, 3-, 5-, and 10-year OS and CSS rates in patients with PCa ( Figure 2). The OS and CSS rate of PCa patients can be estimated by adding the scores corresponding to each variable on the nomogram. We established ROC curves to further evaluate the ability of independent prognostic factors to predict 1-, 3-, 5-, and 10-year OS and CSS. As expected, results also indicated the good ability to predict the survival of PCa patients (Figure 3  validation: OS C-index =0.759, 95%CI = 0.755-0.763; CSS C-index = 0.875, 95%CI = 0.869-0.881) and the approximation between prediction and observation in correction curves indicated that the nomograms models had good accuracy in discrimination of patients' survival ( Figure 4 and Figure 5). Internally validation using the DCA curves demonstrated that the prognostic nomogram had good practicability for clinical decision-making ( Figure 6).

| Establishment of prognostic scoring system
We included the independent prognostic factors in Cox models and resulted in HRs presented in Table 2. We calculated prognostic scores for OS and CSS according to the coefficient in the Cox models (Table 3 and Table 4). The prognostic scoring system has a total score range 0-100, and a higher score indicated a worse prognosis of PCa patients. We divided the prognosis of PCa patients into four grades based on the prognostic scoring system: 0-25 points (grade 1), 26-35 points (grade 2), 36-50 points (grade 3), and 51-100 points (grade 4). Kaplan-Meier curves were used to compare the OS and CSS of PCa patients in different grades or stages using the prognostic grading system and AJCC staging system (Figure 7). The proportion of patients, OS and CSS rates for each AJCC stage and prognostic grade were presented in Table 5. We used AIC to evaluate the suitability of survival models for both grading systems, and found that the use of prognostic grading

T A B L E 2 (Continued)
F I G U R E 2 Nomograms for predicting 1-, 3-, 5-, and 10-year overall survival (A) and cancer-specific survival (B) rate of prostate cancer.

| DISCUSSION
At present, medical data mining is increasingly applied to clinical practice. 14 Clinical big data plays an important role in establishing prognostic models, assessing risk factors, diagnosis and treatment of diseases, which benefit patients greatly. 15,16 Prediction of cancer survival is an important part of oncology. The main advantage of nomograms is the individualized risk assessment based on the different information of patients, which benefits clinical practice in many aspects such as disease prediction, tumor recurrence, survival assessment and adjuvant therapy . [17][18][19][20] In this population-based study, we developed prognostic nomograms based on clinical characteristics and treatment of PCa patients. Statistical methods such as C-index, ROC curve, DCA and Calibration curves were used to evaluate the prediction accuracy of the Nomogram. The C-index and AUC value can be good tests for the ability of nomogram prediction, and the value close to 1 means higher accuracy. 17,21 DCA is used to evaluate the practicability of prediction model for decision-making. 22 The similarity between the predicted curve and the observed curve in the calibration curves can directly demonstrate the prediction ability of nomogram. 17 Internal and external validation demonstrated that the proposed nomograms have high accuracy and excellent discrimination ability in predicting 1-, 3-, 5-, 10-year OS and CSS rate for PCa patients. We further established the prognostic scoring system and verified that it was more effective than the traditional AJCC staging system in predicting the prognosis of PCa patients, especially for OS. Effective individualized survival prediction tools may contribute to clinical practice in selecting effective treatment options and intensity for PCa patients. In this study, age at diagnosis, race, marital status, TNM stage, surgery method, radiotherapy, chemotherapy, PSA value and Gleason score were identified as independent prognostic factors that were included for the development of prognostic nomograms. Many studies have identified race as an independent risk factor for the survival of PCa patients, and black patients have worse survival than white men, which is due, in part, to the relatively low economic and medical environment. 23,24 Previous studies had demonstrated that marital status was able to affect survival in patients with many types of cancer, such as prostate, breast and lung cancer. 25 Married cancer patients tend to have better survival than those unmarried (divorced/widowed/separated), and the associated benefits were more significant in males than females. 25 A population-based study had reported that in all tumor stages of PCa patients, the OS and CSS of married men are higher than that of unmarried men, which may be related to patients' physical and mental health and treatment compliance. 26 In our study, the effects of race and marital status on OS and CSS of PCa patients are similar to previous reports.
The clinical information of PC patients in this study, including age, PSA, TNM stage and Gleason value, was based on the registration of patients at the time of initial diagnosis. However, the PSA value in patients with PCa is a dynamic indicator, and the SEER database does not provide PSA monitoring information for patients who underwent subsequent treatment, such as radical prostatectomy, radiotherapy, and androgen deprivation therapy. Although PSA is controversial as a routine screening test for PCa due to its diversity of effects, it is widely accepted as an indicator for treatment effectiveness and biochemical recurrence of PCa. [27][28][29] Biochemical recurrence time and PSA increase speed were important factors affecting survival of PCa patients after treatment. [30][31][32] In our survival prognosis model, PSA value was not significantly associated with OS and CSS, but we could not provide the results regarding the dynamic effect of PSA value on patients' survival.
We purposed a PCa prognostic scoring system using the clinicopathological information that was associated with patients' survival. Age was the highest risk factor for OS of PCa, which scored patients 0 to 23. Gleason score was weighed 0-24 score to patients, which was the highest risk  9-10 (4 + 5/5 + 4/5 + 5) 12 Note: The prognostic scoring system has a total score of 100, with a higher score indicating a worse prognosis of prostate cancer patients. We divided the prognosis of prostate cancer patients into four grades: 0-25 points (grade 1), 26-35 points (grade 2), 36-50 points (grade 3) and 51-100 points (grade 4).
T A B L E 4 Scores of independent prognostic factors in the prognostic scoring system for cancer-specific survival 9-10 (4 + 5/5 + 4/5 + 5) 24 Note: The prognostic scoring system has a total score of 100, with a higher score indicating a worse prognosis of prostate cancer patients. We divided the prognosis of prostate cancer patients into four grades: 0-25 points (grade 1), 26-35 points (grade 2), 36-50 points (grade 3) and 51-100 points (grade 4).

F I G U R E 7
The Kaplan-Meier curves comparison of overall survival (OS) and cancer-specific survival (CSS) between prostate cancer patients divided with AJCC staging system and divided with the prognostic grading system: (A) OS of AJCC staging system; (B) CSS of AJCC staging system; (C) OS of prognostic grading system; (D) OS of prognostic grading system. factor for CSS. The classification of the AJCC system for PCa was based on Gleason score and tumor TNM stage. In our prognostic scoring system, we found that Gleason score and TNM staging in combination weighed different scores for OS and CSS (e.g., 0-35 points for OS and 0-52 points for CSS), which is the main reason for the large difference for OS but less difference for CSS in the Kaplan-Meier curves in the comparison of our prognostic scoring system versus AJCC staging system.
The advantage of establishing prognostic assessment tools is that it can provide an intuitive initial survival expectation, based on which clinicians and patients can jointly determine treatment options. However, predictive tools are not a substitute for clinical judgment, and clinicians need to make trade-offs based on individual differences such as the severity of comorbidities and physical conditions. Although this is not the first nomogram proposed to assess survival in PCa patients, previously proposed prognostic tools were mostly limited to localized PCa. 33,34 Because our prognostic tool considered comprehensive clinical factors and included PCa patients with advanced stage, it may provide new clues for possible clinical translation.
However, there are some limitations in our study. First, the PSA growth rate and biochemical recurrence time of PCa patients are important indicators for survival, 35 but the missing information in SEER database may reduce the accuracy of our prediction. The second, the health status and concomitant diseases of PCa patients were not included in our study, which may introduce bias into our results. Last, data included this study was retrospectively collection, and more prospective studies are needed to confirm our results.

| CONCLUSIONS
Based on a large data set of PCa patients, we developed and validated prognostic nomograms and demonstrated their favorable ability in predicting 1-, 3-, 5-, and 10-year OS and CSS for PCa patients. We further proposed the prognostic scoring system for the first time to more intuitively present the influence of clinical pathological factors on OS and CSS of PCa. The proposed survival model of PCa can help clinicians select individualized and effective treatment.

AUTHOR'S CONTRIBUTION
Yuanyuan Chang and Lei Cheng designed and reviewed this research. Yuan Zhou and Changming Lin completed statistical analysis and the draft manuscript. Lian Zhu and Rentao Zhang performed validation and processed the charts.