Prognostic models for predicting overall and cancer-specic survival in hepatocellular carcinoma: A competing risk analysis

This study was conducted to estimate the probability of cancer-specic survival (CSS) of HCC and establish a competing risk nomogram for predicting the CSS of HCC using a large population-based cohort. Methods Patients diagnosed with HCC between 2004 and 2015 were identied from the Surveillance Epidemiology and End Results Program. The CSS and overall survival (OS) were the endpoints of the study. A competing risk nomogram for predicting CSS was built with Fine and Gray’s competing risk model, and the nomogram for predicting OS was constructed with Cox proportional hazard regression models. The predictive performance of the model was tested in terms of discrimination and calibration. corresponding In the the concordance-index of the two nomogram models reached 0.810 and 0.750, respectively. Calibration curves revealed good consistency between the prediction of models and observed outcome. Furthermore, cumulative analysis into four distinct supporting the of the


Introduction
Hepatocellular carcinoma (HCC) is the most common liver cancer and fourth leading cause of cancerrelated death worldwide, with approximately 841,000 new cases and 782,000 deaths annually [1] . The worldwide incidence of HCC adjusted by age is around 10.1 cases per 100,000 person-year and is expected to increase in the future [2] . Because of the lack of speci c symptoms and unfavourable tumour biology, most patients with HCC are diagnosed in an advanced stage and exhibit poor prognosis [3,4] .
Similar to in other types of cancer, competitive events, such as cancer-speci c death and death from other causes, are frequent in HCC. In terms of competing events, cancer-speci c death and death from other causes are mutually exclusive, and the occurrence of one event will prevent the occurrence of another. Kaplan-Meier methods and Cox regression models used in traditional survival analysis can only consider one endpoint, which may overestimate the risk of the interested event [5][6][7] . The Fine and Gray model based on the sub-distribution hazard is recommended to overcome this problem [5,6,8] . Several studies have analysed the independent predictive factors of related malignant diseases by utilizing competing risk analysis and established models with good predictive performance for cancer-speci c death [9][10][11][12] . However, no studies have analysed and constructed a prognostic model for cancer-speci c death from HCC using the competing risk method. Therefore, this study was conducted to evaluate the predictive factors associated with the survival of patients with HCC using the competing risks method and develop two simple nomograms for individualized prediction of cancer-speci c survival (CSS) and overall survival (OS).

Patients
Patients with HCC were identi ed based on the International Classi cation of Disease for Oncology, third edition (ICD-0-3) primary site code C22.0, and Histologic type code 8170-8175. According to the variables included in our study, we excluded patients with missing data which was recorded as "blank" in the database; second, patients 1) with history of other primary malignancies; 2) with invalid follow-up data; 3) aged <18 years and 4) with unde ned data, recorded as "unknown" in the database were also excluded from the study. The detailed inclusion and exclusion criteria are shown in Figure 1.

Variable selection
Information on demographic factors (race, age, sex, and marital status), tumour-related factors (tumour diameter and AJCC staging system), therapeutic factors (surgery, chemotherapy, and radiotherapy) and follow-up were collected from the SEER database.
Marital status was recorded as single (never married), separated, divorced, widowed, unmarried or domestic partner, and married (including common law) in the SEER database. We grouped single (never married), separated, divorced, widowed, and unmarried or domestic partner into the single classi cation.
As no information was available on the eighth edition AJCC staging system and only patients diagnosed after 2010 showed information from the seventh edition AJCC staging system in the SEER database, to enrol as many patients as possible, the sixth edition AJCC staging system was recorded for further analysis. Based on the Surgery Codes of the SEER program, we divided the surgical procedures into four categories: no surgery, local tumour destruction (e.g., Heat-Radio-Frequency ablation (RFA), and Percutaneous Ethanol Injection (PEI)), resection and transplantation.
The endpoints of the study were cancer-speci c survival (CSS) and overall survivals (OS). The speci c cause of death was based on the code of "SEER cause-speci c death classi cation" in the SEER database. OS was calculated from the date of diagnosis to the date of death caused by any cause or the most recent follow-up. CSS was de ned as the interval between the date of diagnosis and date of death due only to HCC or the most recent follow-up. The median follow-up time was calculated by the reverse Kaplan-Meier method.

Statistical Analysis
Demographic and clinical variables were summarized by descriptive statistics. Categorical variables were expressed as a number (percent, %) and compared by the chi-square test. Cancer-speci c death and death from other causes were regarded as the two competing endpoint events, and the associations between variables and the risk of cancer-speci c death were evaluated by Fine and Gray's competing risk analysis [6] . The corresponding cancer-speci c mortality probability of different groups was depicted by the cumulative incidence function (CIF) and compared by Gray's test [6,8,13] . Variables with p < 0.05 in univariate analysis or with clinically relevant results were then evaluated by multivariate analysis based on the proportional sub-distribution hazard ratio model. The independent predictive factors in the Fine and Gray competing risk model were incorporated in the nomogram model to predict the 3-, 4-, and 5-year CSS probability.
For OS, the independent risk factors were identi ed by univariate and multivariate Cox proportional hazard regression analyses, and the corresponding nomogram model was constructed to predict the 3-, 4-, and 5-year OS probability.
The predictive performance of the nomogram models was analysed from two perspectives: discrimination and calibration. The discriminative ability of the models was tested by the concordance index (C-index), and calibration was tested using calibration curves [14,15] . Furthermore, CIF curves with Gray's test or Kaplan-Meier curves with log-rank test were used to measure the performance of the models; the risk groups were classi ed by previously recommended cut-points for predictive models (16th, 50th, and 84th) [16] , which classi ed patients into good, fairly good, fairly poor, and poor risk groups based on their personalized total points determined using the nomogram models.

Basic characteristics of patients
According to the inclusion and exclusion criteria, 34,957 patients diagnosed with HCC between 2004 and 2015 were included for further analysis ( Figure 1). The whole cohort was then randomly divided into a training set (31,461) and validation set (3,496) at a ratio of 9:1. The basic clinicopathological features of the whole cohort and corresponding training and validation sets are shown in Table 1. In the whole cohort, most patients were younger than 60 years (46.0%), Caucasian (67.9%), and male (77.2%). In terms of the therapy for HCC, 67.3% of patients did not undergo surgery for the primary nodules, 13.2% of patients were treated by local tumour destruction, 11.8% of patients were administered liver resection, and 7.7% of patients were administered liver transplantation. Regarding tumour characteristics, most patients had tumours with a diameter smaller than 3 cm (33.3%) and were in AJCC I stage (41.9%). There was no signi cant difference in clinicopathological features between the training and validation sets.

Identi cation of Risk factors and construction of nomograms
The median follow-up time was 63 (range: 1-155) months. Of the 34,957 patients, 9840 patients survived during follow-up, 21,044 patients died from HCC, and 4073 patients died from other causes. For the training set, the respective 5-year OS, cancer-speci c mortality, and other causes-speci c mortality were 24.3%, 63.9%, and 11.8%, respectively. The 3-and 5-year cumulative incidence of death and CIF curves corresponding to each clinicopathological variable are shown in Table 2 and Figure 2.
Uni-and multivariate analysis were used in the training set to identify independent predictive factors associated with CSS and OS (Table 3 and 4). Multivariate analysis identi ed age, race, sex, surgical therapy, chemotherapy, radiotherapy, tumour diameter, and tumour stage as independent predictive factors of CSS. Additionally, marital status was an independent risk factor of OS.
Based on the independent predictive factors in multivariate analysis, a nomogram for predicting OS and competing risk nomogram for predicting CSS were constructed ( Figure 3).

Predictive performance of nomogram models
The predictive performance of the nomogram models was veri ed via the C-index and calibration curve in the training and validation sets.
For the competing risk nomogram for CSS, the C-index of the model reached 0.805 (95%CI, 0.805-0.806) in the training set and 0.810 (95%CI, 0.807-0.813) in the validation set, respectively. The calibration plots also displayed good agreement between the predictions of the nomogram models and observation in the probability of 3-and 5-year CSS in the training and validation sets (Figure 4 and Figure S1).
For the OS nomogram, the C-index values were 0.755 (95%CI, 0.750-0.759) in the training set and 0.750 (95%CI, 0.737-0.763) in the validation set, and calibration curves for 3-and 5-year were also wellmatched with the standard lines ( Figure 4 and Figure S1).
Based on the nomograms, each patient was assigned corresponding total points for CSS and OS. The median total points calculated for CSS and OS were 152 (range: 10-256) and 169 (range: 17-297) in the training set, respectively, and 151.5 (range: 14-256) and 151.5 (range: 23-295) in the validation set, respectively. Based on previously reported cut-off points (16th, 50th, and 84th of total points in the training set), patients were divided into four various risk groups. CIF and Kaplan-Meier analysis also showed that the curves of the four risk groups were widely separated in the training and validation sets (both p < 0.001), further supporting the good predictive performance of the nomogram models ( Figure 5).

Discussion
Cancer-speci c death and other cause-speci c death are mutually exclusive endpoints in oncology research. Competing events are regarded as censoring and cancer-speci c mortalities, which may be overestimated using traditional Kaplan-Meier and Cox methods [17,18] . Therefore, there may be deviation in the prognosis assessment of patients by clinicians, creating a substantial psychological burden to patients and affecting their lives. In the present study, we conducted a real-world study based on the SEER database to identify the independent predictive factors of CSS of patients diagnosed with HCC using the competing risk method and established a competing nomogram model for individualized prediction of CSS. We also constructed a model for predicting OS. Both models achieved excellent predictive e ciency, which can help clinicians assess the prognosis of patients more accurately.
However, we only developed nomogram models for OS and CSS and not for other cause-speci c survival (OCSS). OCSS is mainly in uenced by cardiovascular disease, cerebrovascular disease, and other underlying diseases [19][20][21] ; however, the SEER database lacks records of underlying diseases. Therefore, construction of a model for predicting OCSS based on existing data is unreasonable because the predictive performance of the model will be poor.
The two models included nine parameters: age, race, sex, marital status, surgical therapy, chemotherapy, radiotherapy, tumour diameter, and AJCC staging. Marital status was the only difference between the two models. Previous research showed that marital status is one factor affecting prognosis [22,23] ; this may be because a close and cohesive family increases the likelihood of adherence, and psychological and economic support from spouses may contribute to improvements in survival in married patients [24][25][26] . Furthermore, several studies based on the SEER database indicated that HCC patients who were married had a better prognosis [27][28][29] . However, our competitive risk analysis showed that marital status was signi cantly associated with well OS but not CSS. Therefore, the marital status mainly associated with other cause-speci c death for HCC but has little association with cancer-speci c death.
By comparing the prognosis outcomes of different surgical treatments, we found that liver transplantation remains the most effective treatment. Interestingly, patients who underwent liver transplantation had comparable 5-year CSS and OCSS. In the whole cohort, 2686 patients were administered liver transplantation and 784 died during follow-up, including 393 from other causes and 391 from HCC, with 5-year CSS and OCSS of 13.4% (95%CI, 12.0-14.8%) and 10.5% (95%CI, 9.2-11.7%), respectively. Therefore, survival analysis of liver transplantation patients with HCC should consider the in uence of competing events.
Although using population-based data from SEER can reduce selection or treatment biases associated with small sample sizes or single-centre data analysis, there were several limitations to this study. First, this was a retrospective study. Second, although we included a large number of multicentre queues, all patients were from the United States. Considering that there may be differences in treatments and the management of HCC among different countries, international multicentre studies are needed to estimate the predictive performance of the models. Additionally, not all previously reported factors were recorded in the SEER database, such as the aetiology of HCC and other detailed treatments. Previous studies showed that antiviral therapy improves the prognostic outcome of patients with hepatitis B or C infection [30,31] .
Including these variables may improve the predictive power of the models.

Conclusion
Overall, in this population-based study, we developed and validated nomogram models for individualized prediction of CSS and OS in patients with HCC. These simple tools can help clinicians identify high-risk groups and guide clinical decision making. For the patients, the models will help answer consultation questions from patients and provide personalized prognosis assessments.   Supporting Information Figure S1. The calibration curves for predicting the 3-year CSS and OS in the training and validation sets.