Pan-metastatic cancer analysis of prognostic factors and a prognosis-based metastatic cancer classification system

We aimed to perform a pan-metastatic cancer analysis on survival and prognostic factors and to create a prognosis-based classification system. We selected distant metastasis patients from the Surveillance, Epidemiology, and End Results (SEER) database. The associations between the characteristics of the patients at admission and overall survival were determined. A prognosis-based metastatic cancer classification was established based on the identified prognostic factors. The differences in prognosis among these categories were tested. The survival rate and prognostic factors were not consistent across cancers. Three metastatic cancer categories were generated, each with different prognoses. The prognostic differences among the categories were satisfactorily validated. Different metastatic cancer types had homogeneous and heterogeneous survival rates and prognostic factors. A prognosis-based classification system for synchronous distant metastasis cancer patients at admission was created. This classification system reflects the grade of malignancy in metastatic cancers and may guide the prediction of survival and individualized treatment. Moreover, it may have important implications for the management of synchronous metastatic cancers and aid clinicians in properly allocating medical resources to metastatic patients.


INTRODUCTION
Decades of cancer research and clinical trials have revealed genetic, epidemiological, and anatomical characteristics that have led to the development of plausible therapeutic strategies, many of which have significantly improved clinical outcomes [1]. Based on the TNM classification, physicians can conveniently predict the prognosis of cancer patients, select appropriate treatment regimens, and improve the efficiency of clinical treatment [2,3]. It is well known that distant metastasis (DM) is the main characteristic of stage IV cancer, and it accounts for 90% of cancerrelated deaths in patients with clinical symptoms [4].
The prognosis of cancer patients is one of the primary factors guiding treatment. However, there has been no classification system developed to predict the prognosis of patients with DM. The anatomical system may be an excellent choice for predicting the prognosis of metastatic cancers, as they may share common pathogenic mechanisms and present similar symptoms. However, a large number of studies have suggested that different types of metastatic cancers showed both homogeneous and heterogeneous prognoses, even in the same anatomical system [5,6]. Genetics may be another approach to identify the differences in survival among metastatic cancers. However, weaknesses of this approach, including the high cost, complex detection process, and extended detection period, have resulted in the limited application of genetic techniques in the clinic. In our recently published papers, a series of factors were found to contribute to the prognosis of metastatic cancers. The identified factors provided a basis for constructing a metastatic cancer classification system [7][8][9][10][11].
Based on the previously identified prognostic factors, several systems for the evaluation of survival in patients with stage IV cancer have been widely used in different fields, such as the Diagnosis-Specific Graded Prognostic Assessment (DS-GPA) for brain metastasis [12], Tokuhashi score and Tomita score for spinal metastasis [13], and Glasgow prognostic score (GPS) for liver metastasis [14]. However, due to the limited sample size and the relatively limited cancer types, the external applicability of these tools is not satisfactory [15]. These classification tools cannot be used to distinguish the differences in survival of patients with cancers in the same or different anatomical systems.
The Surveillance, Epidemiology and End Results (SEER) database consists of 18 population-based cancer registries and has recorded DM since 2010. To date, the SEER database has recorded more than sixty cancer types and incorporated more than 10 million patients. Thus, the present study aimed to evaluate the differences among the characteristics, survival and prognostic factors in all patients with metastatic cancers, to construct a prognosis-based pan-metastatic cancer classification system, to support the implementation of different metastatic cancer management strategies and to guide physicians in the selection of individualized treatment regimens for stage IV cancer patients.

Characteristics of the included participants
A total of 291,104 metastatic cancer patients with cancer in 61 sites were included in the construction cohort in the present study. In these patients, the mean age was 67.12±13.40 years (0-113 years), 52.6% (N=153,228) were male, and 51.7% were married (N=142,757). Most of the patients were white (N=230,342, 79.3%), and 80.1% of them were insured (N=227,272). The demographic and clinical characteristics stratified by cancer site are described in Figure 1.
A total of 252,535 metastatic cancer patients were included in the validation cohort. The mean age was 66.94±13.44 years, and 52.9% were males (N=133,486). The demographic and clinical characteristics were comparable between the construction and validation cohorts. However, due to the relatively large sample size of the participants, significant differences existed (Table 1).
The survival rate and survival time were not consistent across cancers in different systems. DM patients with AGING primary cancer in the respiratory system exhibited the lowest mean survival time (9.80±0.05 months) and 12month survival rate (22.8%). DM patients with primary cancer in the lymphoma system had the highest mean survival time (47.90±2.33 months), while the female genital system had the highest 12-month survival rate (88.8%).
For different cancer types, the prognosis was not consistent. Metastatic liver cancer (mean survival time: 5.89±0.18 months; 12-month survival rate: 12.3%), gallbladder cancer (mean survival time: 6.95±0.27 months; 12-month survival rate: 14.6%) and pancreatic cancer (mean survival time: 7.00±0.08 months; 12-month survival rate: 15.1%) had the shortest survival times and lowest survival rates of all cancer sites. Metastatic testicular cancer had the highest mean survival time of 54.0±0.75 months, but metastatic carcinoma of the female genital system had the highest 12-month survival rate (88.8%).

Prognostic factors for different metastatic cancers
Multivariable Cox regression showed that advanced age, male sex, white race, poorly differentiated grade, higher T stage, higher N stage, and bone, brain, lung, and liver metastases were all positively associated with overall mortality. Married status, insured status, and surgery at the primary site were all negatively related to overall mortality. The associations between the factors mentioned above and overall survival were not consistent across cancer in different systems and cancer types. These factors were all associated with metastatic lung and bronchus cancer; however, metastatic cancers of other digestive organs and the penis were not associated with any of these factors. Even in the same system, the factors associated with metastatic cancer in different sites were not consistent ( Figure 3).

Prognosis-based metastatic cancer classification
Unsupervised hierarchical clustering analysis was used to classify the 61 cancer sites into three main subgroups, namely, categories A, B, and C. The category A metastatic cancer subgroup had the worst prognosis and included intrahepatic bile duct cancer, stomach cancer, oesophageal cancer, urinary bladder cancer, other biliary cancer, lung and bronchus cancer, mesothelioma, another endocrine including thymus cancer, uterus cancer, ureter cancer, lip cancer, liver cancer, pancreatic cancer, gallbladder cancer, and large intestine cancer ( Figure 4A). With the best prognosis, the category C metastatic cancer subgroup included metastatic NHLextranodal cancer, testis cancer, other female genital organ cancer, appendix cancer, prostate cancer, and other digestive organ cancers. Details about the categories across different anatomical systems are provided in the Supplementary Table 1. The Kaplan-Meier method showed that the mean survival times for the A, B, and C metastatic cancer  Figure 4C).

DISCUSSION
In this study, a comprehensive pan-metastatic cancer analysis was conducted to evaluate survival and to identify prognostic factors for stage IV cancer. Significantly different metastatic cancers had distinct prognoses, even in the same anatomical system.  these three categories in the construction cohort (B) and validation cohort (C). All 61 metastatic cancer types were sub-grouped into three categories, namely, categories (A-C) and the Kaplan-Meier analysis suggested that there were significant differences in prognoses among these categories. Additionally, the survival differences among these categories were validated in the validation cohort.
different prediction of prognosis according to the primary cancer and the metastatic site. The present study can be the foundation for the formulation of an individualized evaluation system for stage IV cancer.
For the first time, based on a large population from the SEER database, we summarized all the prognostic factors in various systems and cancer types for stage IV cancer. The identification of prognostic factors in stage IV cancer patients is a major concern in the DM screening and individualized treatment. In the present study, advanced age, male sex, white race, poorly differentiated grade, higher T stage, higher N stage, and bone, brain, lung, and liver metastases were positively associated with overall mortality. Married status, insured status, and surgery at the primary site were all negatively associated with overall mortality. Previously, some prognostic factors in certain cancers were reported [16][17][18]. The latest study, based on a single-centre population, reported that extracranial metastases and Karnofsky performance status were independent prognostic factors in colorectal cancer patients with brain metastasis [19]. Another study focused on bone metastases of hepatocellular carcinoma reported a series of prognostic factors, including Child-Pugh class A group, alpha-fetoprotein level more than 30 ng/mL, and higher T stage (>5 cm) [20]. Based on 202 lung cancer patients with bone metastasis, another study reported that age (<60 years), non-small-cell lung cancer pathology type, chemotherapy for bone metastasis, and radiation therapy for bone metastasis were independent favourable prognostic factors [21]. Thus, as indicated by our results in each system and cancer type ( Figure  3), the prognostic factors are both homogeneous and heterogeneous. To precisely predict the survival of stage IV cancer patients, studies identifying specific prognostic factors in different stage IV cancers should be performed.
In addition, based on the survival analysis in the panmetastatic cancer cohort, we initially classified all cancers with DM into three subgroups. To the best of our knowledge, the present classification is the first pan-metastatic cancer prognosis-based system for stage IV cancer. Currently, TNM staging has been widely accepted as one of the main tools for evaluating cancer patients. With medical developments and improved survival in cancer patients, the number of patients with DM has been increasing. The present study suggests that there are different survival rates in various cancers with DM, which is supported by evidence from previous studies [5,22]. Thus, among cancer patients in the M1 stage, limited guidance can be provided by the TNM stage regarding the selection of the appropriate treatment. Further classification of patients with M1 stage disease is warranted. Currently, to predict the survival of cancer patients with stage IV disease, most physicians and researchers have classified patients based on the anatomical system. However, such classification was proven to be inaccurate in the present study. We hypothesize that different histological types of cancer are heterogeneous within the same anatomical system or even within the same cancer type. Different histological types may have different prognoses [23][24][25]. In the present study, the constructed classification system was shown to reflect the grade of malignancy of metastatic cancer and may offer important survival information that can be used to guide the formulation of a survival prediction scoring system and treatment selection for stage IV cancer patients.
Synchronous metastasis was accepted as the diagnosis of a distant metastasis with the primary cancer. Metachronous metastasis was usually defined as an occurrence after a period post treatment. Previously, patients with synchronous metastasis, compared with those with metachronous metastasis, have more adverse prognostic features, significantly shorter time to treatment failure, and poorer survival [26]. In the latest study, timing of metastases after initial diagnosis impacts outcome from targeted therapy in cancer [26]. However, seldom study was performed to reveal the potential mechanism under the differences between the synchronous metastasis and metachronous metastasis. Thus, more studies and trials are needed in future.
At the same time, with the increase in the therapy costs of cancer, issues related to medical resource allocation and medical insurance decisions have become global concerns [27,28]. The constructed classification system can help medical officials in the metastatic cancer management and in the distribution of medical resources for stage IV cancer patients. In addition, with the identified prognostic factors for all cancers, the value of treatment options for metastatic cancer can be considered when medical insurance policies are generated.
For these three different classifications, only the distribution of the association between male sex and overall survival was significantly different among categories A, B, and C (Table 2). However, we did not find any obvious rules for the other prognostic factors in different categories. This may be explained by the fact that this metastatic cancer classification system was only based on the prognosis of the cancers, not the pathogenesis.
There were some limitations of our study. First, DM was merely recorded in the bone, liver, lung, and brain in the SEER database. Metastasis to other sites, which may have resulted in a bias in the survival analysis, was not recorded. Second, the present study analysed the associations between overall survival and the characteristics of patients with synchronous metastasis at admission. The occurrence of metastasis during followup, namely, metachronous metastasis, was not investigated, and the results may have been affected. Thus, the results should be interpreted with caution, and more studies are needed to further validate their application. Third, because of the lack of detailed costs for the patients, the present study cannot further analyse the costeffectiveness through the constructed classification based on the pan-metastatic cancer cohort. Moreover, due to the lack of a large cohort focused on DM in cancer patients, the validity of the prognosis-based classification system still needs to be further externally tested.
In summary, this nationwide, population-based study comprehensively analysed pan-metastatic cancer survival and identified prognostic factors in patients with all stage IV cancers at admission. The present study suggests that the survival of patients with synchronous distant metastasis is both homogeneous and heterogeneous. A series of prognostic factors in stage IV cancer patients were identified; advanced age, male sex, white race, poorly differentiated grade, higher T stage, higher N stage, and bone, brain, lung and liver metastases were positively associated with overall mortality. The prognostic factors in various systems and cancer types were both homogeneous and heterogeneous. Based on the different survival of stage IV cancer patients, all metastatic cancers were divided into three subgroups. This classification reflects the grade of malignancy of metastatic cancer and AGING may offer important survival information that can be used to guide the formulation of a survival prediction system and the selection of appropriate treatments. Moreover, the constructed classification system can help medical officials manage synchronous distant metastatic cancers and properly allocate medical resources for stage IV cancer patients.

Study population
This study used a metastatic cancer case cohort derived from the National Cancer Institute SEER  Figure 5).

Ethics statement
Cancer is a reportable disease in every state of the United States, and use of the data in the SEER database does not require informed patient consent. The present study complied with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Statistical analysis
Normally distributed data, such as age, are described as the means ± standard deviations (SDs). The mean and median survival of the patients are described as the survival time with 95% confidence intervals (CIs).
Categorical data, such as sex, are presented as numbers and percentages (N, %), and the differences between groups were tested by Pearson's chi-square test or the rank-sum test. The Kaplan-Meier method was used to investigate the 1-, 3-, 6-and 12-month survival rates and the mean and median survival of patients with metastatic cancer at various sites. Univariable Cox regression was used to investigate the potential factors associated with the overall survival of the cancer patients, and the factors with P-values smaller than 0.1 were incorporated into the multivariable Cox regression model.
Unsupervised hierarchical clustering analysis was performed using the squared Euclidean distance method based on the patients' demographic, clinical and prognostic features, including age; sex; race; marital status; insurance; differentiation grade; T stage; N stage; surgery; bone, brain, liver and lung metastases; 1-, 3-, 6-, and 12-month survival rates; and mean survival. Tree cluster analysis was performed to classify the metastatic cancer sites into categories A, B, and C. Kaplan-Meier analysis was performed to determine the prognosis of the category A, B, and C metastatic cancer subgroups, and differences were identified with the log-rank test. Moreover, metastatic cancer patients who were diagnosed between 2005 and 2009 were used for the validation of the classification system. Two-tailed Pvalues <0.05 were statistically significant. Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS) version 23.0 software package for Windows (SPSS version 20.0, IBM, Inc.).

AUTHOR CONTRIBUTIONS
XW, WM, and CZ designed the study. GX and YX collected the data. CZ, GX, and YX analysed the data. HW, XG, MM, YB, and GW organized the manuscript. VP. B, VP. C, and KP reviewed the papers and revised the manuscript. All the authors have read and approved the final manuscript. All authors contributed to the data analysis, manuscript drafting, and manuscript revision and agree to be accountable for all aspects of the work.