Development and validation of a prognostic model to predict the prognosis of patients who underwent chemotherapy and resection of pancreatic adenocarcinoma: a large international population-based cohort study

Background Pancreatic cancer (PaC) remains extremely lethal worldwide even after resection. PaC resection rates are low, making prognostic studies in resected PaC difficult. This large international population-based study aimed at exploring factors associated with survival in patients with resected TNM stage I–II PaC receiving chemotherapy and at developing and internationally validating a survival-predicting model. Methods Data of stage I–II PaC patients resected and receiving chemotherapy in 2003–2014 were obtained from the national cancer registries of Belgium, the Netherlands, Slovenia, and Norway, and the US Surveillance, Epidemiology, and End Results (SEER)-18 Program. Multivariable Cox proportional hazards models were constructed to investigate the associations of patient and tumor characteristics with overall survival, and analysis was performed in each country respectively without pooling. Prognostic factors remaining after backward selection in SEER-18 were used to build a nomogram, which was subjected to bootstrap internal validation and external validation using the European datasets. Results A total of 11,837 resected PaC patients were analyzed, with median survival time of 18–23 months and 3-year survival rates of 21–31%. In the main analysis, patient age, tumor T stage, N stage, and differentiation were associated with survival across most countries, with country-specific association patterns and strengths. However, tumor location was mostly not significantly associated with survival. Resection margin, hospital type, tumor size, positive and harvested lymph node number, lymph node ratio, and comorbidity number were associated with survival in certain countries where the information was available. A median survival time- and 1-, 2-, 3-, and 5-year survival probability-predictive nomogram incorporating the backward-selected variables in the main analysis was established. It fits each European national cohort similarly well. Calibration curves showed very good agreement between nomogram-prediction and actual observation. The concordance index of the nomogram (0.60) was significantly higher than that of the T and N stage-based model (0.56) for predicting survival. Conclusions In these large international population-based cohorts, patients with resected PaC receiving chemotherapy have distinct characteristics independently associated with survival, with country-specific patterns and strengths. A robust benchmark population-based survival-predicting model is established and internationally validated. Like previous models predicting survival in resected PaC, our nomogram performs modestly. Electronic supplementary material The online version of this article (10.1186/s12916-019-1304-y) contains supplementary material, which is available to authorized users.

Patients with resected PaC who undergo chemotherapy are a selected group of all PaC patients and have distinct characteristics [8]. Even within this patient group, survival is heterogeneous. A prognostic model for this specific patient population is important and desirable and could facilitate clinical counseling by informing both patients and doctors of predicted individualized patient survival, guide plans on follow-up and surveillance, aid to survival stratification in international studies, and offer the baseline survival estimates for further molecular or genetic investigations. Furthermore, for resected patients considering subsequent chemotherapy, the predicted results could potentially encourage a proportion of patients with specific characteristics to further receive the standard postsurgical care. Stage is the major prognostic factor for PaC. Notably, survival of patients with disease of the same TNM stage might vary greatly [14]. Other prognostic factors such as patient age and tumor differentiation could improve individualized survival-prediction. A model incorporating all these factors can be intuitively illustrated using a nomogram [22]. Apart from two institutional nomograms predicting postsurgical survival in overall patients [23,24], population-based survival-predicting models specifically for resected PaC patients receiving chemotherapy with international validations and robustness have not been found.
To our knowledge, we herein report the first large international population-based investigation into factors associated with survival in patients with resected TNM stage I-II PaC receiving chemotherapy. We further construct a population-based survival-predicting model with international validations.

Patients
Population-based data of resected PaC patients were obtained from the national cancer registries of Belgium, the Netherlands, Slovenia, and Norway, and the US Surveillance, Epidemiology, and End Results (SEER)-18 [25] database. Data quality was previously described [3]. Institution-based data were not included due to the highly selected patients. An extensive attempt was made to contact population-based cancer registries, and the contacted registries together with reasons for exclusion are shown in Additional file 1: Table S1. The participating European national registries, located in Western, Southern, and Northern Europe were those able to provide quality data according to a standardized uniformed data-request form, which ensured the robustness of the results. All variables were uniformly (re)coded across registries. While there were other national population-based registries, they were not always able to provide eligible treatment, TNM staging, or survival data. All patient-level data were anonymous. This real-world observational study was approved by the Ethics Committee of Medical Faculty Heidelberg.
Patients with diagnosis based on death certificate only (DCO)/autopsy or with unknown/obscure follow-up time or vital status were excluded (Additional file 1: Table S2). Only patients with microscopically confirmed diagnoses of primary invasive TNM stage I-II adenocarcinomas of the exocrine pancreas who underwent surgical resection in 2003 until 2014 were selected. The time period was selected based on data availability and the fact that the fifth and prior editions of TNM staging were incompatible with the sixth/seventh versions used during 2003-2017 [7]. Since chemotherapy is standard for patients with resected PaC [5][6][7], we only included those receiving chemotherapy. Individuals with benign/premalignant tumors, non-PaC neoplasms involving the pancreas, neuroendocrine tumors/carcinoids, cystic/mucinous/serous tumors, acinar cell tumors, stromal tumors, sarcomas, germ-cell neoplasms, lymphomas, or peri-ampullar tumors were also excluded (Additional file 1: Table S3). To minimize the effect of the potential heterogeneity in surgery quality and perioperative care, we excluded cases surviving < 3 months. Patients with stage III or IV PaC were also excluded since resection is not routinely recommended for these patients [5][6][7].
Information on demographic (sex and age), clinical (year of diagnosis/surgery and treatment), and pathologic characteristics (topology, morphology, and TNM stage) was retrieved from all participating countries. Data on resection margin (the Netherlands and Slovenia), hospital type (Belgium and the Netherlands), Eastern Cooperative Oncology Group (ECOG) performance status score (Belgium), comorbidities (Eindhoven, the Netherlands), resection type (the US and the Netherlands), tumor size (the US), and positive and harvested lymph node numbers (the US and the Netherlands) were only available in certain registries.
Resection was defined as surgical removal of primary tumor, regardless of being curative or palliative and extents of excision and lymphadenectomy. Tumor topography and morphology were based on the International Classification of Diseases for Oncology (third edition). Stage was defined following the TNM staging system (sixth/seventh edition) and was a combination of pathologic and clinical stages with priority given to pathologic staging. Lymph node ratio was calculated by dividing the number of positive lymph nodes by the number of harvested lymph nodes. Vital status was based on valid national mortality registrations and official population registers.

Statistical analyses
Data in each country were analyzed separately without pooling, considering the potential heterogeneity across countries and to avoid the impact of any single large cohort. Descriptive results were reported as the smallest to the largest proportions for categorical variables or medians/means for continuous variables across countries. The cancer incidence rates by sex in each country were retrieved from the Cancer Incidence in Five Continents Volume XI (CI5 XI) by the International Agency for Research on Cancer (IARC), World Health Organization (WHO) (http://ci5.iarc.fr/CI5-XI/Default.aspx), which reports the incidence of cancers diagnosed from 2008 to 2012, standardized to the World (WHO 2000-2025) Standard Population.
The Kaplan-Meier method was applied to calculate survival time and rates. Since patients surviving < 3 months were excluded in this study, the 6-and 9-month survival was calculated as the short-term outcome. The 1-, 2-, 3-, and 5-year survival was computed as the long-term outcome. To assess the independent impact of potential prognostic factors on survival, Cox proportional hazards regression was used. Variables including year of diagnosis, age, sex, tumor location, T and N stages, and differentiation were included as covariates in the main multivariable models. For complete-case analysis, patients with missing data were excluded in multivariable analyses. In the US, results for the white patients were computed for comparison with the total patients, for whom main analyses were performed. In registries with available information, resection margin, hospital type, tumor size, positive and harvested lymph node numbers, lymph node ratio, T and N stages according to the eighth edition following Kamarajah et al. [26], ECOG score, resection type, and comorbidities were incorporated one by one into the main models to examine the survival association for each of them. The proportional hazards assumption was verified for all variables by plotting the logarithm of the negative logarithm of the survival function against the logarithm of survival time [27].
Data were centrally analyzed in the German Cancer Research Center. Results were considered statistically significant at two-sided P < 0.05. Analyses were conducted using the SAS software (version 9.4, SAS Institute Inc.).

Nomogram construction and validation
The SEER-18 dataset, the largest of the included datasets, was used as the training set for nomogram construction (models based on the other cohorts did not reveal markedly better performance). Age, sex, tumor location, T and N stages, and differentiation were entered as potentially relevant prognostic factors into the initial full multivariable Cox proportional hazards regression model, and the final model was selected through a backward step-down process using the likelihood ratio test with the Akaike information criterion as a stopping rule [28]. To permit nonlinear associations, continuous variables were modeled using restricted cubic splines where appropriate [28]. Points assigned to each variable included in the nomogram to predict the median survival time and 1-, 2-, 3-, and 5-year survival probability were proportional to the effect size of that variable in the final multivariable model. To facilitate clinical use, a corresponding online prognostic tool was created with Evidencio (https://www.evidencio.com/).
The nomogram was subjected to 1000 bootstrap resamples for internal validation of the training US cohort and was externally validated using the European datasets to assess the international generalizability of the model. The model performance and discrimination ability for predicting survival was numerically evaluated by computing Harrell's concordance index (C-index) [28]. Comparison of C-indexes of different models followed Hanley et al. [29]. Calibration of the nomogram for 1-, 2-, 3-, and 5-year survival was done by comparing the predicted with the observed survival. Bootstrapping was used for bias correction [28].
In sensitivity analyses for the training US cohort, C-indexes were re-calculated after replacing continuous age with age group, N stage with positive lymph node number or lymph node ratio, and sixth/seventh edition of cancer stages with the eighth version, after adding harvested lymph node number and/or tumor size, after limiting patients to those diagnosed after 2009 or white patients, and after stratifying patients by tumor location. The survival and rms packages in R 3.4.1 (http://www.rproject.org) were used.  Table S2). The detailed counts and frequencies for discrete variables and medians and interquartile ranges for continuous variables are shown in Table 1. Age-standardized PaC incidence was higher for males than for females. Among the participating countries, incidence for males was lowest in Belgium and the Netherlands (7.4 per 100,000) and highest in Slovenia (9.3 per 100,000); for females, incidence was lowest in Belgium (5.6 per 100,000) and highest in the US, Slovenia, and Norway (6.5 per 100,000).

Survival outcomes
The median survival time ranged from 18 (Slovenia) to 23 months (the US) across countries (Fig. 1). The shortand long-term survival outcomes are shown in Table 2. The 6-month survival rate ranged from 94% (the US) to 97% (the Netherlands), and the 9-month survival rate varied from 79% (Slovenia) to 90% (Norway). Regarding longer term outcomes, the 1-year survival rate ranged from 69% (Slovenia) to 79% (the Netherlands), and the 3-year survival rate ranged from 21% (Slovenia) to 31% (the US). The 5-year survival rate was lowest in Slovenia (10%), which was about half of that in the US (19%), the Netherlands (20%), or Norway (21%).

Survival-associated factors
Results from multivariable Cox regression are shown in Table 3, and only significant results are described. Increasing age was associated with worse survival in the US (HR per year = 1.01), Belgium (HR = 1.02), and Norway (HR = 1.04). Survival was significantly worse in men only in the US (HR = 1.10) and in pancreas body compared to head tumors in Norway (HR = 2.67). Compared to T3 cancers, T1 cancers were associated with higher survival in all investigated countries (HR = 0.17-0.70), while T2 cancers were associated with better survival only in the US (HR = 0.86). Negative nodal status was associated with significantly higher survival in the US (HR = 0.65), Belgium (HR = 0.78), and the Netherlands (HR = 0.51). Better differentiation was significantly associated with higher survival in all countries except Slovenia and Norway, and the HRs for well-and intermediately versus poorly/undifferentiated tumors were 0.48-0.68 and 0.61-0.81, respectively. Association patterns and strengths were similar between white and overall US patients.

Prognostic nomogram Construction
A nomogram incorporating prognostic factors remaining after backward selection in the US (sex, age, T and N stages, and differentiation) was established (Fig. 2a). The nomogram illustrated age and differentiation to have the largest contributions to prognosis. T and N stages showed moderate impacts on survival. Each number/category of these variables is assigned a score on the  Table 5. The layout of an online version of the nomogram is shown in Fig. 3.

An example of use
An example of how to use the nomogram is shown in Fig. 2b. A 72-year-old woman with poorly differentiated, T2N1M0 PaC who underwent resection and chemotherapy would have 28 points for her age, 0 points for her sex, 53 points for T stage, 82 points for N stage, and 97 points for  OS overall survival, CI confidence interval differentiation, totaling 260 points. The total points correspond to the estimation of median survival time of < 20 months, a 1-year survival probability of 72%, a 2-year survival probability of 38%, a 3-year survival probability of 22%, and a 5-year survival probability of 12%, which are consistent with the results generated by the online tool (Fig. 3).

Calibration and validation
The nomogram was applied to the US and the European countries for internal and external validations, respectively. The calibration plots showed very good agreement between nomogram-predicted and actual survival in the US, Belgium, and the Netherlands ( Fig. 4; plots were not shown in Slovenia or Norway where the case number was too small to generate meaningful calibration). Generally, the calibration was best for 2-and 3-year survival. In the training US cohort, the C-index for the established nomogram was significantly higher than that for the model based on both T and N stages (0.60, 95% CI = 0.59-0.61 vs. 0.56, 95% CI = 0.56-0.57). In the validation cohorts, C-indexes were also significantly higher for the nomogram than for the T and N stage-based model (Table 6).

Sensitivity analyses
Sensitivity analyses were performed for the derivative US cohort (Table 6). Using positive lymph node number or lymph node ratio instead of N stage in the nomogram did not obviously change the C-index (by 0.00 and + 0.01, respectively). Replacing the sixth/seventh version of both T and N stages with the eighth version also had minimal impact on the C-index (by + 0.01). After including examined lymph node number, tumor size, or both, the C-index only changed by 0.0, + 0.01, and + 0.01, respectively. Limiting the sample to patients diagnosed after 2009 or white people did not change the C-index.
Within subgroups according to tumor location, C-index was slightly higher than the overall one in body/tail cancer (0.61).

Discussion
In our large population-based study, we identified various factors independently associated with survival after resection of PaC and for the first time established and internationally validated a population-based nomogram for predicting survival in resected PaC patients receiving chemotherapy, which is robust, accurate, reliable, and practical. However, like previous models [23,24,[30][31][32], our model had a modest C-statistic. There are various reports on the prognostic factors for patients who underwent resection for PaC [9][10][11][12][13][14][15]. A systematic review showed that with the exception of postsurgical blood transfusion, tumor characteristics (e.g., size,   The main Cox proportional hazard regression models adjusted for year of diagnosis, age, sex, tumor location, T, N, and M stages, histology, and differentiation. HRs were calculated after N stage was replaced by metastatic node number (group) or lymph node ratio, or after the other investigated variables were included one by one into the main models. Statistically significant HRs are shown in italics HR, hazard ratio; CI, confidence interval; LN, lymph node; ECOG, Eastern Cooperative Oncology Group; −, not available lymph node status, and differentiation) were the only features significantly associated with survival after pancreatic resection [9]. Particularly, PaC size > 2 cm was an independent factor associated with poor post-surgical prognosis [10], and this category has been incorporated in both the sixth/seventh and the eighth TNM staging systems [33,34]. Notably, neural invasion was also determined to be an independent prognostic factor in PaC [11]. Through multivariable analyses, we demonstrated that older age, more advanced T and N stages, and poorer differentiation were independently associated with lower overall survival in resected PaC across most countries. In registries with available information, resection margin, hospital type, tumor size, metastatic and harvested lymph node numbers, lymph node ratio, and comorbidity number were also associated with prognosis. These findings are mostly consistent with previous literature [9-15, 35, 36] and add insights into the association strengths for resected PaC patients receiving chemotherapy at the population level and into the comparisons between countries. Some patient (e.g., age and comorbidities) and clinical characteristics (e.g., hospital type) were further identified to be prognostically significant. While previous studies have drawn differing conclusions regarding the association between resection type and survival [35,36], our population-based investigation of chemotherapy-treated resected PaC patients did not show a significant association. Furthermore, we found mostly no significant associations between tumor location and survival. Notably, overall, the contribution of T or N stage to postoperative survival was mostly not greater than that of differentiation. Categorization of tumor size and     number of metastatic lymph nodes following the eighth TNM staging system [33,34] discriminated survival well, supporting the implementation of the new system. Notably, harvested lymph node number was positively associated with survival. The relevance of harvested lymph node number for survival has remained controversial in PaC [37,38]. Possible reasons supporting the positive association include that more metastasized lymph nodes may be removed with more extensive sampling, which also results in more precise staging, guiding appropriate post-surgical treatment.
Estimating mortality risk might impact treatment planning and provide information helpful for patient stratification in study design, contributing to better equivalence between study arms [39]. Post-surgical survival for patients with PaC is remarkably heterogeneous, even with the same TNM stage [14,40,41]. To our knowledge, the nomogram we developed is the first one derived from a large population-based database with long-term follow-up for predicting overall survival in patients with resected stage I-II PaC receiving chemotherapy, with international validations in multiple European national datasets. There is a previous institutional nomogram [23] developed by Memorial Sloan-Kettering Cancer Center (MSKCC) in 2004 for predicting post-surgical survival in Western PaC patients not accounting for chemotherapy, with three external institutional validation attempts [30][31][32]. Based on institutional patient cohorts diagnosed many years ago [23,[30][31][32], the score assignment of several variables might not be optimal currently using the MSKCC  Concordance indexes in sensitivity analyses greater than that for the overall nomogram in the US are highlighted in italics nomogram, which might also be limited in generalizability. The MSKCC nomogram did not employ a backward selection process and incorporated some detailed surgical (e.g., portal vein resection and splenectomy) and symptom parameters (back pain and weight loss).
Notably, portal vein resection and splenectomy might not be routine procedures during pancreatectomy, and reporting of symptoms might show great interpersonal variations. Our population-based nomogram thus represents a more updated prognostic model compared to the MSKCC nomogram (Additional file 1: Table S5). The wide geographical distribution of patients and large sample size further enhanced the international representativeness and generalizability of our nomogram. Resection margin, which reflects the radicality of surgery, has not received a universal standard definition in PaC [42,43], and its relevance for survival remains highly controversial [44,45]. While we showed a positive association of survival with negative margin in the Netherlands, the strength was not greater than that of the association with T stage, N stage, or differentiation. We did not incorporate this variable in our nomogram for better generalizability. It is encouraged to incorporate margin status into our nomogram when a standard definition comes.
Calibration plots demonstrated very good agreement between nomogram-predicted and actual survival, which assures the repeatability and reliability of our nomogram. Importantly, the model based on the US dataset also fits the multiple European national cohorts, which supports the potential for the generalization and international utilization of our nomogram, irrespective of the potential health care disparity across countries. Discrimination of the nomogram, as highlighted by the C-index, was significantly and markedly higher compared to the model based on T and N stages only. In the external validation cohorts, the discriminative potency only slightly changed. Our model performed similarly well across countries, potentially facilitating patient allocation in international studies.
In sensitivity analyses, we examined various alternative models by for instance incorporating positive lymph node number or lymph node ratio as a continuous variable in place of N stage into the nomogram, and the discrimination ability basically remained the same, supporting the robustness of our model.
Notably, the eighth edition of TNM staging system has been implemented since 2018 [33,34]. Compared to the sixth/seventh version, in the eighth version new categories of tumor size (≤ 2, 2-4, and > 4 vs. ≤ 2 and > 2 cm) and positive node number (0, 1-3, and ≥ 4 vs. 0 and ≥ 1) are incorporated into T and N staging, respectively [26,33,34]. However, after integrating these factors either as continuous or corresponding categorical variables into our nomogram, the performance did not markedly change. After transforming the SEER-18 staging data according to the eighth edition following Kamarajah et al. [26], the performance also remained very similar. Moreover, it will take considerable follow-up time for the survival associated with the new staging system to be adequately assessed. Therefore, our nomogram will still be applicable without compromised accuracy in the coming years.
Strengths of our study include the international population-based design, the largest number of patients with resected PaC ever investigated, the extensive potential prognostic factors studied, the uniformly and consistently defined variables especially TNM stage across countries, and the consistency and quality control in reporting through applying rigorous registry data standards. Analyses were performed separately in each respective country without pooling, which avoids the impact of the potential heterogeneity across countries.
Our work may have important clinical impacts and provides to our knowledge the first population-based model which can predict survival for patients with stage I-II PaC who underwent resection and chemotherapy. The model is robust, accurate, well-generalizable, practical, and easy-to-use. Our model may offer personalized patient survival estimates and facilitate clinical counseling for both patients and doctors. Having an idea about the estimated survival of a specific patient could influence plans on follow-up and surveillance (e.g., frequency and examination modality) and thus possibly guide resource allocation. For some proportion of the resected patients considering further treatment, the predicted survival might encourage receipt of further chemotherapy. The international validation assures that our model could be used for survival stratification in international studies.
Patients with resected PaC do not respond equally to chemotherapy, and accordingly, the calibration plots also suggest that individual survival varied greatly despite the relatively consistent comprehensive survival across countries. Our study will help to initially stratify this patient population into subgroups with discrepant survival, and might serve as a platform for developing further endeavors to understand factors associated with chemotherapy responses and survival in resected PaC, including precise, individualized, and personalized genomic and proteomic survivorship investigations.
Like any observational registry-based investigation, our study also has some limitations. Our model predicts survival at the average population level, and when applying this model in specific centers or regions with different care patterns, there could be some inconsistencies between predicted and actual survival. Nevertheless, as revealed by the calibration plots, the real-world survival was still in good accordance with the prediction for a single individual. Residual confounding is a concern. Some significant variables (e.g., tumor size) were only registered in certain databases. Differences in survival pattern across countries might be partly associated with variation in the prescription of chemotherapy and/or the underlying ethnic/racial distribution, even though association results remained similar after limiting the US cohort to white. Notably, there were some differences in patient and tumor characteristics across registries. For instance, in Slovenia, tumors were generally more advanced and poorly differentiated, and the actual survival was the lowest. Nevertheless, these variables were adjusted for in our multivariable analyses.
Population-based registries collected limited information on variables including family and patient health history and individual-level socioeconomic status. In addition, we were unable to determine the molecular or genetic subtype of PaC [16], which probably plays a role in prognosis and explains the moderate C-index of our nomogram. Accordingly, our nomogram is limited by failure to incorporate these and other recognized prognostic parameters (e.g., neurovascular invasion and type of chemotherapy). Further efforts on collection and incorporation of more relevant variables are encouraged to improve this model.
Notably, all known models predicting PaC survival perform very modestly [23,24,[30][31][32]. Our nomogram with selection of only chemotherapy-treated resected PaC patients does not perform better compared to previous models with selection of all patients undergoing resection [23,24,[30][31][32], which might limit the added value of the selection for the current nomogram. The lack of detailed information on chemotherapy which has not been routinely collected in most registries is another limitation of this population-based registry-based study. Collection of such information is strongly encouraged in future registration practice. During the study period, the type of chemotherapy was mainly gemcitabine monotherapy, followed by 5-fluorouracil-based therapy. The ESPAC-3 [46] and RTOG 97-04 randomized trials [47] demonstrated similar efficacy and effectiveness regarding survival between gemcitabine and 5-fluorouracil in the adjuvant setting. The landscape of systemic treatment (e.g., agent and formula) and treatment sequence for PaC are rapidly changing, which might limit the possible use of this nomogram.
Despite the moderate C-index, the agreement between predicted and actual survival was very good. All variables included in our nomogram are easily available in clinics, compared to the not routinely measured and costly molecular markers. It is herein the first time that the contributions of these risk factors are quantified and integrated into a single model for survival prediction in resected and chemotherapy-treated PaC with international validations.

Conclusions
This large international population-based investigation revealed independent factors associated and not associated with survival in patients with resected stage I-II PaC receiving chemotherapy, with country-specific association patterns and strengths. We further established and internationally validated a novel, robust, and reliable survival-predicting model, which may provide the basis for more precise individualized survival estimation and which could be useful for clinical counseling. Our nomogram in line with all known models predicting survival in resected PaC performs modestly.

Additional file
Additional file 1: Table S1. Selection of contacted national populationbased cancer registries in Europe. Table S2. General information on participating population-based registries.