Predicting early extrahepatic recurrence after local treatment of colorectal liver metastases

Abstract Background Patients who develop early extrahepatic recurrence (EHR) may not benefit from local treatment of colorectal liver metastases (CRLMs). This study aimed to develop a prediction model for early EHR after local treatment of CRLMs using a national data set. Methods A Cox regression prediction model for EHR was developed and validated internally using data on patients who had local treatment for CRLMs with curative intent. Performance assessment included calibration, discrimination, net benefit, and generalizability by internal–external cross-validation. The prognostic relevance of early EHR (within 6 months) was evaluated by landmark analysis. Results During a median follow-up of 35 months, 557 of the 1077 patients had EHR and 249 died. Median overall survival was 19.5 (95 per cent c.i. 15.6 to 23.0) months in patients with early EHR after CRLM treatment, compared with not reached (45.3 months to not reached) in patients without an early EHR. The EHR prediction model included side and stage of the primary tumour, RAS/BRAFV600E mutational status, and number and size of CRLMs. The range of 6-month EHR predictions was 5.9–56.0 (i.q.r. 12.9–22.0) per cent. The model demonstrated good calibration and discrimination. The C-index through 6 and 12 months was 0.663 (95 per cent c.i. 0.624 to 0.702) and 0.661 (0.632 to 0.689) respectively. The observed 6-month EHR risk was 6.5 per cent for patients in the lowest quartile of predicted risk compared with 32.0 per cent in the highest quartile. Conclusion Early EHR after local treatment of CRLMs can be predicted.


Introduction
Colorectal cancer liver metastases (CRLMs) are the major cause of colorectal cancer-related death 1 . Local treatment of CRLMs without extrahepatic metastatic involvement, such as liver resection, offers the only chance of cure or long-term survival [2][3][4][5] . Improved surgical and ablative techniques, optimization of systemic induction treatment, and more lenient eligibility criteria have increased the number of patients undergoing CRLM resection 4,6 . Relapse after local CRLM treatment occurs in up to 75 per cent of patients, often with unresectable recurrences and decreased survival 5,7,8 . Numerous prediction models for (recurrence-free) survival after local treatment of CRLMs exist [9][10][11][12][13][14][15] , but these are not widely used to guide decision-making owing to their inability to identify patients with a sufficiently short survival to render local treatment unjustified. Aspects that might contribute to this include suboptimal incorporation of prognostic factors and the use of (recurrence-free) survival as an endpoint.
Recurrence-free survival (RFS) does not discriminate between intrahepatic and extrahepatic recurrences. Patients with liver-limited recurrences may be eligible for repeat local treatment, resulting in long-term survival 8,[16][17][18] . In contrast, a minority of patients with extrahepatic recurrence (EHR) undergo repeated local treatment [18][19][20] . An early recurrence, usually defined as recurrence developing within 6 months 21,22 , and EHR are independently associated with poor overall survival (OS) in patients receiving local treatment for CRLMs [21][22][23] . Therefore, local treatment of CRLMs may not be justified in patients who develop early EHR. Being able to predict early EHR may spare patients invasive treatment, and avoid delay in starting systemic treatment that may effectively treat the systemic disease present. In randomized trials 24,25 , in patients receiving local CRLM treatment, no median OS benefit was seen with perioperative systemic therapy. Early EHR estimates potentially could stratify patients who may and who may not benefit from additional perioperative systemic therapy.
Although early EHR after local treatment of CRLMs is of major clinical importance, no prediction models for early EHR exist. Furthermore, novel prognostic factors, such as primary tumour location and RAS/BRAF V600E tumour mutational status, may aid in better identifying patients at high risk of early EHR. Patients with right-sided primary tumours have a worse prognosis after local treatment of CRLMs, more recurrences at multiple sites, and less repeated local treatment than patients with left-sided primary lesions 26,27 . The presence of RAS and BRAF V600E mutations is associated with a higher recurrence rate of up to 94 per cent, with EHR not amenable to local therapy, and shorter EHR-free survival (EHRFS) 8,28-30 . This study aimed to develop and internally validate a prediction model that incorporates primary tumour location and RAS/BRAF V600E mutational status for early EHR following local treatment of CRLMs using a population-based cohort.

Patient cohort
All patients diagnosed with colorectal cancer between 1 May 2015 and 31 December 2016, who underwent local treatment (resection and/or ablation) with curative intent for CRLMs, were identified in the Netherlands Cancer Registry (NCR) 31 . Patients with extrahepatic metastases before resection, R2 liver resections, appendiceal carcinoma, concomitant local liver treatment other than resection or ablation, and without any follow-up information were excluded. The scientific committee of the Netherlands Comprehensive Cancer Organisation (IKNL) approved the research protocol, and the requirement for written informed consent was waived for this study. The study was performed in accordance with the Declaration of Helsinki and reported according to the TRIPOD guidelines 32 .

Candidate predictor variables
Data were extracted from the NCR including: age, sex, AJCC tumour (T) category, node (N) category of the primary tumour, location of the primary tumour (right or left colon, or rectum), disease-free interval (DFI) between detection of the primary tumour and metastases, size and number of CRLMs, carcinoembryonic antigen (CEA) level before local treatment of CRLMs, type of local treatment, resection margin status (R0 versus R1), and perioperative systemic treatment administered. Major resection was defined as resection of at least four liver segments 33 , synchronous disease as DFI of 6 months or less 34 , and perioperative systemic therapy as treatment administered 100 days or less before and/or after local CRLM treatment and initiated before progression of disease. The intent (neoadjuvant or induction) of systemic treatment was not registered. However, as Dutch colorectal cancer guidelines 35 recommend not administering perioperative systemic therapy to patients with resectable CRLMs, it was assumed that preoperative systemic treatment was given as induction treatment to achieve CRLM resectability. Further assumptions regarding systemic treatment are described in Table S1.
Information on RAS/BRAF V600E mutational status was retrieved from the NCR and the national automated pathological archive (PALGA 36 ), determined in daily practice for the primary tumour or metastases at any time during the disease course. Missing KRAS, NRAS, and BRAF V600E mutation status was complemented by an additional Sequenom Massarray® (San Diego, CA, USA) mutation analysis of tumour tissue from 250 patients. These 250 additional samples were selected in such a way as to maximize mutation status information for patient subgroups otherwise under-represented, increasing the likelihood of successful multiple imputation 37 .

Patient outcomes
Follow-up data for recurrences were collected from medical records until 1 May 2020 and survival was obtained by linkage with the municipal population registry on 31 January 2021. OS was defined as the interval between the date of first local treatment for CRLMs until date of death or last follow-up. RFS and EHRFS were calculated as the interval between the date of first local treatment of CRLMs until the date of a RFS or EHRFS event, which was defined as first recurrence of disease or first EHR or death, whichever occurred first, or censored on last date of RFS or EHRFS without an event respectively. If follow-up for recurrences was shorter than follow-up for survival, all survival follow-up beyond the last follow-up for recurrences was discarded for assessment of RFS, EHRFS or OS. In all patients, a minimum of 1-year RFS and 2-year OS follow-up was ensured. All assumptions regarding OS, RFS, and EHRFS are recorded in Table S1.

Statistical analysis
Standard descriptive statistics were used to describe the study population, including median (i.q.r.) for continuous data, and frequency and percentages for categorical variables. Follow-up and patient outcomes were described using (reverse) Kaplan-Meier approaches.

Prediction model development and performance assessment
Early EHR (within 6 months 21,22 ) was defined as the clinically relevant primary endpoint, owing to the poor prognosis in patients with early EHR and lower chance of repeat local treatment, in contrast to patients with liver-only recurrences.
The prognostic impact of this primary endpoint was assessed using landmark analysis at 6 months after CRLM treatment. Based on published recommendations 38 , there were sufficient data to model 17 coefficients. Nine candidate predictors were selected for model development by assessment of a multidisciplinary team based on literature describing previous prediction models and novel prognostic factors [9][10][11][12]26,39 . The predictors, including four continuous variables that were modelled non-linearly, were: neoadjuvant systemic treatment, primary tumour location, T category, N category, RAS/BRAF V600E mutational status, number of liver metastases, size of largest liver metastasis, preoperative CEA level, and DFI. Multiple imputation with multivariate imputation by chained equations 40 was used to account for missing data.
A prediction model for EHRFS after local treatment of CRLMs was developed using Cox regression, with a time horizon of 12 months to improve the effective sample size, but with a primary evaluation of the model's performance for EHR within 6 months. The prediction model was developed in the whole cohort, using Akaike information criterion (AIC)-based backward selection in each imputed data set, leading to a primary model including only predictors selected in at least 50 per cent of imputed data sets, which was then refitted in each imputed data set to obtain a pooled model using Rubin's rules (EHR model). Adjuvant systemic therapy was included in all models using an offset for expected therapeutic efficacy based on the pooled adjuvant systemic treatment effect from published RCTs 24,25 .
Model performance at 6 and 12 months was assessed using calibration plots, discrimination (C-index), time-dependent receiver operator characteristic (ROC) curves, decision curve analysis, and Nagelkerke's R 2 . Each measure was determined for each imputed data set separately and pooled using Rubin's  Fig. S5.
rules, incorporating appropriate data-transformation steps.
Decision curve analysis was used to assess the net benefit associated with CRLM treatment decisions based on a given threshold value for 6-or 12-month EHRFS probability 41 . To visualize the model's potential relevance, Kaplan-Meier curves were plotted for EHRFS, RFS and OS, with patients categorized based on quartiles of predicted EHR risk.
Internal validation by 500-fold bootstrap resampling was used, repeating all model-development steps, in each bootstrap sample, to obtain an overoptimism-corrected model (using uniform shrinkage) and C-index. Internal-external cross-validation was applied, including all modelling steps, to evaluate the generalizability of the model based on three geographical regions.
An exploratory analysis was conducted to test whether the prognostic value of RAS mutation for EHRFS depended on the administration of preoperative systemic treatment, as reported by others 28

Patient cohort
All 1105 patients who underwent local treatment (resection and/ or ablation) for CRLMs were selected from the NCR for analysis.
No follow-up data were available for 11 of the 1105 patients (less than 1.0 per cent). The primary endpoint (early EHR) was available for 1077 patients.
Patient characteristics are summarized in Table 1

Fig. 2 Calibration plots for predicted versus observed 6-and 12-month extrahepatic recurrence
Predicted versus observed a 6-month and b 12-month extrahepatic recurrence (EHR, which includes EHR or death as events for EHR-free survival) probabilities. The histogram shows the distribution of predicted EHR probabilities. The integrated calibration index was 0.015 (6-month EHR) and 0.028 (12-month EHR). The median absolute difference was 0.017 (6 months) and 0.030 (12 months), with a maximum absolute difference of 0.03 (6 months) and 0.06 (12 months).

Fig. 3 Decision curve analysis plots
Plots indicate the net benefit obtained for a given threshold value for a-c 6-month and d-f 12-month extrahepatic recurrence (EHR) probability, which includes EHR or death as an EHR-free survival (EHRFS) event. The net benefit was compared across three situations: non-informed decision-making (selecting all patients or no patients (dashed and dotted lines respectively)) and for informed decision-making by selecting patients for local treatment of colorectal liver metastases (CRLMs) according to the clinical risk score's (CRS) predicted EHRFS probability (blue continuous line). For comparison, the horizontal black continuous line represents an omniscient model (all-knowing model). a,d Net benefit of local treatment for CRLMs (selected patients) is determined using the true-positives (patients with predicted EHRFS probability (p EHRFS ) above the threshold value and not having had an EHR) versus false-positives (p EHRFS above threshold and the patient did have an EHR) for a range of threshold values (0-1), with the benefit of false-positives weighted relative to the threshold value. For consistency, the net benefit is shown for a range of thresholds for EHR (EHR probability = 1 -EHRFS probability). b,e. Net benefit of no local treatment for CRLMs (non-selected patients) is determined using the true-negatives (patients with p EHRFS below threshold and having an EHR) versus false-negatives (patients with p EHRFS below threshold and not having had an EHR) for a range of threshold values (0-1), with the benefit of false-negatives weighted relative to the threshold value. c,f. Overall net benefit is the sum of the net benefit for the selected and non-selected patients. months. Notably, 45 patients (23.2 per cent) with early EHR had undergone major liver surgery (hemihepatectomy), of which 23 (11.9 per cent of all patients) had two-stage resection, whereas only 15 patients (7.7 per cent) had local ablation therapy only.
The first EHR was a multisite EHR in 127 patients (26.6 per cent). The site of first EHR was most frequently the lungs and lymph nodes in 213 (44.6 per cent) and 55 (11.5 per cent) respectively, whereas the brain was affected in 6 patients (1.3 per cent). The site of first EHR correlated significantly with postrecurrence survival (P < 0.001); the shortest postrecurrence survival was in patients with brain metastases and longest in patients with EHR in the lymph nodes, within the abdomen or lungs (Fig. S2).

Prognostic relevance of 6-month extrahepatic recurrence
Some 982 patients who survived until the landmark time (6 months after local treatment of CRLMs) were included in the landmark analysis to compare survival outcomes according to type of recurrence. Of those alive at 6 months after local treatment, 726 (73.9 per cent) had no recurrence, 123 (12.5 per cent) developed liver-only recurrence, and 133 (13.5 per cent) had EHR (including 100 patients with extrahepatic and intrahepatic recurrence). Median OS from the landmark time was 19.5 (95 per cent c.i. 15.6 to 23.0) months for patients with 6-month EHR after CRLM treatment (Fig. S3), 30.7 (29.0 to not reached) months for those with liver-only recurrence, and not reached (45.3 months to not reached) for patients without a recurrence.

Prognostic value of tumour mutational status and sidedness of primary tumour
The prognostic value of tumour mutational status and sidedness of the primary tumour was first explored using univariable analysis for OS, EHRFS, and RFS ( Fig. S4 and Table 2

Extrahepatic recurrence prediction model
Following AIC-informed backward selection, the model included six of nine candidate predictor variables: sidedness of the primary tumour, T category, N category, RAS/BRAF V600E mutational status, and number and size of liver metastases; preoperative systemic treatment, preoperative CEA level, and DFI were not informative enough. Model HR values are shown in Table 2 (non-linear HR plots for continuous variables can be found in Fig. S5). In an exploratory analysis including an interaction term between RAS mutational status and preoperative systemic treatment, the model fit did not significantly improve (P = 0.194, Wald's D1 test).

Performance and validation of model
EHRFS, RFS, and OS differed according to quartiles of predicted EHR risk (Fig. 1). Six-month EHR rates in the low-, intermediate-, high-, and very high-risk patient groups were 6.5 (95 per cent c.i. 3.9 to 9.9), 15.0 (11.0 to 19.6), 20.3 (15.7 to 25.4), and 32.0 (26.4 to 37.7) per cent respectively. Likewise, the model showed good discrimination for RFS and OS.
The performance of the prediction model was further assessed by calibration and discrimination. The estimated and observed risks for EHR or death were well calibrated (Fig. 2). The observed to expected ratio was 1.015 (95 per cent c.i. 0.911 to 1.120). For discrimination, Harrell's C-index through 6 and 12 months was 0.663 (95 per cent c.i. 0.624 to 0.702) and 0.661 (0.632 to 0.689) respectively, and similar for Uno's C-index. The 6-and 12-month areas under the time-dependent ROC curves were 0.668 (95 per cent c.i. 0.626 to 0.709) and 0.671 (0.636 to 0.707) respectively (Fig. S6). The shrinkage factor obtained through internal validation was 0.86; shrunken HRs are shown in Table 2. The shrunken model yielded overoptimism-corrected 6-month risks for EHR or death of between 5.9 and 56.0 per cent (i.q.r. 12.9-22.0 per cent). The optimism-adjusted Harrell's C-index through 6 and 12 months was 0.643 (0.605 to 0.682) and 0.641 (0.612 to 0.669). Full model specifications are shown in Appendix S1.
The model was further validated for generalizability by internal-external cross-validation using three geographical regions, which indicated that models developed on the other regions showed adequate performance in each excluded geographical region (Fig. S7).

Decision curve analysis for net benefit when using model-guided CRLM treatment decisions
The potential net benefit of the model for clinical decision-making regarding local treatment of CRLMs was examined through decision curve analysis. EHR model-guided treatment of CRLMs (compared with non-informed decision-making by treating all or no patients) resulted in net benefit for patients for 6-month EHR risk thresholds of 0-40 per cent and 12-month EHR risk thresholds of 0-60 per cent (Fig. 3).

Discussion
In this study, a prediction model was developed for early EHR in a nationwide, population-based cohort of patients who had local treatment of CRLMs. The model incorporated tumour RAS/ BRAF V600E mutational status and sidedness of primary tumour alongside traditional prognostic factors. Early EHR after local CRLM treatment is of major clinical importance and can be predicted from routine clinical information. The EHR prediction model developed here discriminates between patients based on EHR rates, reflected in differing EHRFS, RFS, and OS. The EHR prediction model's expected generalizability is good.
Prediction models are increasingly being used, and can facilitate shared risk-informed decision-making for interventions, manage patient expectations, or select patients for inclusion in trials. However, clinical application of prediction models for local CRLM treatment is hampered by lack of generalizability, loss of predictive performance by simplification of models, and low clinical utility 37 . Published models were developed to predict RFS and OS. With increasing possibilities for repeated resections of CRLM recurrences with favourable survival outcomes 16,17 , RFS and OS prediction models become less relevant. The present study confirmed that about half of patients have a liver-limited first recurrence and experience long-term survival. Although RFS and OS are meaningful outcomes to manage expectations, EHRFS as outcome may guide clinical decisions for patients with CRLMs.
Local CRLM treatment should ideally be avoided in patients who experience early EHR (18.0 per cent of patients). These patients evidently have systemic disease, a poor prognosis, and are often not eligible for repeated local treatment [18][19][20][21][22][23] . The poor OS demonstrated in patients with early EHR (19.5 months in landmark analysis) is comparable to the expected OS of patients with metastatic colorectal cancer undergoing palliative systemic treatment 42 . Patients at high risk of early EHR are unnecessarily exposed to potential perioperative risks and may be harmed by delaying palliative systemic treatment, especially as a large proportion of the high-risk patients underwent major liver surgery as they had more extensive disease. The EHR prediction model can be used to confirm that local treatment should be pursued in low-risk patients. However, it is currently difficult for the EHR prediction model to identify patients with a sufficiently high predicted risk that would justify avoiding local CRLM treatment. The EHR prediction model may aid clinical decision-making by identifying moderate-high-risk patients for early EHR who may benefit from perioperative systemic treatment. A treatment strategy for these patients may be to initiate systemic treatment and, upon sustained response, carry out local treatment of CRLMs. Once externally validated, the EHR model will lend itself well for studies examining the optimal treatment by stratifying patients who are at moderate-high risk of early EHR.
The strength of the study is a nationwide cohort of patients encompassing 39 academic, teaching, and regional hospitals. The cohort had minimal loss to follow-up (below 1.0 per cent). Furthermore, the EHR prediction model included RAS and BRAF V600E mutational status, important prognostic factors. Only three previous prediction models included RAS and BRAF mutation status [13][14][15] , potentially owing to the low prevalence of BRAF mutations in patients with local treatment of CRLMs (approximately 2 per cent) 13 . In contrast to previous studies 28, 43 , there was no interaction between neoadjuvant treatment status and RAS mutational status here.
Limitations include a selected population based on primary tumour diagnosis in 2015 and 2016, with subsequent local treatment of CRLMs until January 2019 (no DFI beyond 4 years). The prediction model could not robustly specify site of recurrence, which may be relevant especially for patients with lung-only recurrences who can experience long-term survival after local treatment 44,45 . It was not possible to validate the prediction model externally beyond internal-external crossvalidation. The full EHR prediction model specifications have been provided to facilitate external validation in other patient cohorts.
The performance of the model could be improved further by including additional promising features that may better identify high-risk patients 15 . Examples include distinct histopathological growth patterns, the Immunoscore (based on T cell infiltration), a six-gene panel, and liquid biopsies (detecting circulating tumour DNA) [46][47][48][49] . Incorporating these features into an updated prediction model for local CRLM treatment may help identify patients at sufficiently high risk for early EHR to optimize the treatment strategy for such patients.

Funding
This research was supported by a grant from the Sacha Swarttouw-Hijmans Fund.