External validation and updating of prediction models for estimating the 1-year risk of low health-related quality of life in colorectal cancer survivors

Objectives: Timely identiﬁcation of colorectal cancer (CRC) survivors at risk of experiencing low health-related quality of life (HRQoL) in the near future is important for enabling appropriately tailored preventive actions


Introduction
Globally the population of colorectal cancer (CRC) survivors is growing [1e3].This growing population poses an increasing burden on health-care systems, as many survivors keep experiencing health problems and report decreased levels of health-related quality of life (HRQoL) in the years after diagnosis and treatment [4e6], with declines in levels of HRQoL mostly occurring in the first 6 months after diagnosis and treatment [7].To provide appropriate and tailored care to CRC survivors in order to prevent declining HRQoL, identification of individual CRC survivors who have an increased risk of experiencing low HRQoL in the future is important.
Accurate and timely identification of high-risk individuals is challenging in oncology practice.Risk prediction models can aid health-care providers with identifying CRC survivors who are likely to experience low HRQoL in the future and who therefore are eligible for interventions aimed at promotion of their HRQoL.Prediction models have been developed to predict survival in CRC patients [8e10], which can provide input for the decision-making process regarding cancer treatment.When cancer treatments are completed, however, these models are not helpful anymore for predicting, for instance, the impact of late and long-term adverse treatment effects on HRQoL of CRC survivors.Prediction models specifically aimed at HRQoL estimation are needed for that purpose.Although previous studies have investigated associations of clinical, personal, lifestyle, and psychosocial factors with HRQoL in CRC survivors [11e13], these factors were not combined into prediction models to be used in oncology practice for predicting the risk of HRQoL declines after treatment.
Therefore, we have recently developed and internally validated prediction models to estimate the 1-year risk of low HRQoL in seven relevant domains (i.e., global quality of life; cognitive, emotional, physical, role, and social functioning; and fatigue) [14], using data from a large prospective cohort of long-term CRC survivors [15].For model development, we used biopsychosocial HRQoL predictors that were selected based on evidence from previous association studies, summarized in an extensive systematic review [16].Excellent predictive performance was demonstrated during model development and internal validation in CRC survivors on an average 5 years postdiagnosis [14], thereby covering the more long-term consequences of the disease and treatment on HRQoL.However, it is necessary to externally validate the models in CRC survivors situated closer to the moment of diagnosis and treatment, since this is a more clinically relevant time frame for application of the models in oncology practice in order to timely identify individual CRC survivors at risk of experiencing low HRQoL earlier in the survivorship trajectory.If the predictive performance of the models is generalizable to short-term CRC survivors, this provides opportunities for tailoring of interventions, such as lifestyle and psychosocial interventions [17e24], aimed at safeguarding the HRQoL of high-risk individuals in the early post-treatment period.Moreover, the models can be used for selecting CRC survivors for studies evaluating the effectiveness of newly developed HRQoL interventions that could possibly be targeted at the modifiable predictors included in the models.
A common problem in the field of risk prediction modeling is that many models are being developed but only few models are externally validated, a crucial step to take before potential implementation into practice.Therefore, the purpose of the current study was twofold.First, we aimed to externally validate the previously developed prediction models for estimating the 1-year risk of low HRQoL in seven domains, using pooled data from two prospective cohorts of CRC survivors within the first year postdiagnosis.Second, we aimed to update the models to obtain a single set of HRQoL predictors to be used for all domains, producing more parsimonious models that are easier to implement in oncology practice, while still being useful for adequate risk prediction and identification of high-risk individual CRC survivors.

Study populations
For the current analyses, we pooled datasets of two prospective cohort studies (Figure 1): the ''Energy for life after ColoRectal cancer'' (EnCoRe) study and the ''COlorectal cancer: Longitudinal, Observational study on Nutritional and lifestyle factors that may influence colorectal tumor recurrence, survival, and quality of life'' (COLON) study.

The EnCoRe study
We used data of N 5 276 CRC survivors from the EnCoRe study, which is described in more detail What is new?

Key findings
External validation of previously developed prediction models for estimating the

What this adds to what was known?
The externally validated and updated prediction models could assist clinicians to early identify individual colorectal cancer survivors who are at increased risk of experiencing low health-related quality of life in the near future, providing opportunities for timely preventive actions.

What is the implication and what should change now?
As a next step toward possible implementation of the risk prediction models in oncology practice, the clinical impact of the models on improving or safeguarding health-related quality of life of colorectal cancer survivors needs to be evaluated.
elsewhere [25].In short, EnCoRe is an ongoing multicenter prospective cohort study for which adult stages IeIII CRC patients are being enrolled at diagnosis and followed up at 6 weeks, and at 6, 12, 24, and 60 months after treatment.Patients were recruited in three hospitals in the South-east of the Netherlands, and were excluded when diagnosed with stage IV CRC or in case of comorbidities obstructing successful participation (e.g., cognitive disorders such as Alzheimer disease).The study has been approved by the Medical Ethics Committee of the University Hospital Maastricht and Maastricht University, The Netherlands, and informed consent was obtained from all participants.Data from the 6-week (baseline) and 12month post-treatment measurements (follow-up) collected between April 2012 and November 2016 were available for pooling (Figure 1).

The COLON study
We also used data of N 5 1,320 CRC survivors from the COLON study, which is described in more detail elsewhere [26].In short, COLON is an ongoing multi-center prospective cohort study among adult stages IeIV CRC patients recruited in eleven hospitals in the Netherlands.CRC patients were included at diagnosis and followed up at 6 months and at 2 years and 5 years after diagnosis.Patients were excluded when having a history of CRC or (partial) bowel resection, chronic inflammatory bowel disease, hereditary CRC syndromes, dementia or another mental condition obstructing participation.The study has been approved by the Committee on Research involving Human Subjects, region Arnhem-Nijmegen, The Netherlands, and informed consent was obtained from all participants.Data from the 6-month (baseline) and 2year postdiagnosis measurements (follow-up) collected between August 2010 and September 2017 were available for pooling (Figure 1).

Data collection 2.2.1. Health-related quality of life
As reported previously [14], the prediction models aim to estimate at study baseline what the risk is of having low HRQoL in seven domains at follow-up approximately 1 year later.In both the EnCoRe and COLON study, HRQoL was measured at study baseline and follow-up with the European Organization for Research and Treatment of Cancer Quality of life QuestionnaireeCore 30 (EORTC QLQ-C30, Version 3.0) [27].Seven subscales were used to assess the following HRQoL domains: global QoL; cognitive, emotional, physical, role, and social functioning, and fatigue.For each subscale a sum score was calculated (0e100 points), with higher scores on global QoL and functioning scales representing better HRQoL, and higher scores on fatigue representing worse fatigue [27].Since the previously developed prediction models estimate the risk of having low HRQoL at follow-up, scores of the separate HRQoL subscales were dichotomized into low vs. normal/high scores based on previously published medium-to-large minimal important deteriorations (MID) in the EORTC QLQ-C30 subscales [28], as also described in the development study [14].For each domain, the low HRQoL group had a score !1 MID below the group mean score at baseline; the rest was included in the normal/high HRQoL group.Thus, individuals in the low HRQoL group either reported constantly low HRQoL scores at both baseline and follow-up, or experienced a clinically relevant deterioration from normal/high HRQoL scores at baseline to low scores at follow-up.

Predictors
Data collected within the EnCoRe and COLON cohorts on the predictors included in the previously developed prediction models were used for the present analyses.Briefly, the original prediction models included the following baseline predictors, as described in more detail before [14]: age (years); sex; socio-economic status (high, medium, or low); number of comorbidities, as measured by the Self-Administered Comorbidity Questionnaire (none, 1, or 2þ) [29]; time since diagnosis (years); stoma presence (yes/no); body mass index (kg/m 2 ); adherence (yes/no) to Dutch physical activity guidelines of at least 150 minutes/week of moderate-to-vigorous physical activity measured by the Short QUestionnaire to ASsess Health-enhancing physical activity [30]; anxiety and depressive symptoms measured by the Hospital Anxiety and Depression Scale [31]; baseline fatigue and HRQoL domain scores (EORTC QLQ-C30) [27]; chemotherapy (yes/no); radiotherapy (yes/no); tumor stage (TNM stages IeIV); current working status (yes/no); current smoking (yes/no); adherence (yes/no) to meat consumption recommendation according to the 2007 lifestyle guidelines of the World Cancer Research Fund/American Institute for Cancer Research (!500 gr meat per week [32]); social inhibition and negative affectivity measured with the Dutch 14-item Type D Personality Scale [33]; and subscale scores of micturition, pain, chemotherapy-related side effects, and stoma-related complaints, as well as a total score for gastrointestinal symptoms calculated by summing subscale scores of gastrointestinal problems, nausea/vomiting, constipation, defecation, and diarrhea (measured by the EORTC QLQ-CR29 module, version 2.1, with higher values indicating more complaints for all disease symptom scales) [34].Education level was used as a proxy for socioeconomic status, which was included in the original prediction models, as educational level has been shown to have similar predictor properties in relation to health [35,36].An overview of measurement instruments and methods for data collection in both the EnCoRe and COLON study is presented in Supplementary Table 1, as well as a comparison with the instruments and methods used in the development study.

Missing data
Prior to data analyses, incomplete data on predictors and HRQoL outcomes were imputed with multiple imputation using the mice package in R (Version 1.0.136e Ó 2009e2016 R Studio, Inc.) [37].Some predictors from the original models were not measured in the EnCoRe and/or COLON cohorts (Supplementary Table 1) and therefore only partially or not available for the present analyses.Anxiety and depressive symptoms, working status, and disease symptom scales (micturition, chemotherapy-related side effects, stoma-related complaints, pain, and gastrointestinal complaints) were not available for the COLON cohort, and negative affectivity and social inhibition were not available in both EnCoRe and COLON.To deal with these missing data, we added data from the cohort used for developing the models [14] to proceed with the multiple imputation, as recommended in simulation studies to minimize bias [38,39].In the multiple imputation procedure, we also took into account differences in time since diagnosis between the development cohorts (PROFILES) and the external validation cohorts (EnCoRe and COLON) by including time interactions.

Statistical analyses
To estimate the absolute risk of scoring low on seven HRQoL domains, multivariable logistic regression analyses were performed with the rms package in R [40].The analyses consisted of two parts: (1) external validation of the originally developed and internally validated models, and (2) model updating.In both parts of the analyses, models were recalibrated when needed.
In the first part of the analyses, the prediction models with their original regression parameters were externally validated by evaluating their predictive performance in the pooled EnCoRe and COLON dataset (model 1A).The models were then recalibrated by updating the intercepts only (model 1B), to make the average predicted probability equal to the observed overall event rate (i.e., the prevalence of low HRQoL at follow-up in the pooled EnCoRe and CO-LON cohorts), or by updating both the intercept and the slope (model 1C).For the latter, a logistic regression model was fitted with the linear predictor, which is the sum of all regression coefficients multiplied by their predictor variable value, as the only covariate to estimate the new intercept and a shrinkage factor for adjusting all regression coefficients.
In the second part of the analyses, the models (1Ae1C) for each of the separate HRQoL domains showing the best calibration were selected for further updating to obtain a single set of predictors for the seven HRQoL domains.The goal was to downsize the prediction models to create more parsimonious models that included less predictors and would be easier to implement in practice, without loss of predictive power.Therefore, we examined to which extent removal of predictors altered model performance, while in any case keeping the predictors for which the strongest evidence was reported [16] in the updated models (model 2A).Lastly, the downsized models were recalibrated again by updating the intercept only (model 2B) or both the intercept and slope (model 2C), as described above.
In both parts of the analyses, evaluation of model performance included assessment of discrimination, calibration, overall performance, and classification.Discrimination reflects ability of a model to distinguish between individuals with low vs. normal/high HRQoL based on estimated risks, as quantified by the area under the Receiver Operator Characteristic curve (AUC, with AUCO0.8 indicating good discrimination) [41].Calibration is a measure of the agreement between predicted probabilities and observed relative frequencies of low HRQoL, and is assessed by calibrationin-the-large (the average difference between predicted probabilities and observed relative frequencies, which should be small for adequately calibrated models), the HosmereLemeshow goodness-of-fit test (HeL, with P O 0.05 indicating adequate calibration), and by visually inspecting calibration plots showing agreement between predicted risk and observed prevalence of low HRQoL [42].We assessed overall model performance with Nagelkerke's R 2 , a measure of predictive strength (range 0e1) with higher values for better performance, and Brier scores that determine measures of model accuracy (0e0.25), with lower scores for better accuracy [43].In addition, for a range of predicted probabilities, sensitivity and specificity were determined as measures of classification for the updated models after recalibration.Sensitivity reflects the percentage of true-positive predictions given low HRQoL, and specificity reflects the percentage of true-negative predictions given no low HRQoL.Optimal threshold probabilities to be used as cutoffs for classification were defined for each model based on a high sensitivity (!80%), as also described in the development study [14].High sensitivity was prioritized over high specificity, since we deemed false-negative misclassifications more problematic than false-positive misclassifications when considering the nonhazardous and noninvasive nature of interventions aimed at HRQoL promotion.
All analyses were performed using R software (R Foundation for Statistical Computing Platform, Version 1.0.136Ó 2009e2016 RStudio, Inc.).The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis statement was used for the analyses and reporting [44,45].

Population characteristics
CRC survivors participating in the EnCoRe and COLON studies were on average 66 years old and 5.2 months postdiagnosis, and 35% was female (Table 1).Whereas the two studies were similar regarding most predictors, participants of the EnCoRe study had a slightly higher number of comorbidities, more frequently a stoma, more often received radiotherapy, and were less often adhering to physical activity guidelines.In the pooled cohort (N 5 1,596), 11e19% of participants were categorized as having low HRQoL at follow-up in the seven domains (Table 2), with slightly more participants showing consistently low HRQoL (50e59%) than deteriorating HRQoL between baseline and follow-up (36e50%).The EnCoRe and CO-LON study were similar with regard to the prevalence of low HRQoL at follow-up (Table 2).Of all participants, 868 (54.4%) had complete HRQoL data at follow-up.Participants who had missing HRQoL data at follow-up (N 5 728) were more often diagnosed with stage IV disease (10% vs. 3%) and less often with stage I disease (24% vs. 28%, P 5 0.004), and less frequently had received radiotherapy (17% vs. 21%, P 5 0.03).There were no differences regarding baseline HRQoL scores between the participants with missing vs. complete data.Differences between the development cohort and the external validation cohorts are shown in Supplementary Tables 2 and 3

External validation
First, the original models (model 1A) were externally validated, yielding good discrimination (AUC: 0.77e0.84),sufficient measures of overall performance, and reasonable calibration (Table 3).The calibration plots showed that most models overestimated the risk of CRC survivors to score low on HRQoL domains (Supplementary Figure 1).After recalibration (i.e., updating the intercepts and slopes of the models; models 1B and 1C, respectively), the calibration of the separate models improved, as the predictions were in closer agreement with observed relative frequencies of low HRQoL shown in the calibration plots (Supplementary Figure 1).Models with updated intercepts and slopes (model 1C) showed the best calibration for most HRQoL domains, which were therefore selected for the model updating.

Model updating
In order to obtain a single set of predictors for the seven HRQoL domains (model 2A), we removed the following predictors from model 1C: all EORTC QLQ-CR29 symptom scales, negative affectivity and social inhibition scores, tumor stage, adherence to meat consumption guidelines, and current working status.The updated models thus included the following unified set of 15 predictors: age, sex, education, comorbidities, time since diagnosis, stoma, body mass index, physical activity, anxiety, depression, chemotherapy, radiotherapy, smoking, and baseline fatigue and HRQoL scores (Table 4).Discrimination did not change considerably following removal of predictors (AUC: 0.77e0.85),and overall model performance measures were similar to those of the original models (Table 3).The calibration plots showed that the models now underestimated the risk of scoring low on HRQoL (Supplementary Figure 1).Updating the intercept (model 2B) improved the calibration plots for the separate HRQoL domains, with most predictions now lying close to the diagonal indicating good agreement between predicted probabilities and relative frequencies of low HRQoL.As updating the slope in addition to the intercept (model 2C) did not further improve the calibration, model 2B was chosen as the final updated model.

Measures of classification
Sensitivities and specificities of the updated models (model 2B) were derived at different thresholds of predicted risks of low HRQoL (plotted in Supplementary Figure 2).Sensitivity !80% was reached with the following optimal threshold probabilities as cut-off for positive predictions for the separate HRQoL domains (indicated by the gray areas in Supplementary Figure 2): 5% for role functioning; 5e10% for global QoL, cognitive functioning, emotional functioning, and fatigue; 10e15% for social functioning; and 5e15% for physical functioning.

Discussion
We externally validated previously developed prediction models for estimating the 1-year risk of low HRQoL in a Percentages are shown of the total number of participants with valid data.b Higher scores (range 0e100) on global QoL and functioning domains represent better HRQoL, whereas higher fatigue scores represent worse fatigue complaints.
c Persons were classified as having ''low HRQoL'' when their follow-up score differed !1 minimal important deterioration (MID) [19] from baseline mean.
d Persons had consistently low HRQoL when both their baseline and follow-up scores were !1 MID above/below the baseline mean.Persons had deteriorating HRQoL when they decreased from normal/high HRQoL at baseline to a follow-up score !1 MID below/above the baseline group mean (below mean for global quality of life and functioning domains; above mean for fatigue).The number of subjects with consistently low and deteriorating HRQoL do not always add up to the total number in the low HRQoL group due to missings at baseline.seven domains in CRC survivors within 6 months after diagnosis, using pooled data from two prospective cohorts.In addition, the models were updated by removing several predictors to yield a single predictor set for all HRQoL domains, producing downsized models that are more parsimonious and therefore easier to implement in oncology practice.All updated models had satisfactory predictive power after recalibration for identifying individual CRC survivors at increased risk of low future HRQoL in the seven domains, as indicated by adequate measures of overall model performance, discrimination, calibration, and classification.The updated models included 15 predictors in total, containing both nonmodifiable (age, sex, education, time since diagnosis, chemotherapy, radiotherapy, stoma, and comorbidities) and modifiable predictors (body mass index, physical activity, smoking, anxiety and depression, and baseline fatigue and HRQoL domain scores).
The prediction models that had previously been developed in long-term CRC survivors (approximately 5 years postdiagnosis) [14] were now externally validated and updated in short-term CRC survivors (approximately 5 months postdiagnosis), to assess whether models were generalizable to a time frame situated closer to the diagnosis and treatment during which larger declines in HRQoL are a AUC 5 Area under the Receiver Operator Characteristic curve; 95% confidence intervals (AUC !0.80 indicates good discrimination).b Nagelkerke's R2 is a measure of explained variance, ranging from 0 to 1 (higher is better).c Brier score is a measure of model accuracy, ranging from 0 (perfect) to 0.25 (worthless accuracy).d Calibration-in-the-large is the difference between the predicted probability and the observed relative frequency of low HRQoL.e HeL 5 HosmereLemeshow goodness-of-fit test is an indicator of calibration (agreement between observed and predicted values); P O 0.05 represents well-calibrated model (i.e., nonsignificant disagreement between observed and predicted values).
expected [7].We indeed observed that CRC survivors at baseline in the cohorts used for the current analyses reported somewhat lower HRQoL and functioning scores and higher fatigue scores on average than in the development cohort (Supplementary Table 2).This can likely be explained by the fact that having recently been diagnosed with and treated for cancer has a larger impact on an individual's physical and psychosocial well-being on the shortterm as compared to the long-term.In the initial period after diagnosis and treatment, CRC survivors need time to mentally process and cope with the cancer experience as a major life event (e.g., fears of cancer recurrence and death) and to adjust to changes in their lives as a result of the cancer and treatment (e.g., living with a stoma and dealing with treatment complications).Although the prevalence of low HRQoL at follow-up was lower in all HRQoL domains in the current study than in the development study (Supplementary Table 2), the percentages of survivors showing a clinically relevant deterioration of HRQoL scores between baseline and follow-up were higher in the current study than the development study.This would suggest that the early post-treatment period within 6 month after CRC diagnosis is a relevant time frame in the survivorship trajectory for the application of the risk prediction models to select high-risk survivors eligible for interventions aimed at HRQoL promotion.This time frame is considered a teachable moment during which survivors are open for interventions, providing a window of opportunity for taking tailor-made preventive actions.
The original prediction models included biopsychosocial predictors that had been selected based on evidence from studies showing associations with HRQoL, as summarized in our previously published systematic literature review [16].For the updated models, we selected predictors for which the evidence in the systematic review was strongest, and that at the same time are relatively easy to assess in clinical practice (e.g., no long questionnaires or burdensome tests required).Moreover, the models have the advantage that they contain several modifiable predictors (5 out of 15), including lifestyle and psychosocial factors which have been shown to affect HRQoL in previous intervention studies [17e24].It must be noted, however, that causal interpretations of relations between predictors and outcomes in risk prediction models should be made with great Chemotherapy (no 5 0; yes 5 1) 0.15 1.16 0.00 1.00 0.00 1.00 0.00 1.00 0.00 1.00 À0.18 0.83 0.00 1.00 Radiotherapy (no 5 0; yes 5 1) 0.16 1.17 0.00 1.00 0.00 1.00 0.00 1.00 0.00 1.00 0.00 1.00 0.00 1.00 Smoking (no 5 0; yes 5 1) 0.21 1.23 0.00 1.00 0.30 1.35 0.37 1.45 0.00 1.00 0.29 1.34 0.51 1.66 a Regression coefficients are the ln (odds) of change in the outcome, and can be used to calculate the probability of having low health-related quality of life 5 1/(1 þ exp [-Linear predictor]); Linear predictor 5 intercept þ sum of (regression coefficient * predictors); for categorical and dichotomous predictors the reference category was always coded as 0.
b Odds ratios (OR) are shown to give an estimation of the strength of each predictor, but no confidence intervals or error measures could be calculated after recalibration of the slopes of the models, i.e., the shrinkage of regression coefficients.Note that the magnitude of the OR depends on the scale on which the predictor is measured and that the OR should not be causally interpreted.
caution, as the interpretation of such models should focus on the predictor set as a whole instead of focusing on isolated predictors.Prediction models are not meant to help unravel which predictors actually cause an outcome, such as low HRQoL.
Information on potentially effective interventions for improving HRQoL in CRC survivors is provided by several systematic reviews.Multiple studies have reported increased HRQoL after physical, psychological, or behavioral interventions in cancer survivors, including CRC [17e24].Most evidence is available for exercise interventions, indicating that increasing post-treatment physical activity levels can improve HRQoL and functioning domains and decrease fatigue complaints.There is also evidence for beneficial effects of psychosocial interventions, such as cognitive behavioral therapy and psychoeducational or mindfulness-based interventions, on HRQoL and mental functioning domains.Nevertheless, some studies found less consistent effects of psychosocial interventions [46] or reported that not all domains of HRQoL were improved after exercise [18].Future studies should therefore investigate whether the application of the prediction models for HRQoL can help to make a better selection of CRC survivors for specific interventions.For that purpose, so-called clinical impact studies are needed to evaluate whether application of the prediction models in oncology practice facilitates decision-making regarding appropriate interventions, and thereby contributes to improving HRQoL outcomes in high-risk CRC survivors.In addition, to promote implementation of the prediction models in practice, they should ideally be transformed into, for example, an online risk calculator or nomogram for clinicians to use for estimating their patient's 1-year risks of low HRQoL in the seven domains.Such tools are likely to be very relevant as they provide individual risk profiles that may guide subsequent risk-stratified follow-up care and that could also be used for doctor-patient communication about future risks and (modifiable) predictors.Application of such tools, which includes the assessment of predictors as input for the risk estimation, is becoming more feasible in practice nowadays due to enhanced digitalization of patient records within hospitals and increased attention for patient-reported outcome measures (such as HRQoL scales) as part of patient care.
A strength of the current study was its large sample size.Regardless of the smaller number of persons with low HRQoL at follow-up in the current study as compared to the development study, statistical power was adequate for the external validation according to the recommended minimum of 100 events and 100 nonevents for all HRQoL outcomes [44,45].Another strength of the study is that we succeeded in downsizing the prediction models without losing predictive performance, so that a single set of predictors can be used for accurate estimation of 1-year risks of low HRQoL in multiple domains.Some limitations should also be mentioned.First, some predictors of the original models were not or only partially available in the external validation cohorts (i.e., nine in the COLON and two in the EnCoRe cohort).As recommended based on results from simulation studies [38,39], we imputed missing predictors by adding the data from the cohort that was used for model development to the external validation datasets during the multiple imputation procedure, accounting for differences in time since diagnosis.Even though the missing predictors were eventually not included in the updated models, we cannot exclude the possibility that this procedure might have introduced bias in the external validation part of the analyses.A second potential limitation was that it was not known if and how many participants of the COLON study still received treatment at the time point that was used as study baseline for the current analyses (i.e., 6 months postdiagnosis).Some COLON study participants could thus still have been receiving treatment (chemotherapy) at that time, which might have influenced HRQoL scores at study baseline.Finally, the prediction models were developed and validated in Dutch populations of CRC survivors.This may limit the generalizability and applicability of our findings to CRC survivor populations in other countries.External validation in populations other than Dutch CRC survivors is to be recommended.

Conclusion
In CRC survivors within 6 months after diagnosis, we have externally validated and updated risk prediction models for estimating the 1-year risk of low HRQoL in multiple domains.To the best of our knowledge, these are the first validated models for predicting future HRQoL in the specific population of CRC survivors.Based on a parsimonious set of evidence-based biopsychosocial predictors, the performance of the models to provide accurate risk predictions and to identify high-risk CRC survivors was satisfactory.The models could be used to develop risk prediction tools that are applicable in oncology practice for identifying individual CRC survivors who have an increased risk of experiencing low HRQoL in the future and who may benefit from interventions aimed at improving or safeguarding their HRQoL.Clinical impact studies are warranted in order to evaluate the added value of applying the models in practice before the models can be implemented.
CRediT authorship contribution statement MB, MW, SvK, and DR have contributed to the conception and design of the work.DR, SvK, LS, and MB have done the analyses and interpretation of the data and results.All authors have contributed to the writing of the manuscript and have approved the final version.

Fig. 1 .
Fig. 1.Flowchart of participants from study enrollment to the baseline and follow-up measurements within the two prospective cohort studies used for the present analyses: the EnCoRe study and the COLON study.The baseline for prediction was approximately 4e5 months postdiagnosis in the pooled cohort.Health-related quality of life domains as outcomes for prediction were measured on average approximately 1 year after baseline.Abbreviations: EnCoRe, Energy for life after ColoRectal cancer; COLON, COlorectal cancer: Longitudinal, Observational study on Nutritional and lifestyle factors.

Table 1 .
Predictors measured at baseline in the two nonimputed sets of the EnCoRe (N 5 276) and COLON study (N 5 1,320) separately and in the pooled dataset (N 5 1,596) Abbreviations: EnCoRe, Energy for life after ColoRectal cancer; COLON, COlorectal cancer: Longitudinal, Observational study on Nutritional and lifestyle factors; SD, standard deviation.aBaselinewas 6 wk after the end of treatment in EnCoRe, whereas it was |6 mo after diagnosis in COLON.bPercentages are shown of the total number of participants with valid data.cForall complaints a higher score (range 0e100) represents more complaints.A sum score (range 0e500) is created by summing up separate scores of gastrointestinal problems, nausea/vomiting, constipation, defecation, and diarrhea.dAdherence to physical activity guidelines defined as !30min/day of moderate-to-vigorous intensity physical activity on !5 days/wk.
e Adherence to WCRF/AICR meat consumption guidelines, defined as eating 500 g meat per week.

Table 2 .
Health-related quality of life (HRQoL) domains at baseline and follow-up of the two nonimputed sets of the EnCoRe (N 5 276) and COLON study (N 5 1,320) separately and in the pooled dataset (N 5 1,596)

Table 2 .
ContinuedAbbreviations: EnCoRe, Energy for life after ColoRectal cancer; HRQoL, health-related quality of life; COLON, COlorectal cancer: Longitudinal, Observational study on Nutritional and lifestyle factors; SD, standard deviation.

Table 3 .
Model performance measures of prediction models for seven domains of health-related quality of life: performance measures of original models and updated models after external validation and recalibration are presented

Table 4 .
Regression coefficients and odds ratios of included predictors in models 2B for seven domains of health-related quality of life