Comorbidity Risk Score in Association with Cancer Incidence: Results from a Cancer Screenee Cohort

The combined effects of comorbidities can cause cancer incidence, while the effects of individual conditions, alone, might not. This study was conducted to investigate the joint impact of comorbidities on cancer incidence. The dietary score for energy-adjusted intake was calculated by applying a Gaussian graphical model and was then categorized into tertiles representing light, normal, and heavy eating behaviors. The risk point for cancer, according to the statuses of blood pressure, total cholesterol, fasting glucose, and glomerular filtration rate was computed from a Cox proportional hazard model adjusted for demographics and eating behavior. The comorbidity risk score was defined as the sum of the risk points for four comorbidity markers. We finally quantified the hazard ratios (HRs) and 95% confidence intervals (CIs) for the association between the strata of the comorbidity risk score and cancer incidence. A total of 13,644 subjects were recruited from the Cancer Screenee Cohort from 2007–2014. The comorbidity risk score was associated with cancer incidence in a dose-dependent manner (HR = 2.15, 95% CI = 1.39, 3.31 for those scoring 16–30 vs. those scoring 0–8, P-trend < 0.001). Subgroup analysis still showed significant dose-dependent relationships (HR = 2.39, 95% CI = 1.18, 4.84 for males and HR = 1.99, 95% CI = 1.11, 3.59 for females, P-trend < 0.05). In summary, there was a dose-dependent impact of comorbidities on cancer incidence; Highlights: Previous studies have generally reported that hypertension, hypercholesterolemia, diabetes, and chronic kidney disease might predispose patients to cancer. Combining these chronic diseases into a single score, this study found a dose-dependent association between the data-driven comorbidity risk score and cancer incidence.


Introduction
Cancer has been a global health issue over the past few decades. According to GLOBOCAN, there were an estimated 18.1 million new cancer cases and 9.6 million cancer-related deaths in 2018 [1]. In addition to cancer, other noncommunicable diseases, including cardiovascular disease (CVD), chronic respiratory diseases, and diabetes, are leading causes of death worldwide [2,3]. Although cancer and these chronic conditions are known to share common risk factors, certain comorbidities might independently contribute to the development of cancer [4]. To date, several studies have found a large number of risk factors for cancer. The International Agency for Research on Cancer, which classified these carcinogens into different groups based on their relationship with cancer risk, did not consider comorbidity to be a risk factor for cancer [5]. Additionally, the cancer prevention program mostly focuses on lifestyle risk factors such as nutrition and physical activity [4].

Study Population
The details of the Cancer Screenee Cohort are described elsewhere [21]. Between October 2007 and July 2014, a total of 13,644 subjects completed the written consent form and volunteered to provide information on their medical history, clinical test results, and dietary consumption ( Figure 1). Those with missing data on dietary intake or with unrealistic data for energy intake (<500 or >4000 kcal) were excluded. Among the 8597 individuals identified at baseline, 612 subjects who were previously diagnosed with cancer and 2342 subjects who had missing values regarding demographics and comorbidities were additionally excluded. Because cancer has a long latency, it was possible for a tumor to be present but clinically undetected at the time of participation in the study. Hence, we excluded subjects diagnosed with cancer within 1 year to minimize reverse causality. As a result, a total of 5606 subjects were included in the final analysis. The Institutional Review Board of the National Cancer Center approved this study protocol (number NCCNCS-07-077).

Variable definition
Cancer incidence was identified via diagnosis and classified according to the International Classification of Diseases-10 codes. Time to cancer incidence was defined as the interval between the date of study enrollment and the date of receiving a new diagnosis of cancer. The censoring date was defined as the last follow-up date (31 December 2016) or the date of death from a noncancer cause.
Regarding comorbidities, we selected blood pressure, total cholesterol, fasting glucose, and the glomerular filtration rate (GFR), together with self-reported hypertension and diabetes. The Joint National Committee guidelines recommended dividing blood pressure (mmHg) into normal (<120/80, not receiving therapy and not previously diagnosed with hypertension), prehypertension (120-139/80-89, not receiving therapy and not previously diagnosed with hypertension), and hypertension (≥140/90, receiving therapy or previously diagnosed with hypertension) [22]. Total cholesterol (mg/dL) was divided into low (<180, not taking drugs and not previously diagnosed with dyslipidemia), normal (180-200, not taking drugs and not previously diagnosed with dyslipidemia), and elevated (>200, taking drugs or previously diagnosed with dyslipidemia) [23]. Fasting glucose (mg/dL) was classified as normal (<110, no underlying treatment and not previously diagnosed with diabetes) or prediabetes and diabetes (≥110, underlying treatment or previously diagnosed with diabetes). The GFR, which was calculated by using the Modification of Diet in Renal Disease equation [24], was categorized into ≥90, 60-89, and <60 mL/min/1.73 m 2 . The covariates included age (years), sex (male and female), marital status (married, cohabitant and others), education (<high school, high school graduate, and ≥college), monthly income (<2 million, 2-4 million, and ≥4 million South Korean won (KRW)), smoking (never, past, and current), drinking (never, past, and current), physical activity (yes and no), body mass index (BMI) (<23, 23-24.9, and ≥25 kg/m 2 ), and eating behavior (light, normal, and heavy). Details of these co-factors of medical history are summarized in Tables A1-A2. Figure A1 shows how the dietary intake score was calculated. Briefly, we applied a Gaussian graphical model (GGM) to identify the dietary network, which was composed of 16 food groups as nodes and their pairwise partial correlations as edges. The network structure was estimated using the extended Bayesian information criteria (EBIC) model selection set at 0.5 [25,26]. The eigenvector centrality of the GGM-identified network was computed as the weight of the nodes [27], and the dietary intake score was defined as the sum of the amount consumed in each food group with their respective weights. The dietary intake score therefore indicates the total energy-adjusted intake-in grams/day-for the whole diet of the 16 food groups; therefore, a higher dietary intake score represented a higher amount of dietary consumption (g/day) after adjusting for the weights of the 16 food groups. The dietary intake score was then categorized into low, medium, and high tertiles, which correspond to light, normal, and heavy eating behaviors, respectively.

Statistical Analysis
We computed the comorbidity risk score following the procedure in Sullivan et al. [28], which was developed based on the Heart Framingham Study. Briefly, we used a Cox regression model to obtain the coefficient of cancer incidence related to each chronic condition, adjusting for age and other demographic variables. Then, the risk point corresponding to each chronic condition was obtained by dividing the above-calculated regression coefficients for comorbidities by the coefficient for a one-year increase in age. The comorbidity score for each individual was defined as The covariates included age (years), sex (male and female), marital status (married, cohabitant and others), education (<high school, high school graduate, and ≥college), monthly income (<2 million, 2-4 million, and ≥4 million South Korean won (KRW)), smoking (never, past, and current), drinking (never, past, and current), physical activity (yes and no), body mass index (BMI) (<23, 23-24.9, and ≥25 kg/m 2 ), and eating behavior (light, normal, and heavy). Details of these co-factors of medical history are summarized in Tables A1 and A2. Figure A1 shows how the dietary intake score was calculated. Briefly, we applied a Gaussian graphical model (GGM) to identify the dietary network, which was composed of 16 food groups as nodes and their pairwise partial correlations as edges. The network structure was estimated using the extended Bayesian information criteria (EBIC) model selection set at 0.5 [25,26]. The eigenvector centrality of the GGM-identified network was computed as the weight of the nodes [27], and the dietary intake score was defined as the sum of the amount consumed in each food group with their respective weights. The dietary intake score therefore indicates the total energy-adjusted intake-in grams/day-for the whole diet of the 16 food groups; therefore, a higher dietary intake score represented a higher amount of dietary consumption (g/day) after adjusting for the weights of the 16 food groups. The dietary intake score was then categorized into low, medium, and high tertiles, which correspond to light, normal, and heavy eating behaviors, respectively.

Statistical Analysis
We computed the comorbidity risk score following the procedure in Sullivan et al. [28], which was developed based on the Heart Framingham Study. Briefly, we used a Cox regression model to obtain the coefficient of cancer incidence related to each chronic condition, adjusting for age and other demographic variables. Then, the risk point corresponding to each chronic condition was obtained by dividing the above-calculated regression coefficients for comorbidities by the coefficient for a one-year increase in age. The comorbidity score for each individual was defined as the sum of the four risk points for four chronic conditions. The differences between the cancer and noncancer groups and among groups with different comorbidity scores according to anthropometric factors and the 16 food groups were explored using chi-square tests for categorical variables and t-tests and ANOVA for continuous variables. The determination of the time to cancer incidence according to the comorbidity risk score was made via Kaplan-Meier estimates. The association between the comorbidity risk score and cancer incidence was finally investigated using the Cox proportional hazards model. We assigned the mean values of the comorbidity risk score to test the linear trend across the strata.
Given that a previous study did not support the association between total cholesterol and cancer risk [29], we excluded total cholesterol from the comorbidity risk score in the sensitivity analysis. We also performed the analysis while considering all the participants, regardless of the time to cancer diagnosis, and compared the results with our main findings.

Dietary Scores of Study Participants
The partial correlation network of dietary intake was identified from the GGM ( Figure 2). The partial correlation of diet consumption (g/day) between the two food groups controlling for remaining food groups in the network is reported in Table 1. The strongest regularized partial correlation was observed between 'oils and fats' and 'sugars and sweets' (0.70), followed by 'seasonings' and 'potatoes and starches' (0.37) or 'vegetables' (0.34).
The amount of daily consumption and the weights of the 16 food groups are reported in Table 2. The intake of 16 food groups was not significantly different between cancer incidence and noncancer groups (p > 0.05). The total energy-adjusted intake was 295.1 ± 69.0, 483.2 ± 54.5, and 834.4 ± 276.1 (g/day), for subjects with light, normal, and heavy eating behaviors, respectively. the sum of the four risk points for four chronic conditions. The differences between the cancer and noncancer groups and among groups with different comorbidity scores according to anthropometric factors and the 16 food groups were explored using chi-square tests for categorical variables and t-tests and ANOVA for continuous variables. The determination of the time to cancer incidence according to the comorbidity risk score was made via Kaplan-Meier estimates. The association between the comorbidity risk score and cancer incidence was finally investigated using the Cox proportional hazards model. We assigned the mean values of the comorbidity risk score to test the linear trend across the strata. Given that a previous study did not support the association between total cholesterol and cancer risk [29], we excluded total cholesterol from the comorbidity risk score in the sensitivity analysis. We also performed the analysis while considering all the participants, regardless of the time to cancer diagnosis, and compared the results with our main findings.

Dietary Scores of Study Participants
The partial correlation network of dietary intake was identified from the GGM ( Figure 2). The partial correlation of diet consumption (g/day) between the two food groups controlling for remaining food groups in the network is reported in Table 1. The strongest regularized partial correlation was observed between 'oils and fats' and 'sugars and sweets' (0.70), followed by 'seasonings' and 'potatoes and starches' (0.37) or 'vegetables' (0.34).
The amount of daily consumption and the weights of the 16 food groups are reported in Table  2. The intake of 16 food groups was not significantly different between cancer incidence and noncancer groups (p > 0.05). The total energy-adjusted intake was 295.1 ± 69.0, 483.2 ± 54.5, and 834.4 ± 276.1 (g/day), for subjects with light, normal, and heavy eating behaviors, respectively.    Nodes reflect food groups, and edges reflect the conditional dependencies between food groups. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of the edges represents the strength of correlations. Table 3 presents the associations between individual comorbidities and cancer incidence. A borderline increased incidence of cancer was observed in those with hypertension (adjusted HR = 1.53, 95% CI = 1.02, 2.29). The significant associations in the model fully adjusted for all possible covariates were observed for hypertension (HR = 1.56, 95% CI = 1.02, 2.39) and normal total cholesterol (HR = 1.51, 95% CI = 1.03, 2.19). The comorbidity risk scores were highest in subjects with a GFR ≥90 mL/min/1.73 m 2 (score = 9). Intermediate risk scores were assigned to those with hypertension (score = 8) and normal cholesterol (score = 8), followed by low cholesterol (score = 5) and prediabetes and diabetes (score = 5). Subjects with elevated blood pressure and GFR 60-89 mL/min/1.73 m 2 were at low risk (score = 2).

Baseline Characteristics of the Study Participants
During the median follow-up time of 5.34 years (interquartile range (IQR) = 4.03-6.45, total 29,145 person-years), 176 patients were newly diagnosed with cancer. Cancer subjects were observed to be significantly older than noncancer participants, with age at baseline in the two groups of 55.8 ± 8.7 years and 52.5 ± 8.2, respectively (p < 0.001). Except employment status (p = 0.02), other demographic characteristics and GGM-identified dietary scores were not significantly different between the cancer and noncancer groups (Table 4). In contrast, most of the factors, including age, sex, education, employment, tobacco smoking, alcohol consumption, and BMI, were unequally distributed among groups stratified by comorbidity risk scores (p < 0.001). The association between the comorbidity risk score and cancer incidence is detailed in Table 5. Compared with participants scoring 0-8, those whose scores were 16-30 had a 127% increased risk (HR = 2.27, 95% CI = 1.49, 3.46) of cancer in the crude model. A significantly positive association was still observed in the fully adjusted model, with HR = 2.15, 95% CI = 1.39, 3.31. The linear trend test suggested that there might be a dose-dependent statistical association between the comorbidity risk score and cancer incidence (p < 0.001).

Comorbidity and Cancer Incidence
In the subgroup analysis by sex, the cancer incidence was higher among subjects who scored 16-30 than among those who scored 0-8, with HRs (95% CIs) of 2.39 (1.18, 4.84) for males and 1.99 (1.11, 3.59) for females. There was also the dose-dependent relationship in the sex-specific analysis (p = 0.01 for both males and females).   The association between the comorbidity risk score and cancer incidence is detailed in Table 5.
Compared with participants scoring 0-8, those whose scores were 16-30 had a 127% increased risk (HR = 2.27, 95% CI = 1.49, 3.46) of cancer in the crude model. A significantly positive association was still observed in the fully adjusted model, with HR = 2.15, 95% CI = 1.39, 3.31. The linear trend test suggested that there might be a dose-dependent statistical association between the comorbidity risk score and cancer incidence (p < 0.001).
In the subgroup analysis by sex, the cancer incidence was higher among subjects who scored 16-30 than among those who scored 0-8, with HRs (95% CIs) of 2.39 (1.18, 4.84) for males and 1.99 (1.11, 3.59) for females. There was also the dose-dependent relationship in the sex-specific analysis (p = 0.01 for both males and females).

Discussion
In this prospective cohort study, we found nonsignificant associations between comorbidity markers, including blood pressure, fasting glucose, and the GFR, and cancer incidence. However, after combining the simultaneous effect of the four comorbidities into one risk score, the risk of incident cancer was 115% higher among subjects who scored 16-30 than among those who scored 0-8. Sex-specific subgroup analyses also showed significant associations. The relationship was observed to be dose-dependent.
Although categories of blood pressure (p < 0.001), fasting glucose (p < 0.001), and chronic kidney disease (p = 0.01) were significantly different among total cholesterol groups (Table A7), a further analysis of excluding total cholesterol from the comorbidity risk score was performed. The comorbidity risk score for each category was, therefore, lower than those in the main analysis (Tables 5 and A4). The small number of cases among males who scored 0-2 might result in the large 95% CIs. Furthermore, nonsignificant associations were observed for the subgroup analysis of females. Thus, considering total cholesterol with other chronic conditions was necessary to detect the simultaneously significant effect of comorbidities on sex-specific cancer incidence.
In the sensitivity analysis of participants regardless of time to cancer diagnosis, compared to findings from the main analysis, the point estimates and their 95% CIs for the association between individual comorbidities and cancer incidence tended to be closer to null (Tables 3 and A5). The risk point according to the chronic conditions of blood pressure, total cholesterol, and fasting glucose, therefore, tended to increase after excluding subjects with cancer diagnosis within 1 year (Tables 3  and A5). Additionally, while males whose comorbidity risk scores were in the highest category had a significantly higher risk of cancer (HR = 2.39, 95% CI = 1.18, 4.84, Table 5), the association was not observed when including early cases in the analysis (HR = 1.79, 95% CI = 0.92, 3.52, Table A6). These changes can be partially explained by the reverse causality effect of subjects with undetected tumors at baseline [30,31]. Thus, excluding early cases from our analysis minimized the underestimate of effect associations.
Regarding the dietary score, this study applied the GGM, which has been widely used in research on genetics, psychology, and climate. Iqbal et al. also used the GGM to identify major dietary patterns and investigate their associations with type 2 diabetes, cardiovascular disease, and cancer [32]. However, the quantitative measurement of the dietary score using centrality indices of nodes in the network has still not been performed. Other data-derived methods, such as principal component analysis and reduced rank regression, have been manipulated for data dimension reduction purposes [33]. However, the GGM is considered a novel approach to describe the partial correlations between food groups [32].
Several dietary scores have been developed to elucidate the role of dietary intake in the etiology of cancer [34,35]. Recently, Lassale et al. systematically reviewed the available dietary indices, which were used to investigate the association of diet with depression; these included the Mediterranean diet, the healthy eating index, dietary approaches to stop hypertension, and the dietary inflammatory index [36]. Although different approaches might be associated in the same direction with a health outcome, the posteriori scores of the factor analysis method were reported to be better than the priori scores of common methods in the evaluation of coronary heart disease risk [37].
In terms of the comorbidity risk score, the Charlson comorbidity index and the Elixhauser score are commonly used to determine the severity of health conditions [38,39]. The Charlson comorbidity index was originally developed with 19 conditions, based on general internal medicine service claims and the mortality rate of 607 patients over one month [40]. The Elixhauser comorbidity index consists of 30 conditions based on the International Classification of Diseases diagnosis codes and was developed to predict hospital resource use and in-hospital mortality [41]. However, data on some rare conditions might not be available for our dataset from the general healthy population; thus, we developed the data-driven approach with a risk score algorithm. The comorbidity risk score derived from the Cox model has been widely applied in several studies to predict the risk of hepatocellular carcinoma in the US, Taiwanese, Japanese, Chinese, and Korean populations [42].
Despite its strengths, this study has several limitations. First, we used data from a cancer screening program of the National Cancer Center; thus, this might not represent the whole general Korean population. Second, macro-and micronutrients were not involved in our dietary score assessment models. As nutrients generally did not have the same unit, we were unable to combine all the nutrients in a single dietary score. Because food intake was considered the covariate and we focused on the effects of comorbidities, the daily amount of food consumed was used to reflect the dietary status of individuals. Third, the number of incident cases was limited due to the short follow-up duration, which resulted in larger 95% CIs for the sex-specific subgroups than for the entire study population. Additionally, we were not able to perform subgroup analyses by cancer type. Last, the history of chronic diseases and drugs currently being taken were obtained by subject self-report only and not from medical records. However, the information was obtained via several separate but related questions to minimize recall bias. Regarding medical history, subjects were asked about their names, their duration of taking drugs in years and months, and their frequency of taking drugs. Additionally, subjects were asked about the year when a chronic disease was first diagnosed and whether the disease had been treated or not. These self-reported data were combined with blood test results to classify the comorbidity status.
The evidence for the association between high cholesterol and cancer incidence is controversial. In addition, cholesterol was reported to activate oncogenic Hedgehog signaling and mTORC1, which might result in the differentiation and proliferation of cells, tumor formation, and metastasis [43]. Furthermore, abnormal cholesterol levels could impact the structure of lipid rafts, which is known to be a vital structure involved in cancer signaling [43]. Epidemiological studies also support positive associations between cholesterol levels and breast, prostate, and colorectal cancers [44]. However, low cholesterol was reported to be associated with a higher risk of cancer in Korean, Japanese, Taiwanese, and Caucasian populations [4,23]. In contrast, Kim et al. performed a meta-analysis of approximately 65,000 individuals from randomized controlled trials, and no beneficial effect of cholesterol-lowering medications on cancer prevention was found (pooled relative risk 0.97, 95% CI = 0.92, 1.02) [45]. Subgroup analysis by cancer type, statin type, and country showed nonsignificant associations [45].
Nevertheless, our study showed a significant relationship between total cholesterol and overall cancer incidence.
Meta-analyses reported an approximately 30% increased risk of colorectal cancer among the population with diabetes mellitus [46,47]. The Korean Multicenter Cancer Cohort study conducted from 1993 to 2005 with a median follow-up of 12 years found that the joint consideration of fasting glucose and a history of diabetes mellitus was not significantly associated with colorectal cancer (HR = 1.54, 95% CI = 0.97, 2.43) [48], which was consistent with our findings. However, an increased risk of colorectal cancer was observed in diabetes patients when the duration of the follow-up was more than five years (HR = 1.61, 95% CI = 1.02, 2.56) [48].
Evidence from a recent comprehensive meta-analysis showed significant associations between blood pressure and the risks of kidney, colorectal, and breast cancers, but not the risks of cancer of the stomach, gallbladder, pancreas, lung, cervix, prostate, bladder, and brain [49]. Because of the limited number of cancer cases, we did not perform a subgroup analysis by cancer type. However, we still found a significant association between hypertension and total cancer incidence. Despite the nonsignificant association, the difference between the point estimates of the coefficient for the effect of elevated blood pressure and the coefficient for a one-year increase in age was substantial, and elevated blood pressure still contributed to the risk score.
In the current study, the highest comorbidity risk score was assigned to those who had a normal GFR ≥90 mL/min/1.73 m 2 . Several studies have suggested that a high GFR is associated with an increased risk of cancer [4]. This could be explained by patients whose kidney function was mildly or moderately impaired, as a decrease in the GFR is more likely to develop due to other kidney-related diseases and not cancer. Similarly, although we did not observe a significant association between the GFR and cancer incidence in our study, a high GFR was assigned a high-risk score.

Conclusions
In summary, the current study found that comorbidities had a joint dose-dependent impact on cancer incidence. These findings may be helpful for the development of cancer prevention programs targeting the management of comorbidities. Further population-based prospective studies with substantial follow-up periods are needed to confirm the association among different cancer subtypes. Funding: This study was supported by a grant from the National Cancer Center Korea (1910330). The funders had no role in the design, analysis, or in writing this article.

Conflicts of Interest:
The authors declare no conflict of interest.   The daily weight intake (g/day) of food items were estimated according to the consumption frequency (never or rarely, once a month, 2-3 times per month, once or twice a week, 3-4 times per week, 5-6 times per week, once a day, twice a day, and 3 times per day) and the portion size (small, medium, and large).     Data are presented as counts (percentages). p-values were obtained from chi-square tests. Figure A1. Process for identification of dietary intake score. Figure A1. Process for identification of dietary intake score.

Appendix A
Cancers 2020, 12, x 22 of 25 Figure A2. Kaplan-Meier estimates of cancer-free probability by comorbidity risk scores after removing total cholesterol from comorbidity risk score. Figure A2. Kaplan-Meier estimates of cancer-free probability by comorbidity risk scores after removing total cholesterol from comorbidity risk score. Figure A2. Kaplan-Meier estimates of cancer-free probability by comorbidity risk scores after removing total cholesterol from comorbidity risk score. Figure A3. Kaplan-Meier estimates of cancer-free probability by comorbidity risk scores in subjects regardless of time to cancer diagnosis.