Epidemiologic Associations Between Inflammatory Bowel Disease and Hodgkin Lymphoma or Multiple Sclerosis

Background and Aims Epidemiologic evidence suggests that Hodgkin lymphoma (HL) and multiple sclerosis (MS) share a common set of risk factors with Crohn’s disease (CD) and ulcerative colitis (UC). It was hypothesized that such shared risk factors would lead to similar geographic distributions of these 4 diagnoses and their concurrence in identical patients. Methods All subjects with HL, MS, CD, or UC were identified in the complete Inpatient Standard Analytic File of the Centers for Medicare and Medicaid Services from 2018. In a cross-sectional study, we evaluated whether the frequencies of HL, MS, CD, and UC occurrences among different US states were statistically correlated with each other. In a case-control study, the observed concurrences of each 2 of the 4 diagnoses were compared with their expected frequencies in the overall Medicare population by calculating odds ratios with their 95% confidence intervals. Results The total Centers for Medicare and Medicaid Services population comprised 6,462,321 unique patients, of whom 8027 presented with HL, 42,934 with MS, 40,623 with CD, and 32,521 with UC. Statistically significant positive correlations (r) with P < .001 were found between HL and MS (r = 0.50), HL and CD (0.46), HL and UC (0.68), MS and CD (0.66), MS and UC (0.72), and CD and UC (0.68). Any inflammatory bowel disease was significantly associated with a diagnosis of concurrent HL (odds ratio: 1.22, 95% confidence interval: 1.01–1.48) or MS (1.35, 1.25–1.46). Conclusion The epidemiologic associations of inflammatory bowel disease with HL or MS may reflect a common pathway in the etiology or pathogenesis of these diseases.


Introduction
T he causation of both types of inflammatory bowel disease (IBD), that is, Crohn's disease (CD) and ulcerative colitis (UC), is still unknown.The study of their epidemiology serves to reveal patterns of disease variation with respect to time, geography, or different demographic groups.It is hoped that eventually such epidemiologic patterns will provide clues about potential environmental risk factors that influence the occurrence of IBD.A particular risk factor could be revealed if it varied by geography, time, or demographics in a similar fashion as IBD itself.In general, large patient populations are needed to study the epidemiologic variations associated with relatively rare diagnoses, such as CD and UC.
2][3] Several previous epidemiologic studies have shown similar epidemiologic variations of HL, MS, CD, and UC, which could suggest that infection with EBV also contributes to the etiology of IBD.A previous meta-analysis showed that patients with CD or UC harbor a 1.5-fold increased risk for concurrent MS. 4 Case series and population-based cross-sectional studies suggested an increased risk for HL in patients with CD and UC. 5,6A casecontrol study in the US veteran population suggested that all 4 diagnoses tend to significantly coincide in identical patients. 7The long-term time trends of HL, MS, CD, and UC are characterized by strikingly similar temporal variations. 8,9astly, mortality data for the 4 diagnoses also revealed their similar geographic distributions within the United States as well as among different countries across the globe. 10,11e hypothesized that an epidemiologic study of associations among HL, MS, CD, and UC in the US Medicare population would confirm the previously observed relationships among the 4 diagnoses.The aim of the present study was to utilize the database of the Centers for Medicare and Medicaid Services (CMS) for the analysis of the geographic distributions of these 4 diagnoses across the US and test whether these diagnoses concur in identical patients.

Methods
The study utilized the Inpatient Standard Analytic File of the CMS of the year 2018.The dataset from 2018 represented the most recent annual datafile from CMS that was still unaffected by the subsequent COVID-19 epidemic.The data resides in the public domain and can be requested through the CMS website.The data of individual patients were deidentified, and all analyses dealt with aggregate data only.For these reasons, the studies were exempt from the need to obtain informed consent from individual patients or approval by the institutional review board.
The present study included all inpatient records within the 2018 data file.In addition to an admitting diagnosis and a principal diagnosis, each record could contain up to 25 diagnoses.De-identified participant numbers were used to aggregate multiple records belonging to the same patient into a single entry used for analysis.Aggregate participant entries used the demographic information pertaining to their first available record and all diagnoses accumulated during multiple inpatient encounters.The occurrence of any diagnosis was determined based on its coding according to the 10th revision of the International Classification of Diseases (ICD10).Patients with CD or UC were identified based on their corresponding ICD10 codes K50 and K51, respectively.Patients with HL were identified based on the ICD10 codes C81 or Z85.71; patients with MS were identified based on the ICD code G35.No further 4-letter subcodes or any other diseases were considered in the present analysis.Besides the ICD10 codes, we also extracted demographic data pertaining to the patients' ethnicity (White, Black, Hispanic, Asian, or others), sex, and age.No data pertaining to medications were available in the data file.
In a cross-sectional study, we evaluated whether the relative frequency of HL, MS, CD, and UC occurrence among different states were statistically correlated with each other.Frequency of occurrence was calculated as the prevalence of patients with HL, MS, CD, or UC per 10,000 of the total population of all inpatients from the same state and year.Using linear regression analysis, we calculated 6 Pearson's correlation coefficients (r) between the geographic distributions of each 2 of the 4 diagnoses of interest.We also used weighted regression analysis to adjust for the different population sizes in different states.
In a case-control study, we evaluated whether the observed concurrences of HL, MS, CD, and UC were statistically different from the expected concurrences based on the overall frequency of each individual diagnosis in the total Medicare population of the same year.For univariate comparisons of disease frequency alone or concurrently with another disease, we calculated odds ratios with their 95% confidence intervals.The concurrence of any 2 diagnoses was considered statistically significant, if the 95% confidence interval of the corresponding odds ratio did not include unity.Differences in the demographic characteristics (age, sex, and ethnicity) between case and control subjects were assessed using t-tests or chi-square analysis.We also used multivariate logistic regressions to adjust the odds ratios for the potential confounding influences of demographic characteristics (age, sex, and race/ethnicity).

Results
The Medicare data file of 2018 contained 10,982,347 records of 6,462,321 unique patients, of whom 40,623 presented with CD, 32,521 with UC, 8027 with HL, and 42,934 with MS.The comparison (control) population of Medicare patients comprised 6,340,437 unique patients without any of the 4 diagnoses.Table 1 contains additional stratifications of the data by age, sex, and ethnicity.Patients with either MS or CD tended to be younger than the control population (P < .001).The patient populations with CD, UC, and MS comprised more females, whereas the populations with HL comprised fewer females than the comparison population (P < .001).Lastly, the fraction of Caucasians was slightly higher in all 4 disease groups than in the control population, the difference being more striking in IBD than HL or MS (P < .01).
Table 2 contains the 4 disease populations and the total Medicare population stratified by state of residence.A group of 6003 patients resided outside the 50 states of the Union, Puerto Rico, or the District of Columbia.MS had the highest frequency of occurrence, followed by CD and UC.HL was the least frequent of the 4 diagnoses.For each diagnosis, the frequency rates of occurrence varied between 2.9 and 8.6-fold among different states, with HL and CD showing the smallest and largest state-related variation, respectively.The geographic distributions of disease prevalence were characterized by a slight north-south gradient, however, with multiple exceptions to the rule.For instance, several of the northern states (Massachusetts, Michigan, New Hampshire, Oregon, Washington, and Wisconsin) were characterized by relatively higher rates, whereas several of the southern states (Alabama, Arizona, Arkansas, Georgia, Kentucky, and Tennessee) were characterized by relatively lower rates.Using the entire dataset of all 50 states, Puerto Rico, and the District of Columbia, statistically significant positive correlations (with P < .01)were found between HL and MS (r ¼ 0.36), HL and CD (0.46), HL and UC (0.58), MS and CD (0.66), MS and UC (0.69), CD and UC (0.68).The regression analyses including HL were partly compromised by the low patient counts in the smaller states.Restricting the analysis to 35 states with the largest populations slightly improved the 6 correlation coefficients among the 4 diagnoses (Figure).Using a weighted regression analysis did not further improve the correlation coefficients.
Table 3 lists the concurrences of IBD with HL or MS.A diagnosis of CD was significantly more frequently associated with a concurrent diagnosis of HL or MS.A diagnosis of UC was significantly associated with a concurrent diagnosis of MS, but not HL.Overall, the diagnosis of any IBD was significantly associated with a concurrent diagnosis of HL or MS.No further improvement could be achieved by adjusting the odds ratios to the demographic characteristics.

Discussion
Using the electronic dataset of the entire Medicare inpatient population from 2018, the present epidemiologic analysis focused on potential associations between IBD and HL or MS.The investigators hypothesized that HL, MS, CD, and UC would be characterized by similar geographic variations within the US and that these 4 diagnoses would also tend to coincide more frequently in identical patients.Both hypotheses were confirmed by the results of the present analyses.In previous studies, the analysis of the geographic distribution was based solely on mortality data. 10,11The present analysis reveals that the similarity in the geographic distributions of the 4 diagnoses is found in other types of healthcare statistics as well.The coincidence of the 4 diagnoses in identical patients was initially observed in the US veteran population but had not yet been confirmed in other datasets of current healthcare records. 7Overall, the present results lend additional support to the hypothesis of a shared environmental risk factor or a common pathway in the pathogenesis of these 4 distinct diagnoses.
The long-term time trends of HL, MS, CD, and UC are shaped by strikingly similar birth-cohort patterns. 8,9Any type of birth-cohort pattern is highly suggestive of exposure to an environmental risk factor during early lifetime with long-lasting medical consequences.The risk factor itself or its lasting impact on pathophysiology can then affect the exposed subject's susceptibility to develop the disease many years after the initial exposure.Such patterns are frequently associated with bacterial or viral infections during childhood or early adulthood.In gastroenterology, the acquisition of Helicobacter pylori infection during childhood and the subsequent development of peptic ulcer or gastric cancer decades later represent a typical example for such behavior. 122][3] A recent study of mortality from these 4 diagnoses in different countries revealed similar geographic patterns and significant correlations among their world-wide distributions. 10This study also suggested that the exposure to any relevant environmental risk factors must have started before the age of 5 years in UC and HL and before the age of 15 years in CD and MS.
The Medicare population is limited by its restriction to patients older than 65 years, whereas HL, MS, CD, and UC tend to mostly affect patients during early or mid-adult life.Previous diagnoses may have gone unnoticed or unrecorded  during the inpatient encounter that entered the Medicare datafile.Our analysis was restricted to a single year, which did not provide the opportunity to study the sequence of disease occurrence in individual patients.Future studies, involving a younger patient population with follow-ups over prolonged time periods, may provide the opportunity to study such sequential disease occurrences in individual patients.
The analysis was also limited by the absence of additional information about social habits or types of immunemodulating therapy.Theoretically, the epidemiologic associations between IBD with HL or MS could relate to adverse effects of immune suppressive therapy. 13,14As the mechanism remains unknown, by which immune-suppressive therapy would lead to an increased risk of HL or MS, it is conceivable that such a mechanism also involves reactivation of dormant EBV infection.The shortcomings in trying to record the usage of immune-suppressive medications as a confounding variable relate to difficulties in retrieving reliable information about the length of time and overall amount of exposure to a large variety of such medications for each individual case subject, let alone the entire control population.Most importantly, the strong association between IBD diagnosis and immune-suppressive therapy also limits the usefulness of any information about medications as an independent co-variable.Although the use of immune-suppressive medications for IBD treatment has markedly increased during the past 3 decades, there is no indication that the concurrence of HL or MS with IBD has increased likewise. 7It would be difficult to explain the similar birth-cohort patterns or geographic distributions of the 4 diagnoses based on the effects of immune suppressive therapy.
Despite their statistical significance, the odds ratios for the associations between IBD and HL or UC are relatively weak.The epidemiologic data already indicate that exposure at different ages may underlie the occurrence of the 4 different diagnoses. 9,10Exposure at different ages and different mechanisms of pathophysiology may result in different phenotypes associated with exposure to the same agent.Although over 90% of the adult population harbors antibodies against EBV, only a minute fraction ever develops HL or multiple sclerosis.Referring again to the example of H. pylori, only a small fraction of all infected subjects develop peptic ulcers, and an even smaller fraction of ulcer patients subsequently go on to develop gastric cancer. 12t the present time, the nature of the environmental risk factors affecting the epidemiologic associations between IBD and HL or MS remains speculative.Besides EBV infection, other risk factors may contribute to the observed epidemiologic patterns.The north-south gradient in the occurrence of MS and IBD has led previous investigators to speculate that increased sun exposure with high vitamin D serum levels provides a protective influence with respect to IBD and MS. 15,16In general, cold climate may force people to spend more time indoors and expose themselves to rampant infections.8][19][20] As much as these epidemiologic parameters vary in their magnitude across the United States, they may also have contributed to the occurrence of similar geographic variations of MS, HL, and IBD. 21,22

Conclusion
Using the electronic database of the US Medicare population, the present analysis revealed significant correlations between the geographic distribution of HL, MS, CD, and UC across the US and the increased concurrence of IBD with HL or MS in identical patients.These findings lend additional support to the hypothesis that IBD may share a common risk factor with HL and MS.Additional epidemiologic studies will be needed to further delineate the risk factors that underlie the epidemiologic associations of CD and UC with HL and MS.

Figure .
Figure.Correlations in the geographic distributions of Hodgkin lymphoma (HL), multiple sclerosis (MS), Crohn's disease (CD), and ulcerative colitis (UC) among the 35 largest US states.Each data point represents a different state.Rates are expressed per 10,000 patients in the entire Medicare population.

Table 1 .
Medicare Population of 2018 Stratified by Diagnosis, Ethnicity, Sex, and Age

Table 2 .
Patient Counts and Frequency Rates of HL, MS, CD, and UC in Different States POP, entire Medicare population.

Table 3 .
Concurrence of Hodgkin Lymphoma (HL) or Multiple Sclerosis (MS) With Crohn's Disease (CD) or Ulcerative Colitis (UC) in the Medicare Population 1 st Diagnosis 2 nd Diagnosis Control CI, confidence interval.