Systemic Inflammatory Mediators Are Effective Biomarkers for Predicting Adverse Outcomes in Clostridioides difficile Infection

Each year in the United States, Clostridioides difficile causes nearly 500,000 gastrointestinal infections that range from mild diarrhea to severe colitis and death. The ability to identify patients at increased risk for severe disease or mortality at the time of diagnosis of C. difficile infection (CDI) would allow clinicians to effectively allocate disease modifying therapies. In this study, we developed models consisting of only a small number of serum biomarkers that are capable of predicting both 30-day all-cause mortality and adverse outcomes of patients at time of CDI diagnosis. We were able to validate these models through experimental mouse infection. This provides evidence that the biomarkers reflect the underlying pathophysiology and that our mouse model of CDI reflects the pathogenesis of human infection. Predictive models can not only assist clinicians in identifying patients at risk for severe CDI but also be utilized for targeted enrollment in clinical trials aimed at reduction of adverse outcomes from severe CDI.

C lostridioides difficile is a spore-forming bacillus that causes nearly 500,000 cases of toxin-mediated gastrointestinal illness yearly in the United States, with 29,300 deaths and a cost of 1.5 billion dollars annually (1). The pathogenesis of C. difficile infection (CDI) involves local toxin production within the intestines, leading to diarrhea and intestinal wall inflammation. Some patients experience severe colitis, along with a systemic inflammatory response as previously characterized (2).
We currently lack highly accurate predictive tools to assist with clinical decisions following CDI diagnosis. The development of accurate predictive models for adverse outcomes could guide the use of emerging treatments for CDI that can ameliorate or prevent disease-related complications (DRCs) such as ICU admission, colectomy, or death (3). For instance, fidaxomicin is costlier than vancomycin, while fecal transplants carry the risk of alterations to the host microbiome with unknown long-term effects, as well as transmission of enteric pathogens (4). Widespread deployment of these novel treatments in patients with CDI is impractical due to expense, invasiveness, and undetermined safety profiles, necessitating the development of tools for patient risk stratification treatment selection optimization.
The Infectious Diseases Society of America (IDSA) and Society for Healthcare Epidemiology of America (SHEA) guidelines use measurements of systemic immune response (white blood cell [WBC] count Ͼ 15,000) or signs of renal dysfunction (creatinine Ͼ 1.5) to define severe CDI (5). Further signs of organ failure (shock, hypotension, ileus, or megacolon) are used to define complicated CDI. While this classification system guides management decisions, the features used are late findings and do not always allow for early identification of high-risk individuals. For instance, in a study of two cohorts consisting of 156 and 272 unique CDI cases, of the 23 all-cause mortality cases, 10 of the patients (43.5%) did not meet IDSA severity criterion at time of diagnosis. An ideal model would identify cases of CDI at the time of diagnosis that are progressing toward severe systemic disease, so that treatments to halt disease progression can be started. Models built from baseline clinical variables or standard laboratory measurements have met with limited success in accurately predicting adverse outcomes, or they do not validate externally (6)(7)(8)(9)(10)(11)(12). Therefore, we set out to determine if predictive models built from a panel of multiple inflammatory mediators measured at diagnosis of CDI can accurately predict adverse outcomes, specifically, 30-day all-cause mortality and DRCs defined as ICU admission, colectomy, and/or death attributed to CDI. To validate these findings and provide further evidence of the utility of mouse models for CDI, we employed an experimental C. difficile infection in mice and tested the capability of the biomarker-based model to determine high disease severity in these mice.

RESULTS
Serum markers of epithelial damage, inflammation, and neutrophilic migration are significantly associated with mortality and disease-related complications. We studied an initial pilot cohort of 156 patients with CDI, of whom 58 (37.2%) met IDSA severity criteria, 4 (2.6%) died within 30 days, and 10 (6.4%) had disease-related complications. Of the 4 patients with CDI who died within 30 days, 2 did not meet IDSA severity criteria at the time of diagnosis. Serum collected near time of diagnosis was tested with a custom panel for serum biomarkers ranging from inflammatory markers to epithelial growth factors (Table 1). Biomarker profiles of serum from severe and nonsevere cases showed separation by principal-component analysis (see Fig. S1A and B in the supplemental material), while redundancy analysis (RDA) of biomarkers differentiated severe and nonsevere episodes by permutational multivariate analysis of variance (MANOVA) (P ϭ 0.005) and differentiated cases that developed DRCs (P ϭ 0.025) ( Fig. S1C and D). These biomarkers did not distinguish between patients who died within 30 days of diagnosis, most likely due to the limited number of 30-day mortality cases in our pilot study (n ϭ 4). Unadjusted logistic regression revealed that interleukin-6 (IL-6), procalcitonin (PCT), IL-8, IL-2R␣, and hepatocyte growth factor (HGF) were significantly associated with severity (P values of Ͻ0.001, Ͻ0.01, Ͻ0.05, Ͻ0.05, and Ͻ0.05, respectively). All of these biomarkers except procalcitonin were also significantly associated with DRCs but not overall 30-day mortality (Table S1).
We employed a validation cohort of 272 unique CDI cases among 253 patients, of whom 71 (26.1%) met IDSA severity criteria, 19 (7.0%) died within 30 days, and 18 (6.6%) had DRCs ( Table 2). Eight of 19 patients experiencing 30-day all-cause mortality did not meet IDSA severity criteria at the time of diagnosis. There were 14 patients that experienced 30-day all-cause mortality and developed DRCs. Similar to the case with the pilot, biomarker-based RDA of the validation cohort differentiated severe and nonsevere CDI cases by permutational MANOVA (P ϭ 0.001) and DRCs (P ϭ 0.002). With the increase in the number of patients who died, biomarker profiles from patients with 30-day mortality were also differentiated by RDA (P ϭ 0.001) (Fig. S2). Characterization of biomarker associations with each outcome was performed with unadjusted logistic regression and showed that 12 of the 17 inflammatory markers were individually associated with at least one outcome, with 6 biomarkers (HGF, procalcitonin, IL-6, IL-2R␣, IL-8, and tumor necrosis factor alpha [TNF-␣]) significantly associated with all three outcomes. With unadjusted inflammatory mediators, the most significant positive associated biomarkers (P Ͻ 0.001) with IDSA severity were HGF, PCT, IL-6, and IL-2R␣, with 30-day mortality were IL-2R␣, PCT, IL-8, and IP-10, and with DRCs were PCT, IL-8, and IL-2R␣ (Table 3). All associations are shown in Table S2. These findings validate the associations between biomarkers and adverse outcomes seen in the pilot cohort.
Development of high-performance, multivariable models to estimate CDI severity and predict adverse outcomes. While logistic regression models were initially produced to test feasibility of predicting 30-day all-cause mortality and DRCs from serum biomarkers at diagnosis (Table S3), these models are often not useful outside the particular cohort in which they were built. To produce more refined and generalizable models, we used 5-fold cross-validated elastic net regression modeling. As our goal was not to produce necessarily the best model but to describe which biomarkers have the potential to predict adverse outcomes in a generalizable way that would be most likely to validate in external cohorts, we show the modeling results for a range of tuning parameters. Alpha values were tested from pure ridge regression (alpha ϭ 0) to pure lasso regression (alpha ϭ 1), allowing the visualization of which biomarkers are retained  in the model as the inclusion criterion becomes more stringent (toward lasso regression). Additionally, biomarker inclusion is impacted by the selection of lambda, where deviance was within 1 standard error of the minimum (1se) or at the minimum (min).
As smaller models are more useful for clinical applications and performance did not differ drastically between the min (higher potential for overfitting) and 1se (higher potential for being generalizable) models, biomarker inclusion for each model and the area (AUC) under the receiver operating characteristic (ROC) curve performance for the 1se models are shown in Fig. 1, while the results for the min models are shown in Fig. S3. To create the most parsimonious model, the 0.9 models are strongly weighted to reduce unnecessary biomarkers and are the chosen highlighted models, although similar performance is seen across lambda and alpha values. ROCs and AUCs for the best elastic net models at each alpha value are shown in Fig. S4, highlighting the stability of the model performance with decreasing biomarker inclusion.
For IDSA severity estimation, elastic net modeling shows that PCT and HGF are included in all models and are the only biomarkers in 1se models with alpha values of Ͼ0.5. The 1se (alpha ϭ 0.9) model includes 2 biomarkers and produces an AUC of 0.74   Table showing which biomarkers are included in each Glmnet model. Inclusion was determined by (i) classification task (estimating IDSA severity or predicting adverse outcomes) and (ii) the penalty for including additional low yield variables. Each model was performed across 100 iterations with different initial seeds for each value of alpha. An alpha value closer to 0 weights toward ridge regression, and a value closer to 1 weights toward lasso regression. Lasso regression places a higher penalty on including additional biomarkers, resulting in fewer biomarkers included in the final model for higher alpha values. The color of each square indicates out of the 100 iterations how many times that individual biomarker was included in the produced models for the given alpha value. (B) Table  showing the performance of the best model with an alpha value of 0.9 and biomarkers included. (C) ROCs and AUCs for the best models with an alpha value of 0.9.
Biomarkers Predict Adverse Outcomes in CDI ® produces an AUC of 0.89 (0.84 to 0.95) (Fig. 1), while the min (alpha ϭ 0.9) model includes 12 biomarkers and produces an AUC of 0.91 (0.85 to 0.97) (Fig. S3). For DRC prediction, elastic net modeling shows that IL-8, PCT, HGF, and IL-2R␣ were included in most 1se models and all min models. The 1se (alpha ϭ 0.9) model includes 4 biomarkers and produces an AUC of 0.84 (0.73 to 0.94) (Fig. 1), while the min (alpha ϭ 0.9) model includes 4 biomarkers and produces an AUC of 0.85 (0.74 to 0.96) (Fig. S3). The same biomarkers were included in both models, indicating that determining DRCs is highly dependent on these four markers.
Regardless of model parameters, performances were similar across the largest and smallest models for each outcome and had AUCs higher than the highest among individual biomarker regression models. PCT was the only shared biomarker between 30-day mortality and IDSA severity models. DRC models included the two most significant biomarkers for IDSA severity (HGF and PCT) as well as two others found in 30-day mortality models (IL-2R␣ and IL-8). This indicates that the task of predicting DRCs has a solution that overlaps at least in part with estimating IDSA severity and predicting 30-day mortality. Similar to results from logistic regression modeling, the best-performing models were for 30-day mortality, followed closely by DRC, and the worst performance was seen in models for estimating IDSA severity.
Biomarker-based models outperform basic clinical models for predicting 30day mortality and DRCs. IDSA severity is used clinically to assess the severity of CDI and inform treatment, while the Elixhauser comorbidity index (Elixhauser), which was developed in order to predict mortality, is used as an aggregate measure of the burden of comorbid disease at baseline. We used IDSA severity and Elixhauser to estimate adverse outcomes and compare to our biomarker-based models. Simple logistic regression models showed that IDSA severity was significantly associated with 30-day all-cause mortality (P ϭ 0.003; AUCϭ 0.67 [0.55 to 0.79]) and DRCs (P ϭ 0.002; AUCϭ 0.69 [0.57 to 0.80]) but performed substantially worse than our biomarker models ( Fig. 2A). Simple logistic regression models showed that Elixhauser index was significantly associated with 30-day all-cause mortality (P Ͻ 0.001; AUCϭ 0.77 [0.69 to 0.84]) and DRCs (P ϭ 0.018; AUCϭ 0.71 [0.63 to 0.80]), but not with IDSA severity (P ϭ 0.51; AUCϭ 0.53 [0.45 to 0.61]), and similarly performed worse than our biomarker-based models (Fig. 2B).
The best biomarker-based elastic net model is able to improve the correct classification of 30-day all-cause mortality cases at time of diagnosis compared to the IDSA severity model for predicting 30-day all-cause mortality. This is demonstrated by a positive continuous net reclassification improvement (NRI) (P ϭ 0.022; NRI ϭ 0.53 [0.078 to 0.98]) when comparing the two models. NRI ranges from Ϫ2 (100% of positives and 100% of negatives incorrectly reclassified) to ϩ2 (100% of positives and 100% of negatives correctly reclassified); thus, an NRI of 0.53 is a moderate improvement in classification of individuals with 30-day all-cause mortality by the biomarker-basedmodel over the baseline IDSA severity model.
To test if Elixhauser and IDSA severity would add additional information to the models, we incorporated Elixhauser and IDSA severity into the best elastic net biomarker-based models ( Fig. 2C and D) and into the best logistic regression models (Fig. S5)  Multivariable, predictive models for 30-day mortality and DRCs do predict outcomes in a murine model of severe and nonsevere CDI. We and others have developed murine models of CDI in which experimentally infected animals will develop disease ranging from mild diarrhea to severe colitis. These murine models of CDI allow us to test our predictive biomarker models in a model organism that can develop similar disease but lacks the potential comorbidities of human patients. We felt that this was important to assess, since if our models perform well in an animal system, this gives support to the notion that our models are being fit toward biologically relevant biomarkers for CDI rather than comorbid disease or other confounding features that would not be present in an animal system. Biomarkers Predict Adverse Outcomes in CDI ® For this validation, we used a CDI model employing antibiotic pretreatment followed by experimental infection with C. difficile spores. We have previously demonstrated that murine infection with VPI 10463 results in severe, rapidly fatal disease within 48 h, while infection with strain 630 results in a more indolent course (13). For these experiments we employed 78 antibiotic-treated mice that were challenged with either strain 630g (37 mice), strain VPI 10463 (30 mice), or water (11 mice) and assessed serum responses with a murine version of our multiplex panel. VPI 10463-infected mice exhibited higher weight loss, more histopathologic intestinal damage, and higher clinical severity (Fig. 3A to C and Fig. S6). Therefore, we classified mice infected with VPI 10463 to have severe and fatal CDI, while those infected with 630g were classified to have mild and nonfatal CDI. The best models from our human cohort were applied to the mouse cohort (best 1se and min lambda models with alpha of 0.9 for IDSA severity,  Fig. 1B and Fig. S3b. To apply the models to the mice serum data, each outcome (severity/mortality/DRCs) was defined as positive for VPI 10463-infected and negative for 630g-infected mice (Fig. 3D) or by a cutoff of weight loss, cecum histopathology score, or colon histopathology score, as higher weight loss or histopathology represents more severe disease in mice with CDI (Fig. S7). The 1se models for prediction of 30-day all-cause mortality and DRCs accurately identified mice infected with high-virulence C. difficile. Specifically, applying each 1se model to the mouse cohort for high-versus low-virulence infections revealed AUCs of 0.59 (0.44 to 0.74) for the models built for IDSA severity, 0.96 (0.91 to 1.0) for the models built for 30-day mortality, and 0.85 (0.75 to 0.94) for the models built for DRCs.

DISCUSSION
CDI is associated with an increased risk of mortality, and at present, we are inadequately determining who will experience adverse outcomes. Multiple models have been produced to address this problem, including those utilizing electronic medical records, standard laboratory tests, and medical history (6)(7)(8)(9)(10)(11)(12). However, these models have met with limited success in external validation, and there is room for improvement in predictive ability of CDI adverse outcomes. Additional studies have examined specific biomarkers in serum that could be associated with severe CDI, but no study to date has looked across a wide spectrum of serum-based biomarkers to determine their effectiveness of predicting cases of mortality or DRCs. Our results support the hypothesis that models built from a panel of multiple inflammatory mediators measured early in the course of CDI can accurately predict adverse outcomes and can do so better than current measures commonly used to predict adverse outcomes upon CDI diagnosis.
Our panel and model could be utilized at the time of diagnosis to evaluate the risk of mortality for an individual patient. A negative result reduces the risk of 30-day all-cause mortality, while a positive result increases mortality risk from ϳ10% at baseline to ϳ25%. Currently, the therapeutic options are limited in scope, but identifying a high-risk patient could tip the scale toward using more aggressive therapy, such as colectomy. A secondary use of the panel could be to enable the study of therapies targeted specifically at reducing mortality in CDI, which otherwise are infeasible due to lack of statistical power. For example, if a study was being performed for a therapy against standard of care with a theoretical 30% reduction of mortality in the standard population with baseline ϳ10% mortality risk with a targeted power of 80% and an alpha (i.e., type I error) of 0.05, 2,700 patients would be required for the study. However, if our panel and model were used to identify only high-risk individuals that would be considered for enrollment, the population mortality risk would be increased to ϳ25%, reducing the needed number of patients to 928. This would decrease the number of required subjects 3-fold, substantially reducing cost and improving feasibility.
Validation is an important step in determining if a model is overfit to the particular cohort and/or confounding factors rather than the disease process itself. Utilizing murine CDI allowed us to test the models in a separate system without potentially confounding factors such as age, treatments, and comorbidities. Our results show that the risk model of 30-day all-cause mortality and DRCs are related to the underlying biology of the infection as the models are also predictive of severe outcomes in murine CDI. Additionally, this provides additional support for the observation that murine CDI has a similar immune response to human CDI, supporting continued use of the animal model in the study of the biology of CDI.
Overall, our results confirm our hypothesis that a serum-based biomarker panel predicts adverse outcomes from CDI. Additionally, we show that models constructed from serum biomarkers outperform both IDSA severity criteria and Elixhauser comorbidity index for predicting adverse outcomes. Therefore, serum biomarker-based models could be used to inform medical decision-making for patients with CDI, and this study has explored models from a range of modeling algorithms to inform which biomarkers are the most promising. Specific interest should be placed on continuing to study HGF, procalcitonin, IL-8, IL-2R␣, IP-10, and CXCL-5, as they were the most prevalent biomarkers selected in models of adverse outcomes from CDI in this study.

MATERIALS AND METHODS
Cohort design. Sera were collected within 48 h of diagnosis of CDI in two distinct cohorts of patients and frozen at Ϫ80°C until analysis. The pilot cohort collections ranged from October 2010 to November 2012, contemporaneous with our prior publications on various biomarkers in CDI. The validation cohort collections ranged from January to September 2016. We felt that it was important to separate these two cohorts as they were heterogeneous for several reasons: (i) the 4-year gap in time, (ii) the change in testing practices (e.g., collection of stool in Cary-Blair medium and no rejection of formed specimens in the pilot cohort era and use of best-practice alerts and educational alerts to modify testing protocols in the validation cohort era), and (iii) the change in treatment practices (movement away from metronidazole and toward fidaxomicin and vancomycin per new institutional guidelines). All patients were diagnosed with CDI by the clinical microbiology laboratory using a two-step algorithm including detection of C. difficile glutamate dehydrogenase (GDH) and toxins A and B by enzyme immunoassay (C. DIFF Quik Chek Complete; Alere, Waltham, MA), with reflex to PCR for tcdB gene for discordant results (Focus Simplexa assay from DiaSorin, Saluggia, Italy [pilot cohort], and BD Geneohm assay from Becton, Dickinson and Company, Franklin Lakes, NJ [validation cohort]). We examined prediction tasks for three major outcomes of interest: Infectious Disease Society of America (IDSA) severity, 30-day all-cause mortality, and disease-related complications (DRCs). An IDSA severe case is defined as leukocytosis with a white blood cell count of Ն15,000/ml or a serum creatinine level of Ͼ1.5 mg/dl (5). DRCs included colectomy, death, or ICU admission within 30 days attributed to CDI as determined by infectious disease (ID) physicians on our team blinded to the biomarker results (D.A.P. and K.R.).
The pilot cohort was analyzed with a 14-plex assay to examine key serum biomarkers. After our preliminary results and further study, CXCL-5, IL-22, and IL-23 were added to the panel to produce the 17-plex assay that was utilized on our validation cohort (Table 1). A similar panel was produced for mice with identical inflammatory mediators or the closest homologues. Our analysis was split into classification of current disease using IDSA severity as the gold standard and outcome prediction. The two outcomes of each CDI case that we set out to model were 30-day all-cause mortality from time of diagnosis and disease-related complications (DRCs), which included ICU admission, colectomy, or death caused by CDI specifically. Attributable CDI severity was determined through physician-based chart review.
Data analysis methodology. Given that measurements from Luminex assays are linear and thus accurate over a wide range of concentrations, generally spanning several orders of magnitude, the inflammatory mediator measurements were log transformed prior to analysis to correct for nonnormal distributions (positive skew). Principal-component analysis (PCA) was performed for the panel of inflammatory mediators, independent of our outcomes of interest, using princomp in the stats package in R (14). We performed redundancy analysis (RDA) for each binary variable (IDSA severity, 30-day all-cause mortality, and DRCs) as the predictor, and the outcomes were the log-transformed inflammatory mediators to assess whether the biomarker profile might be different between the individuals positive for the binary metric tested (e.g., those that experienced DRCs and those that did not). This was achieved by performing analysis of variance using Euclidean distance and a permutation test to find P values. This was performed using the vegan package in R (15). We assessed the impact of individual inflammatory mediators on the outcomes by performing unadjusted logistic regression for each inflammatory mediator.
We first attempted to model our outcomes using multivariable logistic regression with binomial deviance as our error measure. However, our overall goal was to identify important inflammatory mediators and construct models in a manner that avoided overfitting and would be more likely to generalize to an external cohort. With this in mind, we utilized 5-fold cross-validated elastic net multivariable logistic regression with the goal of testing the impact of adjusting the stringency of inclusion criterion and tuning parameters. A lambda value was selected where deviance was within 1 standard error of the minimum (1se, more stringent) or at the minimum (min). Additionally, we swept through alpha values range from pure ridge regression (alpha ϭ 0) to pure lasso regression (alpha ϭ 1) to identify which biomarkers would be included under each condition. For each value of alpha tested, 100 iterations across different seeds were performed. All of these methods (regularized regression, cross-validation, evaluating different lambda values, and sweeping the alpha tuning parameter) were aimed at avoiding overfitting, and even though this results in models that do not perform as well, the resulting claims about model performance are more conservative and more likely to validate externally. This was performed using the Glmnet package in R (16). Comparison of elastic net models was performed by creating receiver operating characteristic (ROC) curves and calculating the area under the ROC (AUC) using the pROC package in R (17). Net reclassification improvement index analysis was done using the PredictABEL package in R (18). All analysis was performed using R (19) and RStudio (20).
Mouse experimental methods. Eight-to 12-week-old, specific-pathogen-free (SPF) C57BL/6 mice were treated with 10 days of cefoperazone (0.5 g/liter) delivered in their drinking water to render them sensitive to Clostridioides difficile infection. The C57BL/6 mice used were produced by the Young Lab breeding colony at the University of Michigan established from mice purchased from the Jackson laboratory. After 2 days off antibiotics, mice were given an oral gavage of water, C. difficile 630g spores, or C. difficile VPI 10463 spores (Fig. 3). Inoculum was estimated between 10 3 and 10 4 spores. While mock-infected mice gained weight over the course of the observational time, 630g-infected mice remained at the same weight while VPI 10463-infected mice lost significant weight over 2 days. In our model, VPI 10463 infection following cefoperazone will result in a high proportion of death if allowed to progress beyond 48 h. To obtain serum samples, VPI 10463-infected mice were sacrificed 2 days after infection, while half of the 630g-infected mice were sacrificed at day 2 as time controls and the rest were sacrificed at 4 days postinfection, when they reached their maximum disease. Cecum and colon histopathology were scored from 0 to 12 by a blinded pathologist for edema, epithelial damage, and inflammatory cell infiltration. Each mouse was given a clinical score from 0 to 20 at euthanization based on posture, coat, activity, diarrheal signs, and weight change from day 0 (D0). Further description of the model can be found in the work of Leslie et al. 2019 (33).

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.