Predicting COVID-19 prognosis in hospitalized patients based on early status

ABSTRACT Predicting which patients are at greatest risk of severe disease from COVID-19 has the potential to improve patient outcomes and improve resource allocation. We developed machine learning models for predicting COVID-19 prognosis from a retrospective chart review of 969 hospitalized COVID-19 patients at Robert Wood Johnson University Hospital during the first pandemic wave in the United States, focusing on 77 variables from patients’ first day of hospital admission. Our best 77-variable model was better able to predict mortality (receiver operating characteristic area under the curve [ROC AUC] = 0.808) than CURB-65, a commonly used clinical prediction rule for pneumonia severity (ROC AUC = 0.722). After identifying highly predictive variables in our full models using Shapley additive explanations values, we generated two models, platelet count, lactate, age, blood urea nitrogen, aspartate aminotransferase, and C-reactive protein (PLABAC) and platelet count, red blood cell distribution width, age, blood urea nitrogen, lactate, and eosinophil count (PRABLE), that use age and five common laboratory tests to predict mortality (PLABAC: ROC AUC = 0.796, PRABLE: ROC AUC = 0.793), which also outperformed CURB-65. We externally validated PLABAC using data from the National COVID Cohort Collaborative Data Enclave from 7901 hospitalized COVID-19 patients from the pre-vaccination period and 1547 from the vaccination period, yielding ROC AUCs of 0.755 and 0.766, respectively. This study demonstrates that our models can accurately predict COVID-19 outcomes from a small number of variables obtained early in a patient’s hospital stay in patients from institutions around the United States after the initial pandemic wave. These models can serve as a clinical prediction aid and accurately capture a patient’s prognosis using a small number of routinely obtained laboratory values. IMPORTANCE COVID-19 remains the fourth leading cause of death in the United States. Predicting COVID-19 patient prognosis is essential to help efficiently allocate resources, including ventilators and intensive care unit beds, particularly when hospital systems are strained. Our PLABAC and PRABLE models are unique because they accurately assess a COVID-19 patient’s risk of death from only age and five commonly ordered laboratory tests. This simple design is important because it allows these models to be used by clinicians to rapidly assess a patient’s risk of decompensation and serve as a real-time aid when discussing difficult, life-altering decisions for patients. Our models have also shown generalizability to external populations across the United States. In short, these models are practical, efficient tools to assess and communicate COVID-19 prognosis.

SARS-CoV-2 have few or no symptoms, others develop more severe complications, including acute respiratory distress syndrome and multiorgan failure (2).The ability to predict progression to severe illness has been the subject of intense study, especially as effective interventions have often been limited during outbreak scenarios, resulting in suboptimal resource allocation (3)(4)(5).Conversely, not allocating sufficient resources to patients at risk of serious illness may lead to unnecessary loss of life, while giving aggressive therapies to patients at low risk of dying can lead to unnecessary complica tions (6)(7)(8).
Particular demographic variables, comorbidities, and laboratory findings have been found to be significant risk factors for severe COVID-19, sometimes in ways that may have not been considered a priori (9)(10)(11)(12)(13).Since many front-line clinicians continue to manage numerous COVID-19 patients daily, quickly understanding the risk landscape of their patients is useful.Predictive algorithms based on machine learning techniques can assist clinicians in identifying which patients are most likely to have severe COVID-19 complications, allowing for early intervention to mitigate decompensation and improved utilization of scarce resources during high-volume scenarios (14).We now use data from a retrospective analysis collected at the start of the pandemic in the United States to build algorithms to improve prediction of COVID-19 outcomes and identify the key factors that predict the disease's outcome using SHapley Additive exPlanations (SHAP) values (Fig. S1 and "Feature importance and generation of simplified models" in Materials and Methods).We found that this approach outperformed CURB-65, a clinical prediction rule originally developed for community-acquired pneumonia used by physicians for predicting COVID-19 outcomes (15)(16)(17)(18)(19) and performed well in two independent large national cohorts during both the prevaccination and vaccination periods.

Patient characteristics
The patient cohort used to generate our models was derived from patients hospitalized at Robert Wood Johnson University Hospital (RWJUH) with COVID-19 during the initial pandemic wave between 19 March 2020 and 31 May 2020.Of the 77 variables we chose to include in our model (Table S5), the variables significantly associated with COVID-19 mortality are summarized in Table 1.Variables related to patient outcomes such as intensive care unit (ICU) admission and treatment data were excluded from Table 1 and from our models.The patient population we studied had a relatively high frequency of co-morbidities and often had deranged vital signs at baseline (Table 1), consistent with their acute illness, and a high COVID-19 mortality rate (30.8%).As expected, in univariant analyses, there were significant associations of mortality with number of comorbidities, those related to cardiovascular fitness, hypercoagulable state, dementia, and hyperlipidemia, and with low systolic and diastolic blood pressures.

Predictive ability of full mortality models
Since our major aim was to develop a predictive model for mortality, we compared five models with CURB-65 (Table 2), using the validated cutoff value of 2 for the CURB-65 metric based on sensitivity analysis (Table S4) (15).The best-performing algorithm for predicting mortality by mean cross-validated receiver operating characterisstic area under the curve (ROC AUC) was the voting classifier, but the result for extreme gradi ent boosted trees (XGBoost) was similar and had a slightly improved F1 score, which is relevant in predicting patients with high likelihood of dying (Table 2).The ROC AUC was much lower for CURB-65 than for our highest-performing algorithms, but with comparable F1 score.All five algorithms had improved probability estimates for COVID-19 mortality than CURB-65, as measured by negative log loss.The voting classifier outperformed CURB-65 by providing both better mortality risk prediction, mostly driven by superior identification of low-risk patients and proper probability calibration.

Feature importance in mortality determinations
SHAP values permit examining how single variables and combinations influence a model's prediction of risk.Using the mortality voting classifier, the 10 most important variables in the mortality-predicting model (Fig. 1) include important components of CURB-65 with age as the strongest risk factor.We generated a six-variable model called platelet count, lactate, age, blood urea nitrogen, aspartate aminotransferase, and C-reactive protein (PLABAC), which uses variables that are both top 10 performers by SHAP values in our mortality voting classifier and are also available as part of the National COVID Cohort Collaborative Data Enclave (N3C) data set we aimed to use for external validation.We also generated an additional model called PRABLE (platelet count, red cell distribution width, age, blood urea nitrogen, lactate, and eosinophil count) that utilized the top six variables from Fig. 1.These each had similar performance to the models generated with the full set of features and outperformed the ROC AUC score generated by CURB-65 for our population (Table 1).In particular, the PLABAC voting classifier and the PRABLE random forest model were the strongest in terms of ROC AUC.We developed a web tool that enables physicians to estimate their hospital ized patient's risk using the PLABAC voting classifier (https://plabac-bc6be52803c8.herokuapp.com/).These analyses show that age and either of two sets of five commonly obtained laboratory tests can be used for highly predictive models of mortality.

External validation of PLABAC model on N3C data set
We validated the performance of the PLABAC model in predicting COVID-19 mortality on an external data set utilizing the N3C database, dividing the test data into data before and after 1 March 2021, to evaluate the models' performances on patients before and during the era of mass vaccination for COVID-19 (see Table 4); two of the variables in the PRABLE model were not available in the N3C data set.We found that the PLABAC voting classifier was the strongest performer by ROC AUC in both our own data and in both external data sets (Tables 1 and 4).The model performed slightly worse overall on the two external data sets than of the population on which it was trained but overall still had strong performances on both data sets.PLABAC had a slightly higher ROC AUC and a slightly lower F1 in predicting COVID-19 prognosis in the more recent (vaccine-era) patients, which may be due to the lower COVID-19 mortality in that time frame.

Decrease in full model performance for prediction of ICU admission and intubation
For predicting intubation (Table 3), we considered our voting classifier the best model for analysis because it had the best ROC AUC, but the support vector machine (SVM) classifier had a slightly inferior ROC AUC with a better F1 score.The XGBoost algo rithm was our strongest for predicting ICU admission by ROC AUC, but the voting classifier had a far better F1 score (Table 3).However, all these models were not as accurate in predicting intubation and ICU admission compared to predicting mortality.
The calculated SHAP values for the features weighed in predicting ICU admission and intubation are instructive (Fig. S2 and S3).Many of the requisite variables have been well validated as individual risk factors, are consistent with proposed COVID-19 disease mechanisms, and are predictive of mortality in our model and in other models (15)(16)(17)(18)(20)(21)(22)(23)(24).However, other risk factors for intubation and ICU admission in our voting classifier model were more complex.Advanced age was associated with lower likelihood of intubation and ICU admission, despite their clear link to higher mortality likelihood (Fig. 2).Similarly, a history of dementia predicted a lower chance of ICU admission or intubation, but increased odds of death (Fig. S4).These conflicting observations may indicate that our models for these two outcomes captured the allocation of scarce resources (triaging) during the initial COVID-19 wave rather than a true reflection of patient necessity for intensive treatment.

DISCUSSION
Overall, we developed two algorithms that use a small number of variables that outper formed CURB-65 in predicting COVID-19 mortality, one of which was shown to have external validity.Although our algorithms performed similarly to CURB-65 in identifying patients at high mortality risk, in general, they were far better in identifying low-risk patients, therefore improving reliability.Our ability to predict intubation and ICU admission was weaker than mortality prediction, which may reflect clinical decisions related to triage during an overwhelming epidemic wave (11,25,26).Our best classifiers for mortality prediction by ROC AUC performed better or similarly than in some studies (10,11,(27)(28)(29)(30) but worse than in others (10,(31)(32)(33)(34)(35)(36)(37)(38)(39).However, the optimal way to compare these predictive models is to assess their generalizability to other populations.Variation in patient population, sample size, use of validation sets vs stratified cross-validation, hospital length of stay (LOS), and patient selection all affect algorithm performance.This is highlighted by the poor performance of CURB-65 in our population compared to other studies (15)(16)(17)(18).While CURB-65 is reasonably sensitive for COVID-19 prognosis, it is limited by its lack of specificity, which reduces its applicability to the real world of clinical medicine.Comparisons with prior COVID-19 prediction model outcomes are limited by differences in populations, disease severity, treatment modali ties available, and methodologies.
Although there have been prior risk scores for COVID-19 prognosis (10, 31-39), we believe ours is an improvement for several reasons.First, our risk models, PLABAC and PRABLE, include only a small number of high-impact variables, which to our knowledge is unique among similar works.Our algorithms, trained on a diverse US-based popula tion, may be particularly useful for US clinicians, which is further supported by the generalizability of the PLABAC model.Finally, we validated our model, unlike in prior studies, showing that even for patients in 2021, past the worst days of triaging and ventilator shortages in the United States (40), the PLABAC model performed well in prognosticating across a wide range of hospital environments; this provides confidence that the model's predictions are generalizable.
A second strength is that using an algorithm based on only six routinely gathered variables to predict mortality may provide utility for frontline clinicians.Aggregating a patient's risk of death into a six-variable package is especially useful in circumstances in which a clinician quickly needs to make rapid decisions.The models we have constructed are easy to adopt and are not only of theoretical interest but also practical to enable difficult, life-altering clinical decisions.Early identification of high-risk patients may enable physicians to monitor more closely, begin treatments earlier, and facilitate discussions about prognosis with patients and their family.Such scores can also be used to communicate quickly to other providers about prognosis and can be used in a manner similar to other prognostic scores such as the model for end-stage liver disease or the CHADS score used for stroke prognostication (41,42).Importantly, our models are often able to predict a patient's course months before demise or hospital discharge.Perhaps more valuable to a clinician is not only the binary variable of mortality but also the actual probability of demise or not, which provides a sense of the model's confidence.We have used the 50% chance of mortality as the cut-off, but this treats a 90% and 51% chance of predicted mortality in exactly the same way.In this respect, the PLABAC model repre sents a significant improvement over CURB-65.The combination of a small number of variables and external validation aligns our models with the intrinsic pathophysiology of COVID-19 and not a temporal or institutional quirk.With novel vaccine-resistant COVID-19 variants appearing regularly and COVID-19 remaining the fourth leading cause of overall mortality in the United States (43), the need for a rapid method to assess COVID-19 outcome remains high (19).
Our predictive performances may be conservative compared to others since (i) we intentionally excluded patients hospitalized for <24 h, either due to death in that time frame or mild illness and discharge to home, because this represents an obvious dichotomy to clinicians; (ii) our time frame for mortality events was longer than in other studies (27)(28)(29), including patients who had protracted hospital courses before their demise; (iii) we limited analysis to variables likely available to physicians within the first 24 h of hospitalization, which excludes such factors as number of ICU admissions and later therapeutic interventions (30,37); (iv) our models were designed to sacrifice specificity and overall ROC AUC to favor better identification of illness severity, which was less common but more important to recognize; and (v) several prior COVID-19 outcome predictive algorithms used radiographic biomarkers, which may not be readily available or consistently interpreted under epidemic conditions (32,35,37) and were not used in our study.This study has several important limitations; it was conducted from a data set from a single institution in early 2020 at the start of the COVID-19 pandemic in the United States; models generated from these data may not be fully representative of COVID-19 patients throughout the United States.Differing triage thresholds also may limit generalizability of algorithms predicting intubation and ICU admission since they depend on local epidemic status.The counter-intuitive differences in the predictive values of age and dementia for mortality on the one hand and ICU admission/intuba tion on the other may reflect such physician triage, explaining the reduced algorithm accuracy predicting those events (25,26) Because we were unable to obtain symptom data in our external 2021 validation data set, we were unable to assess how CURB-65 performed on the external data set.
It also is important to consider that the selection of the patients included in the training and testing of the model affects its performance and use cases.Our models are designed to be used as general-purpose tools to stratify the mortality risk for all hospitalized COVID-19 patients within the critical first 24 h of their hospital stay, which can aid in triaging patients before there is obvious clinical evidence of decompensation.This approach enables outcome prediction during the most uncertain portion of a patient's hospital stay.This approach enables outcome prediction during the portion of a patient's hospital stay with the greatest prognostic uncertainty.While we attempted to tailor our models to this end by excluding patients hospitalized for less than 24 h, some of the predictive performance of the model will be affected by patients who are already decompensated upon arrival to the hospital or conversely represent hospital admissions with a low likelihood of serious illness.Future directions of our study include focusing on COVID-19 patients in the ICU and generating a rolling estimate of prognosis by continually incorporating new clinical information.An important limitation of observa tional studies such as we present is the confounding that is inherent to retrospective studies.We hope to address this limitation and to refine our model's performance using a prospective cohort of COVID-19 patients in the future.
Our study used SHAP values to identify important factors in predicting severe COVID-19 outcomes; however, most comorbidities and demographic variables were not highly predictive.Advanced age, a well-documented risk factor for severe COVID-19 (2, 9-12, 15, 20, 41, 42), was the strongest predictor of mortality in our study and most likely accounted for the highest mortality rates in non-Hispanic whites, who were older on average in our studied population.Abnormal blood urea nitrogen (BUN) also is a well-validated prognostic marker of severity (15,23), as COVID-19 causes renal injury through hypotension, direct podocyte infection, and immune dysregulation following cytokine storm (23,44,45).Our models highlighted elevated lactate, reflecting both abnormal metabolism and hypoxia, previously implicated in poor COVID-19 prognosis (9,18,46).Low platelet count is an indicator of thrombotic activity and platelet turnover, which predispose to the severe COVID-19 complications of stroke and disseminated intravascular coagulation (13,47,48).Red blood cell distribution width (RDW), a marker of variability of red blood cell size, has been linked to adverse COVID-19 outcomes through its association with inflammation, hemolysis, and intravascular coagulopathy (49)(50)(51).Low eosinophil count also was a useful predictor of poor COVID-19 outcomes, consistent with prior reports documenting its perturbation (20,21,24,(52)(53)(54).Since eosinophil count was not identified in univariate analysis (Table 1), its relationship with COVID-19 may not be linear.We considered whether eosinophil count could be confounded by corticosteroid administration, which can lower eosinophil count and may hold a mortality benefit for COVID-19.Because only 66 (7.2%) of the 921 patients in our cohort were known to have been administered steroids while 421 members of the cohort had an absolute eosinophil count of 0, we believe that the effect was unaffected by the variable exposure.This is further supported by the body of evidence linking COVID-19 outcomes with eosinophil count, including one study that excluded patients given corticosteroids (55).Aspartate aminotransferase (AST) and C-reactive protein (CRP) are markers of liver damage and inflammation, and both are linked with poor COVID-19 prognosis (13,48,(56)(57)(58).Overall, the risk factors identified are consistent with an illness leading to mortality due to coagulopathy, inflammation, and decreased oxygenation of critical organs.The variables that are most predictive in our models highlight that COVID-19 is a multisystemic disease with significant morbidity and mortality occurring because of coagulopathy and direct damage to critical organs such as the lung, kidney, heart, and liver.Such pathophysiology separates COVID-19 from other infections that are essentially respiratory such as influenza and bacterial pneumonia.This difference may explain why the CURB-65 score, as well as other clinical prediction rules, may not suffice for COVID-19 prognosis.
Despite the limitations above, that PLABAC still performed relatively well on the broader external data set is a sign of its utility and promise.We envision PLABAC and PRABLE being used by clinicians to predict prognosis of their patients with COVID-19, efficiently allocate resources, and facilitate communication about an individual patient's chances of survival.Overall, this work has generated models that can create clinically useful predictions for patients being hospitalized for COVID-19 and represents a practical first step for the inclusion of machine learning into clinical decision-making for COVID-19 that can serve as a template for future prognostic models.

Data collection and patient population
We performed a retrospective analysis of 969 adults who were admitted between 19 March 2020 and 31 May 2020 to the RWJUH in New Brunswick with a diagnosis of COVID-19 infection during the peak of the first wave in New Jersey.Through a retrospec tive chart review, the authors systematically collected data including demographics, comorbidities, symptoms, and inpatient labs, vitals, and medical management.The data were then cleaned over multiple courses of quality control, with each chart reviewed by ≥2 independent readers.Comorbidities, reported from the chart to remove surveyor bias, were later reclassified by relevant category as validated by two independent observers (Tables S1 and S2), and all laboratory values were downloaded from the hospital database.All 935 patients who tested positive for SARS-CoV-2 infection by nasopharyngeal swab using PCR were included in our study, except for 16 patients who had a hospital stay of <24 h.Patients who were diagnosed with COVID-19 clin ically but were never PCR-confirmed were also excluded from the model (n = 34).Patients readmitted to the hospital including after the pre-specified end date were included.Time 0 was defined as the first available time stamp of a patient's initial laboratory value obtained from either the hospital or the emergency room.Additional data included length of stay, readmission number, number of readmissions, discharge location (rehabilitation facility, home, skilled nursing facility, or death), ICU LOS, and date of death.

Outcomes
We developed models to predict three clinical outcomes indicative of illness severity: intensive care unit admission, intubation, or death.Intubation was defined as any form of mechanical ventilation.Death was defined as a patient dying in the hospital or being discharged to hospice care, as detailed in Table S3.

Feature processing and selection
All models began with all variables recorded within the first day of hospital admission.We assumed that binary patient history terms such as pregnancy, smoking history, comorbidities, physical examination findings, and symptoms not recorded in a patient's chart were not present.The models included only categorical features that were present in >5% of our patient population, which excludes some of the comorbidities listed in Table S1.Ethnicity and sex, used in their constitutive categories, were treated as binary variables.Continuous variables with values in ≥50% of our patient population were included, and values that were obvious recording errors were removed.We imputed the results of missing continuous variables using the median value in our patient population.To reduce selection bias, we assessed whether the proportion of missing laboratory tests differed between patients who had a severe illness (ICU admission, intubation, or death) or not and excluded those tests whose values were not missing at random (independent t-test, P < 0.001).Tests quantifying urine blood or leukocyte esterase were converted to an ordinal scale.We calculated variance inflation factors (VIFs) for our data set to identify correlated variables, removing those with VIF of >5 from analyses (59).Some variables that are often used in concert clinically, such as history of chronic kidney disease, and serum levels of BUN and creatinine were included in our analyses despite their representing a similar underlying pathology.Our rationale was that the interaction between these variables may hold prognostic utility, which mirrors how they are used in clinical practice.We subsequently identified that patients who were unresponsive or had altered mental status were significantly (P < 0.05, independent t-test for each) less likely to have recorded symptoms, likely because such patients cannot provide a reliable history; we thus removed these symptoms from our analysis.Following this step, each variable was standardized by removing the mean and scaling to unit variance.These steps yielded a set of 77 variables relating to demographics, medical history, symptoms, physical exam findings, and laboratory values obtained early in the hospital course (Table S3).

Machine learning analysis
We initially selected four machine learning algorithms (LASSO [least absolute shrinkage and selection operator] logistic regression, random forest, extreme gradient boosted trees, and SVM) based on prior utility for similar tasks (10,11,31).Each model was evaluated using 10-fold stratified cross-validation on our own data set.Algorithms were evaluated for sensitivity, specificity, positive predictive value, negative predictive value, F1 metric, receiver operating characteristic area under the curve (ROC AUC), and negative log loss value, as described (10,12,15,16,27,30,32,33,36,38,57,60,61).Class weight was balanced by the ratio of minority to majority class.Hyperparameters for each algorithm were optimized using a random search through the parameter space.A voting classifier was generated using input from the four machine learning models and used soft voting, averaging the predicted probabilities generated by each model.For analysis of mortality, algorithms were compared to CURB-65, a 5-point metric (including confusion, elevated age, BUN and respiratory rate, and decreased systolic or diastolic blood pressure) for assessing community-acquired pneumonia severity (CURB-65 score), with scores corresponding to 30-day mortality risk (15).These probabilities were used to calculate the negative log loss value of CURB-65 and a score of 2 used for the mortality cutoff, as described (15) and validated on our own cohort (Table S3).

Feature importance and generation of simplified models
Feature importance was calculated using SHapely Additive exPlanations (SHAP) (62).Shapley values use game theory to fairly assign payout in a task by measuring the contribution of each of the players in the task and their interactions and can be used in machine learning to explain how combinations of variables contribute to the final prediction in log e odds (referred to herein as log odds).We used SHAP values to generate the most predictive variables for COVID-19 prognosis.Using the top 10 most predictive variables by SHAP values from our strongest 77-variable classifier (the voting classifier), we developed two six-variable models.PLABAC uses variables represented both in this top 10 list and in the National COVID Cohort Collaborative Data Enclave data set we used for external validation, and PRABLE uses the top six variables.

External validation
To assess our PLABAC model with clinical data external to our institution and time frame, we used data from the N3C data set, which includes patients with any encounter after 1 January 2020 and before 24 September 2021.Included patients had one of a set of a priori-defined SARS-CoV-2 laboratory tests, a strong positive diagnostic code, or two weak positive diagnostic codes during the same encounter or same date for patients admitted prior to May 2020 (39,63).The cohort definition is publicly available on GitHub (64).For our validation data set, we established the definition that for a patient to be included, we needed access to their age and all five day 1 laboratory values, and that they had an inpatient hospital stay of >1 day with laboratory-confirmed COVID-19.We used 1 March 2021 as a breakpoint between the pre-vaccine period and post-vaccine period of the COVID-19 pandemic until 24 September 2021.For these two periods, 7901 (with 1863 [23.6%] deaths) and 1547 (with 285 [18.4%] deaths) patient records met our criteria and were used to evaluate the performance of our models (Table 4).

Statistical analysis
For the univariate analyses listed in Table 1, continuous variable normality was assessed through the D'Agostino-Pearson test.Bonferroni-corrected Kruskal-Wallis H test was used for all non-normal continuous variables.Bonferroni-corrected two-sample t-test was used for albumin, the only normally distributed continuous variable.Chi-squared tests were used for all categorical variables.

FIG 1
FIG 1 The 10 most important features identified by SHAP values for predicting mortality by the 77-variable voting classifier.Each point on the plot is a patient's value for the specified variable, in descending ranked absolute feature importance for the voting classifier for mortality prediction.The numerical feature values are shown on a red (high)-blue (low) scale.Impact on model output is shown as log odds for mortality.Abbreviations: AST, aspartate aminotransferase; BUN, blood urea nitrogen; MPV, mean platelet volume; RDW, red blood cell distribution width.

TABLE 1
Demographics, comorbidities, vital signs, and laboratory findings used in full variable models significantly associated with COVID-19 mortality in univariate analysis a, c

TABLE 2
Comparison of CURB-65 with other models for mortality prediction a, b a All metrics (other than CURB-65) were derived via stratified 10-fold cross-validation on our data set and averaged across the 10-folds by the mean.Top-performing models of each class are bolded.b NPV, negative predictive value; PPV, positive predictive value; SVM, support vector machine.

TABLE 3
Comparison of full-feature algorithms for intubation and ICU admission a All metrics were derived via stratified 10-fold cross-validation on our data set and averaged across the 10-folds by the mean.Intubation and ICU models were evaluated to predict those outcomes, respectively.Top-performing models of each class are bolded.
a FIG 2 Importance of patient age in the prediction of mortality, intubation, and ICU admission.SHAP values for age.The y axis represents the impact of a value on our model output expressed in log odds, where a more positive number is associated with higher risk, and the x axis is the numerical value of age, in years.Research Article mBio September/October 2023 Volume 14 Issue 5 10.1128/mbio.01508-237