Predicting Appropriate Hospital Admission of Emergency Department Patients with Bronchiolitis: Secondary Analysis

Gang Luo1, PhD; Bryan L Stone2, MD, MS; Flory L Nkoy2, MD, MS, MPH; Shan He3, PhD; Michael D Johnson2, MD 1Department of Biomedical Informatics and Medical Education, University of Washington, UW Medicine South Lake Union, 850 Republican Street, Building C, Box 358047, Seattle, WA 98195, USA 2Department of Pediatrics, University of Utah, 100 N Mario Capecchi Drive, Salt Lake City, UT 84113, USA 3Care Transformation, Intermountain Healthcare, World Trade Center, 16th floor, 60 East South Temple Street, Salt Lake City, UT 84111, USA luogang@uw.edu, bryan.stone@hsc.utah.edu, flory.nkoy@hsc.utah.edu, shan.he@imail.org, mike.johnson@hsc.utah.edu


Introduction
Bronchiolitis is an inflammation of the bronchioles, the smallest air passages in the lungs, mainly seen in children below age two [1]. More than 1/3 of children have been diagnosed with bronchiolitis by age two [1]. In children below age two, bronchiolitis causes 16% of hospitalizations and is the most common reason for hospitalization [2][3][4][5]. Each year in the United States, bronchiolitis leads to approximately 287,000 emergency department (ED) visits [6], 128,000 hospitalizations [2], and US $1.73 billion of total inpatient cost (2009) [2]. About 32%-40% of ED visits for bronchiolitis result in hospitalization [7][8][9]. Current clinical guidelines for bronchiolitis [10,11] acknowledge that due to a lack of evidence and objective criteria for managing bronchiolitis, clinicians often make ED disposition decisions on hospitalization or discharge subjectively [4,12]. This uncertainty in bronchiolitis management leads to large practice variation [3,[12][13][14][15][16][17][18][19][20][21][22][23], increased iatrogenic risk, suboptimal outcomes, and wasted healthcare resources resulting from unnecessary admissions and unsafe discharges [15,21,24]. Around 10% of infants with bronchiolitis encounter adverse events during hospital stay [25]. By examining the distributions of multiple relevant attributes of ED visits for bronchiolitis and using a data-driven method to determine two threshold values, we recently developed the first operational definition of appropriate hospital admission for ED patients with bronchiolitis [26]. As shown in Figure 1, appropriate admissions cover both necessary admissions (actual admissions that are necessary) and unsafe discharges. Appropriate ED discharges cover both safe discharges and unnecessary admissions. Unsafe discharges are defined based on early ED returns. Unnecessary admissions are defined based on no more than brief exposure to certain major medical interventions listed in Figure 1. Brief exposure was defined as 6 hours and was chosen conservatively based on the median duration of major medical interventions received by a Unnecessary admissions: Actual admissions with exposure to one or more major medical interventions listed as follows for ≤6 hours: supplemental oxygen, intravenous fluids, nasopharyngeal suctioning, cardiovascular support, invasive positive pressure ventilation (mechanical ventilation), noninvasive positive pressure ventilation, chest physiotherapy, inhaled therapy (bronchodilator and mucolytics), and nutritional support (enteral feeding and total parenteral nutrition).
Unsafe discharges: Actual emergency department discharges followed by an emergency department return within 12 hours with admission for bronchiolitis.

Appropriate admissions
Appropriate emergency department discharges Necessary admissions Safe discharges = = + + subset of patients who tended to have been admitted unnecessarily. Based on the operational definition, we showed that 6.08% of ED disposition decisions for bronchiolitis were inappropriate [26].
So far, several models have been built for predicting hospital admission in ED patients with bronchiolitis [7][8][9][27][28][29]. As our review paper [30] pointed out, these models have low accuracy and incorrectly assume actual ED disposition decisions are always appropriate. An accurate model for predicting appropriate hospital admission can guide ED disposition decisions for bronchiolitis and improve outcomes. This model, which is yet to be built, would be particularly useful for less experienced clinicians, including those who are junior and those in general practice seeing children infrequently [31]. The current study's objective is to build the first model to predict appropriate hospital admission for ED patients with bronchiolitis. The dependent variable of the appropriate ED disposition decision is categorical and has two possible values: appropriate admission and appropriate ED discharge. Accordingly, the model uses clinical and administrative data to conduct binary classification.

Methods Study design and ethics approval
In this study, we performed secondary analysis of retrospective data. The Institutional Review Boards of the University of Washington Medicine, University of Utah, and Intermountain Healthcare reviewed and approved this study, and waived the need for informed consent for all patients.

Patient population
Our patient cohort consisted of children below age two with ED visits for bronchiolitis in 2013-2014 at any of the 22 Intermountain Healthcare hospitals. Intermountain Healthcare is the largest healthcare system in Utah, with 22 hospitals and 185 clinics delivering ~85% of pediatric care in Utah [32]. Like our prior paper [26], we adopted the approach used in Flaherman et al. [33][34][35]  . Any of these discharge diagnosis codes, rather than only the discharge diagnosis code of bronchiolitis, could be assigned to an ED visit for bronchiolitis. In addition, this approach included all patients with any of the above as a non-primary diagnosis code, as long as the ICD-9-CM primary diagnosis code is any of the following: apnea

Data set
From Intermountain Healthcare's enterprise data warehouse, we extracted a clinical and administrative data set containing information of our patient cohort's inpatient stays, ED visits, and outpatient visits at Intermountain Healthcare in 2011-2014. Recall that our patient cohort included children below age two with Intermountain Healthcare ED visits for bronchiolitis in 2013-2014. By starting the data set in 2011, we ensured that for each ED visit by a patient below age two for bronchiolitis in 2013-2014, the data set included the patient's complete prior medical history recorded within Intermountain Healthcare and necessary for computing features (a.k.a. independent variables).

Features
The 35 candidate patient features fall into two disjoint categories: (1) Category 1 includes all known predictors of hospital admission in ED patients with bronchiolitis, which were consistently recorded at Intermountain Healthcare facilities and available as structured attributes in our data set [30,31]. These 15 predictors are: age in days, gender, heart rate, respiratory rate, peripheral capillary oxygen saturation (SpO 2 ), temperature, co-infection, rhinovirus infection, enterovirus infection, history of bronchopulmonary dysplasia, history of eczema, prior intubation, prior hospitalization, prematurity, and dehydration. For any vital sign that was recorded more than once during the ED visit, we used its last value as its feature value. Among all recorded values, the last value most closely reflected the patient's status at ED disposition time.
(2) Category 2 consists of 20 features suggested by our team's clinical experts BS, MJ, and FN: race, ethnicity, insurance category (public, private, or self-paid or charity), the ED visit's acuity level (resuscitation, emergent, urgent, semi-urgent, or non-urgent), chief complaint, number of consults called during the ED visit, number of lab tests ordered during the ED visit, number of radiology studies ordered during the ED visit, number of X-rays ordered during the ED visit, length of ED stay in minutes, hour of ED disposition, whether the patient is current with his/her immunizations, diastolic blood pressure, systolic blood pressure, weight, wheezing (none, expiratory, inspiratory and expiratory, or diminished breath sounds), retractions (none, one location, two locations, or three or more locations), respiratory syncytial virus infection, language barrier to learning, and whether the patient has any other barrier to learning. For either attribute of wheezing and retractions that was recorded more than once during the ED visit, we used its last value as its feature value. Among all recorded values, the last value most closely reflected the patient's status at ED disposition time. Based on the timestamp, all candidate features were available as structured attributes in our data set before ED disposition time. We used them to build predictive models.

Data analysis Data preparation
For each ED visit by a patient below age two for bronchiolitis in 2013-2014, we used our previously developed operational definition of appropriate admission [26] (see Figure 1) to compute the dependent variable's value. For each numerical feature, we examined its data distribution, used its upper and lower bounds given by our team's ED expert MJ to identify invalid values, and replaced each invalid value with a null value. All temperatures <80 Fahrenheit or >110 Fahrenheit, all weights >50 pounds, all systolic blood pressure values equal to 0, all SpO 2 >100%, all respiratory rates >120 breaths/minute, and all heart rates <30 or >300 beats/minute were regarded as physiologically impossible and invalid. To make all of the data on the same scale, we standardized each numerical feature by first subtracting its mean and then dividing by its standard deviation. We focused on two years of data for ED visits for bronchiolitis (2013-2014). Data from the first year (2013) were used to train predictive models. Data from the second year (2014) were used to evaluate model performance, reflecting use in practice.

Performance metrics
As shown in Table 1 and the formulas below, we used six standard metrics to measure model performance: accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Area Under the receiver operating characteristic Curve (AUC). For instance, false negative (FN) is the number of appropriate admissions that the model incorrectly predicts to be ED discharges. Sensitivity measures the proportion of appropriate admissions that the model identifies. Specificity measures the proportion of appropriate ED discharges that the model identifies. For the six performance metrics, we conducted 1,000-fold bootstrap [36] to compute their 95% confidence intervals. On each bootstrap sample of the 2014 data, we computed our model's performance metrics. For each of the six performance metrics, its 2.5 th and 97.5 th percentiles in the 1,000 bootstrap samples specified its 95% confidence interval.
To show the sensitivity-specificity tradeoff, we plotted the receiver operating characteristic curve. The calibration of a model refers to how well the predicted probabilities of appropriate admission match with the fractions of appropriate admissions in subgroups of ED visits for bronchiolitis. To show model calibration, we drew a calibration plot [36]. There, a perfect calibration curve would coincide with the diagonal line. In addition, we used the Hosmer-Lemeshow goodness-of-fit test [36] to evaluate model calibration.

Classification algorithms
We used Weka [37], a widely used open-source machine learning and data mining toolkit, to build machine learning classification models. Winning most data science competitions [38], machine learning studies computer algorithms that learn from data, such as random forest, support vector machine, and neural network. Weka integrates many commonly used machine learning algorithms and feature selection techniques. We considered all 39 machine learning classification algorithms in the standard Weka package, and adopted our previously developed automatic machine learning model selection method [39] and the training data of 2013 to automatically select the algorithm, feature selection technique, and hyper-parameter values among all of the applicable ones. In a machine learning algorithm, hyper-parameters are the parameters whose values are traditionally set by the machine learning software user manually before model training. An example of a hyper-parameter is the number of decision trees used in a random forest classifier. Our automatic machine learning model selection method [39] uses the Bayesian optimization (a.k.a. response surface) methodology to automatically explore numerous combinations of algorithm, feature selection technique, and hyper-parameter values, and performs 3-fold cross validation to select the final combination maximizing the AUC. Compared to the other five performance metrics accuracy, sensitivity, specificity, PPV, and NPV, AUC has the advantage of not relying on the cut-off threshold for deciding between predicted admission and predicted ED discharge. Tables 2 and 3   Based on the χ 2 two-sample test, for the 2013 data, the ED visits discharged to home and those ending in hospitalization showed the same distribution for gender (P = .49) and different distributions for race (P < .001), ethnicity (P = .01), and insurance category (P < .001). For the 2014 data, the ED visits discharged to home and those ending in hospitalization showed the same distribution for gender (P = .94) and race (P = .61), and different distributions for ethnicity (P < .001) and insurance category (P < .001). Based on the Cochran-Armitage trend test [41], for both the 2013 and 2014 data, the ED visits discharged to home and those ending in hospitalization showed different distributions for age (P < .001).

Results
Our automatic machine learning model selection method [39] chose the random forest classification algorithm. Random forest can naturally handle missing feature values. Our model was built using this algorithm and the 33 features shown in Table  4. These features are sorted in descending order of their importance values, which were automatically computed by the random forest algorithm in Weka based on average impurity decrease. In general, the features related to the patient's history are ranked lower than those reflecting the patient's status in the current ED visit. This intuitively makes medical sense. Two candidate patient features, ethnicity and the ED visit's acuity level, were not used in our model because they did not increase model accuracy.  2. The receiver operating characteristic curve of our model. Figure 2 shows the receiver operating characteristic curve of our model. Weka uses 50% as its default probability cutoff threshold for making binary classifications. Table 5 shows the error matrix of our model.     Figure 3 shows the calibration plot of our model by decile of predicted probability of appropriate admission. The Hosmer-Lemeshow test showed imperfect calibration of the predicted probabilities and the actual outcomes (P < .001). When the predicted probability is <0.5, our model tends to overestimate the actual probability. When the predicted probability is >0.5, our model tends to underestimate the actual probability.

Discussion Principal results
We developed the first machine learning classification model to accurately predict appropriate hospital admission for ED patients with bronchiolitis. Our model is a significant improvement over the previous models for predicting hospital admission in ED patients with bronchiolitis [7][8][9][27][28][29]. Our model has good accuracy, with five of the six performance metrics achieving a value ≥90% and the other achieving a value >80%. Although our model attained a 3.02% lower accuracy than Intermountain Healthcare clinicians' ED disposition decisions (90.66% vs. 93.68%), we still view our model as a step forward with great potential. Within 0.01 second, our model can output the prediction result for a new patient. With further improvement to boost its accuracy and automatically explain its prediction results [42,43], our model could be integrated into an electronic health record system and become the base of a decision support tool to help make appropriate ED disposition decisions for bronchiolitis. At that time, a clinician could use the model's output as a point of reference when considering the disposition decision. This could provide value, improve outcomes, and reduce healthcare costs for bronchiolitis regardless of whether our future final model can achieve a higher accuracy than Intermountain Healthcare clinicians' ED disposition decisions. Our faith in this stems from the following considerations: (1) Intermountain Healthcare has several collaborative partnerships among its EDs and hospitals to facilitate coordination of pediatric specialty care, and has completed multiple quality improvement projects for bronchiolitis management. About 52.16% (=3,963/7,598) of ED visits for bronchiolitis within Intermountain Healthcare occur at a tertiary pediatric hospital with an ED staffed by pediatric-specific clinicians. On average, the ED disposition decisions for bronchiolitis made at Intermountain Healthcare could be more accurate than those made at some other healthcare systems, especially those systems with general practice physicians or fewer pediatricians working in their EDs. Our model can be valuable for those systems, if it reaches a higher accuracy than the clinicians' ED disposition decisions made at those systems. There is some evidence indicating this possibility. Most inappropriate ED disposition decisions are unnecessary admissions [26]. In our data set, 14.36% of hospital admissions from the ED were deemed unnecessary [26]. In the literature [44,45], this percentage is reported to be larger and between 20%-29%. To understand our model's value for other systems, additional studies need to be conducted using those systems' data. This is an interesting area for future work.
(2) Figure 4 shows the degree of missing values of each feature with missing values. Figure 5 shows the probability mass function of the number of features with missing values in each data instance. In our data set, several attributes have numerous missing values because those values were either recorded on paper or occasionally undocumented, and therefore were not available in Intermountain Healthcare's electronic health record system. In particular, wheezing and retractions values were missing for 73.56% (=5,589/7,598) of ED visits for bronchiolitis. Systolic and diastolic blood pressure values were missing for 46.49% (=3,532/7,598) of ED visits for bronchiolitis. This could lower model accuracy. In the future, these attributes are expected to be recorded more completely in Intermountain Healthcare's newly-implemented Cernerbased electronic health record system. After re-training our model on more complete Intermountain Healthcare data from future years, we would expect its accuracy to increase. In addition, multiple other healthcare systems like Seattle Children's Hospital have been using the Cerner electronic health record system to record these attributes relatively completely for many years. Our model could possibly achieve a higher accuracy if trained on those systems' data. Both of these are interesting areas for future work.    [42,43] to automatically provide rule-based explanations for any machine learning model's classification results with no accuracy loss. When reporting the performance metrics, we used the default cut-off threshold Weka chose for deciding between predicted admission and predicted ED discharge. Different healthcare systems could emphasize differing performance metrics and give divergent weights to false positives and false negatives. As is the case with predictive modeling in general, a healthcare system can always adjust the cut-off threshold based on the system's preferences.

Comparison with prior work
Previously, researchers had constructed several models to predict hospital admission in ED patients with bronchiolitis [7][8][9][27][28][29]. Table 7 gives a comparison of these previous models with our model. Compared to our model that predicts the appropriate ED disposition decision, the previous models are much less accurate and incorrectly assume actual ED disposition decisions are always appropriate. Our model uses many more patients' data, many more predictive features, and a more sophisticated classification algorithm than the previous models. As is the case with predictive modeling in general, all of these help improve our model's accuracy. Some aspects of our findings are similar to those of previous studies. In our data set, 39.59% (=3,008/7,598) of ED visits for bronchiolitis ended in hospitalization. This percentage is within 32%-40%, the range of hospital admission rates in ED visits for bronchiolitis reported in the literature [7][8][9].

Limitations
This study has several limitations: (1) This study used data from a single healthcare system, Intermountain Healthcare, and did not test our results' generalizability. In the future, it would be desirable to validate our predictive models on other healthcare systems' data. We are reasonably confident in our results, as our study was conducted in a realistic setting for finding factors generalizable to other US healthcare systems. "Intermountain Healthcare is a large healthcare system with EDs at 22 heterogeneous hospitals spread over a large geographic area, ranging from community metropolitan and rural hospitals attended by general practitioners and family doctors with constrained pediatric resources to tertiary care children's and general hospitals in urban areas attended by sub-specialists. Each hospital has a different patient population, geographic location, staff composition, scope of services, and cultural background" [26]. (2) Despite being an integrated healthcare system, Intermountain Healthcare does not have complete clinical and administrative data on all of its patients. Our data set missed information on patients' healthcare use that occurred at non-Intermountain Healthcare facilities. Including data from those facilities may lead to different results, whereas we do not expect this to significantly change our results. Intermountain Healthcare delivers ~85% of pediatric care in Utah [32]. Hence, our data set is reasonably complete with regard to capturing bronchiolitis patients' healthcare use in Utah. (3) Our operational definition of appropriate hospital admission is imperfect and ignores factors such as patient transportation availability, preference of the patient's parents, and hour of ED disposition [26]. Many of these factors are often undocumented in patient records. For some hospital admissions from the ED that were regarded as unnecessary based on our operational definition, the original admission decisions could be made because of these factors. (4) Besides those used in the paper, there could be other features that can help improve model accuracy. Finding new predictive features is an interesting area for future work.

Conclusions
Our model can predict appropriate hospital admission for ED patients with bronchiolitis with good accuracy. In particular, our model achieved an AUC of 0.960, whereas an AUC ≥0.9 is considered outstanding discrimination [46]. With further improvement, our model could be integrated into an electronic health record system to provide personalized real-time decision support for making ED disposition decisions for bronchiolitis. This could help standardize care and improve outcomes for bronchiolitis.

Authors' contributions
GL was mainly responsible for the paper. He conceptualized and designed the study, performed literature review and data analysis, and wrote the paper. BLS, MDJ, and FLN provided feedback on various medical issues, contributed to conceptualizing the presentation, and revised the paper. SH took part in retrieving the Intermountain Healthcare data set and interpreting its detected peculiarities.