Data Science for Extubation Prediction and Value of Information in Surgical Intensive Care Unit

Besides the traditional indices such as biochemistry, arterial blood gas, rapid shallow breathing index (RSBI), acute physiology and chronic health evaluation (APACHE) II score, this study suggests a data science framework for extubation prediction in the surgical intensive care unit (SICU) and investigates the value of the information our prediction model provides. A data science framework including variable selection (e.g., multivariate adaptive regression splines, stepwise logistic regression and random forest), prediction models (e.g., support vector machine, boosting logistic regression and backpropagation neural network (BPN)) and decision analysis (e.g., Bayesian method) is proposed to identify the important variables and support the extubation decision. An empirical study of a leading hospital in Taiwan in 2015–2016 is conducted to validate the proposed framework. The results show that APACHE II and white blood cells (WBC) are the two most critical variables, and then the priority sequence is eye opening, heart rate, glucose, sodium and hematocrit. BPN with selected variables shows better prediction performance (sensitivity: 0.830; specificity: 0.890; accuracy 0.860) than that with APACHE II or RSBI. The value of information is further investigated and shows that the expected value of experimentation (EVE), 0.652 days (patient staying in the ICU), is saved when comparing with current clinical experience. Furthermore, the maximal value of information occurs in a failure rate around 7.1% and it reveals the “best applicable condition” of the proposed prediction model. The results validate the decision quality and useful information provided by our predicted model.


Introduction
Extubation decision is critical during a surgical intensive care unit (SICU) stay. Assessment of a patient's readiness for removal of the endotracheal tube in the intensive care unit (ICU) is usually based on respiratory, airway, neurological measures, etc. Extubation is mostly decided right after a weaning readiness test involving spontaneous breathing trial (SBT) or low levels of assisted ventilation. Even among patients who meet all weaning criteria and successfully perform a weaning readiness test, 10 to 20% still experience extubation failure (EF) [1]. Patients who suffer EF are usually associated with extremely poor outcomes, including high probability of mortality from 25 to 50% [2].
The extubation failure is defined as inability to sustain spontaneous breathing after removal of the artificial device such as an endothracheal or tracheostomy tube, and need for reintubation within a prespecified time window ranging from 24 h to one week [3]. The reasons for EF are diverse and often short of recognition. There is usually a significant respiratory distress episode accompanied with

Data Science Framework
This section describes the proposed data science framework and its methodologies. Figure 1 shows that a proposed data science framework of endotracheal extubation involves data preprocessing, variable selection, extubation prediction and Bayesian decision analysis. Finally, the framework identifies the significant variables, uses them to build the prediction model and investigates the value of information regarding an extubation decision. framework identifies the significant variables, uses them to build the prediction model and investigates the value of information regarding an extubation decision.

Data Preprocessing
Data preprocessing is used to deal with the incomplete or inconsistent dataset collected from diverse information systems in the ICU. Data preprocessing enhances the data quality which significantly affects the performance of the prediction model. The data is collected from IntelliVue Clinical Information Portfolio (ICIP) including patient data and several electronic health records. We first remove the null and redundant columns and then combine the patients' ID with their corresponding time they entered the ICU as a unique key (i.e., ID) for binding all the data sheets. Second, we address each data sheet by variable combination and data type transformation. Then, we merge each data frame according to the key ID. Based on the previous studies [1,19] and the experts' instruction, for one specific observation we only select the related data within the 48 h before or after extubation. Next, we transformed the categorical data into binary (i.e., dummy) variables to fit some machine learning models. Finally, we use the variance inflation factor (VIF) to address the collinearity problem potentially resulting in wrong identification of relevant predictors in statistical models [20][21][22]. We do stepwise procedure to remove one highly-correlated variable every iteration, and these removed variables are sent to clinical validation.

Variable Selection and Prediction Model
The variable selection and extubation prediction are described. First, due to the data imbalance problem (i.e., number of successes is much larger than number of failures), the under-sampling technique is suggested. In particular, we keep all failure cases and randomly sample the same number of failure cases from success cases for making the success-failure ratio equal to 1. Thus, we generate a balance dataset for variable selection and extubation prediction and we repeat the data generating process (DGP) 100 times for repeated random sub-sampling validation (i.e., Monte Carlo cross-validation).
Variable selection is a method to select important variables (or remove the redundant or insignificant factors) in order to (1)

Data Preprocessing
Data preprocessing is used to deal with the incomplete or inconsistent dataset collected from diverse information systems in the ICU. Data preprocessing enhances the data quality which significantly affects the performance of the prediction model. The data is collected from IntelliVue Clinical Information Portfolio (ICIP) including patient data and several electronic health records. We first remove the null and redundant columns and then combine the patients' ID with their corresponding time they entered the ICU as a unique key (i.e., ID) for binding all the data sheets. Second, we address each data sheet by variable combination and data type transformation. Then, we merge each data frame according to the key ID. Based on the previous studies [1,19] and the experts' instruction, for one specific observation we only select the related data within the 48 h before or after extubation. Next, we transformed the categorical data into binary (i.e., dummy) variables to fit some machine learning models. Finally, we use the variance inflation factor (VIF) to address the collinearity problem potentially resulting in wrong identification of relevant predictors in statistical models [20][21][22]. We do stepwise procedure to remove one highly-correlated variable every iteration, and these removed variables are sent to clinical validation.

Variable Selection and Prediction Model
The variable selection and extubation prediction are described. First, due to the data imbalance problem (i.e., number of successes is much larger than number of failures), the under-sampling technique is suggested. In particular, we keep all failure cases and randomly sample the same number of failure cases from success cases for making the success-failure ratio equal to 1. Thus, we generate a balance dataset for variable selection and extubation prediction and we repeat the data generating process (DGP) 100 times for repeated random sub-sampling validation (i.e., Monte Carlo cross-validation).
Variable selection is a method to select important variables (or remove the redundant or insignificant factors) in order to (1) avoid the curse of dimensionality which may lead to computational complexity and poor performance of the prediction model; (2) provide a better understanding of the causal relationship between predictors and response variable; (3) suggest a cost-effective monitoring with fewer control charts regarding these important variables [22]. In this phase, we suggest three variable selection techniques (including linear and nonlinear models)-Multivariate adaptive regression splines (MARS), stepwise logistic regression (SLR), random forest (RF)-to rank the relevant importance of factors. The reason we suggest three techniques is because we are not familiar with the geometric relation and property between predictors and response variable in the dataset. Thus we apply three methods rather than one and suggest (1) total frequency (TF) of the selected variables by three techniques [23]; (2) 100 times sampling cross-validation for the robustness to identify the important variables. In addition, because we merged each data frame according to the corresponding ID, the number of observations decreased dramatically since some patients' IDs do not match others from a variety of data sheets. Thus, we suggest repeating the data merger again with respect to the selected variables to increase the number of samples. In our case study, the number of observations approximately doubled after data re-merger.
Finally, we use these selected variables to construct the extubation prediction models, including support vector machine (SVM), boosting logistic regression (BLR) and backpropagation neural network (BPN), and assess the performance of each prediction model by the confusion matrix for each method. Due to a relatively small testing dataset, 10 times sampling cross-validation is used. For performance benchmarking, we also compare the proposed framework to the single index RSBI and APACHE II commonly used in clinical practice.

Bayesian Decision Analysis
Based on the extubation prediction model mentioned above, this section uses the prediction results to enhance the extubation decision by applying Bayesian decision analysis and assessing the value of information provided by the prediction model. Bayesian analysis is a method to modify the probability (posterior) by collecting the observed results from the uncertain event. Since the Bayesian analysis estimates the posterior probability (i.e., given an observed event, it estimates the probability of the hypotheses/population/unknown parameters that may explain the observed data), we can treat the probability distribution of collected dataset as a prior distribution (i.e., it gives the probability of observed data for a given hypothesis) and the extubation prediction result in the testing dataset as the likelihood function (i.e., it quantifies the possibility that the observed data would have been observed as a function of the hypotheses). We summarize the Bayesian inference as investigating an uncertain event as an unverified hypothesis presented byθ and all the possible results (i.e., state of nature space) are θ 1 , θ 2 , . . . , θ m . Let set P θ = θ j be the prior probability ofθ = θ j , that is, the probability given by the dataset without further information. When the sample space x i is observed by the decision maker, the probabilities under the given x i are corrected for each θ j based on the Bayesian theorem's so-called posterior probability. That is, where P(x i |θ = θ j ) is the likelihood function ofθ = θ j when x i occurred (i.e., when we observe the sample x i , the probability thatθ is equal to the θ j ). Here, we introduce a new idea and replace all the observed data and likelihood function by the predicted results from the prediction model. That is, this study assumes that we believe the prediction model. Thus, based on the Bayesian decision analysis, we can enhance the decision quality and integrate the data science technique into the decision framework (i.e., the proposed framework as Figure 1). From the previous phase, the BPN is suggested for estimating the likelihood function due to a higher accuracy of prediction. The decision tree and Bayesian analysis are used to enhance the decision quality and quantify the value of information presented by the expected monetary value (EMV) and the expected value of experimentation (EVE) [24]. The decision tree presents a tree structure which can display the details concerning the status in the decision process and demonstrates all possible actions that the decision maker would take and use the probability to show the possible scenarios for the uncertain factors. We calculate the expected profit/loss value of all possible actions for supporting decision-making. The performance of each node is usually characterized by expected monetary value (EMV), which can be calculated by the folding-back method and we select the best one and its corresponding decision node as the value for the next backward iteration. Finally, we conduct a sensitivity analysis of failure rate to assess the value of information provided by the prediction model and identify the best failure rate that can maximize the value of information to validate our proposed framework. Note that this study assesses the value of information from the "cost" aspect and thus we aim to minimize the expected loss.

Data and Results
Because all data being used in the study were part of routine clinical practice, the protocol was approved by the Institutional Review Board of National Cheng Kung University Hospital (approval no.: B-ER-105-362) with a waiver of informed consent. In our empirical study, the data is collected from IntelliVue Clinical Information Portfolio (ICIP) including patient data and several electronic health records from October 2015 to September 2016. The imbalanced panel datasets including several tables regarding biochemistry, arterial blood gas (ABG), blood cell, Glasgow coma scale (GCS), APACHE, extubation, etc. are collected from different information systems, in our case hospitals. There are 23 variables with the number of observations being between 1565 and 626,894. Through data preprocessing, the processed data (i.e., several data sheets combined into a single table) is with 359 observations including 49 failure cases (i.e., reintubation). Table 1 shows the patients' characteristics. It shows that APACHE II indeed presents a significant difference between success and failure of extubation. The cross validation with 100 times the result of variable selection is shown in Table 2 and we find both MARS and SLR have similar results; however, RF shows difference in some variables such as eye opening, RSBI and pO 2 _FiO 2 . Based on a scree plot we select eight important variables suggested by TF, we then repeat the data preprocessing (i.e., re-preprocess) to expand the number of observations to 704 based on these eight variables and then build SVM, BLR and BPN for extubation prediction. The hyperparameters of SVM, BLR and BPN are optimized by the grid search and cross validation. Note that based on 20% data for testing and success-failure ratio equal to 1 for data balance, we randomly choose 10 from 49 failure cases and 10 from 655 success cases for building testing datasets. The prediction results of different models in the testing dataset were shown in Table 3 via 10-time cross validation. We also list BLR and BPN with single index RSBI and APACHE II to compare with typical methods used in practice (SVM with single index is ignored due to its relatively poor performance with selected variables by TF).  In fact, the penalty in false positive (i.e., predict success in extubation but actually fail) is more serious than false negative (i.e., predict failure in extubation but actually succeed). In fact, there is a trade-off between false positive and false negative, thus we aim to select the prediction model with high accuracy and low false positive. In Table 3, BPN with selected variables by TF shows better performance. Note that this study did not suggest BPN with APACHE II as a prediction model since zero false positive is too ideal.
According to the prediction result of BPN, the value of information provided by the data science prediction model can be investigated by Bayesian decision analysis. In particular, building a decision tree shows the status along with the expected cost/loss after the decision of extubation as shown in Figure 2. Comparing with successfully extubated patients, the patients who need reintubation are more likely to spend more time in the ICU and the intubated patients should spend one more day to recheck the status before extubation [5]. Thus, in our case hospital, the cost (also called penalty/loss) is characterized by "more time-spent in ICU (i.e., more days of stay in the ICU)" shown in the right-hand side of Figure 2.  Figure 2 shows the calculated expected cost (i.e., days of stay in the ICU) of each chance node for decisions made. When the case of perfect information is considered, the expected cost is equal to 4.5696 days (expected costs under perfect information, ECPI); however, the expected cost without extra information (ECWI) is 5.662 days. Based on clinical experience, the cost of deciding not to extubate was 5.5 days less than 5.662 and we make the decision to "not extubate". Thus, the expected value of perfect information (EVPI) is = 5.5 4.5696(ECPI) = 0.9304 (days). That is, if we have perfect information, we will on average save 0.9304 days (patients staying in the ICU) when making an extubation decision. In addition, the expected cost of using the prediction model (i.e., expected costs of experiment, ECE) on an extubation decision is 4.8480 days less than 5.5 days and thus we suggest the decision use prediction model". Therefore, the expected value of experimentation (EVE) is calculated as = 5.5 4.8480 = 0.652 (days). That is, using the prediction model will roughly save 0.652 days (patient staying in the ICU) compared with current clinical experience. It implies useful information provided by the prediction model and validates the proposed data science framework. Note that we ignore the cost of building a prediction model since it is relatively small. Finally, a sensitivity analysis is conducted to characterize the uncertain events in the decision-making process in Figure 2. The failure rate is regarded as the prior probability provided by the prediction model (i.e., BPN) in this study. Since the failure rate in extubation usually ranges from 2% to 47% in the literature (the failure rate collected from our case hospital is 0.0696 after re-preprocess, i.e., success-versus-failure is about 13:1), we performed sensitivity analysis of the failure rate from 0.5% to 50%; that is, we consider different scenarios in general hospitals and validate the value of information. The failure rate directly affects the prior shown as the bottom branch in Figure 2. The EVPI and EVE are calculated as Figure 3. The result shows that our proposed framework is superior with positive EVE when the failure rate is between 1.5% and 25%; in particular, the maximal EVE occurs in a failure rate around 7.1%. At the moment, the proposed data science framework shows the best value of information just like our case study. On the other hand, though BPN provides prediction with high accuracy, the proposed framework may not be helpful when the failure rate is lower than 1.1% or over 33.3%.  Figure 2 shows the calculated expected cost (i.e., days of stay in the ICU) of each chance node for decisions made. When the case of perfect information is considered, the expected cost is equal to 4.5696 days (expected costs under perfect information, ECPI); however, the expected cost without extra information (ECWI) is 5.662 days. Based on clinical experience, the cost of deciding not to extubate was 5.5 days less than 5.662 and we make the decision to "not extubate". Thus, the expected value of perfect information (EVPI) is EVPI = 5.5 − 4.5696(ECPI) = 0.9304 (days). That is, if we have perfect information, we will on average save 0.9304 days (patients staying in the ICU) when making an extubation decision. In addition, the expected cost of using the prediction model (i.e., expected costs of experiment, ECE) on an extubation decision is 4.8480 days less than 5.5 days and thus we suggest the decision use prediction model". Therefore, the expected value of experimentation (EVE) is calculated as EVE = 5.5 − 4.8480 = 0.652 (days). That is, using the prediction model will roughly save 0.652 days (patient staying in the ICU) compared with current clinical experience. It implies useful information provided by the prediction model and validates the proposed data science framework. Note that we ignore the cost of building a prediction model since it is relatively small.
Finally, a sensitivity analysis is conducted to characterize the uncertain events in the decision-making process in Figure 2. The failure rate is regarded as the prior probability provided by the prediction model (i.e., BPN) in this study. Since the failure rate in extubation usually ranges from 2% to 47% in the literature (the failure rate collected from our case hospital is 0.0696 after re-preprocess, i.e., success-versus-failure is about 13:1), we performed sensitivity analysis of the failure rate from 0.5% to 50%; that is, we consider different scenarios in general hospitals and validate the value of information. The failure rate directly affects the prior shown as the bottom branch in Figure 2. The EVPI and EVE are calculated as Figure 3. The result shows that our proposed framework is superior with positive EVE when the failure rate is between 1.5% and 25%; in particular, the maximal EVE occurs in a failure rate around 7.1%. At the moment, the proposed data science framework shows the best value of information just like our case study. On the other hand, though BPN provides prediction with high accuracy, the proposed framework may not be helpful when the failure rate is lower than 1.1% or over 33.3%.

Discussion
More and more machine learning techniques are used in medical care, in particular, ICU [25]. This study focuses on extubation prediction. In literature, weaning parameters such as tidal volume, minute ventilation, maximum expiratory pressure, etc. are used to support the weaning process; however, they may not support predicting extubation well [26]. In addition, there are several criteria such as APACHE, RSBI and SOFA to support the extubation decision; however, the contribution is limited since the single index, constructed with several variables or partially distinct variables, did not provide a comprehensive view for extubation decisions. This study proposes a data science framework including variable selection, a prediction model and Bayesian decision analysis to support the extubation decision. The framework identifies the significant variables related to the endotracheal extubation by MARS, SLR and RF, and then provides excellent prediction performances by SVM, BLR and BPN. The results are compared with the current indices such as APACHE II and RSBI. In particular, the variable selection phase suggest that APACHE II and WBC are two critical factors affecting EF. Prediction with the BPN model provides high accuracy, and this result is consistent with previous studies, which reported that the predictive performance of artificial neural networks (ANNs) was better than those of RSBI and maximum expiratory pressure [26] and better than those of RSBI and maximal inspiration pressure (PIMAX) [27]. In previous studies, the factors that affect the EF are APACHE II, RSBI, sex, creatinine, PIMAX, ABG, etc. [3,4,[26][27][28]. All these factors are included in our data science framework and thus it provides a robust prediction based on comprehensive information. Finally, the predictive results of BPN are used for Bayesian decision analysis. This phase is critical to provide a connection from predictive analytics to prescriptive analytics [29]; that is, data science not only provides a model for prediction but also enhances the decision-making process in practice by investigating the value of information and decision risk (i.e., days of stay in the ICU) [30]. The results, showing a positive value of information, enhance confidence in applying data science for supporting extubation decisions in clinical practice (i.e., the prediction model will roughly save 0.652 days of a patient staying in the ICU). In fact, the space and beds in SICUs are limited and to shorten the patient's stay in the SICU will improve the bed turnover rate and the service quality. The most interesting thing derived from this study is that the maximal value of information occurs in a failure rate around 7.1%. This reveals the "best applicable condition" of the proposed prediction model.

Conclusions
This study proposes a data science framework for extubation prediction and quantifies the value of information. The analysis results validate the decision quality and useful information provided by our predicted model. The proposed data science framework is general and can be extended to other cohorts of ICU patients (e.g., medical ICU, neuro ICU, cardiac ICU, etc.) by

Discussion
More and more machine learning techniques are used in medical care, in particular, ICU [25]. This study focuses on extubation prediction. In literature, weaning parameters such as tidal volume, minute ventilation, maximum expiratory pressure, etc. are used to support the weaning process; however, they may not support predicting extubation well [26]. In addition, there are several criteria such as APACHE, RSBI and SOFA to support the extubation decision; however, the contribution is limited since the single index, constructed with several variables or partially distinct variables, did not provide a comprehensive view for extubation decisions. This study proposes a data science framework including variable selection, a prediction model and Bayesian decision analysis to support the extubation decision. The framework identifies the significant variables related to the endotracheal extubation by MARS, SLR and RF, and then provides excellent prediction performances by SVM, BLR and BPN. The results are compared with the current indices such as APACHE II and RSBI. In particular, the variable selection phase suggest that APACHE II and WBC are two critical factors affecting EF. Prediction with the BPN model provides high accuracy, and this result is consistent with previous studies, which reported that the predictive performance of artificial neural networks (ANNs) was better than those of RSBI and maximum expiratory pressure [26] and better than those of RSBI and maximal inspiration pressure (PIMAX) [27]. In previous studies, the factors that affect the EF are APACHE II, RSBI, sex, creatinine, PIMAX, ABG, etc. [3,4,[26][27][28]. All these factors are included in our data science framework and thus it provides a robust prediction based on comprehensive information. Finally, the predictive results of BPN are used for Bayesian decision analysis. This phase is critical to provide a connection from predictive analytics to prescriptive analytics [29]; that is, data science not only provides a model for prediction but also enhances the decision-making process in practice by investigating the value of information and decision risk (i.e., days of stay in the ICU) [30]. The results, showing a positive value of information, enhance confidence in applying data science for supporting extubation decisions in clinical practice (i.e., the prediction model will roughly save 0.652 days of a patient staying in the ICU). In fact, the space and beds in SICUs are limited and to shorten the patient's stay in the SICU will improve the bed turnover rate and the service quality. The most interesting thing derived from this study is that the maximal value of information occurs in a failure rate around 7.1%. This reveals the "best applicable condition" of the proposed prediction model.

Conclusions
This study proposes a data science framework for extubation prediction and quantifies the value of information. The analysis results validate the decision quality and useful information provided by our predicted model. The proposed data science framework is general and can be extended to other cohorts of ICU patients (e.g., medical ICU, neuro ICU, cardiac ICU, etc.) by collecting specific and related factors. In fact, the methodologies of variable selection and prediction models are more generalized to other industries (e.g., manufacturing and services) while decision analysis needs to specify the limitation and practical experience based on the applied domain. For further research, besides the factors regarding biochemistry, arterial blood gas (ABG), Glasgow coma scale, etc., there are several environmental factors (e.g., family care, time interval of doctor's review, etc.) and neurological dysfunctions (e.g., dysphagia) [31] which can be considered to improve the extubation decision and patient's status.