This is a retrospective single center study performed between October 2013 and January 2022 at Jackson Memorial Hospital/Ryder Trauma Center. The study was approved by the University of Miami IRB and the Jackson Memorial Research office. Data were collected in REDcap (a secure web application for managing databases, approved by the University IRB).14,15 For data collection, we used the National Institute of Neurological Disorders and Stroke Common Data Elements for TBI.16
Subjects. We identified retrospectively 134 TBI patients (18 years and older) that were admitted to the intensive care units at our institution and were monitored with EEG during their hospital stay. All patients were comatose on admission, defined as eyes closed with the inability to follow commands as evaluated by a neurologist or a neurosurgeon.
Clinical and Imaging Data Collection. We collected the patients’ basic demographics (age, sex, race/ethnicity), mechanism of injury, type of injury (subdural hematoma (SDH), epidural hematoma (EDH), subarachnoid hemorrhage (SAH), contusion, and diffuse axonal injury (DAI)), Glasgow Coma Scale on admission (GCS), Charlson Comorbidity Index (CCI) on admission, pupils’ reactivity, surgical treatments (neurological, tracheostomy, gastrostomy (PEG)), and seizures. Type of injury and Marshall CT classification was adjudicated by a neurosurgeon.17 We collected the cumulative received doses over the 2 previous half-lives of midazolam (24 hours), propofol (2 hours), fentanyl (8 hours), ketamine (6 hours), and dexmedetomidine (4 hours). Sedation levels were classified as follows: “none” if no sedation; “minimal” if only pushes of sedatives; “low” for continuously administrated doses of midazolam (≤ 0.15 mg/kg/h), propofol (≤ 4 mg/kg/h), fentanyl (≤ 2 µg/kg/h), or dexmedetomidine (any dose); “moderate” for continuously administrated doses of midazolam (> 0.15 mg/kg/h), propofol (> 4 mg/kg/h), fentanyl (> 2 µg/kg/h), or ketamine (any dose); and “deep” for barbiturate infusions.
We collected the following outcomes: hospital and ICU length of stay, Glasgow Outcome Scale-Extended (GOS-E) on discharge, mortality, withdrawal of life-sustaining therapies and discharge disposition. GOS-E is an eight category scale and is the most commonly used scale for global outcomes after TBI: (1) dead, (2) vegetative state (patient has no clinical evidence of awareness), (3) lower severe disability (patient is dependent and cannot be left alone for more than 8 hours at home), (4) upper severe disability (patient is dependent and can be left alone for more than 8 hours at home), (5) lower moderate disability (patient is independent at home but not able to return to work), (6) upper moderate disability (patient is independent at home and able to return to work with special arrangements), (7) lower good recovery (patient is able to resume normal life with the capacity to work with disabling neurological or psychological deficits), and (8) upper good recovery (patient is able to resume normal life with the capacity to work without disabling neurological or psychological deficits).18 Our primary outcome was recovery of consciousness at discharge, defined by eyes opening and the ability to follow commands. We selected re-emerging of consciousness as a primary outcome since it is an early stage of recovery preceding functional recovery and is a reasonable target to capture in a retrospective study. The ability to follow commands before discharge was obtained through the chart review of the neurologist, neurosurgeon, neurointensivist, and ICU nurses. It was defined as the ability to follow one-step commands (such as sticking the tongue out, showing two fingers, etc.).
Resting-State EEG. EEGs were obtained as a standard of care at our institution to exclude seizures in comatose patients. EEGs were obtained using a 10–20 system of electrode placement, using 16–19 EEG channels with adjustments for drains/wounds online referenced to Cz. EEGs were recorded using digital video EEG bedside monitoring (Xltek; Natus Medical, Oakville, ON, Canada; low-pass filter 70Hz, high-pass filter 0.1Hz, sampling rate up to 256-512Hz; impedances < 10kOhm). EEG recordings had short traces of data followed by long periods of no data across all channels due to recording or data export settings. These periods of missing data could not be recovered and therefore we defined a minimum of 10-minute of continuous resting-state data to include a patient in the analyses. Each of the 10-minute segments was visually assessed and segments with pervasive muscular, movement artifacts, and interictal/ictal activities were discarded.
EEG analyses were carried out in Python using the MNE-Python 0.24.1 and Nice 0.1 packages in custom scripts.19,20 A 60Hz notch filter (one pass zero-phase filter with length 1,691 samples) followed by a high-pass filter above 0.5Hz and a low-pass filter below 40Hz (finite impulse response one pass zero-phase filter with length 1,691 samples) were applied, and the data were referenced to the average of all channels and split into 2-second epochs. Noisy channels or epochs were either interpolated automatically or rejected using Autoreject 0.4.0.21 Finally, recordings with a higher sampling frequency were down sampled to 256Hz. A minimum of 10 continuous minutes of non-rejected EEG data was required which met the following requirements. For group analyses, features for one segment per patient were used. Segments selected had more than 200 clean epochs and less than 7 rejected channels, yielding 70 patients that had a good recovery and 37 that had a bad recovery. The number of non-rejected epochs (good = 271.5 + 29.6, bad = 279.8 + 22.5, U = 1094, p = .19) and rejected channels (good = 2.1 + 1.6, bad = 2.4 + 1.5, U = 1132, p = .28) did not differ across groups. Under these criteria, we constructed two datasets: one with 107 (80%) patients who met the EEG criteria (E) and another with 92 (69%) patients who met both the EEG and HEP criteria (EH). For subjects with more than one 10-minute trial of data, one was selected at random from among those with the best perceived data quality.
We calculated the following EEG measures based on prior literature on assessment of consciousness in the acute, and chronic states as well as the anesthesia literature9,22–27:
-
Power Spectral Density (PSD) for frequencies from 1 to 30 Hz was calculated following the Welch method using a window length of 128 samples with 100 samples overlap and a nfft of 4096 samples. Normalized and non-normalized spectral data were computed in four different frequency bands: delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), and beta (13–30 Hz). PSD is a commonly used quantitative measure in EEG studies calculated as the squared EEG amplitude in a specific frequency range.22,23,28
-
Permutation Entropy (PE) is a measure of signal complexity, for which the EEG data are transformed into a symbolic representation, and the distribution of the obtained patterns is quantified for each channel giving a measure of how irregular the signal is. The transformation involves taking consecutive sub-vectors of length n of the signal defined by parameter τ that determines the number of samples between elements resulting in a frequency-specific transformation. PE in the theta band (τ = 8, n = 3) has proven informative for the classification of DoC patients and is the measure used in this work.22,29
-
Weighted Symbolic-Mutual-Information (wSMI) is a measure of long-range connectivity that quantifies global information sharing by evaluating the nonrandom joint fluctuations between two EEG signals following the same symbolic transformation as for the PE.25
-
Kolmogorov complexity (Kolcom) measures the complexity of the EEG signal by quantifying the compressibility of the signal in each channel. It has been used more recently in acute disorders of consciousness.22,30
Since changes in these EEG markers over time and sensors have proven useful for categorizing patients with DoC, we performed four types of dimensionality reduction for each marker.20 These included calculating the average across trials and channels (mEmCh), the average across trials and the standard deviation across channels (mEsdCh), the standard deviation across trials and channels (sdEsdCh), and the standard deviation across trials and the average across channels (sdEmCh). This resulted in 44 total EEG features.
Heart-Evoked Potentials (HEP) and Heart Rate Variability (HRV). The standard of care EEG at our institution has at least one electrocardiogram (EKG) channel that is recorded and time-synched with the EEG data in the digital video EEG bedside monitoring (Xltek; Natus Medical).
For each recording, we visually inspected the EKG and selected the channel showing the clearest QRS complex. For each 10-minute recording, Neurokit 0.2.0 functions implemented in custom python scripts were used to remove slow drifts from the EKG signal (0.5Hz high-pass butterworth filter with order 5), to remove power line noise (signal was smoothed with a moving average kernel with the width of one period of 60Hz) and to automatically detect heartbeats.31 A patient specific threshold was defined to reject wrongly detected heartbeats and finally a visual inspection was carried out to reject heartbeats missed by the chosen threshold. Recordings for which no clear QRS complexes were observed were not analyzed. Heart rate (HR) and HRV were calculated on those segments. The HR was computed as the inverse of the average difference between consecutive R peaks (RR intervals). HRV was measured as the root mean square of successive differences between RR intervals.32
To obtain HEPs, we extracted the − 200ms to 800ms EEG data relative to each R peak (corresponding to the QRS complex), linear detrended each epoch, automatically rejected noisy epochs and computed the averaged signal in each EEG electrode in 10-minute of good data. For HR, HRV, and HEP results; we performed a group analysis (classified by recovery of consciousness on discharge). For this, one segment per patient with more than 300 clean epochs and with less than 7 rejected channels was selected, resulting in 68 patients that had a good recovery and 32 with a bad recovery. The number of epochs (good = 695.9 + 160.7, bad = 737.0 + 173.8, U = 871, p = .25) and rejected channels (good = 1.1 + 1.1, bad = 1.2 + 1.3, U = 947, p = .55) did not differ across groups. We analyzed the data for 100 (75%) patients.
Statistical Analysis. Descriptive data were generated to describe patients who recovered consciousness versus patients who did not recover. Continuous and categorical variables were summarized using means and frequencies (%), respectively. Chi-square tests were used to examine relations between categorical variables. Mann-Whitney U tests were used to examine group differences for continuous variables, respectively. A value was considered an outlier and discarded if it was below or above 3 standard deviations.
We used the following machine learning models to predict recovery of consciousness on discharge: Random Forest with 500 decision trees (RF), Support Vector Machine with a linear kernel (SVM), Histogram-Based Gradient Boosting (HGB), and XGBoost (XGB). We calculated the models’ area under the receiver operating characteristics curve (AUC-ROC). We also computed the reduction in the Residual Sum of Squares to report the most important variables contributing to our models. Class imbalance was accounted for using SMOTE with one neighbor.33 We used stratified 10-fold cross-validation, where the models were trained on 90% of the samples and tested on the remaining 10%, repeated such that each sample appeared in the testing set once and the class balance was maintained in each fold. We repeated this process 100 times, yielding a series of 1000 AUC values, which we used to compare model performances. We trained models on basic clinical characteristics (age, motor response, and pupils’ reactivity – the core IMPACT score) the Marshall CT score, HEP (Cz channel values) and rsEEG, as well as every combination of those features for both datasets.5 Training samples were normalized using a standard scaler, which removed the need for non-normalized spectral power features. We performed manual feature reduction on the remaining 28 rsEEG features through an iterative test, concluding that mEmCh and mEsdCh features together yielded the highest AUCs. Therefore, for the remainder of this paper, rsEEG refers to the 14 mEmCh and mEsdCh features. All data analysis was conducted using the scikit-learn package in a custom Python script. Statistics on the HEP responses were done using a nonparametric cluster corrected permutation test for two-time windows (0 to 600ms and 600 to 800ms over all channels).34
Data Availability. The data supporting the findings of our work and the scripts written are available upon reasonable request.