Design
This is a prospective multicentre observational study. Unlike studies of prognostic models, in the present study, diagnostic models were developed, that is, models designed to determine whether a patient was in the compensated or decompensated phase of their disease (exacerbation of COPD and/or HF decompensation).
Sample
The criteria for admission to the study and the recruitment process have been previously reported16. Patients older than 55 years, able to walk at least 30 m, with a main diagnosis of decompensated HF and/or exacerbation of COPD, and hospitalized at the Department of Internal Medicine, Cardiology or Pneumology, were included. Participants with a pacemaker or intracardiac device, domiciliary oxygen therapy users prior to admission and patients with HF functional class IV of the New York Heart Association (NYHA) were excluded26.
Four hospitals participated: two tertiary university hospitals (600–900 hospital beds) and two regional secondary care hospitals (150–400 hospital beds) from the provinces of Barcelona and Madrid.
Each centre had a trained interviewer, and each department had a reference physician who was accessible to the interviewer. Each day, the interviewer contacted the referring physicians to review the hospitalization census and identify patients with the diagnosis of interest. Next, the interviewer confirmed the main diagnosis (decompensated HF and/or exacerbation of COPD) with the physician responsible for the patient and then contacted the participant (the same day or the next day) to obtain informed consent and verify compliance with all admission criteria of the study. The sample was obtained through convenience sampling, and all patients were enrolled consecutively as they were identified.
The recruitment and follow-up periods lasted 18 months from November 2010.
Evaluation of the participants
Each patient received three identical evaluations: the first in the hospitalization unit (V1) and the other two consecutively and at least 24 hours apart in the participant's home at 30 days after hospital discharge (V2 and V3). Thus, each participant received one evaluation in the decompensated phase (V1) and two in the compensated phase (V2, V3) of their disease.
The evaluation protocol16 included documentation of symptoms (dyspnoea according to the NYHA26 and Modified Medical Research Council (mMRC)27 scales) and physiological parameters (HR and Ox) in two consecutive periods: effort (walking at a normal pace and on flat terrain for a maximum of 6 minutes) and recovery (seated for 4 minutes after the end of the effort period).
HR and Ox were considered as time series with a sample frequency of 1 Hz, and were collected throughout the evaluation through a pulse oximeter (Model 3100, brand Nonin Medical, Inc., Plymouth, MN, USA) placed on the left index finger.
Reference standard diagnostic test
Given the absence of a single standard diagnostic test to verify whether a patient was in the compensated or decompensated phase of their disease, the clinical judgment of the participant’s responsible physician was considered a standard diagnostic test. Thus, in the decompensated phase, the diagnosis of decompensated HF and/or COPD exacerbation corresponded to the confirmed diagnosis from the participant’s attending physician (in cases of diagnostic doubt, the patient was excluded). For the compensated phase, the standard diagnosis of compensated HF and/or stable COPD was confirmed by a study physician through telephone contact with the participant 30 days after hospital discharge. During this telephone interaction, the patient was considered to be in the compensated phase if none of the following events had occurred since hospital discharge: increased cough, sputum or dyspnoea; initiation of or an increase in corticosteroid use; initiation of antibiotic treatment and medical consultation for worsening of the clinical situation from any cause. In cases of doubt or if the compensated phase could not be confirmed, successive telephone contacts were made until the phase could be confirmed. The interviewer scheduled home visits for the respective evaluations (V2, V3) only after confirmation and within 24–48 hours of receiving confirmation.
Index test: diagnostic algorithms
Initial preparation of potentially predictive variables or characteristics
Given the objective of the study (development of an “online” algorithm capable of detecting the proximity of an exacerbation from HR and Ox data), various characteristics of each of the evaluations were extracted (V1, V2, V3). For this purpose, the effort phase (walking) and recovery phase of each evaluation were separated by verifying the times recorded manually in the data collection records at the beginning and end of each phase of the test and visually reviewing the signals to confirm the manual records. Once the signals were separated according to the evaluation phase, the corresponding characteristics of the available measures were extracted.
Numerous characteristics were extracted from the signals. During each of the tests, two different phases were considered: effort and recovery, which were treated separately. From each of the phases, three signals were considered: HR, Ox and the normalized difference between them. From each of these three temporal signals, the characteristics of the temporal (the mean and standard deviation) and frequencial domains (the characteristics of the first and second harmonics, the sum of all harmonics and the six first indexes of the principal component analysis [PCA] for the normalized fast Fourier transform [FFT] of the signal) were extracted. Accordingly, 13 characteristics of each phase were obtained (26 characteristics total for each evaluation).
Labelling and definition of the events to be detected
Given that the main objective of the study was detection of a transition from a state considered normal or stable (HF or COPD in the compensated phase [V2, V3]) to a state of decompensation or exacerbation (decompensated phase [V1]), a methodological scheme was applied based on calculation of the differences between the evaluations of each available characteristic. Thus, if a patient had three evaluations (V1, V2 and V3), six differences or useful comparative signals were obtained from these evaluations (V1-V2, V1-V3, V2-V1, V2-V3, V3-V1, V3-V2). The label of each of these comparative signals is illustrated in Fig. 1.
Although the differences V1-V2 and V1-V3 might be more appropriately considered as “decompensation recovery” rather than “no decompensation”, we decided to discard a third label category (“decompensation recovery”) due to the small sample size and because the main objective of the trial was the detection of a decompensation.
Selection of predictor variables or characteristics
In a first approximation, potential predictive characteristics were selected using the random forest28, gradient boosting classifier28 and light gradient-boosting machine (LGBM)29 classification algorithms, which integrate the functions of characteristic selection by importance within the decision.
Figure 2 shows an outline of the process for preparation and selection of the characteristics of the signals.
During the process of selecting characteristics, all those that were redundant or had very low variabilities were discarded. In this study, by definition, we did not have variables with perfect separation that could cause overestimation of the diagnostic capacity of the models (overfitting)23.
In addition to the characteristics selected from the HR and Ox signals, the age, sex and baseline disease (HF or COPD) of the patients were considered potential predictors.
Development and validation of algorithms
For the development of the algorithms, the ML techniques most used in the studies of classification models were considered: (i) decision trees, (ii) random forest, (iii) k-nearest neighbour (KNN), (iv) support vector machine (SVM), (v) logistic regression, (vi) naive Bayes classifier, (vii) gradient-boosting classifier and (viii) LGBM.
For each of these techniques, hyperparameters were selected based on a brute force scheme using all available data through a cross validation scheme (K-fold cross validation, k = 5). A normalization process based on the median and interquartile ranges (IQRs) was applied to all characteristics28.
Once the best parameters of each technique were identified, internal validation was performed with a leave-one-patient-out method. Thus, a new model was calculated for each patient by replacing the model’s data from the training and validation sets with the patient’s data. Figure 3 shows an outline of the training and validation process.
The observation units (inputs) on which the algorithms were applied were the differences between two different evaluations, as illustrated in Fig. 1. Thus, the algorithms classified the evaluated difference as a state of “no decompensation” (label = 0) or “a change to decompensation” (label = 1). Therefore, the following parameters were defined:
-
True positive (TP): “a change to decompensation” as classification result for V3-V1 or V2-V1 comparison.
-
True negative (TN): “no decompensation” as classification result for V1-V2, V1-V3, V2-V3 or V3-V2 comparison.
-
False positive (FP): “change to decompensation” as classification result for V1-V2, V1-V3, V2-V3 or V3-V2 comparison.
-
False negative (FN): “no decompensation” as classification result for V3-V1 or V2-V1 comparison.
The parameters used to evaluate the diagnostic performance of the algorithms were the S, E and A. Each patient could have up to six observation units or inputs; therefore, up to six classification results were obtained, which were then defined as TP, TN, FP or FN. Then, the S, E and A were obtained for each patient. The final S, E and A of the entire sample were calculated from the mean of parameters obtained from each patient.
The predictive values were not considered because the proportions of evaluations in the decompensated phase (33% [V1]) and compensated phase (66% [V2, V3]) did not correspond to the usual proportion found in clinical practice (the vast majority of patients in the community are usually in the compensated phase).
Missing data, excluded data and indeterminate results
Missing data were not included in the analysis but patients with lost data were not excluded (all available patient data was included in the analysis). No imputation of the lost data was performed.
During the process of signals review and verification of the start and end times of each evaluation from the manual records, lost sections of HR and/or Ox data due to poor contact between the skin and the sensor was observed. This incidence caused introduction of some filters to be applied to exclude these lost sections from the analysis. Thus, an evaluation was excluded if it had a loss rate (measures lost divided by the total number of measures) greater than 10% in any phase. In addition, evaluations performed at home (V2, V3) that did not reveal an improvement in the sensation of dyspnoea for the patient (of at least one point according to the mMRC scale27) with respect to the decompensated phase evaluation (V1) were also excluded to ensure that home assessments were performed in the “compensated phase”.
No indeterminate results were noted in the index test (algorithms); in all cases, the model produced a “no decompensation” or “a change to decompensation” result. On the other hand, all evaluations were always performed after a definitive result of the standard diagnostic reference test: clinical diagnosis of the decompensated phase by the doctor responsible for the patient in the hospital evaluation (V1) and clinical diagnosis of the compensated phase by the doctor who contacted them by phone before home evaluations (V2, V3). Thus, the algorithms were developed and applied on evaluations clearly labelled as the compensated or decompensated phase by the reference diagnostic test.
Approval of the Ethics Committee
The study was developed according to the Declaration of Helsinki and approved by the Ethics and Research Committee (ERC) of the centre promoting the study (ERC of the Mataró Hospital, approval number 1851806). Informed consent was obtained from all participants and/or their legal guardians.