Prediction of high airway pressure using a non-linear autoregressive model of pulmonary mechanics

For mechanically ventilated patients with acute respiratory distress syndrome (ARDS), suboptimal PEEP levels can cause ventilator induced lung injury (VILI). In particular, high PEEP and high peak inspiratory pressures (PIP) can cause over distension of alveoli that is associated with VILI. However, PEEP must also be sufficient to maintain recruitment in ARDS lungs. A lung model that accurately and precisely predicts the outcome of an increase in PEEP may allow dangerous high PIP to be avoided, and reduce the incidence of VILI. Sixteen pressure-flow data sets were collected from nine mechanically ventilated ARDs patients that underwent one or more recruitment manoeuvres. A nonlinear autoregressive (NARX) model was identified on one or more adjacent PEEP steps, and extrapolated to predict PIP at 2, 4, and 6 cmH2O PEEP horizons. The analysis considered whether the predicted and measured PIP exceeded a threshold of 40 cmH2O. A direct comparison of the method was made using the first order model of pulmonary mechanics (FOM(I)). Additionally, a further, more clinically appropriate method for the FOM was tested, in which the FOM was trained on a single PEEP prior to prediction (FOM(II)). The NARX model exhibited very high sensitivity (> 0.96) in all cases, and a high specificity (> 0.88). While both FOM methods had a high specificity (> 0.96), the sensitivity was much lower, with a mean of 0.68 for FOM(I), and 0.82 for FOM(II). Clinically, false negatives are more harmful than false positives, as a high PIP may result in distension and VILI. Thus, the NARX model may be more effective than the FOM in allowing clinicians to reduce the risk of applying a PEEP that results in dangerously high airway pressures.

Whenever damage is caused to the lung tissue, there is a release of biological mediators, which can lead to organ failure and increased mortality [5]. Hence, setting PEEP levels to maintain recruited alveoli, while avoiding distension, is necessary to minimise VILI and improve outcomes [6][7][8][9].
Physiological modelling can capture patient-specific pulmonary mechanics and aid determination of optimal ventilator settings for each patient [10][11][12][13][14]. We have previously proposed a nonlinear autoregressive (NARX) model of respiratory mechanics [15,16]. The model uses basis functions to capture pulmonary elastance as a function of airway pressure, and multiple time-dependent resistance coefficients to capture impedance and viscoelastic effects. The NARX model has successfully captured airway pressure waveforms in ARDS patients during recruitment manoeuvres (RM) [15], and successfully predicted airway pressure at higher PEEP [16].
High airway pressure (> 40 cmH 2 O) due to excessive PEEP can contribute to over-distension and VILI [17][18][19][20]. Thus, limited pressure, as well as low tidal volume and other parameters, is considered lung protective [21]. Using prediction to avoid PEEP levels that induce high PIP may reduce the incidence of VILI, and the associated mortality and morbidity. Such prediction would provide clinicians with an indication of the risk associated with an increase in PEEP and a further metric to aid decision support. The aim of this work is to assess the ability of the NARX model to predict the high pressure that may cause pulmonary distension in ARDS patients, where the prediction horizon is a PEEP step increase of 2, 4, or 6 cmH 2 O. A comparison is made with the well-known and frequently used first order model (FOM) [6], which, in contrast with the NARX model, uses a single elastance term.

Data
This analysis uses data from a pilot Clinical Utilisation of Respiratory Elastance (CURE) software trial [22]. Airway pressure and flow data were collected from 10 fully sedated ARDS patients, of which seven were ventilated in pressure controlled mode, and three in volume controlled mode. Ethics approval for this study and subsequent use of collected data was granted by the New Zealand South Regional Ethics Committee. Informed consent was obtained from the patients. Patient age ranged from 18 to 88, with a mean of 55 years. The mean breathing rate was approximately 18 breaths/min, with no endinspiratory pause. Pressure and flow were recorded from a Puritan Bennett 840 ventilator at a sampling rate of 50 Hz. Volume was calculated from continuous integration of the flow, with compensation for volume drift to maintain a volume of 0 mL at PEEP.
Patients underwent one or more RMs with initial PEEP = 8-16 cmH 2 O. PEEP was then increased in steps of 2 cmH 2 O after approximately 10 breaths at each PEEP level. The RMs used differing numbers of PEEP steps, and reached different PIP levels. The data sets contained a range of 5-9 PEEP steps. It is also important to note these patients were in the middle of care and PEEP = 16 cmH2O is not unreasonable for some patients. If the patient is not responding at this level a RM is an acceptable course of action, leading to the wider range of PEEP at RM commencement in this data. One of the ten patients was excluded as they were not ventilated to a PIP of 40 cmH 2 O. Of the nine remaining patients, three underwent multiple RMs during the trial. In total, there were 16 data sets available for analysis.

Respiratory models
The first order model (FOM) of pulmonary mechanics forms the basis of the NARX model. The FOM contains single resistive and elastic components: where P is the measured airway pressure (cmH 2 O), t is time (s), R is the airway resistance (cmH 2 O s/L), V is the airway flow rate (L/s), E is the elastance of the lung tissue and chest wall (cmH 2 O/L), V is the air volume in the lung above volume at PEEP (L), and P 0 is the offset pressure (cmH 2 O).
The NARX model builds upon the FOM by incorporating pressure dependent basis functions to describe elastance, and multiple time dependent terms that represent the effect of airway resistance to flow and changes in flow. The NARX model is defined [15]: where φ i (P(t)) is the particular basis function value for a given pressure measurement (dimensionless); E i is the elastance coefficient for a given basis function (cmH 2 O/L), M is the number of basis functions used, R j is a series of terms that capture resistance and inertance of the flow (cmH 2 O s/L); L is the number of resistive terms. The subscript − j in the second term refers to the previous time samples. Thus, each P(t) is calculated from information from the previous 170 data points. The sum of the basis functions multiplied by their E i coefficients represent elastance as a function of pressure. The FOM can be replicated with M = L = 1, and zero order basis functions. Figure 1 shows the four basis functions used to describe elastance. These functions were selected previously [16] as a linear combination of these shapes were able to capture the range of elastance profiles observed in a larger patient cohort:  16:126 To identify patient-specific models, the non-linear elastance terms are decomposed into linear terms using Eqs. 3a, 3b, 3c, 3d and a linear system is evaluated across certain subsets of the full data range (t 0 − t N, where N < N D and N D is the length of the full data set). This subset of data is the 'training set' . where: The linear system can be evaluated via matrix psuedo inverse to provide x.

Analysis
This analysis compares NARX and FOM predictions for the occurrence of PIP > 40 cmH 2 O at PEEP levels higher than the training set. This pressure threshold was arbitrarily chosen, but is generally considered an upper bound to acceptable ventilation pressures for critically ill ARDS patients. True positive, true negative, false positive, and false negative results were recorded for both models. The sensitivity and specificity were calculated. Receiver operating characteristic (ROC) curves were generated by varying the discrimination threshold.
For each RM, the NARX model was first trained on the lowest PEEP level (PEEP min ) in the data. The basis functions that formed the patient model were extrapolated to predict the PIP at PEEP min + {2, 4, 6} cmH 2 O, respectively. The predicted PIP was compared with the measured PIP at these higher PEEP levels. Then the PEEP min and PEEP min + 2 cmH 2 O levels were used as the training set, and PIP was predicted at (3a) . . .
The training set was enlarged to incorporate more PEEP levels until there was only a single PEEP step left for prediction. The predictive ability of the FOM was used as a comparator for the NARX. The FOM is generally considered to be most relevant over small pressure ranges. FOM(I), uses the same training set as the NARX model, FOM(II) uses the elastance determined from a single PEEP level to predict PIP at one to three PEEP steps above that level. FOM(I) allows a direct comparison with the NARX model method, and FOM(II) allows the FOM to operate in a scenario more suited to its strengths. Figure 2 shows one patient's identified and extrapolated elastance for the NARX and FOM methods. Figure 3 shows prediction results for one RM. In this particular case, the NARX and FOM(I) models were identified on six PEEP steps and extrapolated to the seventh PEEP step. The NARX model provided an accurate prediction of the pressure waveform and correctly determined that peak pressure at the seventh PEEP step would be greater than 40 cmH 2 O. The FOM(I) prediction was substantially lower than the true peak pressure at the seventh PEEP step and gave a false negative result. FOM(II) was able to correctly predict that peak pressure would exceed 40 cmH 2 O at the seventh PEEP step. However, the predicted waveform had larger residuals compared to the NARX model. Table 1 shows the confusion matrices for the NARX model, FOM(I), and FOM(II) at each prediction horizon. The NARX model sensitivity was substantially better than the FOM(I) or FOM(II) in all cases. As expected, FOM(II) was more sensitive than FOM(I).

Results
The specificity of the NARX model was lower than FOM(I) and FOM(II). Note that there is a general, but imperfect, trend for the prediction metric scores to be poorer as the prediction PEEP moves further from the training PEEP.
The relationship between measured and predicted peak pressures across the cohort is given in Fig. 4, for the prediction a single PEEP step up from the training data. The three cases of false positive detection by the NARX model occurred when the measured PIP was very close to 40 cmH 2 O. In all three of these cases, measured PIP was ≥ 38.6 cmH 2 O, indicating the NARX model prediction was very close to the clinical  16:126 outcome. In the Bland-Altman plots, dotted lines give the 25th, 50th, and 75th percentiles of the difference in measured and predicted PIP. There is a slight bias in the NARX model for predictions to be higher than measured. FOM(I) has a relatively large bias towards low predictions, and FOM(II) is also slightly biased low.
The ROC curves in Fig. 5, for a single PEEP step prediction, plot the true positive rate against the false positive rate as the discrimination threshold was varied (Fig. 5). While all models performed significantly better than a random classifier, the area under the curve was clearly greatest for the NARX model, indicating the best overall performance.

Discussion
Clinically, false negatives are much more harmful to patients than false positives. In the context of pulmonary distension, a false negative may result in higher ventilation pressures that lead to distension and VILI. In contrast, a false positive result may mean clinicians do not increase PEEP, and miss out on the potential for improved recruitment at the higher PEEP level. Hence, the comparative impact of a false negative is stronger than the impact of a false positive, and thus, a high sensitivity should be favoured when predicting the outcomes of potential treatment methods.   16:126 Elastance captures the static tidal pressure required for a given volume of inspired air. While PEEP is the model input that allows the model to fit to the pressure troughs in each breath, an accurate elastance allows an accurate model fit to PIP. If predicted elastance is too high, a predicted breath will have a PIP that is higher than the true PIP. Similarly, if predicted elastance is too low, the predicted PIP will be lower than the true PIP.
VILI often occurs at the alveoli, due to mechanical strain caused by high alveolar pressure [23]. Due to low flowrates in the lower bronchiole close to the alveoli and the compliance of the bronchial path, the PIP measured at the airway is generally higher than the pressure experienced at the alveoli. Since airway resistance reduces at high pressures, higher than expected PIP levels are more likely due to increases in elastance at high pressure. Such an outcome is indicative of alveolar distension. Although this clinical protocol did not utilise the end-inspiratory pause needed to approximate alveolar pressure, it utilised a proxy metric (PIP) that is easily available in typical clinical practice.
In this analysis, there were no false negative and only three false positive predictions from the NARX model over a prediction horizon of one PEEP step. In the three false positive cases, the mean difference between measured and predicted peak pressures was only 1.5 cmH 2 O. For the NARX false positives that occurred at prediction horizons of two and three PEEP steps, the mean difference between measured and predicted PIP was 2.4 cmH 2 O in both cases. Thus, the false positives did not actually represent poor prediction. However, the incidence of these false positives may imply that the basis function shapes slightly overestimate the onset of distension. FOM(I) and FOM(II) were unable to always predict when measured PIP was likely to be greater than 40 cmH 2 O ( Table 1). The FOM assumes a constant elastance (Eq. 1). In reality, elastance changes during inflation according to recruitment and distension effects [24,25]. In general, distension effects mean that elastance is likely to be higher at pressure close to 40 cmH 2 O than at lower pressures, as most of the possible recruitment has already been achieved. Thus, in a situation with increasing elastance with PEEP, the single elastance term identified in the FOM will lead to underestimated PIP. This led to the high rate of false negatives given by FOM(I) and FOM(II). In contrast, the extrapolation of the NARX elastance shape to higher pressures enables the prediction of a higher elastance at higher PEEP, and thus very few false negatives occurred. FOM(II) uses identification data from only one PEEP step previous to the prediction PEEP. This method better represented the clinical use of the FOM. While less data was used for parameter identification in comparison to FOM(I), FOM(II) was more successful at predicting PIP. Compared to FOM(I) the sensitivity of FOM(II) increased by an average of 0.14 percentage points for the three prediction horizons. This occurred because the single elastance of FOM(II) did not need to partially represent the lower elastance of the lower pressure ranges (Fig. 2). However, the predicted peak pressures of FOM(II) tended to be lower than the NARX prediction and had higher residuals (Fig. 3).
The FOM(I) and FOM(II) specificities were higher than the NARX model specificities for each prediction horizon. This was an expected result, as a FOM false positive would only occur if elastance was decreasing when PIP was near 40 cmH 2 O. When peak pressure is close to 40 cmH 2 O, elastance is very unlikely to decrease as all recruitable lung volumes are likely to be recruited at lower pressures. Alternatively, FOM false positives would be likely to occur in a decreasing PEEP scenario. A single elastance identified from high pressure data is likely to be too high for an accurate prediction when PEEP is decreased. In this scenario, the NARX would be expected to perform better than the FOM, due to the ability to easily extrapolate a continuous elastance to lower pressures.
The ROC curve analysis (Fig. 5) shows the possible diagnostic equivalence of the models as their diagnostic thresholds are varied. The area under the ROC curve for the NARX model was the largest and yielded a peak sensitivity and specificity of 0.98 and 1.00 when a threshold of 40.3 cmH 2 O is used. Since the optimum threshold of the NARX is very close to 40 cmH 2  Clinically, once a dangerous PIP has been recorded, PEEP would not be increased further as the risk of distension and VILI would be high. Thus, outside of an initial recruitment manoeuvre, PEEP might not be increased to allow peak pressures beyond the clinician's chosen threshold. In this analysis, we included all available data from the recruitment manoeuvres and thus contradicted the clinical process. However, this was necessary to establish that the models did not erroneously predict a low PIP at high PEEP. Furthermore, ceasing the evaluation at the first instance of PIP > 40 cmH 2 O would generate an incorrect low true positive rate, and thus the sensitivity calculation would be negatively affected.
In this analysis we used data from seven patients on pressure controlled ventilation and three patients on volume controlled ventilation. In pressure controlled data, the PIP is a setting defined by the clinician, and thus it may seem strange to analyse the ability of the model to predict PIP in pressure controlled mode. However, none of the modelling approaches used in this analysis incorporated a priori information on the applied ventilator settings. In contrast, the modelling approaches provide a transfer function between pressure and flow. Thus the ability to predict pressure from flow data remains scientifically valid, even in pressure controlled mode, when the model does not use the ventilator settings as an input. Furthermore, parameterisation must be considered when model residuals are used to assess model performance [26]. In particular, in such cases extraneous parameters could enable improved fitting, but they could also confound the precise identification of physiologically meaningful parameters and thus limit the applicability of the model for prediction or extrapolation of behaviour. However, the level of model parameterisation is insignificant when the ability to precisely extrapolate beyond the identification domain is used to assess model suitability.
This model was designed specifically to capture non-linear elastance effects and uses pressure dependent elastance terms. However, we note that this formulation, like all models has limitations in certain behaviours. In particular, the model may well not perform well in some clinical cases, such as atelectasis or high auto-PEEP. However, the primary goal of this research was determine the model's ability to predict the likelihood of exceeding a particular PIP threshold. Hence, it was critical to be able to extrapolate elastance beyond the pressure in the identification sets. Pressure and volume dependent elastance formulation are mathematically equivalent due to the generally monotonic behaviour of pressure and volume. However, integration of flow signals yields drift in volume values that can confound PIP prediction. Pressure signals can be more effectively calibrated across PEEP steps and are thus a much more stable parameter to extrapolate elastance.
While the cohort is representative of modern ICU patients, the sample size of nine patients and 16 RMs is relatively small. The method should be tested on a larger cohort under varying disease states and ventilation modes to confirm the results. Additionally, the threshold pressure of 40 cmH 2 O was somewhat arbitrarily chosen. While this pressure would normally be considered high, it may not necessarily cause over-distension in all patients. However, the findings of this analysis imply that the NARX model's variable elastance can enable accurate and precise PIP predictions at any pressures thresholds that are chosen with the intention to limit distension.
The NARX model predicted high peak pressures more accurately than the FOM, and importantly, had very low instances of false negatives. Zero false negatives occurred at the prediction horizon of one PEEP step. The NARX model may aid clinicians in deciding whether to raise the PEEP setting for individual patients, and avoid dangerous PEEP levels that may cause distension and VILI.