Impact of Software Settings on Multiple-Breath Washout Outcomes

Background and Objectives Multiple-breath washout (MBW) is an attractive test to assess ventilation inhomogeneity, a marker of peripheral lung disease. Standardization of MBW is hampered as little data exists on possible measurement bias. We aimed to identify potential sources of measurement bias based on MBW software settings. Methods We used unprocessed data from nitrogen (N2) MBW (Exhalyzer D, Eco Medics AG) applied in 30 children aged 5–18 years: 10 with CF, 10 formerly preterm, and 10 healthy controls. This setup calculates the tracer gas N2 mainly from measured O2 and CO2concentrations. The following software settings for MBW signal processing were changed by at least 5 units or >10% in both directions or completely switched off: (i) environmental conditions, (ii) apparatus dead space, (iii) O2 and CO2 signal correction, and (iv) signal alignment (delay time). Primary outcome was the change in lung clearance index (LCI) compared to LCI calculated with the settings as recommended. A change in LCI exceeding 10% was considered relevant. Results Changes in both environmental and dead space settings resulted in uniform but modest LCI changes and exceeded >10% in only two measurements. Changes in signal alignment and O2 signal correction had the most relevant impact on LCI. Decrease of O2 delay time by 40 ms (7%) lead to a mean LCI increase of 12%, with >10% LCI change in 60% of the children. Increase of O2 delay time by 40 ms resulted in mean LCI decrease of 9% with LCI changing >10% in 43% of the children. Conclusions Accurate LCI results depend crucially on signal processing settings in MBW software. Especially correct signal delay times are possible sources of incorrect LCI measurements. Algorithms of signal processing and signal alignment should thus be optimized to avoid susceptibility of MBW measurements to this significant measurement bias.


Introduction
Assessment of impaired ventilation distribution in the lungs by multiple-breath washout (MBW) measurement has been increasingly used over the past few years. In children with cystic fibrosis (CF) but also with primary ciliary dyskinesia MBW has been shown to be more sensitive for detection of early structural lung changes than standard lung function tests [1][2][3]. MBW is able to assess treatment effects even in mild CF lung disease [4][5][6]. In some centers, MBW is already part of the routine clinical surveillance in patients with CF [7][8][9].
The recently published ATS/ERS consensus aims to standardize MBW signal recording, processing, and analysis [10]. Despite guidelines as to what software packages should be able to perform, most of the technical recommendations rely on little data only [11;12]. In particular, uncertainty exists as to what extent the different software settings impact upon MBW outcomes. Even if the MBW post-hoc quality criteria are met [10], technical flaws during or after the measurement may strongly impact on MBW results. Random measurement bias may particularly influence longitudinal MBW data. Further, some MBW data require post hoc offline analysis to adjust for incorrect settings of online measurements. This may introduce additional non-systematic bias on test results.
To avoid MBW measurement bias as good as possible, the most important software settings and the respective impact upon results need to be known. The aim of our study was thus to assess the influence of different software settings on MBW results. We used nitrogen (N 2 ) MBW raw data from 30 children and adolescents with and without lung disease. We systematically evaluated the impact of changing different software parameters in order to identify the most important source of measurement bias. Primary outcome was the change in lung clearance index (LCI) and functional residual capacity (FRC), secondary outcome was change in phase III slope indices (SIII).

Methods Subjects
To cover a wide age and disease range we used unprocessed raw N 2 MBW data (A-files from the recording software Spiroware 3.1.6, Eco Medics AG, Duernten, Switzerland) from children aged 5 to 18 years, 10 with CF, 10 formerly preterm born and 10 healthy term born children. The study was approved by the Ethics Committee of the Canton of Bern, Switzerland. The children's assent was obtained and parents or caregivers provided written informed consent.

Nitrogen multiple-breath washout
We applied a previously described N 2 MBW setup [13;14] (Exhalyzer D and Spiroware 3.1.6, Eco Medics AG) as recommended by the current consensus and the manufacturer to record raw data [10]. In this device flow was measured by a mainstream ultrasonic flowmeter which derives tidal volumes (Fig 1). Gas concentrations were measured by a side-stream laser O 2 sensor and a main-stream infra-red carbon dioxide (CO 2 ) sensor. The N 2 fraction was measured indirectly from O 2 , CO 2 and the (estimated) Argon fraction. Flow-gas delay times were based on default settings as recommended by the manufacturer and further individually adjusted based on visual control of the N 2 signal. Equipment dead space volume was divided in pre and post sampling-point volumes, defining pre-and post-capillary dead space as represented in Fig 1. Depending on the child's bodyweight appropriate apparatus dead space reducer was used, set 2 (9.5mL) for children <35kg and set 3 (22mL) for children >35kg, except for the use of set 3 in two preterm children and two children with CF with <35kg bodyweight for logistic reasons.
All children performed triplicate N 2 MBW according to current consensus [10]. During measurement children were sitting upright, wearing a nose clip and quietly breathing through a snorkel mouth piece. N 2 MBW was stopped after 3 breaths below 1/40 th of N 2 starting concentration. We included the first high quality measurement per child for analysis, i.e. one test without visible breathing irregularity or leak [10].

Software settings
All data were recorded, processed and analyzed using Spiroware 3.1.6 (Eco Medics AG). To generate a consistent baseline, original measurements were analyzed with averaged delay times per set (Table 1), averaged ambient temperature and pressure. This was the standard baseline . Flow (and derived volume) are measured by a mainstream ultrasonic flowmeter. Gas concentrations are measured by the side-stream laser O 2 sensor and the main-stream infra-red CO 2 sensor. The N 2 fraction is measured indirectly by F N2 = 1 -F O2 -F CO2 -F Argon . The gas sampling port divides pre-from post-capillary dead space. Star symbols give approximates of volume and thus delay times (off set) between gas and flow sampling points. The gas supply illustrates the open bypass; during the N 2 MBW the patient breathes 100% O 2 through the mouthpiece.
analysis to which we compared MBW results after rerunning measurements with alternative software settings. We calculated LCI and FRC as primary outcomes, and parameters calculated from SIII (Scond and Sacin) as secondary outcomes. Outcomes were calculated as currently recommended [10], LCI from cumulative expired volume used to wash out to 1/40 th of initial N 2 concentration, divided by FRC.
We based the magnitude of change in the software settings either on given setting properties or on observed changes in clinical measurements. Accordingly, we inactivated certain settings (e.g. signal correction) or changed baseline setting in both directions (increase and decrease) using realistic steps of at least 5 units (e.g. 5°C ambient temperature or 5 mL dead space) or a relative change of at least 10% (e.g. flow-O 2 offset). We categorized our changes into four groups: 1. Environmental conditions. Default ambient temperature was 21°C and pressure 980 hPa. We changed temperature from 21°C to 16°C and 26°C (±5°C or ±24%), and pressure from 980 hPa to 960 hPa and 1000 hPa (±20 hPa or ± 2%). Secondly, BTPS (body temperature pressure saturation) correction was completely switched off.
2. Apparatus dead space. Pre-capillary dead space was 24 mL; post-capillary dead space 9.5 mL in set 2, and 22 mL in set 3. First we altered the default pre-and post-capillary dead spaces separately. Pre-capillary dead space was changed from 24 mL to 19 mL and 29 mL (±5 mL or ±21%); post-capillary dead space from 9.5 mL to 7.5 mL and 11.5 mL (±2 mL or ±21%) in set 2, and from 22 mL to 17 mL and 27 mL (±5 mL or ±23%) in set 3. Secondly, we simultaneously lowered and elevated both pre-and post-capillary dead spaces by ±5 mL, respectively ± 2ml for post-capillary dead space in set 2.
3. Signal processing and breath detection limits. We switched off one by one the default algorithms processing raw gas signals as recommended by the manufacturer and by previous work [13]: (i) We separately and individually deactivated: automated O 2 -drift correction, dynamic CO 2 -correction (adjusting the CO 2 signal for high O 2 fractions), O 2 response-time correction (normally set to 30 ms), and the correction for re-inspired N 2 volume. (ii) We decreased the sensitivity for breath detection by elevating the required minimum tidal volume from 25 mL to 100 mL. (iii) We increased the cut-offs determining limits for SIII calculation from 65%-95% to 50%-80% of expired volume. The latter limits may include phase II in paediatric tracer gas expirograms [10]. 4. Signal delay times. In the current setup side-stream O 2 and main-stream CO 2 signals must be aligned in time together with the flow signal to allow calculation of the tracer gas N 2 volumes as illustrated in Fig 1. Default flow-O 2 offset was 601 ms in set 2 and 618 ms in set 3, default flow-CO 2 offset was 51 ms in set 2 and 60 ms in set 3 (Table 1). We assessed the susceptibility of measurements towards changes in delay times by the following steps. (i) First, we changed the signal delays individually using realistic steps: For flow-O 2 offset from 601 ms in set 2 and from 618 ms in set 3 to ±20 ms (±3%), ±40 ms (±7%) and ±60 ms (±10%); for flow-CO 2 offset from 51 ms to ±10 ms (±25%) in set 2, and from 60 ms to ±10 ms (±17%) in set 3 ( Table 1). (ii) Secondly, we changed both O 2 and CO 2 signal delays simultaneously in four separate analyses adding or subtracting the same times as before. (iii) Third, we maintained always a constant delay difference between the two signals in three separate analyses: we changed both O 2 and CO 2 signal delays identically by -40 ms, +40 ms and +80 ms, respectively.

Statistics
Primary outcome parameters were changes in LCI and FRC. Change in either outcome exceeding 10% was considered relevant as 10% approximately reflects between day-to-day variability of LCI using the same equipment [13;14]. We further assessed changes in parameters calculated from SIII: Scond reflecting convection-dependent ventilation inhomogeneity and Sacin reflecting diffusion-convection-dependent ventilation inhomogeneity. We report raw Scond and Sacin as well as corrected for tidal volume as currently recommended in children [15].
FRC, LCI and Scond and Sacin data were not normally distributed while differences between tests were normally distributed. Therefore data from paired tests were compared by the non-parametric Wilcoxon signed-rank tests. Agreement between settings (differences between paired tests) were assessed by Bland Altman plots [16]. P-values <0.05 were considered statistically significant. All analyses were done using Stata (Stata Statistical Software: Release 11. College Station, TX: StataCorp LP).

Environmental conditions
Temperature and pressure settings had small indirect proportional effects on FRC and LCI. Lowering the temperature by 5°C resulted in higher LCI and lower FRC values. Vice versa, 5°C higher temperature resulted in the opposite effect (Table 2). Lowering the pressure by 20 hPa lead to both higher LCI and FRC, while higher pressure lead to lower LCI and FRC (Fig 2). Effect size (mean change) of LCI in all children was lower than 10% (Table 3). Completely inactivating BTPS correction resulted in the same direction of change as lowering temperature with >10% LCI change observed in two children (one CF, one Healthy).

Apparatus dead space
Dead space settings had small indirect proportional effects on FRC and LCI. Lowering preand post-capillary dead space settings separately and simultaneously resulted in higher LCI and FRC (Table 4). Vice versa increased dead space decreased LCI and FRC. Mean effect size on LCI was smaller than 2% and lower than 10% in all children (Tables 4 and 5). Interestingly, measurement bias due to dead space changes was strongly non-linearly associated with FRC, with smaller FRCs being most affected (Fig 3). LCI showed a trend for an analogous but much weaker relationship (data not shown).  Healthy). Thereby measurement bias was weakly linearly and inversely related to FRC magnitude. Inactivating re-inspired N 2 and increasing the sensitivity of breath volume detection lead to heterogeneous changes of LCI and FRC. Changing limits for SIII calculation from 65-95% to 50-80% of expired volume altered Scond non-systematically, while Sacin significantly increased on average by the double. Sacin showed a linear trend for more increase the higher standard Sacin. This effect was more pronounced in raw Sacin than in volume-corrected Sacin (Fig 4).

Signal delay times
Change of O 2 signal delay time settings had a significant impact on LCI and FRC, starting from changes of ±40 ms upwards (Tables 8 and 9). (i) Decreasing flow O 2 offset by 40 ms (-7%) lead to a mean LCI increase of 12% and a significant FRC decrease, thereby 60% of the children showed >10% LCI change. The N 2 signal typically showed spikes at end-expiration over the Pre-and postcap DS lower "" "" # # " " last washout breaths (Fig 5A). Elevating the flow O 2 offset by 40 ms (+7%) resulted in the opposite effect with a mean LCI decrease of 9% and 43% of the children with >10% LCI change. This resulted in divots at beginning of inspired N 2 signal (Fig 5B). Changing flow-CO 2 offset showed the opposite effect on LCI and FRC but not exceeding 10%, and without visible change of the N 2 signal. (ii) Combining changes in flow-O 2 and-CO 2 offset resulted in the same trend of changes as for flow-O 2 offset alterations alone. (iii) When maintaining a constant delay difference between the two signals, effect sizes of changes were lower, but still significant. As expected, no effect was visible on the N 2 signal even in those children with relevant LCI changes. Overall Scond changed non-systematically while Sacin showed the same trend of changes as LCI. Taken together, signal delay times had the largest impact upon the primary outcomes LCI (Table 10), FRC, and secondary outcomes Scond and Sacin. Thereby changes in O 2 delay were relatively more sensitive than changes in CO 2 delay; while 7% (40 ms) change in O 2 delay showed a significant impact, 17-25% (10 ms) change in CO 2 delay had no significant effect. The largest effect on Sacin was caused by changed limits for SIII calculation doubling Sacin values on average. Except for inactivation of O 2 drift or reinspired N 2 leading to different changes among disease groups, changes of software factors lead to the same trend of LCI and FRC changes in all children independent of disease state. This uniformity often generated an overall statistically significant difference which was not necessarily clinically relevant (LCI change >10%) in all subjects (Table 10).

Discussion
This is the first study that systematically examines the implication of different software settings on washout results in a commercially available N 2 MBW software. We find that incorrect   [13;17;18] or by comparison to mass spectrometry [19][20][21][22] using optimized software settings. While the ERS/ATS consensus statement gives recommendations for the optimal hard-and software there is only little data at what point technical factors lead to relevant measurement errors [11;12].
There is preliminary evidence that change of settings over time in two different software releases of the same software lead to different FRC and LCI results as seen in infant MBW using sulfur hexafluoride (SF 6 ) [23]. Two other studies simulated specifically flow gas misalignment in different washout setups, both using 10 ms steps from -50 ms to +50 ms [24;25]. While Horsley et al. simulated flow-gas misalignment in a lung model using SF 6 washouts with an Innocor gas analyzer (Innovision, Odense, Denmark), Buess et al. analyzed raw data files from N 2 MBW (ndd Medical Technologies, Switzerland) [24;25]. Both found an almost linear increase of FRC between ±2.5% and ±7.5% over the -50 to +50 ms delay-time change and a higher FRC error with higher breathing frequency. While different setups and simulation conditions hamper direct translation of results, direction and effect size seem comparable to FRC errors in our study. In any case, all those studies point towards the importance of precise flow and gas signal alignment on lung volume calculations particularly in young children with faster ventilation. This has now been shown for all commercially available MBW setups.
One limitation of our study is that we simulated technical measurement errors rather than having the subjects themselves performing the test repetitively under changed conditions. Change of certain settings might have a different impact on real-time washout measurement compared to simulated tests, e.g. the change of post-capillary dead space would be associated with a change of signal delay times concurrently. On the other hand we based our simulations on the reload of raw, unprocessed storage files of the original measurement. Thus we are confident that results reflect real-life impact. To enable multiple reloads of the tests, we used averaged delay times per set. As we did not change the setup, individual delay times of the tests varied only minimally within a narrow range (Table 1). In addition we confirmed proper flowgas alignment for each test by visual control of the N 2 signal shape. Thus, we believe this approach does not impede validity of results. Another limitation is that results are specific for the device used in our study. However only certain findings do not apply to other pieces of equipment, like the change of software specific algorithms. General findings such as for signal alignement and BTPS correction are not specific to this apparatus. Sampling flow and gas concentration at different points will always result in delay between signals. Most of the times this delay is even flow-dependent [26]. This applies to all available and customized washout setups  Impact of Software Settings on MBW such as ultrasonic flowmeter based MBW using either N 2 [14] or SF 6 [27], Innocor gas analyser using SF 6 [18] and mass spectrometer using SF 6 [28]. As mentioned above this is underlined by comparable findings of incorrect flow-gas delays on MBW results also for other devices [18;24;25]. We only tested single relevant changes for each software factor within our heterogeneous study population. Thus we could not assess the effect over the complete range of the factor, or define clear relationships with outcome parameters. The Bland-Altman plots suggest nonlinearity for many parameters. However the primary aim of our study was to tease out the most Impact of Software Settings on MBW important software setting by including measurement of good quality [10] in a wide age range of children with different lung disease. The majority of our findings were consistent throughout the study population resulting in the same trend of change for all children. Depending on the role of software settings within the algorithm for outcome calculation, the effects of changes of software settings affecting primarily volume measurements were independent of underlying LCI (e.g. BTPS, temperature) while the effect of changes in settings primarily affecting gas concentrations (e.g. reinspired N 2 ) was of course related to underlying LCI (e.g. reinspired N 2 ). The most relevant finding was the significant impact of flow-O 2 signal misalignment on FRC and LCI results. The same applied for Scond and Sacin, with >10% change of their baseline values. The O 2 signal mainly determines the tracer gas signal (N 2 ) in the current setup. In order to derive FRC from MBW measurements, expired gas volumes are calculated by integrating the gas signal with flow. The same applies for measurement of respiratory dead space [26;29]. This requires precise alignment of gas and flow signals [30]. In the N 2 MBW setup we encounter two additional challenges for this alignment. During N 2 washout gas composition changes due to increasing O 2 concentration and viscosity [30]. This leads to changing delay times even in the side-stream sampling of O 2 over the period of the washout. Moreover mainstream flow is not constant because of breathing cycles. This will influence signal alignment furthermore. So far, flow gas delay was calculated as fixed correction factor for the whole measurement. The susceptibility of this flow gas synchronization and the clinical implication in case of misalignment as shown in this study suggests strongly that this algorithm is prone to  errors. Whether a dynamic flow-gas delay correction is superior, especially at younger age, in which these errors are more relevant [24;25], needs to be examined in future studies. Simulation of changes in equipment dead space also showed more impact in young children, where dead space is large relative to lung volumes. This is in line with recently published clinical data showing higher LCI values with increasing equipment dead space especially in young children [31]. Technical quality control of washout tests gets particularly important for longitudinal studies with potentially changing software settings. In the current setup, different signal processing settings (such as O 2 -drift correction, dynamic CO 2 correction or correction for re-inspired N 2 volume) were implemented continuously over three years. Moreover the storage of unprocessed raw data makes it possible to rerun original measurement with changed settings. We now know that the most important technical factor to check for the operator is correct signal alignment throughout the entire washout while other factors such as small deviations from environmental settings or BTPS correction only have small and clinically negligible impact. Unfortunately, if the N 2 signal shows relevant flow-gas misalignment, it gets clearly visible to the operator only over the last washout breaths. Prior to software updates the companies should provide data showing the new software's impact (if any) on MBW outcomes. This is recommended by the current consensus [10] and would enable the user to account for possible measurement bias, which seems especially important for repeated MBW measurements in longitudinal studies.
To sum up, we found that software settings have clear impacts upon MBW results. The most important technical factor is flow gas signal alignment. Particularly inaccurate flow-O 2 offset in N 2 MBW can lead to wrongly elevated or false normal LCI in a wide age range of children with different lung disease. Whether a flow-adapted new algorithm for signal synchronization will lead to more robust results needs to be examined in the future.
Supporting Information S1 Minimal Dataset. In this dataset the individual settings and respective outcomes are given for all changes in the software settings as detailed in the manuscript. (TXT)