Diagnostic performance of electronic nose technology in chronic lung allograft dysfunction

BACKGROUND: There is a need for reliable biomarkers for the diagnosis of chronic lung allograft dysfunction (CLAD). In this light, we investigated the diagnostic value of exhaled breath analysis using an electronic nose (eNose) for CLAD, CLAD phenotype, and CLAD stage in lung transplant recipients (LTR). METHODS: We performed eNose measurements in LTR with and without CLAD, visiting the outpatient clinic. Through supervised machine learning, the diagnostic value of eNose for CLAD was assessed in a random training and validation set. Next, we investigated the diagnostic value of the eNose measurements combined with known risk factors for CLAD. Model performance was evaluated using ROC-analysis. RESULTS: We included 152 LTR (median age 60 years, 49% females), of whom 38 with CLAD. eNose-based classiﬁcation of patients with and without CLAD provided an AUC of 0.86 in the training set, and 0.82 in the validation set. After adding established risk factors for CLAD (age, gender, type of transplantation, time after transplantation and prior occurrence of acute cellular rejection) to a model with the eNose data, the discriminative ability of the model improved to an AUC of 0.94 ( p = 0.02) in the training set and 0.94 ( p = 0.04) in the validation set. Discrimination between BOS and RAS was good (AUC 0.95). Discriminative ability for other phenotypes (AUCs ranging 0.50-0.92) or CLAD stages (AUC 0.56) was limited. CONCLUSION: Exhaled breath analysis using eNose is a promising novel biomarker for enabling diagnosis and phenotyping CLAD. eNose technology could be a valuable addition to the diagnostic armamentarium for suspected graft failure in LTR.

Lung transplantation (LTx) is a lifesaving treatment for selected patients with end stage lung disease. The number of LTx has been rising steadily over the past years along with reductions in morbidity and mortality. 1,2 Survival rates after LTx have improved with a current median worldwide survival of 6.2 years. 2,3 Nonetheless, further improvement of long-term survival of LTx is hampered by chronic lung allograft dysfunction (CLAD). CLAD concerns permanent loss of allograft function after exclusion of other causes. [4][5][6] Worldwide, around 50% of the patients are diagnosed with CLAD within five years after LTx. 7 CLAD is associated with high mortality and morbidity, and its exact pathophysiology is only partially understood. Several phenotypes of CLAD can be distinguished; a predominantly obstructive ventilatory pattern-known as bronchiolitis obliterans syndrome (BOS)-, a restrictive pattern-known as restrictive allograft syndrome (RAS)-, or a mixed obstructive and restrictive pattern (mixed CLAD). Although the phenotypes differ in spirometric and radiological characteristics, establishing the final diagnosis can be challenging. Yet, differentiation between BOS and RAS is crucial, since prognosis of the phenotypes considerably differs. 8 Furthermore, CLAD can be divided into four stages depending on the decline in forced expiratory volume in 1 second (FEV1). 6 At the moment, treatment options for established CLAD are scarce and remain an unmet medical need. Being able to detect CLAD in an early or developing stage or with greater accuracy could enable quicker, personalized interventions directed at reversing or slowing the process and could lead to better overall survival outcomes. [9][10][11] Hence, there is a need for reliable biomarkers that can diagnose CLAD early.
From this perspective, it is of interest to assess the role of exhaled breath analysis in the early diagnosis of CLAD. Exhaled breath contains thousands of molecules known as volatile organic compounds (VOCs). Endogenous VOCs can be associated with normal physiology, but also with pathophysiological inflammatory or oxidative activity. Thermal-desorption-gas-chromatography-mass-spectrometry has shown the potential of individual VOCs to discriminate between BOS stages 1-2 and stage 3. 12 Although identification of individual VOCs is very specific, it is a very time-consuming process and hard to implement in routine clinical care. An entirely different approach is the analysis of exhaled breath by use of an electronic nose (eNose). An eNose can be used to capture the complete mixture of VOCs in exhaled air by several cross-reactive gas sensors without identifying individual components, resulting in a unique pattern, the so-called breathprint. 13,14 This breathprint reflects current health status of the individual, and can be analyzed by pattern recognition using machine learning. Subsequently, real-time measurements of the breathprint with an eNose have potential as a cheap and fast point-ofcare tool in clinical practice. eNose technology shows promising recent results in diagnosing and phenotyping asthma, chronic obstructive pulmonary disease, and interstitial lung diseases. [15][16][17] Despite its potential, eNose technology has so far scarcely been explored within the field of LTx. 18 In this study we aimed to assess the diagnostic accuracy of exhaled breath analysis using an eNose to detect CLAD in LTx recipients (LTR), and assess the added value of the breathprint on top of known risk factors for CLAD. Furthermore, we aimed to assess whether eNose technology could discriminate between different CLAD phenotypes and CLAD stages.

Study design and population
We performed a cross-sectional analysis of a prospective cohort study conducted at the Erasmus University Medical Center, Rotterdam, the Netherlands. Between July 2020 and September 2021, all consecutive LTR, both CLAD and non-CLAD, irrespective of transplant date (median 4 [0. 1-19.3] years after LTx), visiting the outpatient clinic were asked to participate. For all patients a single measurement was included. The measurement that was included in the analysis was the first measurement, unless the treating consultant indicated that the patient was "not stable" (i.e., signs of infection or acute cellular rejection (ACR) or reduced pulmonary function). When a patient was recently transplanted the first stable measurement at least 3 months after LTx was included. When uncertainty remained about whether the patient was truly stable, we included the first consecutive measurement where the treating consultant and team were convinced a patient was stable. No subjects were excluded. This study was approved by the medical ethics committee (MEC-2019-0497). All patients provided written informed consent prior to eNose measurements.

Routine measurement and diagnosis of CLAD
Routine pulmonary function tests were performed to determine the forced expiratory volume in 1 second (FEV1) and the forced vital capacity (FVC). CLAD was diagnosed by the treating consultant and the transplant team according to the ISHLT criteria. 6 CLAD is defined by a substantial and persistent decline (≥20%) in measured FEV1 from baseline. Baseline FEV1 is calculated as the mean of the two best FEV1 measurements post-LTx, with a time interval of at least 3 weeks in between. After excluding other causes of lung function decline, CLAD is confirmed when the decrease of FEV1 persists for 3 months despite clinically appropriate therapies. 1,5,6 CLAD that presented as an obstructive ventilatory pattern (FEV1/FVC ratio <0.7) was classified as BOS. CLAD that presented as a restrictive ventilatory pattern (a ≥10% reduction in baseline total lung capacity [TLC]) along with chest computed tomography opacities was classified as RAS. In case there were insufficient TLC measurements available, a substantial and persistent decline (≥20%) in measured FVC from baseline was used as a proxy. 19 If both an obstructive and restrictive pattern existed cases were classified as mixed CLAD. The remaining cases were classified as undefined CLAD. CLAD stages 0 to 4 were classified according to current FEV1 compared to baseline FEV1, with CLAD 0 as current FEV1 > 80% FEV1 baseline, CLAD 1 as current FEV1 > 65% to 80% FEV1 baseline, CLAD 2 as current FEV1 > 50% to 65% FEV1 baseline, CLAD 3 as current FEV1 > 35% to 50% FEV1 baseline, and CLAD 4 as current FEV1 ≤ 35% FEV1 baseline. 6 Patient demographics, type of LTx, time after LTx, occurrence of any prior ACR episodes, medication use and CLAD phenotype and stage at time of the eNose measurement were collected from electronic medical records. ACR was defined as: (1) clinical suspicion for ACR with confirmatory trans-bronchial biopsy (classified as ≥A2 or ≥B1), (2) clinical suspicion for ACR with suspicious transbronchial biopsy (classified as A1), after exclusion of other causes (such as viral infection) and with clinical response to antirejection treatment (3) clinical suspicion for ACR after exclusion of other causes and with clinical response to antirejection treatment.

eNose -Exhaled breath measurements
Exhaled breath was analyzed real-time using a cloud-connected eNose; SpiroNose (Breathomix, Leiden, the Netherlands). The SpiroNose consists of seven different types of cross-reactive metal-oxide semiconductor sensors. Each sensor is present on the sensor arrays in a duplex configuration, both inside and outside of the SpiroNose. With this configuration, the SpiroNose is able to measure the VOCs in both exhaled breath and ambient air. Detailed description of the methods and set up was published previously. 16 In short, a SpiroNose measurement consists of 5 tidal breaths, followed by an inspiratory capacity maneuver to total lung capacity, a five second breath hold, and slow expiration to residual volume. All measurements were performed in duplicate. Sensor responses were sent directly to the online analysis platform, BreathBase. The BreathBase platform includes the secured online database of Breathomix and is developed conform the requirements of standards ISO 27001 (information security) and NEN 7510 (information security in healthcare). eNose sensor responses were processed and corrected for ambient air as previously described. 16,20 Of each sensor, the peak value was determined and normalized to the most stable sensor. To minimize inter-array differences, sensor-to-sensor ratios were used. Ratios between the sensor peaks and breath hold point were calculated. The normalized sensor peaks and the ratios between the sensor peaks and breath hold points were both used for data analysis.

Standard treatment protocols post LTx
Induction therapy consists of basiliximab (Simulect; Novartis Pharma, Basel, Switzerland). Maintenance immunosuppression consists of a calcineurin inhibitor (CNI)-based immunosuppressive regimen (trough levels 7-10) combined with prednisolone (0.05-0.15 mg/kg) and mycophenolate mofetil (CellCept; Roche, Basel, Switzerland). Tacrolimus (Prograft; Astellas Pharma, Staines, UK) is the most commonly used CNI in our center. Upon significant decline of renal function patients receive a quadruple immunosuppressive scheme adding everolimus (Certican; Novartis Pharma, Basel, Switzerland) to enable CNI dose reduction (tacrolimus trough levels 3-4; everolimus trough levels 3-4). All patients receive Pneumocystis Jerovici prophylaxis (co-trimoxazole or inhaled pentamidine) and cytomegalovirus prophylaxis (valganciclovir) if applicable. Inhaled amphotericin B is used as fungal prophylaxis in the first months and azithromycin is used as standard CLAD prophylaxis in all patients unless patients experience side-effects.

Statistical analyses
Descriptive statistics were used to analyze baseline data. Betweengroup comparisons were conducted using independent samples ttests and Pearson's Chi-squared tests as appropriate.
We investigated the diagnostic value of eNose-based exhaled breath analysis using the following strategy.
First, we achieved a dimensionality reduction of the eNose sensor data by applying a partial least squares discriminant analysis (PLS-DA). PLS-DA is a supervised machine learning approach. 21 In short, it is a modeling technique for data reduction, creating simplified new explanatory variables, known as latent variables, while retaining as much information as possible from the complete dataset. These variables are subsequently used for supervised classification and discrimination problems, and can be visualized using a scatter plot where the axes present these latent variables. 22,23 Second, to perform the actual analyses, we randomly divided the dataset into a training set and a (internal) validation set by using a ratio of 2:1. 24 The training set was used to train the model by using a 10-fold internal cross-validation procedure and the predictive potential of the fitted model was assessed in the validation set. 25 Third, we investigated the discriminative ability of the eNose sensors to distinguish in LTR with CLAD from LTR without CLAD. Furthermore, we assessed the diagnostic value of the eNose measurements combined with available known risk factors of CLAD (age, gender, type of LTx, time after LTx and occurrence of any prior ACR episodes) by adding these risk factors to the diagnostic model from the eNose results. 26,27 Additionally, a multivariate logistic regression model with only clinical parameters was compared to a multivariate logistic regression model with eNose parameters added to the known predictors of CLAD. The final model ultimately consisted of the first two PLS-DA components, age, gender, type of LTx, time after LTx, and occurrence of any prior ACR episodes.
Discrimination between different CLAD phenotypes and stages was analyzed without dividing the dataset into a training and validation set. To assess discrimination between CLAD phenotypes, we first tested BOS versus RAS, and subsequently tested all four CLAD phenotypes. Additionally, the discrimination between CLAD stage 1 versus CLAD stage 2 or 3 versus non-CLAD was tested.
The performance of the obtained models were evaluated by computing the area under the receiver operating characteristic curve (AUC) and the associated model sensitivity, specificity, and accuracy. Furthermore, the 95% confidence interval (CI) was calculated. Additionally, the cross-validation accuracy and Cohen's kappa were calculated. 28,29 The Cohen's kappa indicates how much better the model is over a random classifier. A value >0.75 is considered as excellent, 0.40 to 0.75 as fair to good, and <0.40 as poor agreement beyond chance. 29 Differences in performance between the models were estimated by performing a pairwise comparison of the obtained AUCs using a bootstrap method.
To check whether the primary disease in the native lung of unilateral transplanted patients influenced the discrimination between CLAD and non-CLAD using the eNose, a sensitivity analysis was performed by excluding unilateral transplanted patients. The influence of the exclusion of unilateral transplanted patients was estimated by performing a pairwise comparison of the obtained AUCs using a bootstrap method.

Baseline characteristics
A total of 152 out of the 160 approached LTR participated in the study and were included at outpatient clinic visits. Of

ARTICLE IN PRESS
these patients 131 (86%) underwent a bilateral transplantation, 49% of the patients were female and the median age was 60 [range

Classification of CLAD versus non-CLAD
For the training set the estimated classification accuracy of the PLS-DA model achieved with cross-validation was 83% ( §9.6% SD) with a kappa of 0.47 ( §0.31 SD). The results of the PLS-DA model for the training set, without incorporation of the known predictors for CLAD, are shown in Figure 1, along with the corresponding ROC curve. In the training set, the AUC was 0.86 (95% CI 0.76-0.96) with a sensitivity of 88%, a specificity of 76%, and an accuracy of 79%. In the validation set, the AUC reached 0.82 (0.66-0.98) with a sensitivity of 83%, a specificity of 83%, and an accuracy of 83%.
Multivariate logistic regression with only known available predictors of CLAD and supplemented with the eNose parameters was performed ( Table 2). eNose parameters are significantly associated with CLAD on top of the known predictors of CLAD.
Subsequently, the logistic regression model for CLAD was trained using the two PLS-DA components obtained from the breathprints and known predictors for CLAD (age, gender, type of LTx, time after LTx and occurrence of any prior ACR episodes). For the training set, the estimated

ARTICLE IN PRESS
classification accuracy of the complete model achieved with cross-validation was 85% ( §9% SD) with a kappa of 0.55 ( §0.29 SD). The results for the training set are shown in Figure 2. In the training set, the AUC was 0.94 (95% CI 0.87-1.00) with a sensitivity of 96%, specificity of 85%, and an accuracy of 88%. In the validation set, the AUC reached 0.94 (95% CI 0.87-1.00) with a sensitivity of 100%, a specificity of 78%, and an accuracy of 83%.
Bootstrapped comparison of the AUCs showed that the logistic regression model-with the additional parametersoutperformed the PLS-DA model with breathprint only in both the training (p = 0.02) and the validation set (p = 0.04).

CLAD phenotype
Out of the 38 LTR with CLAD, 20 were diagnosed with BOS, 5 LTR with RAS, 5 LTR with mixed CLAD, and 8 LTR with undefined CLAD phenotype. The results of the PLS-DA model are shown in Figure 3. The SpiroNose accurately discriminated between BOS and RAS with an AUC of 0.95 (95% CI 0.87-1.00), a sensitivity of 100%, specificity of 90%, and an accuracy of 92%.
Likewise, as shown in Figure 4, the SpiroNose could adequately discriminate between the BOS and undefined phenotypes. However, the SpiroNose could not adequately discriminate between the other phenotypes (Table 3).

CLAD stages
The results of the PLS-DA model are shown in Figure 5. The eNose could reliably distinguish between non-CLAD LTR (CLAD 0) and CLAD stages 1 and 2-3. However, for discrimination between CLAD 1 and CLAD 2-3, the AUC was only 0.56 (95% CI 0.38-0.75) with a sensitivity 100%, specificity of 18%, and an accuracy of 54%.

Influence of unilateral lung transplantation
In a sensitivity analysis, unilateral LTR were excluded from the dataset. This resulted in a dataset of 131 bilateral LTR of which 31 LTR were diagnosed with CLAD.  Subsequently, the model was trained on the new dataset and the AUCs were obtained. In the training set the AUC was 0.94 (95% CI 0.89-0.99) with a sensitivity of 78%, specificity of 97%, and an accuracy of 92%. In the validation set the AUC reached 0.86 (95% CI 0.74-0.98) with a sensitivity of 100%, specificity of 66%, and an accuracy of 73%. Bootstrapped comparison of the AUCs showed no significant differences before and after exclusion of unilateral LTR in both the training (p = 0.84) and validation (p = 0.27) sets.

Figure 2
ROC curves of the logistic regression model.

Figure 3
Discrimination between the BOS and RAS phenotypes of CLAD using the SpiroNose and the corresponding ROC curve. BOS: Bronchiolitis obliterans syndrome; RAS: Restrictive allograft syndrome.

Discussion
In this study, we found that exhaled breath analysis using an eNose was able to confirm CLAD in LTx with 79% to 83% accuracy. The addition of known risk factors for CLAD to the breathprint further improved the diagnostic accuracy. Additionally, the eNose could discriminate between BOS and RAS CLAD phenotypes with 92% accuracy.
As for the technical model, eNose technology shows a good discriminative signal between LTR with and without CLAD, as indicated by the cross-validated machine learning model showing an AUC of at least 0.82 for the PLS-DA model (validation set). When known risk factors for CLAD development were incorporated into the model the discriminative power of the model further increased. In literature underlying diagnosis pre-LTx, induction treatment, time after LTx, ACR, gastroesophageal reflux disease (GERD), bacterial colonization, infections, and the presence of donor-specific antibodies (DSAs) were described as consistently associated with the development of CLAD and its main phenotypes. 26,27 In our data set induction treatment, GERD, bacterial colonization, infections, and presence of DSAs were missing and could therefore not be added into the model. A larger validation set should reveal whether the breathprint alone is sufficient as predictor for CLAD or that a larger model with incorporation of other known risk factors would be preferred.
The eNose could accurately discriminate between the BOS and RAS phenotypes. However, discrimination between the other CLAD phenotypes, especially the mixed Discrimination between all CLAD phenotypes using the SpiroNose. BOS: Bronchiolitis obliterans syndrome; RAS: Restrictive allograft syndrome. and undefined phenotypes was insufficiently accurate. Whereas this could be related to the overall small numbers in the subgroups it may also give insight into the underlying characteristics of the various phenotypes. There are some putative mechanisms described for RAS including inflammation, innate immune response, and humoral immune response, which may result in irreversible tissue remodeling and fibrosis. Putative mechanisms described for BOS include different injurious processes, such as allo-and autoimmune responses, external stimuli (such as gastroesophageal reflux, bacterial or viral infections, pollutant exposure), and airway ischemia, ultimately leading to remodeling of the small airways while other compartments of the lung remain relatively intact. Nonetheless, the mechanisms of BOS and RAS often overlap. 30 The inability to discriminate undefined and mixed CLAD patients may be explained by the fact that these phenotypes express characteristics of both RAS and BOS. Although highly speculative, the patients with mixed phenotype seem to cluster more towards the BOS phenotype, and patients with undefined phenotype cluster more towards the RAS phenotype. This does however not align with data as published by Levy et al. on prognosis of these phenotypes, as they found that patient with RAS or mixed phenotype had a worse allograft survival than BOS, while patients with an undefined phenotype did not differ in allograft survival from patients with BOS. 8 Unsupervised analyses of breathprint in a larger and multicenter CLAD dataset may yield useful insights into how patients will naturally cluster together as opposed to how they are grouped according to the current classification. This might confirm or challenge the current CLAD classification.
Sensitivity analysis of CLAD detection, in which patients who received unilateral lung transplantation were excluded, showed no significant differences. This suggests that the influence of the native lung of unilateral LTR on the breathprint could be discarded for the discrimination between LTx with and without CLAD.
To date, there are limited treatment options for CLAD with no treatment able to reverse the onset of CLAD. Addition of azithromycin has shown to delay CLAD-onset and improve long-term survival, and is yet regarded standard therapy before allowing a diagnosis of CLAD. 31,32 Therapeutic options for CLAD (BOS) include total lymphoid irradiation (TLI), extracorporeal photopheresis (ECP), or rescue treatment with alemtuzumab. All of these have not been studied extensively or in controlled studies, but may have some effect in slowing down progression of CLAD. 33 Also, a beneficial effect of ATG as a second line therapy is suggested, with a response rate of 63%. 34 Some beneficial effects with add-on montelukast have also been suggested in two small studies, but its effect remains unclear. 35,36 As CLAD is histologically often accompanied by interstitial fibrosis, novel antifibrotic agents are currently under study in various clinical trials, both in BOS and RAS phenotypes of CLAD, with disappointing initial results for BOS. More results are expected later in 2022. 37 Finally, if all treatment options fail re-transplantation can be a last resort to treat advanced CLAD. Nevertheless, a minority of patients qualifies for re-transplantation and outcomes are inferior to primary transplantation. 38 Despite the limited treatment options of established CLAD, timely diagnosis is highly desirable. Earlier diagnosis could potentially enable more successful treatment. 5 Regarding future perspectives, if patients that are at risk of developing CLAD could be identified before pulmonary function actually deteriorates, early treatment such as increment of the net state of immunosuppression, might be initiated with the possible potential for better outcomes. Our study shows that the eNose might be a suitable point of care test to rapidly and noninvasively establish reliable diagnosis of CLAD. Future studies will be directed at external validation of the current results as well as the potential of a more timely diagnosis or even predicting development of CLAD before actual decline in pulmonary function becomes apparent.

Figure 5
Discrimination between the different CLAD stages using the SpiroNose and the corresponding ROC curves.
This is the first time that exhaled breath using eNose technology was demonstrated to be a very promising novel biomarker for CLAD. Despite the potential of eNose technology, it has barely been explored in the field of lung transplantation. 18,39 The ability of eNose technology to discriminate between different pulmonary diagnoses have been evaluated by numerous clinical studies for a wide spectrum of lung diseases. 18 Accordingly, evidence is accumulating that eNose may be a valuable diagnostic tool for clinical and inflammatory phenotyping of for example asthma, COPD, or interstitial lung diseases. [15][16][17] Real-time measurements of the breathprint by an eNose is a low-cost and fast point-of-care tool that may provide useful as addition to the current repertoire in clinical follow-up after LTx. Further study into the application and validation is warranted given these promising first results. Using eNose technology, only the patterns of VOCs in exhaled breath are captured by cross-reactive gas sensors. Subsequently, these patterns are analyzed and classified using machine learning algorithms. In order to identify individual VOCs, other techniques-such as gas chromatography−mass spectrometry (GC-MS)-are needed, which are very complex and time-consuming, but could theoretically be combined with the current eNose. Despite this inherent characteristics and inability to characterize individual VOCs, main advantages are that this technique is cheap, easy to execute, fast and can be made available in real-time at the outpatient clinic, thus offers opportunities for widespread implementation.
A potential limitation in this study is the relatively small validation set. The size of the validation set has an influence on the random uncertainty of the observed performance of the model, as to avoid bias the validation set needs to be representative for the patient population. 40 Nonetheless, results from training and validation set are highly similar in all models tested, indicating good model fit. Also, eNose studies generally require small numbers to assess discrimination abilities. 24 The diagnostic accuracy found in this cohort is very high and reproducible in the validation sets and thus unlikely to be based on chance findings. Additionally, the study was underpowered for discrimination between the 4 phenotypes of CLAD. Analyses of the breathprint in a larger and multicenter CLAD dataset should reveal the accuracy of eNose technology to distinguish between the four phenotypes of CLAD. A strength is the large response rate and the well-defined cohort of patients.
In the present study we for the first time demonstrate that it is possible to discriminate between LTR with and without CLAD, including RAS and BOS phenotypes, with the use of eNose technology. Hence, exhaled breath analysis using eNose technology is a very promising novel biomarker for CLAD enabling timely diagnosis and phenotyping. Furthermore, eNose technology could be a valuable addition to the diagnostic armamentarium for suspected graft failure after LTx.

Disclosure statement
JA: reports personal fees and non-financial support from MSD; personal fees from BMS, Boehringer Ingelheim, Amphera, Eli Lilly, Takeda, Bayer, Roche, Astra Zeneca outside the submitted work. In addition, JA has a patent on allogenic tumor cell lysate licensed to Amphera, a patent combination immunotherapy in cancer pending, and a patent biomarker for immunotherapy pending. OM: reports personal fees from Astra Zeneca, Boehringer Ingelheim, Novartis outside the submitted work. The remaining authors have no conflicts of interest to disclose.