Identification of profiles of volatile organic compounds in exhaled breath by means of an electronic nose as a proposal for a screening method for breast cancer: a case-control study

The objective of the present study was to identify volatile prints from exhaled breath, termed breath-print, from breast cancer (BC) patients and healthy women by means of an electronic nose and to evaluate its potential use as a screening method. A cross-sectional study was performed on 443 exhaled breath samples from women, of whom 262 had been diagnosed with BC by biopsy and 181 were healthy women (control group). Breath-print analysis was performed utilizing the Cyranose 320 electronic nose. Group data were evaluated by principal component analysis (PCA), canonical discriminant analysis (CDA), and support vector machine (SVM), and the test’s diagnostic power was evaluated by means of receiver operating characteristic (ROC) curves. The results obtained using the model generated from the CDA, which best describes the behavior of the assessed groups, indicated that the breath-print of BC patients was different from that of healthy women and that they presented with a variability of up to 98.8% and a correct classification of 98%. The sensitivity, specificity, negative predictive value, and positive predictive value reached 100% according to the ROC curve. The present study demonstrates the capability of the electronic nose to separate between healthy subjects and BC patients. This research could have a beneficial impact on clinical practice as we consider that this test could probably be used at the first point before the application of established gold tests (mammography, ultrasound, and biopsy) and substantially improve screening tests in the general population.


Introduction
Breast cancer (BC) is an oncological process in which healthy breast gland cells degenerate into tumor cells, proliferating and multiplying to form a tumor that can spread to other organs through the lymph nodes. This cancer is the most common type in women and is the second leading cause of death in the world. In 2018, the World Health Organization (WHO) estimated that there were 2.09 million cases of BC worldwide and more than 627 000 women died from this disease [1].
Several studies have shown that the risk of suffering from BC may be due to a combination of different factors, among the most important are: biological factors, such as age over 40 years; environmental factors, such as exposure to ionizing radiation and environmental pollutants; lifestyle factors, such as obesity, sedentary life, and smoking; reproductive history (e.g. no experience of a first pregnancy after 30 years of age); and genetic factors, such as being a carrier of mutations in the BCRA1 or BCRA2 genes [2][3][4]. Although the direct causative agents of BC remain largely unknown, so far early detection has been the most important aspect of efforts to fight this disease. Strategies such as awareness, early detection, accurate diagnosis, timely treatment, and supportive care are critical to reducing the burden of BC [5][6][7].
Mammography is generally used for screening programs, and the WHO states that the use of this technology can reduce mortality by up to 20%. However, as with other screening tests, mammography leads to false positive/negative results in over 20% of cases [8]. When used in settings with sufficient resources available, mammography screening programs are recommended for women aged 50-69 years, with testing every two years; for women aged 40-49 years, this test has been shown to have less sensitivity, with one of the main reasons being the high breast density in this age group, where the sensitivity of the test is reduced from 37.6% to 67.6% in early stages of the disease [9].
The WHO states that in resource-constrained settings and deficient health systems, population-based mammography screening programs may be neither cost-effective nor feasible. Therefore, early diagnosis and treatment should be the priority in these settings, where clinical breast examination appears to be a promising screening method [1].
To address these challenges, effort has been invested into the discovery of new diagnostic biomarkers and screening through genomic analysis, leading to improvements in clinical outcomes and survival rates [10][11][12][13]. Tests such as the identification of BRCA1 and BRCA2 mutations decrease the cost of timely BC identification programs for women with a family history [14,15]. However, it is costly and applies only to people with identified risk factors and not to the general population. Evidence-based frameworks for prevention and cost-effective management of BC are essential for achieving equitable outcomes and addressing the growing burden of disease [16].
The analysis of cancer biomarkers for screening based on the volatilome represents a promising development and has the characteristics of being minimally or non-invasive, fast, low-cost, and with potential for a greater coverage of the population. This technology analyzes and identifies the volatile organic compounds (VOCs) that are produced from the metabolism of normal and cancer cells and interact with their microenvironment to be detectable in biological fluids, depending on their tissue-blood/urine/breath partition coefficients [17]. This approach is based on the production of VOCs by metabolism in pathophysiological processes, including hypoxia, increased energy expenditure by hyperproliferation, excessive inflammatory activity, the generation of reactive oxygen species, and many other processes that occur during the evolution of cancer and lead to changes in the types and amounts of VOCs generated [17][18][19].
Gas Chromatography coupled to Mass Spectrometry (GC-MS) has frequently been used for the identification and quantification of VOCs in exhaled breath. Our study intended to identify the chemical nature of the composition of VOCs in women with BC, allowing the exploration of each compound through multivariate statistics, and comparing this to the composition of VOCs in healthy women in order to statistically discriminate the differences and, thus, determine a chemical print that acts as a biomarker of the disease [20,21]. In 2014, Wang et al evaluated VOCs in exhaled breath from 85 patients with histologically confirmed BC and 45 healthy subjects using GC-MS, and identified that the VOCs 2,5,6trimethyloctane, 1,4-dimethoxy-2,3-butanediol and cyclohexanone differed between the study groups [22]. This study suggested that these metabolites can be used in the identification of specific disease biomarkers; however, the economic cost of the analysis made it unfeasible to propose this approach as a screening method.
Another widely used methodology for the global identification of VOC patterns uses gas sensors such as the electronic nose. This approach measures the set of breath signals produced, without details of the chemical composition of the sample; this is defined as an olfactory print (breath-print), and its association with disease and differentiation between healthy subjects is accomplished through multivariate analysis, including principal component analysis (PCA), canonical discriminant analysis (CDA), Support Vector Machine (SVM), Artificial Neural Networks, and other methods. This approach is not able to identify every VOC, but it can reveal the overall pattern and, therefore, can be used in the future to identify similar cases [23]. In 2011, Shuster et al evaluated the exhaled breath of 7 healthy women, 16 women with benign breast conditions and 13 women with malignant lesions using an artificial nanoscale nose. The authors managed to distinguish between the groups, and this pilot study has guided the use of the electronic nose as a cost effective, fast, and reliable method in BC screening [24].
The present article aimed to analyze the olfactory print of patients with BC and healthy women using an electronic nose, in order to evaluate its potential as a complementary screening method. A secondary objective was to test the methodology in a pilot study in women categorized as BI-RADS-0.

Patients, healthy subjects, and study design
A prospective, observational, case-control study was conducted with patients from the Breast Cancer Foundation A.C. (FUCAM) in Mexico City (2240 m above sea level). This site serves 7% of the BC population of Mexico. A convenience non-probabilistic sampling method was performed from March to December 2019. The sample size calculation was performed using the statistical package STATISTICA V7 (Power Analysis and Sample Size Calculation in Experimental Design-Calculating Required Sample Size), assuming the comparison of different means using a one-way analysis of variance. Conservative values were used considering a normal distribution, 50% variability, two experimental groups, with a Power (1-β) of 90% and a significance value (α) of 1%. For the calculation of sample size, it was considered that the prevalence of BC in Mexico City was 26 000 cases per year. For the purposes of this work, the sample size required for each group was 166 subjects.
The study inclusion criteria for BC patients were: (i) recently diagnosed with BC by biopsy; (ii) with no previous oncological treatment for at least one year; (iii) stage I to IV ductal carcinoma. The study elimination criteria were: lung diseases such as COPD, lung cancer, recent respiratory infections, pregnancy, insufficient sample, not following the breath sample protocol, and not signing the informed consent.
A group of healthy women was included in the BI-RADS scale of 1 and 2 as healthy controls. Inclusion criteria for healthy subjects were: (i) absence of BC and (ii) no history of any type of cancer. Also, a questionnaire about family health history, comorbidities (hypertension, diabetes, and other diseases), number of BI-RADS, socioeconomic characteristics, risk activities, such as tobacco smoking, number of births, among others, was applied to both groups. Control group elimination criteria included: diagnosed lung diseases, recent infections, failure to take a sample, failure to follow the breath sample protocol, and failure to sign the informed consent. The study was reviewed and approved by the Ethics Committee of INER, with the number C54-18, and the Scientific Committee of FUCAM. This protocol was approved by COFEPRIS to be carried out in humans.

Exhaled breath collection
The collection of exhaled breath samples was conducted based on previous studies proposed by our research group and as proposed by the European Respiratory Society guide [23,25]. The relaxed participants underwent three deep breaths followed by a deep exhalation into breathing collection bags (BCB) consisting of a 1.4-liter metalized plastic bag that had been pre-purged twice with ultra-pure nitrogen [26]. Every participant was asked to repeat the process.
The conditions under which the sample was collected, both for healthy patients and patients with BC, were as follows, (i) fasting conditions (minimum 5 h of fasting), (ii) without smoking before the study, (iii) without oral hygiene and, (iv) before taking any medication. The samples were transported at 4 • C and subsequently analyzed. In addition, an environmental control sample was taken to eliminate possible interferences [27].

Analysis of exhaled breath
The Cyranose 320 (Sensigent ®, California, US) was employed to determine the breath-print of the study groups. This technology is equipped with 32 polymer-based sensors, each with different sensitivities to various VOCs causing an increase in the electrical resistance of each sensor.
The sample processing was as follows, first, the BCBs were incubated at 37 • C for 5 min before the reading. The electronic nose setting consisted of a constant flow rate of 120 ml min −1 , with 40 s of baseline with ultra-pure nitrogen. The duration of the Cyranose output signal collection was 90 s. Then, the flow rate was increased to 180 ml min −1 of ultra-pure nitrogen for sample line purging and air intake, and the substrate temperature was 32 • C.
As an internal quality control, the resistance of the 32 sensors was recorded every day of the experiment to evaluate the quality of the analysis and determine the variation of the sensors' responses to methanol, and Undecane standards were used at a concentration of 1 part per million. In addition to this measurement, 10 ultrapure-nitrogen samples were randomly recorded to verify the baseline signal. Low coefficients of variation were obtained (Supplementary materialtables 1 and 2 (available online at stacks.iop.org/JBR/ 14/046009/mmedia)).

Statistical analysis
The choice of the data pre-processing algorithm has been shown to affect the performance of the pattern recognition stage. Software written in MATLAB 6.1, CDAnalysis (Sensigent®), was used to extract features from the data in terms of the static change in sensor resistance. All data were normalized using a fractional difference model: R/Ro = (Rmax-Ro)/Ro, where R is the response of the system to the sample gas, and Ro is the baseline reading, the reference gas being the ultrapure nitrogen flow.
In addition, a self-scaling was carried out to eliminate the effects of the magnitude of the sensor responses, by subtracting the average of the samples from the individual response of each sample and dividing it by the standard deviation of the samples.
To capture the greatest amount of variability in the data, PCA was performed using the Chemometric Data Analysis software CDAnalysis (Sensigent®), thus, reducing the data from the 32 sensors to three main components. CDA and SVMs discrimination models were used to assess clustering within the data sets of healthy women and BC patients. These exploratory techniques are used to investigate how the data cluster in the multi-sensor space.
The sensors with a higher importance index were used to obtain the CDA and SVM discrimination models through a cross-validation value (leaving one out of the procedure and, thus, predicting the group association and obtaining overall classification success rates) and the Mahalanobis distance between the group means in units of standard deviation.
SVMs is a kernel-based (radial Gaussian) supervised learning classification method that determines the optimal boundaries (support vectors) that precisely separate groups [28]. By giving n training pairs (x 1 ,y 1 ),(x 2 ,y 2 ),…,(x n ,y n ), where x i is an input vector, and y i ∈ [25], the SVM solves the following main problem: Where β is a unit vector (i.e. ||β|| = 1), T denotes the transposition of the matrix to Kernel, C is the adjustment parameter denoting the compensation between the margin width and the training data error and ξ i ⩾ 0 are stationary variables. For an unknown input pattern x, we have the decision function: Where, αi, i = 1,2, …, n; αi ≥ 0 are the Lagrange multipliers, and K (x, The Gaussian radial base function is used as the kernel function Where γ > 0 are fixed parameters, γ [28,29]. The vector limit display was shown in blue for a matrix equal to zero, indicating a low confidence prediction, and green outlines for a matrix equal to 0.99, with a high confidence prediction. Cross-validation (CV) is a standard technique for adjusting the hyperparameters of predictive models. In K-fold CV, the available data S was partitioned into K subsets S 1 ,…,S K . Each data point in S was randomly assigned to one of the subsets, such that these are of an almost equal size (i.e. ⌊|S|/K⌋≤|S i |≤⌈|S|/K⌉). Furthermore, we defined S i = j=1 ,…, K∧j̸ =i S i as the union of all data points, except those in S i . For each i = 1,…,K, an individual model was built by applying the algorithm to the training data S i . This model was then evaluated by means of a cost function using the test data in S i . The average of the K outcomes of the model evaluations is called cross-validation (test) performance or cross-validation (test) error and was used as a predictor of the performance of the algorithm when applied to S. Typical values for K are 10 [30].
We evaluated differences in the CDA model between the groups of Healthy vs BC, <39 vs Healthy, >40 vs Healthy, IDC, other types of cancer, BI-RADS 2 3, and BI-RADS 4 5 6, in order to explore differences in healthy and BC groups. The performance of the CDA model was evaluated by receiver operating characteristic (ROC) using the PC1 axis values obtained from the training model, with an analysis with a 95% confidence interval, and the threshold value was selected with the highest specificity/sensitivity ratio. An external validation was conducted by selecting 70% of the population for the definition of the groups and the other 30% were randomly selected to validate the model.
The analysis was performed using XLSTAT ver-  Table 1 presents the general characteristics of the study participants. A total of 443 women participated, including 262 women with BC diagnosed by biopsy and 181 healthy women, with similar characteristics to those of the BC group who did not present significant differences. The general characteristics of the women with BC were an average age of 53.3 ± 13.0 years, with 84.4% over 40 years, 32.3% presenting with hypertension, and 21.7% with T2DM. Ninety-five percent mentioned not having smoked. In 61.6% of cases the tumor was located in the left breast, and 67.6% presented with invasive ductal carcinoma (IDC), 6.5% had in-situ ductal carcinoma (ISDC), and 6.1% had invasive lobular carcinoma (ILC). The most frequently reported BI-RADS were 4, 5, and 6, covering 93.8% of the population of the study. With respect to stages, 17.4% were reported as stage 0, I and IA, 44.5% as stage IIA and IIB, 31.7% as stage IIIA, IIIB and IIIC, and 6.4% as stage IV. 1(a-c) shows the PCA, CDA and SVM Model of the general model comparing groups of healthy women against women with BC. In figure 1(a), corresponding to the PCA, an evident separation between the study groups is shown, with the percentage of variability between the groups being 98.87%. In figure 1(b), it can be seen that 28 sensors presented with a higher index and were used to construct the CDA model. The SVM model in figure 1(c) also shows a high discrimination between the two study groups (100%), indicating that there are differences between healthy female VOCs and VOCs from BC patients.

Results
With the information from the CDA model (Supplementary material-figure 1), the percentages of correct prediction between groups of healthy women vs women with BC were projected to reach 98%. On   figure  1), the results indicate that the subgroups within the BC group do not have an influence on the model's ability to distinguish the groups or the percentage of correct prediction.  Table 3 presents the Cross-Validation Correct Rate performed by the Support Vector Machine model. A correct classification with 100% was obtained in the general model of the BC group vs healthy women. The same analysis was applied to establish the classifications between healthy women vs women with BC under 39 years, women with BC over 40 years, women with IDC and women with other types of BC, reaching 97.8%, 98.4%, 98.3% and 97.4% of correct classification, respectively. This can be observed clearly in figure 2 of the supplementary material.
Furthermore, with the values created in the CDA score, a cut-off point of 0.048 was established, which provided 100% sensitivity (Confidence Interval: 0.982%-100%) and specificity (Confidence Interval: 0.974%-100%), with a negative predictive value and a positive predictive value of 100%, and 100% accuracy over the entire ROC curve ( figure  2). The values of the external validation of the CDA model obtained a percentage of correct prediction of 98.7%.
Finally, the methodology developed was applied to a double-blind study of seven women with BI-RADS 0, and the Mahalanobis distance and the probability of belonging to the group of healthy women or the group with BC were calculated (table 4). 100% of the women showed similar breath print characteristics to the BC group (Supplementary materialfigure 3).

Discussion
This study demonstrates the great potential of electronic noses for discrimination between the breathprint of women with BC and healthy women. The main contribution of this work is that of having wellcharacterized groups of BC, and the discrimination between healthy women with respect to the types of BC (Supplementary material-figure 1). Moreover, in the pilot study, we managed to determine the similarity of the breath-print in women with BI-RADS 0 and women with BC diagnosed with a biopsy. It is important to highlight that, in this group, the corresponding gold tests were performed, and six women were ultimately diagnosed with infiltrating ductal carcinoma and one with a malignant phyllodes tumor (five women in grade III and two in grade II).
Similar studies using electronic noses have been shown to discriminate between healthy women, those with benign tumors and women with malignant stages of cancer. In women from China, an investigation was carried out where 30 healthy women were included, 52 with benign conditions, 25 with ductal carcinoma in-situ and 169 women with malignant lesions. This model reached 83% accuracy [31] by using an artificially intelligent nanoarray based on Gold Nanoparticles (GNPs) and Single Wall Carbon Nanotubes (SWCNTs) coated with different organic layers. In Israel, a model utilizing the Cyranose 320 was built using an artificial neural network (ANN) algorithm with 46 BC women and 39 healthy women, achieving up to 86.1% accuracy, 87.5% sensitivity, and 85% specificity [32]. Although the Cyranose 320 was used in the study proposed by Or Herman-Saffar et al, 2018 [32], the main difference to our study was the analytical method; it is worth highlighting the following differences: (i) the sample analysis time proposed was 330 s, (ii) the sample was taken by connecting a mask directly to the input of the electronic nose, and (iii) the patient exhaled for 40 s. The advantage of the sampling proposed in our study is the reduction of the effect of the ambient air via our hermetic system, in addition to obtaining a greater amount of alveolar air, in which a greater amount of sample is concentrated. Our method also ensures a continuous flow that reduces the variability in the response of the sensors. On the other hand, the method of Herman-Saffar and colleagues obtained good sensitivity and specificity using 85 samples (46 sick and 39 healthy). However, the limitations suggested by the study indicated that the data extraction proposed for the elimination of sensor noise did not improve the percentages of sensitivity, specificity, and accuracy; therefore, data processing did not improve performance in the Artificial Neural Network model.
Other studies used to discriminate between exhaled breath VOCs of healthy subjects and BC patients have applied gas chromatography coupled to mass spectrometry, and these have developed models with a number from 10 to 80 patients with BC compared to healthy women. The results of these studies reached a sensitivity from 68.2% to 93.8%, a specificity of 61% to 91.7% and an accuracy from 79%   [26,31,[33][34][35]. This shows that there are differences between the profiles of VOCs in these groups, indicating their ability to be used as biomarkers of disease. The studies using GC-MS concluded that increased oxidative stress in breast tissue, caused by cancer, is linked to the alteration of biomarkers in the breath. Saturated hydrocarbons (ethane, pentane, nonane, tridecane), methylated derivatives (5-methyl undecane, 3-methyl pentadecane), and alcohols (2ethyl-1-hexanol) have been identified as candidates for possible biomarkers [26,31,[36][37][38][39].
Remarkably, the electronic nose provides pattern recognition methods that allow great discrimination between healthy and patient groups; however, in this study it does not manage to identify between stages or genotypes of cancer (table 5, supplementary material- figure 4(e)). Though the specificity is clinically relevant since it defines the guideline for the treatment of the disease, we consider that due to the obtained data, the analysis of the breathprint will be able to substantially improve the screening capacity of the BC. Moreover, the mathematical models developed are able to discriminate patients in BI-RADs 0 (table 4, supplementary material-figure 3) and even manage to position the samples from early stages of the disease.
The limitations of our study include that it was conducted at only one center, and we believe that other types of cancer should be included to establish the differences between the diseases. As for the strengths of our work, the high number of samples used in our model allowed us to assess factors that can be considered confounding, such as age, smoking, diabetes, and hypertension. Several authors point out that age is an independent variable of the breathprint and that it does not change the accuracy of the test [39], in agreement with our results presented among the groups under 39 and over 40 years old (table 2). Furthermore, when evaluating the influence of comorbidities (diabetes and hypertension) and habits (smoking) on the model (Supplementary  material-table 3 and figures 5 to 8), it was demonstrated that the separation between the groups is not due to these factors. This result is consistent with other studies evaluating VOCs in exhaled breath to identify chemical prints of lung diseases using mass spectrometry and the electronic nose [23,[40][41][42]. We consider that further evaluations of this techniques should include the evaluation of other types of cancer, as well as including a multi-center study to allow the development of this technology so that it can be used as a screening method, taking into account that our external validation results indicate a high percentage of correct classification (97.8%). Besides, although different authors show similar results to those obtained in this work, with a lower number of samples, there is a lack of standardization of sample collection among the different studies to implement this methodology in clinical practice. This research could have a beneficial impact on clinical practice [43], and we consider that this test could probably be used as a first point of contact before the application of established gold tests (mammography, ultrasound, and biopsy) and substantially improve screening tests in the general population. Future development will involve innovation or a combination of advanced techniques for sampling and detection but, in addition, the analysis of the breathprint needs to be validated and standardized for realworld clinical use.