Automatic cough detection based on airflow signals for portable spirometry system

We give a short introduction to cough detection efforts that were undertaken during the last decade and we describe the solution for automatic cough detection developed for the AioCare portable spirometry system. In contrast to more popular analysis of sound and audio recordings, we fully based our approach on airflow signals only. As the system is intended to be used in a large variety of environments and different patients, we trained and validated the algorithm using AioCare-collected data and the large database of spirometry curves from the NHANES database by the American National Center for Health Statistics. We trained different classifiers, such as logistic regression, feed-forward artificial neural network, support vector machine, and random forest to choose the one with the best performance. The ANN solution was selected as the final classifier. The classification results on the test set (AioCare data) are: 0.86 (sensitivity), 0.91 (specificity), 0.91 (accuracy) and 0.88 (F1 score). The classification methodology developed in this study is robust for detecting cough events during spirometry measurements. As far as we know, the solution presented in this work is the first fully reproducible description of the automatic cough detection algorithm based totally on airflow signals and the first cough detection implemented in a commercial spirometry system that is to be published.


Introduction
A cough can be described as a sudden, and often repetitively occurring, air expulsion with a forceful expiratory effort. The cough reflex is initiated by irritation of cough receptors in the airways [1]. As a consequence, nerve impulses from the cough centre in the brainstem stimulate the diaphragm, intercostal muscles and larynx to produce the explosive expiration of cough. Cough often reflects respiratory irritation or illness and can also occur as an early symptom of asthma, cystic fibrosis or chronic obstructive pulmonary disease (COPD) [1,2].
Patients suffering from chronic pulmonary diseases should be regularly monitored to evaluate the progress of disease or treatment. A common method for diagnosing and monitoring of pulmonary functions is spirometry. However, patients with COPD frequently complain of breathlessness and cough which are usually increased during exacerbations. On the other hand, spirometry maneuver needs physical effort from the patient and can cause irritation of airways that results in cough during the examination. Worldwide spirometry standards require that correct spirometry maneuvers do not contain cough and are free of cough artefacts [3].
Recently, much effort is focused on developing pocket, mobile peak flow meters and spirometers having the same or similar functionality to the stationary clinical spirometers for expanding the accessibility of this type of monitoring (e.g. [4,5]). These systems are intended and designed to perform spirometry measurements with no supervision of physician or with supervision of a physician with limited experience (including general practitioners). Therefore, they need to comprise automatic real-time algorithms that can detect the cough in an accurate and efficient way and warn the user in case of incorrectness of the measurement.
In recent decade, the cough detection issue has been exhaustively explored. The growth of computing power allowed to analyze cough signals in real time using smartphones or dedicated hardware, even if algorithms are computationally intensive. However, virtually all of the research was related to audio signals [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21] or accelerometer recordings [22,23], which remains in contrast to our contribution. In our solution, we do not analyze the sound but the air flow signal passing through the spirometer, thus, the troublesome influence of environmental noise is largely minimized at once. The main purpose of developing cough detection and segmentation algorithms described in the literature was monitoring patient's cough over time and counting cough occurrences [6][7][8][9][10][11][12][13][14][15][16][17]. Some of the research was dedicated to assessing the degree of pathology for patients suffering from cystic fibrosis [18], to detect cold [19], tuberculosis [20] or COPD [21]. There were several studies on the relevance of different sensors for cough detection (e.g. ECG sensor, thermistor, chest belt, oximeter) [24,25] but no air flow sensor investigated. During last years, due to the constant reduction of size of electronic equipment, there are attempts to develop wearable cough detection system [17,27,28] which is inexpensive in use and could monitor the patient's cough continuously and not disturbing his activities. A very recent idea is to make use of smartwatches for ambulatory cough monitoring [29]. Advanced mathematics is also exploited recently to increase the performance of cough detection algorithms, this is e.g. using octonions (octets of real numbers) [30] or so-called Hu moment invariants from image processing domain [31]. The interesting is that cough detection was applied not only for human patients, but also for veterinary monitoring of farm animals [32,33].
Although the cough detection seems to be examined from many different perspectives, browsing the articles one can realize that not uncommon issue is the low number of patients that produced the records for the dataset, usually not exceeding a dozen, sometimes up to several dozens of subjects. In some cases, each subject produced several cough samples or the recordings of subjects were divided into numerous segments. Therefore, the numerical results presented by the authors may not be always entirely accurate if rescaled for larger or more diverse sets of patients, then, they may present limited usefulness and reliability, especially in broad clinical or commercial application. Algorithms for automatic cough detection are implemented in some stationary spirometers, however, the manufacturers do not disclose data on performance or detection methodologies of their solutions, therefore, no data is available for comparative analysis.
In this work, we describe the solution for automatic cough detection developed for the AioCare spirometry system (CE and ISO certificates, FDA pending) [34]. The system consists of three main elements: portable spirometer (class IIa medical device), mobile application for smartphone and Internet cloud to store the data. During the measurement, the airflow signal is transmitted from the spirometry device to the mobile application where is consequently analyzed by dedicated algorithms. In result, all of the clinically important parameters are presented to the user, e.g. forced vital capacity (FVC), forced expiratory volume in first second (FEV1), their ratio (FEV1/FVC), peak expiratory flow (PEF), etc. Similarly, if any technical errors occurred during the maneuver then they are shown to the user and they determine the technical correctness of the examination as specified in acceptability criteria in [3], thus the need to repeat the measurement. The presence of cough is one of the indicators of incorrectness of the maneuver. As the system is intended to be used in a large variety of environments (clinical and in-home) and by both physicians and patients themselves, the cough detection algorithm we present in this work is developed to be accurate and robust and is tested on large dataset of spirometry airflow recordings. Large attention is also paid to the need of high specificity of the algorithm to avoid negative consequences of incorrect classification as cough which could cause the unjustified need of repeating the measurement or user's discouragement.
The organization of the article is as follows. In Section 2 the database used for training and validation of the algorithm is briefly described. Section 3 provides the overview of analytical methods adapted to construct the algorithm. The results of training and testing are presented in Section 4. In Section 5 remarks and conclusions from the research are stated.

Data and preparation
The data for the research was obtained from the National Health and Nutrition Examination Survey (NHANES) database by the American National Center for Health Statistics [35]. It is a free data source containing raw spirometry curve data and additional information about the examinations. Spirometry testing procedures of NHANES database met the recommendations of the American Thoracic Society. Three subsets of the database covering years 2007-2012 were used (available in the Internet [36]). The patients were both males and females from 6 to 79 years old. Due to the NHANES documentation, participants eligible for spirometry performed an initial first test spirometry examination. Then, if certain criteria met, a subset of participants performed a repeat the second test spirometry exam after inhaling a β2-adrenergic bronchodilator. Multiple individual spirometry curves were typically obtained during both test spirometry examinations. The dataset contains the raw signals for all of these individual spirometry curves. While the majority of spirometry studies collected in NHANES are of high quality, some spirometry curves may show defects such as extra breaths, a cough, a back extrapolated volume error (BEV error) or a false start to the expiratory maneuver. These curves are divided to 4 subsets (A-D) in the NHANES database where the subset A contains the curves of acceptable quality, B -curves with a large time to peak flow or a non-repeatable peak flow, C -curves that had either less than 6 seconds of exhalation or no plateau, and D contains cough and BEV error curves. Thus, the cough containing curves were extracted from the D-labeled examinations by 4 experienced human experts to create the dataset of two classes: ATS-acceptable and other error curves versus cough curves. Examples of ATS-acceptable and cough containing maneuvers are shown in Figure 1. Although NHANES data is massive, the cough detection algorithm trained on that data is to run on the signals collected using the AioCare spirometer, hence, the signal collecting devices are different and signal properties may differ (e.g. sensitivity of airflow sensors, level of noise). To prove the reliability of NHANES-trained and tested algorithm on AioCare-collected curves, therefore also to demonstrate the independence of developed solution from hardware and data source, the second data set contained of 218 curves (115 containing cough events and 103 without cough) was obtained from AioCare measurements during forced vital capacity maneuvers. These signals were mainly obtained from 5 healthy volunteers who performed normal spirometry maneuvers or imitate cough events during examinations. In the AioCare additional test data set, there are also 19 steady-flow signals generated by Series 1120 Flow Volume Simulator by Hans Rudolph, Inc. Adding curves of this very specific kind to the AioCare data test set was to ensure that the spirometry system will correctly recognize such signals as non-cough ones. Moreover, during the preliminary analysis of the NHANES data set it has been found that there are very few non-cough signals with PEF of 1.5 L/s or lower. However, the experience shows that such a low PEF can be the case for children or patients with very severe symptoms. To overcome this issue, additional 55 AioCare measurements of low (<1.5 L/s) PEF and with no cough were performed and added for training set. Table 1 presents a short overview of the dataset contained of NHANES and AioCare data.

Data preprocessing
Before extracting features and supplying them to the algorithms, some preprocessing of raw data is needed. These preprocessing steps are to standardize the curves and to clean the region of interest from noise and artefacts. These are performed automatically in order as following: a. Segmentation of the forced exhale signal from the raw curve. It is usual that the flow curve contains not only the forced exhale but also e.g. inhales before or after the main maneuver. The segmentation covers the fragment from the starting point of forced exhale up to start of the first inhale (if occurs) that follows the main maneuver.
b. If the length of the forced exhale signal after segmentation is longer than 600 samples (6 seconds) it is cut down to 600 samples. The spirometry norm [3] requires the forced exhale to last at least 6 seconds. After that time the AioCare application allows to stop the maneuver, therefore, the first 6 seconds of the forced exhale are regarded.
c. Zeroing all of the negative values in the signal. This operation has no effect on extracting features as any of them analyzes the negative part of signal, however, it lowers the flow-span of the data and zeroes the residual fragments of inhales if they were not entirely extracted during the initial segmentation.
d. Filtering the signal with moving average of window length of 5 samples (i.e. 0.05 second). A slight filtering is applied to smooth the noise and dispose of minor artefacts.
e. Preprocessing for steady-flow detection. This step recognizes whether the signal is a steady flow signal characteristic for generator measurements. The detection is based on testing how many of the samples in segmented exhale signal differs from the signal median significantly. If this number is low, then it can be assumed that the signal is a steady-flow signal and not a cough (see Table 2 for pseudocode algorithm) and it finishes the classification path for the signal. Table 2. Pseudocode algorithm for determining whether the signal is a steady-flow signal (preprocessing step e). The main idea is to calculate the difference between the signal and its median.
If the signal is recognized as a steady-flow signal it is then labeled as non-cough one.

Feature extraction
Several numerical features have been developed to characterize the presence of cough in a single spirometry curve. The input of the algorithms finally consists of 6 features of low computational effort, extracted for each curve. These are: a. Number of spikes that are longer than 0.05 s (6 or more samples in width) at the right (descending) slope of the forced exhale signal (see fig. 2a). The threshold of 0.05 s is sufficiently sensitive to count cough-relevant local peaks while insensitive to shorter fluctuative ones. Please note that the moving average filtration (applied in Section 3.d) smoothens but usually does not remove peaks (or valleys) if they are clearly visible before.
b. Number of local maxima with the right-slope amplitude of more than 0.25 L at the right (descending) slope of the forced exhale signal (see fig. 2b). This feature counts the peaks that are distinguishable enough from the background and can be markers of cough. The right-slope amplitude is the amplitude between the peak maximum and the first point in time where the first derivative of the signal changes its sign, thus, where the signal starts to increase again.
c. Number of crossings of the signal with horizontal lines (intersections) at 15%, 25%, 50% and 75% of maximum value of the signal. In this way 4 separate features are calculated (see fig.  2c). This methodology, especially zero crossing (intersections with x-axis), is widely used in detecting fluctuations in signals from various domains, e.g. in heart rate analysis for both electro-and phonocardiograms [37][38][39][40].
The features a-c are graphically outlined in Figure 2. The analysis of correlation between different features and between features and binary curve labels (cough or no-cough) is often useful to determine the relevance and similarity of these features. High correlation (positive or negative) of a specific feature with data labels can indicate high usefulness of this feature in further classification. On the other hand, one should avoid processing features that are highly correlated (close to unity) with each other as it increases the size of the input data and of the model while, in the same time, not providing any additional information. Correlation plot for this study is presented in Figure 3. None of the pairs of features exceeds the correlation of 0.8, therefore they were all passed to the next step of machine learning training.

Machine learning algorithms and results
The dataset of feature-extracted NHANES curves was divided into the training (75%) and the test (25%) sets. The AioCare collected data was qualified as the additional test set. An overview is shown in Table  3. All input data were mean-and standard deviation-normalized before processing to training models. Machine learning models for the study were implemented in the R-Studio environment (version, The results for all of the methods are presented in Table 4. For all of the algorithms, the accuracy and specificity are roughly similar, however, the sensitivity varies. The feed-forward artificial neural network achieved the highest scores for sensitivity and accuracy and it was further tested on the AioCare data set to check the transferability of the model. Presentation of all machine learning methods used in this study could be tedious for the reader and we do not consider it necessary, therefore we will present briefly only the mathematical basics of topscoring neural network method. Mathematical model of an artificial neuron (perceptron) is defined as [41]: where is the neuron output, is the transfer function, is the weight of -th input of inputs and is the input signal of the -th input (see Figure 4). In this definition the element 0 0 is the bias factor. Thus, if one constructs the network consisting of the input, hidden (middle) layer and the output layer (1 neuron unit in our case), the network output can be described as: where (1) stands for the weights between the input and the hidden layer and (2) stands for the weights connecting the hidden layer and the output layer. There are inputs and hidden neurons. The output value of varies from 0 to 1 and can be regarded as a probability of class 0 (no cough) or class 1 (cough). The final classification is performed by setting a threshold of 0.5 for these two classes.
In this work we use logistic (sigmoid) function as the transfer function which is widely used in statistical modeling and has the form of: The supervised model training is equivalent to the problem of finding such a set of weights that minimizes the output error of the model. In case of artificial neural network this process can be realized through backpropagation algorithm and large variety of its optimizations. This step is fully managed by the R's nnet package.
During the training stage, different models are evaluated, i.e. networks with different size of hidden layer are tested. The ROC (receiver operating characteristic) metric was used to select the optimal model [42]. It is a performance measurement for classification problem at various thresholds settings. A ROC is basically a curve of True Positive Rate (sensitivity) versus False Positive Rate (defined as 1 − specificity). The area under the ROC curve (so called AUC) represents degree or measure of separability telling how much the model is capable of distinguishing between classes. The model of the highest AUC is then chosen as the classifier of the best performance. The final artificial neural network architecture is presented in Figure 5. The ROC curve for this model is shown in Figure 6.  The resulting performance for the AioCare data legitimates the transfer of the model as it is close to the performance tested on NHANES test set (see Table 4). Fortunately, it shows that the algorithm does not tend to be overfitted on the NHANES data and, universally, can be applied in the AioCare system. It can be seen that for both NHANES and AioCare data the specificity of the algorithm is higher than the sensitivity which is regarded as an acceptable property of the algorithm as the minimalization of the number of false-positives was one of the goals in the process of algorithm development.

Discussion and summary
The detection of cough events in spirometry curves using air flow signal only is not a simple task as the cough can be manifested not only in a very clear way but also through small flow disturbances. As far as we know, we performed the first complete attempt to effective cough classification basing totally on air-flow signals of human patients. Additionally, we adopted NHANES database to make sure that the training data is as large and diverse as possible.
Although the performance results of the classification algorithms presented in Table 4 were more or less comparative, the artificial neural network model could be chosen as the model of best suitability. Similar results between different algorithms suggest, fortunately, that none of the models was overfitted. The resulting specificity of the algorithm is higher than sensitivity which is acceptable as the minimalization of false positive factor was the property of interest due to the functional needs of the application.
Classification algorithm developed in this study is sufficiently robust tool for detecting cough events during spirometry measurements and can be implemented in the AioCare mobile application. It is characterized by high specificity. Training the algorithm using NHANES database and testing it with AioCare signals proved usefulness and universality of the model. There are still some curves that were misclassified by the algorithm, however, most of these maneuvers contain small cough disturbances or disturbances which can be similar to cough. Thus, the features developed in this study can be insufficient to distinguish these subtle cases. Further increase of performance and reliability will be the aim of the next works.