Cough Expired Volume and Cough Peak Flow Rate Estimation Based on GA-BP Method

School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China 2 e State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou 310027, China North Automatic Control Technology Institute, Taiyuan, Shanxi 030006, China Department of Respiratory and Critical Care Medicine, Beijing Engineering Research Center of Respiratory and Critical Care Medicine, Beijing Institute of Respiratory Medicine, Beijing Chao-Yang Hospital, Capital Medical University, Beijing 100043, China


Introduction
Cough is a kind of respiratory reflex behavior. When the respiratory tract is stimulated by inflammation, dust, or some particulate matters, cough behavior is performed to clear respiratory secretions to keep the respiratory tract clean and unobstructed [1][2][3][4][5]. Because of the high pressure difference between the inside and outside of the thoracic cavity, a high airflow rate is generated to impose a great shear force on the surface of secretions and propel them to the mouth [6].
According to previous studies, the cough process continued about 0.4∼0.6 s and can be characterized by three parameters which are cough peak flow rate (CPFR), peak velocity time (PVT), and cough expired volume (CEV) [7][8][9][10].
e CEV and CPFR are the total exhausted air volume and the maximum airflow rate measured in atmosphere normal reference during the whole cough process, respectively. e cough expired volume (CEV) and the cough peak flow rate (CPFR) are important for medical diagnosis, cough effectiveness assessment, and extubation decision [11][12][13]. However, as for the patients with mechanical ventilation, neuromuscular disease, or other diseases which impairs the cough ability, the CEV and CPFR values cannot be obtained and used for medical diagnosis. erefore, establishing a relationship between the CEV, CPFR values and human physical information could be used for medical diagnosis for these patients. If the physical information is obtained, the CEV and CPFR values will be estimated.
Previous studies have developed the relations between CEV and CPFR values and human genders, heights, and weights. Leiner et al. found that the CPFR is related to the height and age of human [14]. Mahajan et al. and Singh et al. involved the gender as an influence factor and developed relations between CPFR, CEV, and PVT [8,9]. eir research involved 100 healthy and nonsmoking volunteers (50 females and 50 males) and showed a direct relationship between CPFR and CEV values [8]. e results showed that the maximum value of CEV reached 5 L with an average of 3 L. Zhu et al. measured the CEV values of three healthy subjects and found that the variation range from 0.8 to 2.2 L with an average of 1.4 L [15]. Gupta et al. investigated 25 healthy subjects (12 females and 13 males) and developed a first-order relation between CEV, CPFR values and genders, heights, and weights through the linear regression analysis method [7]. e results are performed as follows: Brandimore et al. established a linear relation among CEV values, airflow rates, and number of coughs through analyzing the measured data of 25 participants (14 females and 11 males with an average of 23 years old). e results demonstrated significant linear relationships between expired volume, the total number of coughs, and cough airflow rates [6]. Particle velocimetry (PIV) method has been also used to estimate ranges of cough velocities. Chao et al. collected the average velocity of 50 coughs from eleven healthy volunteers (3 men and 8 women). e estimated maximum cough velocities of male and female were 13.2 m/s and 10.2 m/s, respectively [16]. VanSciver et al. performed over twenty-nine nonsmoking healthy volunteers (ten male and nineteen female volunteers) to obtain and analyze the cough velocity. e results showed that there is no correlation between cough velocity and sex and weight [10].
In this study, 700 healthy participants were involved. e CEV values, CPFR values, genders, heights, weights, ages, and smoking status were measured and recorded. Meanwhile, the integration of backpropagation neural network and genetic algorithm, which is called the GA-BP method was developed to estimate CEV and CPFR values.

Experiment Setup.
e CEV and CPFR values were measured by using a portable pulmonary function device (Contec Ltd.), which are presented in Figure 1. e CEV and CPFR values acquisition range from 0 to 10 L and 0∼16 L/s, respectively. e acquisition errors are within ±0.05 L and 0.2 L/s, respectively. A disposable connector is installed in the front of the device during the collection. e human subjects hold the connector with their mouths and cough forcefully. After a single cough, the CEV and CPFR values will be displayed on the device screen and stored in the device.
All the participants were receiving the training of how to use this device before measurement. Sitting posture was adopted during measurement.

Ethical Statement.
e CEV, CPFR values, and physical information of 700 human subjects were obtained by the doctor at the Chao Yang Hospital. All the human subjects were agreed to conduct these measurements and signed the informed content. e Chao Yang Hospital ethics committee and human subjects have approved these data collection (Approval number: 20175241).

Backpropagation Neural
Network. BP neural network is a multilayer feedforward neural network with the forward signal transmission and reverse error transmission and could be used to estimate any nonlinear relations through training [17][18][19][20]. Typically, the BP neural network consists of the input layer, hidden layer, and output layer. e input signals are processed layer by layer from the input layer through the hidden layer to the output layer. If the expected output is not achieved, the signals transfer to backpropagation. According to the prediction error, the neural network weights and threshold values adjust to obtain the expected output values [21][22][23][24].
In this study, five inputs which represent gender, height, weight, age, and smoking status, and two outputs which represent CEV and CPFR value are set up in the BP neural network. e structure of the BP neural network is presented in Figure 2. e neuron number of hidden layers is set to 11 based on the Hecht-Nelson method.

BP Neural Network Improved by Genetic Algorithm.
Although BP neural network can obtain a good estimated result through the training process, the training time is too long and the results may converge to local optimal values. Genetic Algorithm (GA) is a parallel random search optimization method, which imitates the natural genetic mechanism and Darwin's principle of biological evolution [25][26][27]. erefore, the BP neural network is improved by genetic algorithm (GA) to solve this problem, which is called GA-BP [28][29][30]. e GA-BP algorithm is used to search for the most preferable weights and thresholds of neural network. e whole operation process includes initialization, calculating fitness values, selection, crossover, and mutation.
is process repeats until the end condition is satisfied. e data set which consists of the connection weights and thresholds of neural network is regarded as an individual. As shown in Figure 3, the real number coding was used to create the initial values for each individual which consists of ω ij , a, ω jk , and b. e ω ij and ω jk represented the weights between the input layer and hidden layer and the weights between the hidden layer and output layer, respectively. e a and b represented the thresholds of the hidden layer and output layer, respectively. e number of individuals was set ten in this study. e absolute error between predicted and expected output of BP neural network was used as the individual fitness value in this study. e fitness function is as below: where E is the individual fitness value, m is the number of neurons in output layer, and Y i and O i are the predicted output and expected output of the network, respectively. e goal of selection is that individuals with lower fitness have a greater chance to inherit to the next generation. e roulette method was used for selection operation in this study, as shown in the following formula: where p is the chosen probability of each individual, k is the coefficient, and N is the number of individuals. e crossover probability was set 0.5. e crossover operation for individual a k and a l at the j position was presented in equation (4). e diagram is shown in Figure 4.
where α is a random number between [0∼1]. e mutation probability was set 0.4 and the diagram is shown in Figure 5. e new chromosome a * ij is calculated through equation (5).where a max and a min are the upper bound and lower bound of gene a ij , respectively. e r 2 is a random number, g is the current iteration number, G max is the maximum evolution number, and r is a random number between  Output layer Smoking status X 5 Figure 2: e BP neural network structure used in this study. A three-layer structure was adopted. ere are five neurons in the input layer, eleven neurons in the hidden layer, and two neurons in the output layer.
e iteration was set 40. e operation process of GA-BP is presented in Figure 6.

Results
e statistical results of collected physical information (gender, age, height, weight, and smoking status), CEV, and CPFR values of 700 human subjects are presented in Table 1.
In order to testify the GA-BP method, the ten times tenfold cross-validations method was adopted. 700 groups of data were randomly divided into tenfolds, each fold contains 70 groups of data. Ninefold data were used to train the GA-BP network. e remaining onefold data is used for validation. rough ten different random groupings and exchanging the training and test data, we conducted ten times tenfold cross-   Figure 6: e operation process of GA-BP. e GA-BP starts with GA which optimizes the initial values of weight and threshold of the BP neural network. en, the BP neural network begins training with the optimized weights and thresholds until the end condition is satisfied.   validations. e statistical results of data for ten times tenfold cross-validations from test 1 to test 10 are presented in Tables 2 and 3. e relative errors between estimated and test values for CEV and CPFR in ten times tenfold cross-validations are calculated and presented in Figures 7 and 8, respectively.

Discussion
ere were 700 participants (430 males and 270 females) involved in this study. ere were 106 smokers, accounting for about 15% of the total. e age, height, weight, CEV, and CPFR values were collected and analyzed. e average of   (4). (e) Test (5). (f ) Test (6). (g) Test (7). (h) Test (8). (i) Test (9). (j) Test (10). 6 Complexity CEV and CPFR were 1.54 L and 6.25 L/s, respectively. It is adapt to the statistical results of physical information of participants which presents a young state. e ten times tenfold cross-validations method was adopted to verify the GA-BP method. From Tables 2 and 3, we found that the average and standard deviation values of age, height, weight, CEV, and CPFR values of training groups were approximate with that of validation groups in all ten times validations. It indicated that all the training and validations groups were selected randomly and the estimated results were reliable.
It is considered that the estimated value is acceptable and relatively accurate when the absolute error is within 10% in this study. From the ten times validation results in Figures 7 and 8, we find that the relative errors of a large number of test samples are within 10% for both CEV and CPFR values. e accuracy of ten times validations and the average values for CEV and CPFR estimations are calculated and presented in Table 4. From Table 4, we found that the accuracy of CEV and CPFR value estimations exceeded the 90% in all ten times validations. e averages of CEV and CPFR estimation accuracy were 95% and 94.57%, respectively. e results indicated that the GA-BP method has a high accuracy and could be effectively used for CEV values and CPFR values estimation.
Considering the low sampling frequency of the portable pulmonary function device, the PVT value estimation is not  completed in this study. Even the estimation accuracy is high under current condition and more data especially in a large span of age, height, weight, and other physical information should be involved and used for improving the generalization ability.

Conclusions
e heights, weights, ages, smoking status, CEV, and CPFR values of 700 participants were collected and analyzed in this paper.
e GA-BP method which integrates backpropagation neural network and genetic algorithm was developed to estimate the CEV and CPFR values. Additionally, the ten times tenfold cross-validation method was adopted to testify the GA-BP method. e results show that the estimation accuracy of GA-BP method used for CEV and CPFR values both exceeds 90%. e averages of CEV and CPFR estimation accuracy reached 95% and 94.57%, respectively. e estimation results verified the accuracy of the GA-BP method.
In the future study, the PVT value estimations will be completed. More information such as compliance and resistance will be measured and used in the GA-BP method to improve the generalization ability.
Data Availability e datasets generated and analyzed during the current study are available in the Baidu cloud disk repository (https://pan. baidu.com/s/1NLTUx8lPO8l7LocGWpzJ1A&shfl�sharepset; code: 0qv1).

Conflicts of Interest
e authors declare that they have no conflicts of interest.