A Beat-to-Beat Atrial Fibrillation Detecting System

Atrial Fibrillation (AFib) is the most common cardiac arrhythmia, increasing in prevalence with age. AFib is often associated with structural heart disease and a substantial proportion of AFib patients lead to the significant morbidity, mortality, and cost. Thus, AFib is the most prevalent and costly health problems in the world and a major global healthcare challenge. This study presents a beat-to-beat AFib detection system to provide a healthcare system for AFib patients. For real-time Electrocardiogram (ECG) signals, the beat-to-beat AFib detection system consists of two methods in this study: an improved ECG R peak detection method and a beat-to-beat Gaussian voting AFib method. The improved R peak detection method proposes two different optimization algorithms to replace the knowledge-based theory in previous R peak detection method that consists of three stages: band-pass filter, interesting blocks and threshold. The beat-to-beat Gaussian voting AFib method extracts features based on the R-R intervals to identify the possibility of AFib. Based the R-R intervals, the heart rate can be estimated, and the system can detect the tachycardia and bradycardia in addition. The results using the MIT-BIH database show that the proposed R peak detection method can detect beats with 99.9984% accuracy in testing data. Clinical testing reveals that the proposed beat-to-beat Gaussian voting AFib method is about 94.72% accurate and 98.11% sensitivity for 6 normal subjects and 6 AFib patients.


Introduction
Atrial fibrillation (AFib) is the most common cardiac arrhythmia and affects nearly 1% of the population. Its prevalence increases with age: it is relatively infrequent in those under 40 years old, but occurs in up to 5% of those over 80 years of age. Normal people have a resting heart rate of between 60-80 beats per minute. But, the atrium of AFib patients contract rapidly and irregularly and produce a heart rate of between 400-800 beats per minute. Fortunately, the atrioventricular node compensates for this activity; only about 1 or 2 out of 3 atrial beats pass to the ventricles [1].
A typical AFib shows a rapid irregular tachycardia in which recognizable P waves are sometimes absent [2]. The ventricular rate in patients with untreated AFib is generally 110 to 180 beats per minute. In elderly patients, ventricular rates in untreated AFib are typically slower. Recent data from the Framingham study indicates that Chronic Heart Failure (CHF) is associated with a 4.5-and 5.9-fold risk of AFib for men and women, respectively [3]. Apart from the epidemiological data, most evidence on the prevalence of AFib in heart failure patients stems from analysis of a number of clinical trials in heart failure with populations conducted within the last 10-15 years. The AFib might have no detectable cardiovascular disease. Hemodynamic impairment and thromboembolic events related to AFib patients included in these trials were selected for different purposes, which are reflected in the varying prevalence of AFib. In addition, AFib is often associated with structural heart disease and a substantial proportion of AFib patients lead to the significant morbidity, mortality, and cost, which make AFib become a major global healthcare challenge [4].
Since the QRS complex is the most prominent feature of an ECG signal and the least sensitive to muscle movement, several methods have been reported for AFib detection based on R-R intervals (RRI) irregularity. Logan and Healey used the histogram of variance of RRI to identity AFib [5]; Tateno and Glass utilized a histogram of the difference between successive RRI [6]. Other accomplished methods, such as the coefficient of variation and Kolmogorov-Smirnov testTateno and Glass) [6], Markov models [7], neural networks [8] and Hidden Markov Models [9], are all proposed to have a robust detection results in AFib identification. Given the chaotic pattern, it is unlikely to model the exact behavior of R-R irregularity during AFib. All the methods mentioned above are based on the fact that the irregular R-R intervals of AFib can be expressed in a typical pattern of R-R interval distribution, which could be used to differentiate AFib from non-AFib rhythms. Besides, for paroxysmal AFib, the duration of AFib might be as short as 10 seconds. By using fewer R-R intervals, the algorithm has more chances to detect the symptom. Therefore, the sooner the reaction time or the fewer R peaks usage can make the expert system more efficient.
This study attempts to develop an intelligent expert system with a built-in abnormal ECG detection mechanism to facilitate diagnosis and management of patients with AFib and other rhythm disorders (tachycardia and bradycardia). The main purpose of this study is to establish a new method for real-time AFib detection for a quick and concise performance, which can be embedded to mobile or table tablet personal computer (PC) applications to advance a tele-health system. The remainder of this paper is organized as follows. In Section 2, an improved ECG R peak detection method and a beat-to-beat Gaussian voting AFib method are described. Section 3 compares the results obtained from obtained from R peak detection and against previous studies and discusses the performance of Beat-to-beat AFib detection system. Finally, Section 4 provides concluding remarks.

Methodology
The beat-to-beat AFib detection system consists of two parts for real-time clinical ECG signals: an improved ECG R peak detection part ( Figure 1A) and a beat-to-beat Gaussian voting AFib part ( Figure 1B). In the following sub-sections, the present study introduces the beat-tobeat AFib detection model consists of two parts in detail.

The improved ECG R peak detection method using optimization algorithm
This study adopts the two different optimization algorithms to replace the knowledge-based theory in proposed QRS peaks algorithm [10]. This study adds order of filter (N) into the band-pass filter. Thus, this study adopts two optimization algorithms to produce these six parameters (N, F1, F2, W1, W2 and β).
PSO algorithm: Particle swarm optimization (PSO) was proposed by Kennedy and Eberhart [11] according to the behaviour of birds. PSO is based on population search concept to explore the optimal solution. Because the movements of particles reference the global best position and past best position of particle, all particles will move towards the current global best position aggregately. Particles change their position continuously and update their best position until the terminating condition or the optimal solution of the problem is reached.
PSO is constructed on the basis of flocking behaviour in birds or fish. PSO assumes that birds foraged in a regional context where there is a food source (the position of best solution). Because birds don't know the position of food source, birds search within the region randomly in accordance with swarm intelligence. Each bird communicates experience of searching food source with each other thus birds move towards food source gradually. The searching behaviours of particles can move toward a new direction according to two kinds of experiences, one is cognitive model, and other is social model. In cognitive model, the locations of particle's optimal solution are records by itself, which is called personal best position (pb). In next generation, each particle moves on the basis of its own best position (pb). In Social model, particles share their location information with each other. The global best solution (gb) can be obtained by the fitness values of particles.

Canonical PSO:
The entire population of PSO is composed of particles. The position of each particle denotes X id , where i is the number of particle in population and d is the dimension of particle. The position of particle X id is adjusted using following equation: where V id is a velocity of each particle to determine the moving direction of particle and can be depicted as follows: V id (t+1) = w * V id (t) + c 1 * r 1 * (pb id (t) -x id (t+1)) + c 2 * r 2 * (gb id (t) -x id (t+1)) (2) where V id (t+1) is new updated velocity of particle; V id (t) is old velocity of particle; X id (t+1) is new updated position of particle and X id (t+1) is old position of the particle. t is the number of iteration. w is inertia weight and influences the convergence rate of PSO. PSO can control to adjust the influence of V id (t) on the V id (t+1). gb id (t) is the global best particle and pb id (t) is the personal best particle. C 1,2 are confidence factor value for swarm. r 1,2 are random number between 0 and 1 and can ensure the diversity of the particles in population. Equation (1) and (2) can regarded as the canonical PSO [12].

Standard PSO 2011:
Recently, the last standard PSO (SPSO-2011) has been defined [12]. In SPSO-2011, while particle position X id is update by Equation (1), velocity is revised as following equation: where H id (.) is hypersphere function and ‖. ‖ is the absolute value function. Parameter settings are defined in the study of Zambrano-Bigiarini et al. [12]. Swarm confidence factor c 1 and c 2 equal to 0.5 + ln (2) and inertia weight w equals to 1⁄(2 * ln( 2)).

DE algorithm:
Storn and Price proposed DE in 1995 [13]. DE produces new offspring by using three basic schemes: mutation, crossover, and selection. These schemes are different from those in the GA. DE adopts mutation to converge evolution, utilize crossover to control the information exchanged during convergence, and employs selection to verify the success of the convergence. In 1996, DE solved the numerical problems discussed during the first IEEE Congress on Evolutionary Computation (CEC) conference [14]. Furthermore, the DE algorithm and its variants have been achieving competitive rankings in various IEEE CEC conference competitions [15,16].
In general, three type of vectors was defined in DE algorithm: donor vector V i,G , trial vector U i,G , and target vector X i,G . V i,G is produced by mutation, U i,G is generated using crossover, and X i,G is the current vector of the population. There are three parameters in DE algorithm: population NP, crossover control rate Cr, and amplification factor of the difference vector F. Each member in the population is regarded as a D-dimensional parameter vector. In order to contain the entire search space, the population in the DE algorithm is initialized far. In the DE algorithm, a vector is represented by X i,G , where i=1,2,3,…,NP and G is the generation number. This study individually illustrates these three basic schemes.

Mutation:
The DE algorithm adopts mutation to change the vector's location. Each mutation strategy has a characteristic approach to change a vector [14,17,18]. The formula for mutation is the linear combination of the existing vectors. The mutant vector is the donor vector. An initial individual X i,G of S pop is generated randomly from a uniform distribution within the decision space D. At each generation, for an individual X i,G , three distinct individuals X r1,G , X r2,G , and X r3,G are pseudorandomly extracted from the population. The vector X i,G =[X i,1,g , X i,2,g , X i,3,g ,….,X i,D-1,g , X i,D,g ] is the target vector. According to the DE logic, the donor vector V i,G of the ith population member is generated through mutation as where F ∈ (0,2) is a scale factor to control the length of the exploration vector (X r2,G -X r3,G ). Random indices are r 1 , r 2 , r 3 ∈ {1, 2, 3,…,NP}, i ≠ r 1 ≠ r 2 ≠ r 3 . The mutation strategy in (1) is known as DE/rand/1 [14]. Other variants of the mutation strategies have been subsequently proposed [17,18]: -DE/best/1: -DE/current to best/1: -DE/current to rand/1: -DE/rand to best/1: where X best is the solution with the best performance among the individuals of the population.

Crossover:
The crossover strategy [19] can control the inherited components from the mutant vector to a target vector. When a provisional offspring is produced through mutation, each gene of individual V i,G+1 is exchanged with the corresponding gene of X i,G with a uniform probability to generate the trail vector ,..., u , g, u ] : (0,1) Cr or j j (10) where rand j (0,1) is a random number between 0 and 1. The parameter j is the index of the gene and j=[1, 2, 3,…,D− 1,D]. j rand is an integer randomly chosen from 1 to D. Cr ∈ [0,1] is a constant influencing the number of elements to be exchanged through crossover. Because of the range of j rand , U i,j,G , always differs from X i,j,G and i=1,2,3,…,NP. Equ. (10) is famous as uniform binary crossover.
Another crossover stagey is exponential crossover [20], which is similar to the two-point crossover strategy in which the first cutpoint l is randomly chosen from [1,D] and the second point is decided such that L consecutive components are chosen from the mutant vector. The trail vector U i,G =[u i,1,g , u i,2,g , u i,3,g ,…., u i,D-1,g , u i,D,g ] is created as follows: where i=1,2,3,…,NP, j=1,2,3,…,D, and 〈 〉 D is the modulo function with modulus D, and 〈j〉 D is j of j ≤ D and j − n if j > D. Price et al. [21] reported that Prob(L=h)=(1-Cr)Cr n-1 corresponds to a geometric distribution, the discrete counterpart of the continuous exponential distribution. This study derives (8) and (9) through the aforementioned exponential crossover.
, , , Cr (13) where j rand is an integer and randomly chosen between 1 and D and randCri ∈ [0,1) is a random number. The parameter expCr is the exponent of Cr.
In binomial crossover, Cr explicitly determines the probability that a component will be replaced with a mutated component. By contrast, in exponential crossover, Cr determines the number of components to be mutated.
Selection: New members in DE are formed using the selection strategy [19]. The selection strategy compares the fitness value of the trial vector U i,G+1 with that of the target vector X i,G . The vector with the best fitness value is selected as a new member in the population. The following equation is used for selecting the fitness value: where the fitness () function is the fitness value of the objective function, and X i,G+1 is the new offspring in the DE algorithm.

The revised ECG R peaks detection method:
This study adds order of filter (N) into the band-pass filter in previous study by Elgendi [10]. The ECG signal is processed separately with the following sub-steps in the detection method.

Bandpass filter:
The aim of band-pass filtering is to remove the baseline wander and high frequencies that do not contribute to QRS complex detection. The band-pass filter is designed using Matlab function "designfit" (Matlab signal processing Toolbox), which need to set the type of filter, order of filter, F1, F2 and sampling rate. In this study, the type of filter is band-pass IIR filter and sampling rate is obtained from training and testing data. Rang of order of filter is from 2 to 30 and its value need to be an even number. F1 is the starting frequency and the range is from 1 to 10. F2 is the cut-off frequency and the range is from F1+10 to 25. F1 affects the value of F2. The slop of filter, which described by how many decibels the filter gain drops off per (logarithmic) frequency interval above the cut-off frequency, depends on the order of the filter and as a general rule for filter. The slope increases by 6.02 * N⁄2 per octave for the order of filter N.

Squaring function:
The function is to square the filtering signals as show in Equation (15) and enhance the large values and boost highfrequency components.
where x[n] is the original ECG signal. In the all equations, n is the data point.

Generating blocks of interest
Blocks of interest can be produced by two event-related moving averages: MA QRS and MA QRS . MA QRS is used to extract the QRS features and calculated by Equation (16).
MA QRS can smooth out multiple peaks to emphasize and extract the QRS area. MA one Beat is extracted the QRS's beat and calculated by Equation (17).
If the MA QRS is higher than THR1, this area can as an interesting block. The meaning of the interesting block is that this block contains the ECG features and noise.
Thresholding and detecting R peaks: If a width of interesting block is more than or equal to W 1 , it is classified as a QRS complex. On the contrary, the block is classified as a P wave, T wave or noise. The last procedure is to find the maximum value within each interesting block, which is classed as a QRS complex. Then the maximum value is considered as R peak. This study adopts optimization algorithm to find out the suitable sets of parameters (F1, F2, N, W1, W2 and β) in the revised ECG R peaks method.

Beat-to-beat Gaussian voting AFib method
Two indicators were defined in this model, including △RRIi and △△RRIi. △RRIi refers to the variation of consecutive RRI. △△RRIi. represents the difference of △RRIi-1 and △RRIi. Accordingly, a fiveheartbeat moving window can acquire three △RRIi and two △△RRIi. These five parameters were applied to construct a beat-to-beat Gaussian voting AFib method. Three normal subjects and three AFib patients were used to train this model. △RRIi distribution of AFib patients training data were shown in Figure 2. Apparently, the △RRIi distribution of AFib patients had a wider distribution in all parameters compared to normal subjects, which was a normal distribution (data not shown). A two-Gaussian fitting curve was used to fit the distribution as well as a normal distribution was applied to normal subjects.
When an ECG signal started to receive, we can calculate the five parameters with a five-heartbeat moving window overlapped by four beats. Thus, the system can compute the data every heart beats to achieve a real-time monitor. Next, we applied the constructed beatto-beat Gaussian voting AFib method to estimate the probabilities of normal and AFib conditions. Each parameter had a comparative result between two probabilities that the higher the density won the voting and marked 1 point. Finally, these 5 parameters were summed as the final score for two conditions (normal subject and AFib patient) in a range of 0-5. System alarm as the event occurred when the score of AFib patient was greater than the score of normal subject ( Figure 1B). In addition, the heart rate (HR) measures the number of heartbeats taken per minute and the formula for calculating the HR is was 60 divided by averaged R-R interval of each 6 s. System alarm as the event (tachycardia and bradycardia) also occurred when the heart rate was smaller than 60 beats per minute (bpm) or bigger than 120 bpm.

Performance indexes
To evaluate the performance of the detection algorithm, we adopt two indices in this study. False negative (FN) indicates a failure to detect a cardiac patient. True positive (TP) is the total number of correctly detected cardiac patients. By using FN and TP, the sensitivity, accuracy and error rate can be calculated using (20)- (22). Sensitivity (%)=(TP/TP + FN) × 100 (20) Accuracy (%)=(TP/Total beats) × 100 (21) Error rate (%)=100-Accuracy (22) The algorithms were compared by calculating the numbers of TP and FN for each record.

The improved ECG R peak detection
In this study, the proposed algorithms are implemented in R2015a on Intel (R) Xeon (R) CPU E5-26500 at 2.00 GHz. The MIT−BIH Arrhythmia Database (https://www.physionet.org/physiobank/ database/mitdb/) includes different shapes of QRS complexes and noise, which makes it the most suitable database to test the robustness of the developed ECG algorithms. Database contains 48 ECG recordings. This study uses only 42 ECG recordings because the remaining ECG recordings exhibit the considerable noise in ECG signal. These are very difficult judgment QRS complexes, even for humans. The data are stored on four-channel FM magnetic tapes. Channels 1 and 2 are the two-channel ECG signals. Channel 3 is the annotation channel recorded in a standard binary format, and channel 4 is a binaryrecorded timing track. All the recordings were sampled at 360 Hz with 11-bit resolution over a 10-mV range. This study adopts leave one out cross validation (LOOCV) for data validation. LOOCV uses only one of the ECG recordings to as testing data, while the remainder are as training data.
This study adopts the training data and optimization algorithms to calculate the optimized parameters in the above-mentioned improved R peak detection algorithm. Fitness function estimates by the accuracy between the real R peaks and estimated R peaks. Testing data and the optimized parameters, which are found out by optimization algorithms and training data, can calculate the accuracy between the real R peaks and estimated R peaks.
All ECG recordings can regard as testing data once. Therefore, the above actions implement for 42 times and generate 42 different optimize parameters. This study adopts SPSO-2011, DE which has six different strategies to find out the optimize parameters in R peak algorithm. This study compares the error rates and computation times among SPSO-2011, DEs and knowledge-based theory in Table 1 and 2. In SPSO-2011 and DEs, number of particle in population is 40. The topping criterion occurs when the error rate in algorithm is smaller than one in knowledge-based theory or the number of iterations is greater than 25.
In original QRS peaks algorithm [10], the number of parameters is five (F1, F2, W1, W2 and β) and the order of filter is 10. Thus, knowledge-based theory can produce 34650 combinations and the computing operation spends 3 days. Table 1 shows that SPSO-2011 and DE/rand/1 spend shorter time than knowledge-based theory to find out the suitable parameters. Although the error rate of training data in SPSO-2011 and DE/rand/1 is little more than one in knowledge-based theory, two algorithms can obtain the minimum error rate of testing data. These results suggested that the combined parameters could an overfitting result after using knowledge-based theory.
When the number of parameters increases from five to six, the number of combinations changes from 34650 to 381150 and it spends 30 days. Table 1 shows that the knowledge-based theory takes a lot of time in trial and error and error rate is little less than other methods. Thus, Table 2 only compares the error rates and computation times among SPSO-2011and DEs when six parameters (F1, F2, N, W1 and W2) need to be obtained. The order of filter (N) affects the combination of other five parameters indirectly because the order of filter influences the slope of filter and applies to ECG signal first. In addition, the structure of the solution space of six parameters is different from one of five parameters. The results of Table 2 show that SPSO-2011 can obtain the best performance in error rate and computation times. The SPSO-2011and DEs can obtain the high accuracy for detecting R peaks (error rate=0) expect testing data 113, 114, 208, 215, 231 and 232. In order to observe the details, Table 3 shows the error rate of those testing data among SPSO-2011and DEs. The results of Table 3 indicate that the SPSO-2011 is the best algorithm to improve R peak detection, which consist with the results of Table 2.
As the above-mentioned results, the revised ECG R peaks detection method using SPSO-2011 can obtained the best performance. Thus, this study selects record 105 from the MIT/BIH database to compare this proposed method in this study with the previous study specially as shown in Table 4, because record 105 is the most difficult to analyse owing to a large induced noise factor and then widely used by researchers to test R peak detection algorithms. Table 4 shows that the proposed method has the highest positive prediction rate (100%). Table 5 compares the performance of various algorithms on all the records generated from the MIT/BIH database [23][24][25][26][27][28][29][30][31][32][33]. The proposed method can clearly achieve good performance.  Beat-to-beat AFib detection: Totally 6 normal (aged between 52~76 years) and 6 patients (aged between 54~80 years) records were mixed and served as the testing data. Comparison and calculation was performed according to the recommendations of the American National Standard for ambulatory ECG analysers (ANSI/AAMI EC38-1994). A true positive indicates that the algorithm successfully detected a true AFib episode and, on the contrary, a false negative indicates the fail of detection of AFib ECG. False positive represents a false AFib detection whereas true negative means normal subjects have nothing to be detected at all. Accuracy, Sensitivity and positive predictive value were used for further analysis. The recorded data was shown in Table 6.
Twelve subjects (including normal and AFib patients) with 150 trials each were test. Accuracy, sensitivity, specificity and positive predictive value were used to present system performance. Accordingly, the accuracy of this AFib detection model was 94.72%, followed with 98.11% of sensitivity, 97.97% of specificity and 91.88% of positive predictive value.
The results suggest that this system can offer a highly sensitive and specific monitoring of AFib for a pre-screen purpose. High sensitivity stands for low chance to omit the disease and high specificity indicates lesser possibility to erroneous diagnose of a normal person. Moreover, the false negative detection of AFib data was separated in five parameters from those of true positive data statistically in Figure 3. This implied that the false negative detection might due to the original data was in a normal state manner.

Discussion
Most previous studies emphasized the improvement of the whitening filter for QRS detection. This study proposed the improved R peak detection method using two different optimization algorithms to replace the knowledge-based theory in previous R peak detection method. This beat-to-beat Gaussian voting AFib method provides a new method for diseases classification and supports a high sensitivity and specificity for diseases detection. The voting mechanism can further apply to fuzzy neural network to construct a more precise and complete expert system.

Conclusion
In conclusion, this working model is capable for early AFib detection, and represents a successful first step toward improving efficiency and quality of care in cardiovascular disease (CVD). Further research on improving both hardware and software designs are necessary to enhance the efficiency and accuracy of this system in the future.   Wavelet de-noise 5 Chen et al. [25] Filter banks 16 Poli et al. [26] BPF 22 Hamilton and Tompkins [27] Linear adaptive filter 22 Lee et al. [28] Optimized filtering 21 Arzeno et al. [29] Topological mapping 4 De Chazal et al. [30] Wavelet transform 12 Faezipour et al. [31] Artificial neural network adaptive filter 4 Sufi et al. [32] 3M method 7 Benitez, et al. [33] Proposed method 0 - Proposed method 99.83 -