Next Article in Journal
Fast Determination of Satellite-to-Moon Visibility Using an Adaptive Interpolation Method Based on Vertex Protection
Next Article in Special Issue
Hydration Assessment Using the Bio-Impedance Analysis Method
Previous Article in Journal
Sum Rate Optimization of IRS-Aided Uplink Muliantenna NOMA with Practical Reflection
Previous Article in Special Issue
Wearable E-Textile and CNT Sensor Wireless Measurement System for Real-Time Penile Erection Monitoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DE-PNN: Differential Evolution-Based Feature Optimization with Probabilistic Neural Network for Imbalanced Arrhythmia Classification

BioComputing Lab., Institute for Bio-Engineering Application Technology, Department of Computer Science and Engineering, KOREATECH, Cheonan 31253, Korea
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(12), 4450; https://doi.org/10.3390/s22124450
Submission received: 26 April 2022 / Revised: 31 May 2022 / Accepted: 6 June 2022 / Published: 12 June 2022

Abstract

:
In this research, a heartbeat classification method is presented based on evolutionary feature optimization using differential evolution (DE) and classification using a probabilistic neural network (PNN) to discriminate between normal and arrhythmic heartbeats. The proposed method follows four steps: (1) preprocessing, (2) heartbeat segmentation, (3) DE feature optimization, and (4) PNN classification. In this method, we have employed direct signal amplitude points constituting the heartbeat acquired from the ECG holter device with no secondary feature extraction step usually used in case of hand-crafted, frequency transformation or other features. The heartbeat types include normal, left bundle branch block, right bundle branch block, premature ventricular contraction, atrial premature, ventricular escape, ventricular flutter and paced beat. Using ECG records from the MIT-BIH, heartbeats are identified to start at 250 ms before and end at 450 ms after the respective R-peak positions. In the next step, the DE method is applied to reduce and optimize the direct heartbeat features. Although complex and highly computational ECG heartbeat classification algorithms have been proposed in the literature, they failed to achieve high performance in detecting some minority heartbeat categories, especially for imbalanced datasets. To overcome this challenge, we propose an optimization step for the deep CNN model using a novel classification metric called the Matthews correlation coefficient (MCC). This function focuses on arrhythmia (minority) heartbeat classes by increasing their importance. Maximum MCC is used as a fitness function to identify the optimum combination of features for the uncorrelated and non-uniformly distributed eight beat class samples. The proposed DE-PNN scheme can provide better classification accuracy considering 8 classes with only 36 features optimized from a 253 element feature set implying an 85.77% reduction in direct amplitude features. Our proposed method achieved overall 99.33% accuracy, 94.56% F1, 93.84% sensitivity, and 99.21% specificity.

1. Introduction

An electrocardiogram (ECG) represents electrical activity of the heart in a graphical manner. It is a non-invasive and commonly-used tool by clinicians and cardiology specialists to monitor the function of the heart and diagnose both critical and non-critical heart diseases. The ECG signal is defined by a standard PQRST sequence of waves as shown in Figure 1. The P wave indicates atrial depolarization. The QRS complex consists of a Q wave, R wave and S wave and represents ventricular depolarization. The T wave comes after the QRS complex and indicates ventricular repolarization. Each of these entities, i.e., P wave, QRS complex and T wave (possibly a U wave) have a unique pattern in terms of duration, amplitude and consecutive inter-beat correlation. A deviation in this normal pattern signifies an abnormal event. The diseased state in the case of cardiovascular monitoring is called an arrhythmia. The occurrence of an arrhythmic event is rare but critical and life threatening leading to a sudden cardiac arrest or sudden cardiac death incident. Recently, cardiovascular health monitoring has shifted from traditional in-clinic ECG machines [1,2] to portable and wearable ECG devices [3,4,5] that accumulate 24 h single-lead patient ECG data for long-term and continuous monitoring scenario. Identifying deviating patterns from the normal heartbeats in this large accumulated data is a tiresome and tedious job for clinicians and suffers from inter- and intra-observer variation error. This problem has led to the evolution in the development of computer-aided diagnostic methods for cardiovascular disease pathology indication for early referral to cardiac specialists and initiation of proper and timely medical attention.
Recent developments in the use of wearable ECG devices and on devices based on the Internet of Medical Things (IoMT) have led to an explosion of routinely collected individual ECG data. The use of feature engineering and computational intelligence methods to turn these ever-growing ECG monitoring data into clinical benefits seems as if it should be an obvious path to take. Computer-aided ECG arrhythmia classification systems that use intelligent techniques for the development of smart healthcare monitoring platforms are popular nowadays. A computer-aided early referral arrhythmia classification system [6,7,8] usually involves a feature extraction process in which a set of features is calculated for each individual heartbeat (the type of features used might be hand-crafted, statistical, morphological or spectral, etc.) and classifier construction to learn the features and classify incoming heartbeats. Using all the features calculated in the feature extraction step and a multi-layered classifier not only introduces heavy computational cost but also affects classifier performance due to the presence of redundant/corrupted features. The latest systems deploy a feature reduction/optimization step before classification to remove all unnecessary features. This also allows the use of a single layered or a computationally less intensive learning algorithm for classification.
In the latest competitive research, novel features and various classifiers have been utilized for ECG beat classification tasks. Sayantan et al. [9] feature representation of ECG is learnt using the Gaussian–Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. Elhaj et al. [10] investigated principal components of discrete wavelet transform coefficients and higher order statistics. Afkhami et al. [11] used parameters of Gaussian mixture modeling together with skewness, kurtosis and 5th moment and applied an ensemble of decision trees to classify the heartbeats using a class-oriented scheme. Liu et al. [12] improved the dictionary learning algorithm for vector quantization of ECG. Shen et al. [13] used wavelet transform-based coefficients, signal amplitude and interval parameters. A new classifier, which integrates k-means clustering, one-against-one SVMs, and a modified majority voting mechanism, is proposed to further improve the recognition rate for extremely similar classes. Qin et al. [14] developed wavelet multi-resolution analysis to extract time-frequency domain features and applied one-versus-one support vector machine to characterize six types of ECG beats. Zhai [15] and Acharaya et al. [16] used a CNN classifier. Oh et al. [17] used CNN and LSTM in combination to propose a refined classification method and generated synthetic data to overcome imbalance problem with accuracies of 94.03% and 93.47% with and without noise removal, respectively.
Recently, researchers have presented different feature reduction methods to reduce the input dimensions of ECG signals for neural classifiers. To name a few of the latest, Zhang et al. [18] extracted statistical features applying a combined method of frequency analysis and Shannon entropy and used information gain criteria to select 10 highly effective features to obtain a good classification on five types of heartbeats. Yildrim et al. [19] implemented a convolutional auto-encoder-based nonlinear compression structure to reduce the feature size of arrhythmic beats. Tuncer et al. [20] applied the neighborhood component analysis feature reduction technique to obtain 64, 128 and 256 features from a 3072 feature vector size. Wang et al. [21] proposed an effective ECG arrhythmia classification scheme consisting of a feature reduction method combining principal component analysis with linear discriminant analysis. Alonso-Atienza et al. [22] used a filter-type feature selection procedure which was proposed to analyze the relevance of the computed parameters. Chen and Yu [23] applied nonlinear correlation-based filters, calculated feature–feature correlation to remove redundant features prior to the feature selection process based on feature–class correlation. Asl et al. [24] proposed the feature reduction scheme based on generalized discriminant analysis. Haseena et al. [25,26] used a fuzzy C-mean (FCM) clustered probabilistic neural network (PNN) for the discrimination of eight types of ECG beats. The performance has been compared with FCM clustered multi layered feed forward network trained with the back propagation algorithm. Important parameters are extracted from each ECG beat and feature reduction has been carried out using FCM clustering. Polato et al. [27] used principal component analysis. Genetic algorithms have also been applied recently for the optimization of ECG heartbeat features [28,29,30,31] and proved to be advantageous in improving the time-cost value in heartbeat classification methods.
Previously proposed automated cardiovascular disease diagnosis systems have mostly followed the design objective of achieving high performance by maximizing accuracy, F1-score, sensitivity and precision measures. A major limitation in the case of general and particularly cardiovascular disease diagnosis is a highly unbalanced ratio or frequency of occurrence of normal to abnormal events. Furthermore, existing multi-class learning approaches mainly focus on exploiting label correlations to facilitate the learning process. However, an intrinsic characteristic of multi-class learning, i.e., class-imbalance [32] has not been well studied [33,34,35]. The Matthews correlation coefficient (MCC) was first used by B.W. Matthews for the performance assessment of protein secondary structure prediction [36]. Since then, it has become a widely used performance measure in biomedical research. MCC and Area Under ROC Curve (AUC) have been chosen as the elective metric in the US FDA-led initiative MAQC-II that aims to reach a consensus on the best practices for the development and validation of predictive models for personalized medicine [37].
This research models a metaheuristic search algorithm Differential Evolution (DE) [38] which is a very robust and highly effective heuristic algorithm. Differential evolution has also been used in many applications in many fields. For example, surface and Beizer curve optimization [39], electronic circuitry [40], lithology [41], optimizing solar cells [42] and many others. Although not directly related, these papers should be cited to show the wide range of uses of the differential evolution algorithm. The current work implements DE to optimize direct ECG heartbeat amplitude features to maximize MCC for eight arrhythmia beat classes having imbalanced and uncorrelated class distributions. The algorithm is tuned to find a minimized optimum combination of features that performs better as compared to all features. The motivation here is to remove noisy or redundant signal points, specifically for the task of classification. Classification using PNN is performed with optimum and all features to show the difference. The proposed method is simply depicted in Figure 2. Using PNN for classifying abnormal heartbeats with reduced direct heartbeat amplitude points diminishes the computation of a secondary feature extraction step, produces higher classification performance due to removal of unnecessary features and is faster due to the optimized minimum number of features and less complex PNN learning algorithm. The rest of this paper is organized as follows. In Section 2, the clinical data, cardiac cycle identification and normalization, DE feature reduction and the PNN classification for arrhythmia identification are described in detail. Section 3 includes the performance evaluation measures and data division for training and testing. Results are presented in Section 4. A detailed discussion on the achieved results plus some future possibilities are presented in Section 5.

2. Materials and Methods

2.1. Clinical Data

ECG data for this study belongs to “MIT−BIH arrhythmia database” developed in 1987 and are available as open source on Physionet (https://physionet.org, accessed on 15 December 2020) [43,44]. The database consists of 48 two-channel ambulatory ECG records, each of approximately 30 min duration digitized at a sampling rate of 360 Hz acquired from 47 subjects out of which 25 subjects were men aged 32 to 89 years, and 22 were women aged 23 to 89 years (2 records came from the same subject). Each record has simultaneous recordings from 2 leads, MLII and V5. For the purpose of testing a wearable ECG sensing scenario that mostly uses a single lead for acquisition [45], this work uses ECG signal from only the MLII lead. Each record is supported by an annotation file providing the R-peak positions and corresponding beat labels (Lb). Hence, for this research, 107,800 heartbeats are used having corresponding labels for 8 classes, i.e., normal (NORM), left bundle branch block (LBBB), right bundle branch block (RBBB), premature ventricular contraction (PVC), atrial premature contraction (PAC), ventricular escape (VESC), ventricular flutter wave (VFLT) and paced (PACE) beat. The selected 8 classes include less frequent but clinically significant arrhythmic beats too to prove the validity of the proposed algorithm. An sample of all beat patterns is shown in Figure 3 as an example.

2.2. Proposed Methodology

The proposed methodology as graphically shown in Figure 2 and in detail in Figure 4 is explained in four steps; (1) preprocessing, (2) cardiac cycle identification and normalization, (3) feature optimization, and (4) disease-based classification as follows:

2.2.1. Preprocessing

In the preprocessing stage, power and low-frequency components are removed from the raw ECG signal by using a 6th-order bidirectional Butterworth band-pass filter with lower and upper cut-off frequencies of 0.5 and 40 Hz, respectively. The baseline is computed as a cubic spline interpolation of fiducial points placed 90 milliseconds before R-peak positions as an approximation for baseline PR-segment and subtracted from the bandpass-filtered signal.

2.2.2. Cardiac Cycle Identification and Normalization

Using the R-peak positions provided with each record, a heartbeat sample is identified as having an onset of 250 ms before each R-peak position to 450 milliseconds after each R-peak position. This definition makes each heartbeat consist of 253 sampling points and ensures that the important characteristic points of ECG such as P, Q, R, S, and T waves are included [46] as shown in Figure 5. The signal amplitude biases in the waveforms of the ECG beat samples are inconsistent due to instrumental and human errors. Hence, we utilize the Z-score method to reduce the above-mentioned differences in each ECG beat sample. Through the Z-score method, the mean value of each ECG sample is first subtracted from each ECG sample to eliminate the offset effect and then divided by its standard deviation [21]. This procedure results in a normalized ECG beat sample with zero mean and unity standard deviation. Figure 3 shows samples for all 8 ECG beat classes used in this research.

2.2.3. Feature Optimization

The mathematical model followed for feature optimization using DE to find the minimum number of features that result in maximum classification performance is explained as follows.

Population Initiation

An initial population matrix P is generated as in Equation (1) to represent the possible solution/optimization space consisting of n p number of binary row vectors p called population individuals each of length n f (number of features in heartbeat samples in this case 253 as mentioned in Section 2.2.2).
P n c , n f = p 1 p 2 . . p i . . p n p 1 p n p = p 1 , 1 p 1 , 2 . . . p 1 , n f p 2 , 1 p 2 , 2 . . . p 2 , n f . . . . . . . . . . p i , 1 p i , 2 . p i , j . p i , n f . . . . . . . . . . p n p 1 , 1 p n p 1 , 2 . . . p n p 1 , n f p n p , 1 p n p , 2 . . . p n p , n f
where, p i , j represents bit value at j t h feature position in i t h population individual. Here, j = 1 to n f and i = 1 to n p . 1’s and 0’s in each population individual represent the selected and non-selected features, respectively. p i , j for p 1 to p n p 1 are generated setting probability equal to 0.8 for a bit being 1. The last row population individual p n p is set to p a l l and is defined as a population individual representing an ’All-feature’ set in the optimization space. This tunes the DE optimization process to find a final subset of optimized and reduced features that achieves even better fitness than the all feature set and is mathematically represented in Equation (2).   
p n p = p a l l = 1 1 1 . . . 1 1 x n f
The number of individuals n p is chosen as 50 so that it is large enough to avoid stagnancy and small enough to avoid excessive computing time [47,48].

Fitness Evaluation

The fitness function, fit in this case, is modeled as the k-category MCC [36,49] mathematically expressed as Equation (3) considering one versus rest strategy taking all 8 classes one by one as positive (P) and the rest of 7 classes as negative class (N). All feature subsets represented by p in P are selected from the dataset and individually trained using PNN as explained in Section 2.2.4 and fit is calculated on the testing subset.
MCC k = T P · T N + F P · F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N )
Here, TP = number of samples for which positive class was correctly identified, TN = number of samples for which negative class was correctly identified, FP = number of samples for which positive class was wrongly identified and FN = number of samples for which negative class was wrongly identified and k denotes the number of classes and k = 8 for the current problem. Hence, FP and FN represent misclassifications or error made by the classification algorithm. Mean calculated over MCC individually for 8 classes is modeled as fit. A maximization of fit is carried out to find the optimum combination of features. Maximization of the defined fitness function is carried out using maximum 200 generations.
f i t = m a x ( m e a n ( M C C k ) )

Crossover

Randomly selecting two different individuals p i 1 and p i 2 from P , a 1-point crossover is performed where, i 1 ,   i 2 are randomly generated index values between 1 and n f with crossover probability ( C R = 0.8 ) . The population individual v i obtained after the crossover operation is called an offspring. Similarly, an offspring vector is created corresponding to every row in P to create a trial population matrix V .

Mutation

A bit-flip is performed with mutation probability ( M R = 0.2 ) for all v i ’s in V . Hence, currently there exists a parent population P G and an offspring population V G + 1 (after crossover and mutation) both of size n p x n f .

Selection

The fitness function fit for each individual in the V is calculated using Equation (4). Applying the current-to-best strategy, if v i shows a higher fit value than the corresponding p i , then p i in the P is replaced with v i . Otherwise, the p i retains its position. This comparison and replacement process is repeated for every ( p i , v i ) pair an evolved version of P is obtained at the end of the generation. This process evolves and accumulates better individuals until the maximum number of generations, i.e., 200 is reached. After looping through all generations every individual in the P is replaced with the best possible candidate, i.e., having the highest fit value. p s e l with the best fit in the end P is selected as the optimum feature subset with 1’s representing the selected features out of total n f .

Termination

The process terminates if the maximum number of given generations 200 is reached or fit becomes stagnant for a consecutive 20 generations. For every new generation, a new V is generated using the updated P . Hence, crossover and mutation occur in every generation. The default control parameters are summarized in Table 1.

2.2.4. Disease-Based Classification

Training and testing subsets composed of optimized subset of features p s e l obtained in the last step are now extracted from complete training and testing subsets and can now be used to classify unseen beats using PNN [50]. The PNN consists of an input layer, a pattern layer, a summation layer, and a output layer. This architecture is illustrated in Figure 4 (Step 4). The neurons of the input layer convey the input features a = [ a 1 , a 2 , a j , , a n s ] T to the neurons of the pattern layer directly, where n s represents the number of optimized features in p s e l and n s < = n f .
In the pattern layer of PNN, Gaussian function is used to calculate the output of the neuron a k o as in Equation (5) using the input vector a transferred down from the input layer:
g k i ( a ) = 1 ( 2 π σ 2 ) n f exp ( a a ki 2 2 σ 2 )
where, a k i is the vector of neurons, σ defines the standard deviation also called spread for the Gaussian function and n s is the size/dimension of the pattern vector a. a a ki is the Euclidean distance between a and a k i . The neurons in the summation layer calculate the maximum likelihood of the pattern vector a being categorized into class k by averaging the output of all neurons in the pattern layer that belong to the same class as mentioned in Equation (6).
s k ( a ) = 1 ( ( 2 π σ 2 ) n b 1 n k 1 n k exp ( a a ki 2 2 σ 2 )
where n k is the total number of the samples in class k. The neuron in the decision layer applies Bayes’s decision rule to determine the class belongingness of the pattern a by Equation (7).
c ( a ) = m a x ( p k ( a ) )
where k denotes the number of classes in the training samples and c ( a ) is the estimated class of the pattern a. In this paper, the output of the PNN is represented as the Lb of the eight types of ECG beats (i.e., NORM, LBBB, RBBB, PVC, APC, VESC, VFLT and PACE are labeled as ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, and ‘8’, respectively). The detailed pseudocode for the proposed DE-based feature optimization and PNN classification strategy is given in Algorithm 1.
Algorithm 1 Pseudocode of DE-PNN algorithm for feature optimization in heartbeat classification problem
Input: dataset constructed using beat samples (DS) and associated class labels (Lb) as in
  • population size ( n p ) = 50
  • maximum number of generations (maxGen) = 200
  • crossover probability (CR) = 0.8
  • mutation rate (MR) = 0.2
  • feature size ( n f ) = 253
  • current generation (gen) = 1
  • start DE: DS, Lb, n p , maxGen, CR, MR
  • // Population Initialization %
  • for i = 1 n p
  • for j = 1 n f
  • if rand[0,1] > 0.5
  • P i , j G = 0 = 1
  • endif
  • endfor
  • endfor
  • // Replace the last population individual with an all feature vector P n p , j G = 0 = 1 1 1 …1 1 x n f
  • // For each population individual (i.e., bit string P i ) calculate ‘fit’
  • representing classification performance metric of PNN as in Equation (2)
  • for i = 1 n p
  • f i t | D S ( P i ) G + 1
  • endfor
  •  
  • // Generate test vectors
  • while gen <= maxGen
  • for j = 1 n f
  • // Select two separate random population individuals (i.e., bit strings P i 1 a n d P i 2 )
  • for i = 1 n p
  • if rand(0,1) < CR
  • i1 = rand[1, n p ], i2 = rand[1, n p ] and ≠ i1
  • ri = rand[1, n f ]
  • V i , j G + 1 = [ P i 1 , j G [ 1 : r i ] , P i 2 , j G [ r i + 1 : n f ] ]
  • endif
  • endfor
  • endfor
  • // Select the individual with better ’fit’
  • if f i t | D S ( V i ) G + 1 > f i t | D S ( P i ) G + 1
  • P i , j G = V i , j G + 1
  • else
  • P i , j G = P i , j G
  • endif
  • gen = gen + 1
  • endwhile
  • endprocedure
  • // Return optimized combination of features
Output:
  • p s e l = p b e s t with maximum ’fit’ value as in Equation (3)

3. Performance Evaluation

Out of the 108,700 beat samples, 50% were selected as the training subset and the remaining 50% as the testing subset. Table 2 summarizes the details of the available beat samples from each class. All the available class samples in the MIT-BIH database are used in the current test to keep the arrhythmia instance ratio as close to real as possible.
Classification metrics; Matthew’s correlation coefficient (MCC), macro F1-score (Macro-F1) and accuracy (Acc) and area under the curve (AUC) have been reported. MCC, Macro-F1, Acc and additionally, sensitivity (Sen), and specificity (Spe) are reported according to Equation (3), Equations (8)–(11) with fit modeled as MCC.
All the definitions mentioned below follow a one-versus-rest strategy [51]. Each classification measure is calculated for each of the eight classes (taking one class as positive and all the rest as negative) and then averaged to represent the mean classification measure. The PNN classification was performed for All features set (as the exact solution) and Optimized features subset obtained after DE. Hence, all measures are reported for both All features and Optimized features cases to present a comparison between classification improvement and feature reduction achieved using the proposed method. Here, TP, TN, FP, and FN follow the same definition as mentioned in ’Fitness Evaluation’ part.
Acc = T P + T N T P + T N + F P + F N
Sen = T P T P + F N
Spe = T N T N + F P
F 1 = 2 · T P 2 · T P + F P + F N
M a c r o - F 1 = 1 N c = 1 N F 1 c ,

4. Results

Table 3 shows a comparison of the proposed DE-PNN algorithm with the selected `All feature’ standard. The confusion matrices for both are reported in Table 4. The optimized features which result in the maximum MCC are plotted in Figure 6. The average number of generations by which the optimization is achieved was 78 ± 12 (10 trials). After an average 78 generations, the fitness value becomes stagnant meaning the fitness function has achieved its maximum value and is no longer improving.
Using the DE-PNN scheme, the best and worst were the accuracy of 99.84% for VESC and 95.41% for NORM, respectively. The DE-PNN scheme could classify NORM with an accuracy of 99.45%, PVC with 99.18%, PACE with 100.00%, RBBB with 99.94%, LBBB with 99.80%, APC with 99.76%, VFLT with 99.61%, and VESC with 99.94%. These results demonstrate the abilities of the above-mentioned ECG arrhythmia classification schemes to classify the eight ECG beats effectively. The overall accuracy of the DE-PNN scheme, the DE-PNN scheme, and the DE-PNN scheme were 99.61%, 98.26%, and 99.71%, respectively, as reported in Table 5.
To analyze the efficacy of optimized features in distinguishing between simulated cardiac conditions, the receiver operating characteristic (ROC) is plotted using a one-versus-all class strategy and area under the curve (AUC) is calculated. By analogy, the higher the AUC, the better the capability of recognition of the particular class by the classification algorithm. Figure 7 shows the ROCs and AUCs of every class in the case of optimized and all features. The AUC for all arrhythmia classes except paced beat has increased with maximum AUC improvement for VFLT (10%) which is the rarest class in the currently used dataset and secondarily PVC (4%), both representing critical pathological conditions. Overall, the recognition for all classes has improved or stayed consistent with 85.77% reduction in number of features.

5. Discussion

The proposed method presents an accurate and computationally efficient arrhythmia classification method using direct ECG amplitude signal features. More than 100,000 ECG heartbeats are obtained with eight types of ECG beats including one normal and seven arrhythmic beat types. Feature optimization is performed by modeling optimization input as binary vectors representing different feature combinations using DE. An optimized feature subset is obtained which is then used with a simple PNN classifier. The proposed method achieved 85.77% reduction in directly acquired features with comparable classification performance. Figure 6 shows the optimized and selected 36 out of 253 (total amplitude feature points). The higher classification performance achieved could be due to better beat definition (250 ms before and 450 ms after the R-peak positions) as compared to [52] which arbitrarily used 200 samples around the R-peak. Our definition makes sure the inclusion of important physiological characteristics necessary to distinguish between the currently classified arrhythmia types which are most ventricular types. Furthermore, on the algorithm design level, adding an all feature combination to the solution space pushes the optimization process to find a solution better than the All features scenario.
Moreover, we compared the classification performance of the proposed DE-PNN scheme for ECG arrhythmia classification with those of other schemes simultaneously utilizing different feature reduction methods and neural classifiers presented in the literature as summarized in Table 6. Jun et al. [53] used the same direct ECG amplitude features as used in this work and presented a comparison between 2D-CNN, AlexNet, and VGGNet models. All three of the models were deployed using TensorFlow [54] which is a deep learning Python library proposed by Google especially for GPGPUs and yet used two Intel Xeon E5 CPUs and two NVIDIA K20m GPUs to reduce the learning time. All tested classifiers had complex architectures implying extremely high computational cost with no feature optimization/reduction function which is not suitable for continuous monitoring using wearable sensing modality. Yildrim et al. [19], Tuncer et al. [20], and Elhaj et al. [10] used wavelet features with multiple different combination of features to perform arrhythmia classification adding feature computation layer in the processing algorithms performing optimizations focused on classifier parameters rather than feature engineering.
DE-PNN aimed at searching for the optimum feature combination that provides maximum recognition capability for arrhythmic heartbeats removing redundant and selecting highly discriminating features. Overall, the achieved ECG arrhythmia classification result indicates that the detection of arrhythmia using 14.23% (85.77% reduced) features of a complete ECG heartbeat can be an effective approach to help general physicians and cardiology specialists to diagnose critical cardiovascular diseases in continuous and long-term, online or offline monitoring scenarios particularly well-suited for a wearable sensing setting. For future work, the current algorithm may be extended to recognize 16 classes (1 normal and 15 arrhythmic) for which the annotations are available with the MIT-BIH dataset. A future DE optimization might focus on a multi-objective approach to maximize arrhythmia recognition whilst minimizing percentage signal distortion (accuracy and compression being the two objective functions) to make the ECG signal reproducible for clinical analysis.

Author Contributions

Conceptualization, A.N. and Y.S.K.; methodology, A.N.; software, A.N.; formal analysis, A.N.; resources, Y.S.K.; writing—original draft preparation, A.N.; writing, review and editing, A.N. and Y.S.K.; supervision, Y.S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study belongs to MIT-BIH Arrhythmia database and is available opensource at Physionet (https://physionet.org/content/mitdb/1.0.0/ accessed on 15 December 2020).

Acknowledgments

This paper was (partially) supported by the Post-Doc. Scholarship Program of KOREATECH.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Baig, M.M.; Gholamhosseini, H.; Connolly, M.J. A comprehensive survey of wearable and wireless ECG monitoring systems for older adults. Med. Biol. Eng. Comput. 2013, 51, 485–495. [Google Scholar] [CrossRef]
  2. Davenport, C.; Cheng, E.Y.L.; Kwok, Y.T.T.; Lai, A.H.O.; Wakabayashi, T.; Hyde, C.; Connock, M. Assessing the diagnostic test accuracy of natriuretic peptides and ECG in the diagnosis of left ventricular systolic dysfunction: A systematic review and meta-analysis. Br. J. Gen. Pract. 2006, 56, 48–56. [Google Scholar] [PubMed]
  3. Pollonini, L.; Rajan, N.O.; Xu, S.; Madala, S.; Dacso, C.C. A novel handheld device for use in remote patient monitoring of heart failure patients—Design and preliminary validation on healthy subjects. J. Med. Syst. 2012, 36, 653–659. [Google Scholar] [CrossRef] [PubMed]
  4. López, G.; Custodio, V.; Moreno, J.I. LOBIN: E-textile and wireless-sensor-network-based platform for healthcare monitoring in future hospital environments. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 1446–1458. [Google Scholar] [CrossRef]
  5. Yoo, J.; Yan, L.; Lee, S.; Kim, H.; Yoo, H.J. A wearable ECG acquisition system with compact planar-fashionable circuit board-based shirt. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 897–902. [Google Scholar] [CrossRef]
  6. Sahoo, S.; Dash, M.; Behera, S.; Sabut, S. Machine learning approach to detect cardiac arrhythmias in ECG signals: A survey. IRBM 2020, 41, 185–194. [Google Scholar] [CrossRef]
  7. Sree, V.; Mapes, J.; Dua, S.; Lih, O.S.; Koh, J.E.; Ciaccio, E.J.; Acharya, U.R. A novel machine learning framework for automated detection of arrhythmias in ECG segments. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 10145–10162. [Google Scholar]
  8. Fujita, H.; Cimr, D. Decision support system for arrhythmia prediction using convolutional neural network structure without preprocessing. Appl. Intell. 2019, 49, 3383–3391. [Google Scholar] [CrossRef]
  9. Sayantan, G.; Kien, P.; Kadambari, K. Classification of ECG beats using deep belief network and active learning. Med. Biol. Eng. Comput. 2018, 56, 1887–1898. [Google Scholar]
  10. Elhaj, F.A.; Salim, N.; Harris, A.R.; Swee, T.T.; Ahmed, T. Arrhythmia recognition and classification using combined linear and nonlinear features of ECG signals. Comput. Methods Programs Biomed. 2016, 127, 52–63. [Google Scholar] [CrossRef]
  11. Afkhami, R.G.; Azarnia, G.; Tinati, M.A. Cardiac arrhythmia classification using statistical and mixture modeling features of ECG signals. Pattern Recognit. Lett. 2016, 70, 45–51. [Google Scholar] [CrossRef]
  12. Liu, T.; Si, Y.; Wen, D.; Zang, M.; Lang, L. Dictionary learning for VQ feature extraction in ECG beats classification. Expert Syst. Appl. 2016, 53, 129–137. [Google Scholar] [CrossRef]
  13. Shen, C.P.; Kao, W.C.; Yang, Y.Y.; Hsu, M.C.; Wu, Y.T.; Lai, F. Detection of cardiac arrhythmia in electrocardiograms using adaptive feature extraction and modified support vector machines. Expert Syst. Appl. 2012, 39, 7845–7852. [Google Scholar] [CrossRef]
  14. Qin, Q.; Li, J.; Zhang, L.; Yue, Y.; Liu, C. Combining low-dimensional wavelet features and support vector machine for arrhythmia beat classification. Sci. Rep. 2017, 7, 6067. [Google Scholar] [CrossRef] [Green Version]
  15. Zhai, X.; Tin, C. Automated ECG classification using dual heartbeat coupling based on convolutional neural network. IEEE Access 2018, 6, 27465–27472. [Google Scholar] [CrossRef]
  16. Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M.; Gertych, A.; San Tan, R. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2017, 89, 389–396. [Google Scholar] [CrossRef]
  17. Oh, S.L.; Ng, E.Y.; San Tan, R.; Acharya, U.R. Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats. Comput. Biol. Med. 2018, 102, 278–287. [Google Scholar] [CrossRef]
  18. Zhang, Y.; Zhang, Y.; Lo, B.; Xu, W. Wearable ECG signal processing for automated cardiac arrhythmia classification using CFASE-based feature selection. Expert Syst. 2019, e12432. [Google Scholar] [CrossRef]
  19. Yildirim, O.; Baloglu, U.B.; Tan, R.S.; Ciaccio, E.J.; Acharya, U.R. A new approach for arrhythmia classification using deep coded features and LSTM networks. Comput. Methods Programs Biomed. 2019, 176, 121–133. [Google Scholar] [CrossRef]
  20. Tuncer, T.; Dogan, S.; Pławiak, P.; Acharya, U.R. Automated arrhythmia detection using novel hexadecimal local pattern and multilevel wavelet transform with ECG signals. Knowl.-Based Syst. 2019, 186, 104923. [Google Scholar] [CrossRef]
  21. Wang, J.S.; Chiang, W.C.; Hsu, Y.L.; Yang, Y.T.C. ECG arrhythmia classification using a probabilistic neural network with a feature reduction method. Neurocomputing 2013, 116, 38–45. [Google Scholar] [CrossRef]
  22. Alonso-Atienza, F.; Morgado, E.; Fernandez-Martinez, L.; García-Alberola, A.; Rojo-Alvarez, J.L. Detection of life-threatening arrhythmias using feature selection and support vector machines. IEEE Trans. Biomed. Eng. 2013, 61, 832–840. [Google Scholar] [CrossRef] [PubMed]
  23. Chen, Y.H.; Yu, S.N. Selection of effective features for ECG beat recognition based on nonlinear correlations. Artif. Intell. Med. 2012, 54, 43–52. [Google Scholar] [CrossRef] [PubMed]
  24. Asl, B.M.; Setarehdan, S.K.; Mohebbi, M. Support vector machine-based arrhythmia classification using reduced features of heart rate variability signal. Artif. Intell. Med. 2008, 44, 51–64. [Google Scholar] [CrossRef] [PubMed]
  25. Haseena, H.H.; Mathew, A.T.; Paul, J.K. Fuzzy clustered probabilistic and multi layered feed forward neural networks for electrocardiogram arrhythmia classification. J. Med. Syst. 2011, 35, 179–188. [Google Scholar] [CrossRef]
  26. Ceylan, R.; Özbay, Y. Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst. Appl. 2007, 33, 286–295. [Google Scholar] [CrossRef]
  27. Polat, K.; Güneş, S. Detection of ECG Arrhythmia using a differential expert system approach based on principal component analysis and least square support vector machine. Appl. Math. Comput. 2007, 186, 898–906. [Google Scholar] [CrossRef]
  28. Pławiak, P. Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst. Appl. 2018, 92, 334–349. [Google Scholar] [CrossRef]
  29. Yildirim, Ö.; Baloglu, U.B. Heartbeat type classification with optimized feature vectors. Int. J. Optim. Control. Theor. Appl. (IJOCTA) 2018, 8, 170–175. [Google Scholar] [CrossRef] [Green Version]
  30. Houssein, E.H.; Ewees, A.A.; ElAziz, M.A. Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recognit. Image Anal. 2018, 28, 243–253. [Google Scholar] [CrossRef]
  31. Li, H.; Yuan, D.; Ma, X.; Cui, D.; Cao, L. Genetic algorithm for the optimization of features and neural networks in ECG signals classification. Sci. Rep. 2017, 7, 41011. [Google Scholar] [CrossRef] [PubMed]
  32. Daskalaki, S.; Kopanas, I.; Avouris, N. Evaluation of classifiers for an uneven class distribution problem. Appl. Artif. Intell. 2006, 20, 381–417. [Google Scholar] [CrossRef]
  33. Sun, K.W.; Lee, C.H. Addressing class-imbalance in multi-label learning via two-stage multi-label hypernetwork. Neurocomputing 2017, 266, 375–389. [Google Scholar] [CrossRef]
  34. Maalouf, M.; Siddiqi, M. Weighted logistic regression for large-scale imbalanced and rare events data. Knowl.-Based Syst. 2014, 59, 142–148. [Google Scholar] [CrossRef]
  35. Yu, H.; Sun, C.; Yang, X.; Yang, W.; Shen, J.; Qi, Y. ODOC-ELM: Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data. Knowl.-Based Syst. 2016, 92, 55–70. [Google Scholar] [CrossRef]
  36. Matthews, B.W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta (BBA)-Protein Struct. 1975, 405, 442–451. [Google Scholar] [CrossRef]
  37. Shi, L.; Campbell, G.; Jones, W.D.; Campagne, F.; Wen, Z.; Walker, S.J.; Su, Z.; Chu, T.M.; Goodsaid, F.M.; Pusztai, L.; et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 2010, 28, 827. [Google Scholar]
  38. Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  39. Zaman, M.A.; Chowdhury, S. Modified Bézier curves with shape-preserving characteristics using Differential Evolution optimization algorithm. Adv. Numer. Anal. 2013, 2013, 858279. [Google Scholar] [CrossRef] [Green Version]
  40. Liu, X.F.; Zhan, Z.H.; Zhang, J. Resource-aware distributed differential evolution for training expensive neural-network-based controller in power electronic circuit. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–11. [Google Scholar] [CrossRef]
  41. Saporetti, C.M.; Goliatt, L.; Pereira, E. Neural network boosted with differential evolution for lithology identification based on well logs information. Earth Sci. Inform. 2021, 14, 133–140. [Google Scholar] [CrossRef]
  42. Sikder, U.; Zaman, M.A. Optimization of multilayer antireflection coating for photovoltaic applications. Opt. Laser Technol. 2016, 79, 88–94. [Google Scholar] [CrossRef]
  43. Moody, G.B.; Mark, R.G. The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 2001, 20, 45–50. [Google Scholar] [CrossRef] [PubMed]
  44. Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Zhu, H.; Pan, Y.; Wu, F.; Huan, R. Optimized Electrode Locations for Wearable Single-Lead ECG Monitoring Devices: A Case Study Using WFEES Modules Based on the LANS Method. Sensors 2019, 19, 4458. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Marinucci, D.; Sbrollini, A.; Marcantoni, I.; Morettini, M.; Swenne, C.A.; Burattini, L. Artificial Neural Network for Atrial Fibrillation Identification in Portable Devices. Sensors 2020, 20, 3570. [Google Scholar] [CrossRef]
  47. Neri, F.; Tirronen, V. Recent advances in differential evolution: A survey and experimental analysis. Artif. Intell. Rev. 2010, 33, 61–106. [Google Scholar] [CrossRef]
  48. Yang, M.; Cai, Z.; Li, C.; Guan, J. An improved adaptive differential evolution algorithm with population adaptation. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, Amsterdam, The Netherlands, 6–10 July 2013; pp. 145–152. [Google Scholar]
  49. Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 2004, 28, 367–374. [Google Scholar] [CrossRef]
  50. Specht, D.F. Probabilistic neural networks. Neural Netw. 1990, 3, 109–118. [Google Scholar] [CrossRef]
  51. Xu, J. An extended one-versus-rest support vector machine for multi-label classification. Neurocomputing 2011, 74, 3114–3124. [Google Scholar] [CrossRef]
  52. Wang, T.; Lu, C.; Sun, Y.; Yang, M.; Liu, C.; Ou, C. Automatic ECG classification using continuous wavelet transform and convolutional neural network. Entropy 2021, 23, 119. [Google Scholar] [CrossRef] [PubMed]
  53. Jun, T.J.; Nguyen, H.M.; Kang, D.; Kim, D.; Kim, D.; Kim, Y.H. ECG arrhythmia classification using a 2-D convolutional neural network. arXiv 2018, arXiv:1804.06812. [Google Scholar]
  54. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: https://arxiv.org/abs/1603.04467 (accessed on 14 January 2022).
  55. Chen, S.; Hua, W.; Li, Z.; Li, J.; Gao, X. Heartbeat classification using projected and dynamic features of ECG signal. Biomed. Signal Process. Control. 2017, 31, 165–173. [Google Scholar] [CrossRef]
  56. Garcia, G.; Moreira, G.; Menotti, D.; Luz, E. Inter-patient ECG heartbeat classification with temporal VCG optimized by PSO. Sci. Rep. 2017, 7, 10543. [Google Scholar] [CrossRef]
Figure 1. PQRST wave and primary fiducial markers for ECG heartbeat.
Figure 1. PQRST wave and primary fiducial markers for ECG heartbeat.
Sensors 22 04450 g001
Figure 2. Methodology flowchart.
Figure 2. Methodology flowchart.
Sensors 22 04450 g002
Figure 3. Sample beats for eight ECG beat classes: (a) NORM, (b) LBBB, (c) RBBB, (d) PVC, (e) PAC, (f) VESC, (g) VFLT and (h) PACE.
Figure 3. Sample beats for eight ECG beat classes: (a) NORM, (b) LBBB, (c) RBBB, (d) PVC, (e) PAC, (f) VESC, (g) VFLT and (h) PACE.
Sensors 22 04450 g003
Figure 4. Detailed methodology: Differential Evolution-based feature optimization with Probabilistic Neural Network for imbalanced arrhythmia classification.
Figure 4. Detailed methodology: Differential Evolution-based feature optimization with Probabilistic Neural Network for imbalanced arrhythmia classification.
Sensors 22 04450 g004
Figure 5. Cardiac cycle identification.
Figure 5. Cardiac cycle identification.
Sensors 22 04450 g005
Figure 6. Selected feature scan after DE with all beat classes (a), feature points on 1 representative beat (b).
Figure 6. Selected feature scan after DE with all beat classes (a), feature points on 1 representative beat (b).
Sensors 22 04450 g006
Figure 7. ROC curves of 8 classes for Optimized feature subset (left panel) and All feature set (right panel).
Figure 7. ROC curves of 8 classes for Optimized feature subset (left panel) and All feature set (right panel).
Sensors 22 04450 g007
Table 1. DE control parameters summary.
Table 1. DE control parameters summary.
ParameterValue
Population size50
Population typeBinary bits
Crossover1-point crossover
MutationUniform
Selection schemeCurrent-to-best
Population individual length253
Maximum number of generations200
Crossover probability0.8
Mutation probability0.2
Table 2. MIT-BIH data selection details.
Table 2. MIT-BIH data selection details.
Beat ClassTrainingTestingTotal
NORM36,90736,90773,814
LBBB403140318062
RBBB453345339066
PVC336333636726
PAC127012712541
VESC5353106
VFLT236236472
PACE350635077013
Total53,89953,901107,800
Table 3. Classification test result.
Table 3. Classification test result.
FeaturesNumFeatMCCAccMacro-F1AUC
All2530.124899.0592.440.8242
Optimized360.125099.3394.560.8370
Difference−217+0.0002+0.28+2.12+0.0128
Table 4. Confusion matrices for testing subset with Optimized and All features with fit = MCC for 1 normal and 7 arrhythmia classes.
Table 4. Confusion matrices for testing subset with Optimized and All features with fit = MCC for 1 normal and 7 arrhythmia classes.
Optimized Features
T/PNORMLBBBRBBBPVCPACVESCVFLTPACE
NORM74674762140110
LBBB77780112200
RBBB8001141213130
PVC501152409105110
PAC1050154657000
VESC200005100
VFLT1511122131470
PACE0000000800
All Features
T/PNORMLBBBRBBBPVCPACVESCVFLTPACE
NORM7481371649000
LBBB7785042200
RBBB870114622120
PVC1099423645631
PAC790101691000
VESC200005100
VFLT2311219341371
PACE1000000799
Table 5. Classification results for testing subset with Optimized and All features with fit = MCC for 1 normal and 7 arrhythmia classes.
Table 5. Classification results for testing subset with Optimized and All features with fit = MCC for 1 normal and 7 arrhythmia classes.
Optimized Features
ClassAcc (%)Sen (%)Spe (%)F1 (%)
NORM96.7397.2496.3796.10
LBBB99.6599.0699.7398.60
RBBB98.5096.1398.8494.12
PVC98.9394.1399.6295.66
PAC98.1284.0099.4088.14
VESC99.9096.2299.9394.44
VFLT99.2884.7899.8690.06
PACE99.9099.2099.9699.39
Average99.3393.8499.2194.56
All Features
ClassAcc (%)Sen (%)Spe (%)F1 (%)
NORM94.4697.5292.3093.58
LBBB99.7398.9399.8498.93
RBBB98.4493.4699.1493.71
PVC97.9786.2699.6491.38
PAC98.3486.4099.4289.62
VESC99.9196.2299.9595.32
VFLT98.6264.7899.9678.21
PACE99.8098.0099.9698.79
Average99.0590.2098.7792.44
Table 6. Comparison of the proposed DE-PNN scheme with latest literature.
Table 6. Comparison of the proposed DE-PNN scheme with latest literature.
ResearchFeature Type#ClassesFeature SelectionClassificationAccuracy (%)
DE-PNNMorphology8DEPNN99.33
[53]Morphology8NoneCNN98.90
[53]Morphology8NoneAlexNet98.80
[53]Morphology8NoneVGGNet98.70
[19]Morphology5convolutional AELSTM99.00
[29]Wavelet5PSOLS-SVM, RF98.95
[10]HOS+Wavelet5ICA+PCASVM+NN98.91
[28]PSD+DFT17GASVM, kNN, PNN, and RBFNN98.85
[55]DCT+weighted inter-beat5, 15noneSVM98.46
[20]Multilevel wavelet17NCA1-NN95.00
[12]k-medoids vector quantization4noneparallel regression NN95.00
[16]Morphology5none9-layer Deep CNN94.03
[56]Temporal vectorcardiogram3PSOSVM92.40
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nasim, A.; Kim, Y.S. DE-PNN: Differential Evolution-Based Feature Optimization with Probabilistic Neural Network for Imbalanced Arrhythmia Classification. Sensors 2022, 22, 4450. https://doi.org/10.3390/s22124450

AMA Style

Nasim A, Kim YS. DE-PNN: Differential Evolution-Based Feature Optimization with Probabilistic Neural Network for Imbalanced Arrhythmia Classification. Sensors. 2022; 22(12):4450. https://doi.org/10.3390/s22124450

Chicago/Turabian Style

Nasim, Amnah, and Yoon Sang Kim. 2022. "DE-PNN: Differential Evolution-Based Feature Optimization with Probabilistic Neural Network for Imbalanced Arrhythmia Classification" Sensors 22, no. 12: 4450. https://doi.org/10.3390/s22124450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop