Skip to main content
Log in

Proposing feature engineering method based on deep learning and K-NNs for ECG beat classification and arrhythmia detection

  • Scientific Paper
  • Published:
Physical and Engineering Sciences in Medicine Aims and scope Submit manuscript

Abstract

Arrhythmia is slow, fast or irregular heartbeat. Manual ECG assessment and disease classification is an error-prone task because of vast differences in ECG morphology and difficulty in accurate identifying ECG components. Moreover, proposing a computer-aided diagnosis system for heartbeat classification can be useful when access to medical care centers is difficult or impossible. Therefore, the main aim of this study is classifying ECG beats for arrhythmia detection (four beat classes are considered). Previous studies have proposed different methods based on traditional machine learning and/or deep learning. In this paper, a novel feature engineering method is proposed based on deep learning and K-NNs. The features extracted by our proposed method are classified with different classifiers such as decision trees, SVMs with different kernels and random forests. Our proposed method has reasonably good performance for beat classification and achieves the average Accuracy of 99.77%, AUC of 99.99%, Precision of 99.75% and Recall of 99.30% using fivefold Cross Validation strategy. The main advantage of the proposed method is its low computational time compared to training deep learning models from scratch and its high accuracy compared to the traditional machine learning models. The strength and suitability of the proposed method for feature extraction is shown by the high balance between sensitivity and specificity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Acharya UR, Oh SL, Hagiwara Y, Tan JH, Adam M, Gertych A, San TR (2017) A deep convolutional neural network model to classify heartbeats. Comput Biol Med 89:389–396

    Article  Google Scholar 

  2. Liu W, Huang Q, Chang S, Wang H, He J (2018) Multiple-feature-branch convolutional neural network for myocardial infarction diagnosis using electrocardiogram. Biomed Signal Process Control 45:22–32

    Article  Google Scholar 

  3. Sannino G, Depietro G (2018) A deep learning approach for ECG-based heartbeat classification for arrhythmia detection. Future Gener Comput Syst 86:446–455

    Article  Google Scholar 

  4. Mondejar-Guerra V, Novvo J, Rouco J, Penedo MG, Ortega M (2019) Heartbeat classification fusing temporal and morphological information of ECG via ensemble of classifiers. Biomed Signal Process Control 47:41–48

    Article  Google Scholar 

  5. Mathews SM, Kambhamettu C, Barner KE (2018) A novel application of deep learning for single-lead ECG classification. Comput Biol Med 99:53–62

    Article  Google Scholar 

  6. Huanhuan M, YUE Z (2014) Classification of electrocardiogram signals with deep belief networks. In: IEEE 17th International Conference on Computational Science and Engineering (CSE)

  7. Li W, Li J (2017) Local deep field for electrocardiogram beat classification. IEEE Sens J 18:1656–1664

    Article  Google Scholar 

  8. Wang D, Shang Y (2013) Modeling physiological data with deep belief networks. Int J Inf Educ Technol (IJIET) 3:505

    CAS  Google Scholar 

  9. Park J, Kang K (2013) PcHD: personalized classification of heartbeat types using a decision tree. Comput Biol Med 54:79–88

    Article  Google Scholar 

  10. Ebrahimzadeh A, Shakiba B, Khazaee A (2014) Detection of electrocardiogram signals using an efficient method. Appl Soft Comput 22:108–117

    Article  Google Scholar 

  11. Qin Q, Li J, Zhang L, Yue Y, Liu C (2017) Combining low-dimensional wavelet features and support vector machine for arrhythmia beat classification. Sci Rep 7:6067–6078

    Article  Google Scholar 

  12. Alqudah AM, Albadarneh A, Abu-Qasmieh I, Alquran H (2019) Developing of robust and high accurate ECG beat classification by combining Gaussian mixtures and wavelets features. Australas Phys Eng Sci Med 42(1):149–157

    Article  Google Scholar 

  13. Tang X, Ma Z, Hu Q, Tang W (2019) A real-time arrhythmia heartbeats classification algorithm using parallel delta modulations and rotated linear-kernel support vector machines. IEEE Trans Biomed Eng. https://doi.org/10.1109/TBME.2019.2926104

    Article  PubMed  Google Scholar 

  14. Oh SL, Ng EYK, Tan RS, Acharya UR (2018) Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length Heart beats. Comput Biol Med 102:278–287

    Article  Google Scholar 

  15. Xu SS, Mak MW, Cheung CC (2018) Towards end-to-end ECG classification with raw signal extraction and deep neural networks. IEEE J Biomed Health Inform 3:505

    Google Scholar 

  16. Wang G, Zhang C, Liu Y, Yang H, Fu D, Wang H, Zhang P (2018) A global and updatable ECG beat classification based on recurrent neural networks and active learning. Inf Sci 501:523–542

    Article  Google Scholar 

  17. Yildirim O, Baloglou UB, Tan RS, Ciaccio EJ, Acharya UR (2019) A new approach for arrhythmia classification using deep coded features and LSTM networks. Comput Methods Programs Biomed 176:121–133

    Article  Google Scholar 

  18. Qian Y, Bi M, Tan T, Yu K (2016) Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24:2263–2276

    Article  Google Scholar 

  19. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (NIPS 2012)

  20. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556

  21. Orlando JI, Prokofyeva E, Fresno MD, Blaschko MB (2018) An ensemble deep learning based approach for red lesion detection in fundus images. Comput. Methods Progr Biomed 153:115–127

    Article  Google Scholar 

  22. Abidin AZ, Deng B, Dsouza AM, Nagarajan MB, Coan P, Wismuller A (2018) Deep transfer learning for characterizing chondrocyte patterns in phase contrast X-ray computed tomography images of the human. Comput Biol Med 95:24–33

    Article  Google Scholar 

  23. Vogado LHS, Veras RMS, Araujo FHD, Silva RRV, Aires KRT (2018) Luekemia diagnosis in blood slides using transfer learning in CNNs and SVM for classification. Eng Appl Artif Intell 72:415–422

    Article  Google Scholar 

  24. Zuiderveld K (1994) Contrast limited adaptive histogram equalization. Academic Press, Cambridge

    Book  Google Scholar 

  25. Giocca G, Napoletano P, Schettini R (2018) CNN-based features for retrieval and classification of food images. Comput Vis Image Underst 176–177:70–77

    Google Scholar 

  26. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  Google Scholar 

  27. McAllister P, Zheng H, Bond R, Moorhead A (2018) Combining deep residual neural network features with supervised machine learning algorithms to classify diverse food image datasets. Comput Biol Med 95:217–233

    Article  Google Scholar 

  28. Mazo C, Bernel J, Trujillo M, Alegre E (2018) Transfer learning for classification of cardiovascular tissues in histological images. Comput Methods Progr Biomed 165:69–76

    Article  Google Scholar 

  29. Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29:2352–2449

    Article  Google Scholar 

  30. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition

  31. Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kauffmann, Burlington

    Google Scholar 

  32. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  33. Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20:1–25

    Google Scholar 

  34. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  35. Chen S, Hua W, Li Z, Gao X (2017) Heartbeat classification using projected and dynamic features of ECG signal. Biomed Signal Process Control 31:165–173

    Article  Google Scholar 

Download references

Funding

There is no funding.

Author information

Authors and Affiliations

Authors

Contributions

KT: Conceptualization; KT and RNN: Data curation; KT and RNN: Formal analysis; KT and RNN: Investigation; KT: Methodology; KT: Project administration; KT and RNN: Software; KT: Supervision; KT and RNN: Validation; KT and RNN: Visualization; KT and RNN: Writing – original draft; KT: Writing – review & editing.

Corresponding author

Correspondence to Toktam Khatibi.

Ethics declarations

Conflict of interest

The authors declare that there are no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: more details about the classifiers used in this study

Appendix A: more details about the classifiers used in this study

In this section, we review the classifiers that are used in this paper including decision trees, support vector machines (SVM), and random forests (RF).

Decision trees

In 1970s and 1980s, decision tree algorithm known as ID3 has been proposed by Quinlan [32]. Further, other extensions of decision tree algorithm such as CART, C4.5 (later C5.0) and J48 have been developed.

Decision trees are built in top-down and divide and conquer manner. The training set is recursively partitioned to make different branches in the decision tree.

A decision tree is represented by a set of nodes which are leaves (terminals) or internal nodes (parents). Each internal node has splits based on a feature to improve the impurity measures (the best split is chosen which reduces the impurity most). The most common and popular impurity measures are Gini Index, Information Gain (Entropy) or misclassification error as formulated in Eq. (A1A3) [31]:

$$ Gini\,Index\left( D \right) = 1 - \mathop \sum \limits_{j = 1}^{C} p_{j|D}^{2} $$
(A1)
$$ Information\,Gain\left( D \right) = - \left( {\mathop \sum \limits_{j = 1}^{C} p_{j|D} log_{2}^{{p_{j|D} }} } \right) $$
(A2)
$$ Misclassification \,Error\left( D \right) = 1 - max_{1 \le j \le C} \left( {p_{j|D} } \right) $$
(A3)

where D is a node of the decision tree, C is the number of classes and pj|D is the probability that an arbitrary data record in D belongs to class Cj. The lower values of impurity measures denote better splits.

Each leaf node in the tree is labeled by the majority class label of its data. The classification starts from the root node. In each internal node, the valid branch of the tree which the data record satisfies its corresponding condition is followed. This process continues till the data record reaches to a leaf node and its class label is assigned to the data record.

Pruning decision trees may be done to avoid from overfitting [31].

Decision trees are easily interpretable because of their tree-like structures. The association rules can be extracted easily by traversing the decision tree branches. The number of the association rules extracted from decision trees equals to the number of root-to-leaf paths in the decision tree.

The decision tree obtained by a random sample of our training data based on FS-CNN is displayed in Fig. 13. For simplicity, the classes considered for this sample decision tree are “not N” (abnormal beat) and “N” (normal beat) classes.

Fig. 13
figure 13

A decision tree inducted by our training dataset based on FS-CNN

Decision tree illustrated in Fig. 13 consists of 15 internal nodes and 16 leaf nodes. The depth of the decision tree is 5 levels. The most important feature appearing in the root node is X9. The second and the third important features are X11 and X34, respectively.

As shown by Fig. 13, 8327 training data records of each class exist at the root node. Entropy of the root node is 1 indicating high impurity. The left branch from the root node is split based on X11 with entropy of 0.508 (much better than the root node) with 944 and 7681 data records of classes “N” and “not N”, respectively.

The first leaf node from the left side of the figure includes 176 samples from which 136 samples (40 samples) belong to “N” (“not N”) class. Therefore, the class label of the leaf node is “N” and every testing data records reach to this node is assigned “N” class label.

The color of the nodes is associated to the class label of the majority data records. The majority data records of blue (orange) nodes belong to “N” (“not N”) class. The color saturation of each node indicates its impurity. Higher saturated nodes have lower impurity measures.

A sample association rule can be extracted by following the root node to the first left leaf node is:

“ If X9 ≤ 0.559 AND X11 ≤ 0.403 AND X16 ≤ 0.431 AND X34 ≤ 0.456 then class Label = N”.

Random forests (RF)

RF is an ensemble classifier of T independent decision trees. Each decision tree is trained by a bootstrap sample of RF training set. For selecting the split feature in the internal nodes of decision trees, the best split among a random subset of features is selected based on the impurity measures. Depth of decision trees should be less than or equal to a user-defined maximal depth to avoid overfitting.

Therefore, RF user-defined hyper-parameters are the number of the trees and the tree maximal depth.

In this paper, RF with 100 decision trees with maximum depth of 3 is applied to our data.

Support vector machines (SVM)

SVM is a classifier trying to find the optimal hyperplane to separate data records of different classes. Original version of SVM is appropriate for classifying linearly separable data. Nonlinear kernels in SVM make nonlinear data linearly separable. Therefore, SVM with nonlinear kernels can classify nonlinear data, too.

The equation of the optimal hyperplane for separating different classes in linear SVM is formulated as Eq. (A4):

$$ \mathop \sum \limits_{i = 1}^{m} w_{i} \cdot FP_{i} + \mathop \sum \limits_{i = 1}^{m} b_{i} = 0 $$
(A4)

where m is the number of the features, wi is the weight of the ith feature FPi in the hyperplane equation and bi is the ith element of the bias vector.

Each data record is classified by linear SVM according to Eq. (A5):

$$ y = \left\{ {\begin{array}{*{20}c} { + 1\,if \mathop \sum \limits_{i = 1}^{m} w_{i} \cdot FP_{i} + \mathop \sum \limits_{i = 1}^{m} b_{i} \ge 0} \\ { - 1\,if \mathop \sum \limits_{i = 1}^{m} w_{i} \cdot FP_{i} + \mathop \sum \limits_{i = 1}^{m} b_{i} < 0} \\ \end{array} } \right. $$
(A5)

In this paper, linear (default version of SVM), polynomial, Sigmoid and Radial Basis Function (RBF) kernels are used for SVM with their best tuning of the parameters leading to the best accuracy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khatibi, T., Rabinezhadsadatmahaleh, N. Proposing feature engineering method based on deep learning and K-NNs for ECG beat classification and arrhythmia detection. Phys Eng Sci Med 43, 49–68 (2020). https://doi.org/10.1007/s13246-019-00814-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13246-019-00814-w

Keywords

Navigation