Skip to main content

Advertisement

Log in

Improving the accuracy of diagnosing and predicting coronary heart disease using ensemble method and feature selection techniques

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Heart disease is a complex disease, and many people around the world suffer from this disease. Due to the lack of a healthy lifestyle, it is the most common cause of death worldwide. Machine learning plays an important role in medical treatment. The goal of this research is to develop a machine learning model to help diagnose heart disease quickly and accurately. In this article, an effective and improved machine learning method is proposed to diagnose heart disease. We designed a novel and robust ensemble model that combines the top three classifiers, namely Random Forest, XGBoost and Gradient Boosting Machine, to effectively diagnose heart disease. We used an ensemble voting method to combine the results of the top three classifiers to improve the prediction of heart disease. We used a combined heart disease dataset containing five different datasets (Hungary, Statlog, Switzerland, VA Long Beach and Cleveland). Feature selection algorithms (Pearson Correlation, Univariate Feature Selection, Recursive Feature Elimination, Boruta Feature Selection, Random forest, and LightGBM) are used to select highly relevant features based on rankings to improve classification accuracy. The proposed ensemble model is designed using seven highly relevant features, and a comparison of machine learning algorithms and ensemble learning techniques is applied to the selected features. Different performance evaluation methods are used to evaluate the proposed model: accuracy, sensitivity, precision, F1-score, MCC, NPV and AUC. Results analysis shows that the ensemble model achieves excellent classification accuracy, sensitivity, and precision of 96.17%, 98.37%, and 94.53%. Our proposed model performs better than existing models and individual classifiers. The results show that the proposed ensemble method can effectively predict the risk of heart disease.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Sanz, M., Marco del Castillo, A., Jepsen, S., Gonzalez-Juanatey, J.R., D’Aiuto, F., Bouchard, P., Wimmer, G.: Periodontitis and cardiovascular diseases: Consensus report. J. Clin. Periodontol. 47(3), 268–288 (2020)

    Article  Google Scholar 

  2. Allen, L.A., Stevenson, L.W., Grady, K.L., Goldstein, N.E., Matlock, D.D., Arnold, R.M., Spertus, J.A.: Decision making in advanced heart failure: A scientific statement from the American Heart Association. Circulation. 125(15), 1928–1952 (2012)

    Article  Google Scholar 

  3. Pouriyeh, S., Vahid, S., Sannino, G., De Pietro, G., Arabnia, H., Gutierrez, J.: A comprehensive investigation and comparison of machine learning techniques in the domain of heart disease. In 2017 IEEE symposium on computers and communications (ISCC), pp. 204–207. IEEE. July 2017

  4. Ghwanmeh, S., Mohammad, A., Al-Ibrahim, A.: Innovative artificial neural networks-based decision support system for heart diseases diagnosis. J. Intell. Learn. Syst. Appl. 5(3), 176–83 (2013)

    Google Scholar 

  5. Sevakula, R.K., Verma, N.K.: Assessing generalization ability of majority vote point classifiers. IEEE Trans. neural networks Learn. Syst. 28(12), 2985–2997 (2016)

    Article  MathSciNet  Google Scholar 

  6. Li, H., Cui, Y., Liu, Y., Li, W., Shi, Y., Fang, C., Lu, Y.: Ensemble learning for overall power conversion efficiency of the all-organic dye-sensitized solar cells. IEEE Access. 6, 34118–34126 (2018)

    Article  Google Scholar 

  7. Shamrat, F.J.M., Raihan, M.A., Rahman, A.S., Mahmud, I., Akter, R.: An analysis on breast disease prediction using machine learning approaches. Int. J. Sci. Technol. Res. 9(02), 2450–2455 (2020)

    Google Scholar 

  8. Singh, D., Samagh, J.S.: A comprehensive review of heart disease prediction using machine learning. J. Crit. Reviews. 7(12), 281–285 (2020)

    Google Scholar 

  9. Asif, S., Wenhui, Y., Tao, Y., Jinhai, S., Jin, H.: An Ensemble Machine Learning Method for the Prediction of Heart Disease. In 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 98–103. IEEE. May 2021 

  10. Liu, X., Wang, X., Su, Q., Zhang, M., Zhu, Y., Wang, Q., Wang, Q.: A hybrid classification system for heart disease diagnosis based on the RFRS method. Comput. Math. Methods Med. (2017). https://doi.org/10.1155/2017/8272091

    Article  MathSciNet  Google Scholar 

  11. Amin, M.S., Chiam, Y.K., Varathan, K.D.: Identification of significant features and data mining techniques in predicting heart disease. Telematics Inform. 36, 82–93 (2019)

    Article  Google Scholar 

  12. Atallah, R., Al-Mousa, A.: Heart disease detection using machine learning majority voting ensemble method. In 2019 2nd international conference on new trends in computing sciences (ictcs), pp. 1–6. IEEE. October, 2019 

  13. Kannan, R., Vasanthi, V.: Machine learning algorithms with ROC curve for predicting and diagnosing the heart disease. In: Soft Computing and Medical Bioinformatics, pp. 63–72. Springer, Singapore (2019)

    Chapter  Google Scholar 

  14. Gudadhe, M., Wankhade, K., Dongre, S.: Decision support system for heart disease based on support vector machine and artificial neural network. In 2010 International Conference on Computer and Communication Technology (ICCCT), pp. 741–745. IEEE, September, 2010

  15. Prasad, R., Anjali, P., Adil, S., Deepa, N.: Heart disease prediction using logistic regression algorithm using machine learning. Int. J. Eng. Adv. Technol. 8(3S), 659–662 (2019)

    Google Scholar 

  16. Melillo, P., De Luca, N., Bracale, M., Pecchia, L.: Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. biomedical health Inf. 17(3), 727–733 (2013)

    Article  Google Scholar 

  17. Nalluri, S., Saraswathi, V., Ramasubbareddy, R., Govinda, S., K., Swetha, E.: Chronic heart disease prediction using data mining techniques. In: Data Engineering and Communication Technology, pp. 903–912. Springer, Singapore (2020)

    Chapter  Google Scholar 

  18. Sapra, L., Sandhu, J.K., Goyal, N.: Intelligent method for detection of coronary artery disease with ensemble approach. In: Advances in Communication and Computational Technology, pp. 1033–1042. Springer, Singapore (2021)

    Chapter  Google Scholar 

  19. Raza, K.: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. In: U-Healthcare Monitoring Systems, pp. 179–196. Academic Press, Cambridge (2019)

    Google Scholar 

  20. Mohan, S., Thirumalai, C., Srivastava, G.: Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7, 81542–81554 (2019)

    Article  Google Scholar 

  21. Zomorodi-moghadam, M., Abdar, M., Davarzani, Z., Zhou, X., Pławiak, P., Acharya, U.R.: Hybrid particle swarm optimization for rule discovery in the diagnosis of coronary artery disease. Expert Syst. 38(1), e12485 (2021)

    Article  Google Scholar 

  22. Geweid, G.G., Abdallah, M.A.: A new automatic identification method of heart failure using improved support vector machine based on duality optimization technique. IEEE Access 7, 149595–149611 (2019)

    Article  Google Scholar 

  23. Haq, A.U., Li, J.P., Memon, M.H., Nazir, S., Sun, R.: A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob. Inform. Syst. (2018). https://doi.org/10.1155/2018/3860146

    Article  Google Scholar 

  24. Rashmi, G.O., Kumar, U.M.A.: Machine learning methods for heart disease prediction. Int. J. Eng. Adv. Technol. 8(5S), 220–223 (2019)

    Article  Google Scholar 

  25. Sharma, S., Parmar, M.: Heart diseases prediction using deep learning neural network model. Int. J. Innovative Technol. Exploring Eng. (IJITEE). 9(3), 2244–2248 (2020)

    Article  Google Scholar 

  26. Dwivedi, A.K.: Performance evaluation of different machine learning techniques for prediction of heart disease. Neural Comput. Appl. 29(10), 685–693 (2018)

    Article  Google Scholar 

  27. Alizadehsani, R., Habibi, J., Hosseini, M.J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., Sani, Z.A.: A data mining approach for diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 111(1), 52–61 (2013)

    Article  Google Scholar 

  28. Guidi, G., Pettenati, M.C., Melillo, P., Iadanza, E.: A machine learning system to improve heart failure patient assistance. IEEE J. Biomed. Health Inform. 18(6), 1750–1756 (2014)

    Article  Google Scholar 

  29. Abdar, M., Acharya, U.R., Sarrafzadegan, N., Makarenkov, V.: NE-nu-SVC: A new nested ensemble clinical decision support system for effective diagnosis of coronary artery disease. IEEE Access. 7, 167605–167620 (2019)

    Article  Google Scholar 

  30. Qin, C.J., Guan, Q., Wang, X.P.: Application of ensemble algorithm integrating multiple criteria feature selection in coronary heart disease detection. Biomed. Eng. 29(06), 1750043 (2017)

    Google Scholar 

  31. Abdar, M., Książek, W., Acharya, U.R., Tan, R.S., Makarenkov, V., Pławiak, P.: A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 179, 104992 (2019)

    Article  Google Scholar 

  32. Shah, D., Patel, S., Bharti, S.K.: Heart disease prediction using machine learning techniques. SN Comput. Sci. 1(6), 1–6 (2020)

    Article  Google Scholar 

  33. Latha, C.B.C., Jeeva, S.C.: Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inf. Med. Unlocked. 16, 100203 (2019)

    Article  Google Scholar 

  34. Nasarian, E., Abdar, M., Fahami, M.A., Alizadehsani, R., Hussain, S., Basiri, M.E., Sarrafzadegan, N.: Association between work-related features and coronary artery disease: a heterogeneous hybrid feature selection integrated with balancing approach. Pattern Recognit. Lett. 133, 33–40 (2020)

    Article  Google Scholar 

  35. Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2019)

  36. Alizadehsani, R., Roshanzamir, M., Abdar, M., Beykikhoshk, A., Khosravi, A., Panahiazar, M., Sarrafzadegan, N.: A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Sci. Data. 6(1), 1–13 (2019)

    Article  Google Scholar 

  37. Kursa, M.B., Rudnicki, W.R.: Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010)

    Article  Google Scholar 

  38. Bashir, S., Qamar, U., Khan, F.H.: A multicriteria weighted vote-based classifier ensemble for heart disease prediction. Comput. Intell. 32(4), 615–645 (2016)

    Article  MathSciNet  Google Scholar 

  39. Ali, L.I., Niamat, A., Golilarz, N.A., Ali, A., Xingzhong, X.: An expert system based on optimized stacked support vector machines for effective diagnosis of heart disease. IEEE Access 4, 2169–3536 (2019)

    Google Scholar 

  40. Paul, A.K., Shill, P.C., Rabin, M., Islam, R., Murase, K.: Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. Appl. Intell. 48(7), 1739–1756 (2018)

    Article  Google Scholar 

  41. Dinesh, K.G., Arumugaraj, K., Santhosh, K.D., Mareeswari, V.: ‘Prediction of cardiovascular disease using machine learning algorithms, In: Proceedings International Conference on Current Trends towards Converging Technologies (ICCTCT), Coimbatore, India,  pp. 1–7 (2018)

Download references

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2019YFA0706400 and Grant 2019YFA0706402, in part by the Pre-Research Funds for Equipment of China under Grant 61409220115.

Author information

Authors and Affiliations

Authors

Contributions

Sohaib Asif: Data curation, Methodology, Validation, Writing – original draft. Wenhui Yi: Conceptualization, Investigation, Supervision, Writing – review & editing. Jin Hou and Jinhai Si: Formal analysis, Validation, Investigation. Qurrat ul Ain and Yueyang Yi: Analysis, Validation, Writing – review & editing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yi Wenhui.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

For this type of study, formal consent is not required.

Experiments involving human and/or animal participants

This paper does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asif, S., Wenhui, Y., ul Ain, Q. et al. Improving the accuracy of diagnosing and predicting coronary heart disease using ensemble method and feature selection techniques. Cluster Comput 27, 1927–1946 (2024). https://doi.org/10.1007/s10586-023-04062-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-023-04062-2

Keywords

Navigation