Skip to main content
Log in

HyOPTXGBoost and HyOPTRF: Hybridized Intelligent Systems using Optuna Optimization Framework for Heart Disease Prediction with Clinical Interpretations

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recently, every human being has been busy keeping pace with fast changes and is unable to put his attention on a healthy lifestyle. This unhealthy lifestyle leads to several fatal diseases like heart disease (HD), cancer disease, allergies, asthma, breast cancer, etc. These diseases are increasing the morbidity and mortality rate every year, and out of these HD is more fetal. Earlier/Timely prediction of HD is regarded as one of the most crucial tasks in the medical field. In the healthcare industry, a big collection of datasets is available free of cost. Motivated by existing challenges, the authors proposed two intelligent models i.e., HyOPTRF (Model 1), and HyOPTXGBoost Classifier (Model 2), implemented on the Statlog HD dataset with default and hyper-tuned parameters. Both the models recorded maximum Accuracy of 92.59%, Specificity 100%, Precision 100%, Negative Predicted Value 100%, Geometric mean 93.93% F1 Score 93.75%, Training Time 0.02s, and Testing Time 0.005s for the HyOPTRF on Trial no. 2nd and Accuracy of 96.30%, Specificity 100%, Precision 100%, Negative Predicted Value 100%, Geometric mean 96.82%, F1 Score 96.77%, Training Time 0.009s, and Testing Time 0.002s for the HyOPTXGBoost Classifier on Trial no. 33rd. The proposed models were validated by the Stratify Kfold Cross-Validation technique and compared with the other existing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

In this research study an open-source dataset has been used, which is freely available on the UCI Machine Learning repository at https://archive.ics.uci.edu/ml/datasets/statlog+(heart).

References

  1. Mirbabaie M, Stieglitz S, Frick NRJ (2021) Artificial intelligence in disease diagnostics: a critical review and classification on the current state of research guiding future direction. Health Technol (Berl) 11(4):693–731. https://doi.org/10.1007/s12553-021-00555-5

    Article  Google Scholar 

  2. Farzin A, Hassan S, Emadi R, Etesami SA, Ai J (2019) Comparative evaluation of magnetic hyperthermia performance and biocompatibility of magnetite and novel Fe-doped hardystonite nanoparticles for potential bone cancer therapy. Mater Sci Eng C 98(August 2018):930–938. https://doi.org/10.1016/j.msec.2019.01.038

    Article  CAS  Google Scholar 

  3. Long NC, Meesad P, Unger H (2015) A highly accurate firefly based algorithm for heart disease prediction. Expert Syst Appl 42(21):8221–8231. https://doi.org/10.1016/j.eswa.2015.06.024

    Article  Google Scholar 

  4. Havaei M et al (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31. https://doi.org/10.1016/j.media.2016.05.004

    Article  PubMed  Google Scholar 

  5. Ali F et al (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion 63(April):208–222. https://doi.org/10.1016/j.inffus.2020.06.008

    Article  Google Scholar 

  6. Ayon SI, Islam MM, Hossain MR (2020) Coronary artery heart disease prediction: a comparative study of computational intelligence techniques. IETE J Res 0(0):1–20. https://doi.org/10.1080/03772063.2020.1713916

  7. Rong G, Mendez A, Bou Assi E, Zhao B, Sawan M (2020) Artificial intelligence in healthcare: review and prediction case studies. Engineering 6(3):291–301. https://doi.org/10.1016/j.eng.2019.08.015

    Article  Google Scholar 

  8. Ke C et al (2018) Divergent trends in ischaemic heart disease and stroke mortality in India from 2000 to 2015: a nationally representative mortality study. Lancet Glob Heal 6(8):e914–e923. https://doi.org/10.1016/S2214-109X(18)30242-0

    Article  Google Scholar 

  9. Bharti R, Khamparia A, Shabaz M, Dhiman G, Pande S, Singh P (2021) Prediction of heart disease using a combination of machine learning and deep learning. Comput Intell Neurosci 2021:1. https://doi.org/10.1155/2021/8387680

    Article  Google Scholar 

  10. Dhanka S, Maini S (2021) Random forest for heart disease detection: a classification approach. In: 2021 IEEE 2nd International Conference On Electrical Power and Energy Systems (ICEPES), pp 1–3. https://doi.org/10.1109/ICEPES52894.2021.9699506

  11. Rajkamal R, Karthi A (2022) Heart disease prediction using entropy based feature engineering and ensembling of machine learning classifiers. Expert Syst Appl 207:117882. https://doi.org/10.1016/j.eswa.2022.117882

    Article  Google Scholar 

  12. Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554. https://doi.org/10.1109/ACCESS.2019.2923707

    Article  Google Scholar 

  13. Amin MS, Chiam YK, Varathan KD (2019) Identification of significant features and data mining techniques in predicting heart disease. Telemat Inform 36(August 2018):82–93. https://doi.org/10.1016/j.tele.2018.11.007

    Article  Google Scholar 

  14. Budholiya K, Shrivastava SK, Sharma V (2020) An optimized XGBoost based diagnostic system for effective prediction of heart disease. J King Saud Univ - Comput Inf Sci 34(7):4514–4523. https://doi.org/10.1016/j.jksuci.2020.10.013

    Article  Google Scholar 

  15. Nagavelli U, Samanta D, Chakraborty P (2022) Machine learning technology-based heart disease detection models. J Healthc Eng 2022. https://doi.org/10.1155/2022/7351061

  16. Indrakumari R, Poongodi T, Jena SR (2020) Heart disease prediction using exploratory data analysis. Procedia Comput Sci 173(2019):130–139. https://doi.org/10.1016/j.procs.2020.06.017

    Article  Google Scholar 

  17. Vijayashree J, Sultana HP (2018) A machine learning framework for feature selection in heart disease classification using improved particle swarm optimization with support vector machine classifier. Program Comput Softw 44(6):388–397. https://doi.org/10.1134/S0361768818060129

    Article  Google Scholar 

  18. Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In: Proc ACM SIGKDD Int Conf Knowl Discov Data Min, pp 785–794. https://doi.org/10.1145/2939672.2939785

  19. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. [Online]. Available: https://www.jstor.org/stable/2699986

  20. Alanazi HO, Abdullah AH, Qureshi KN (2017) A critical review for developing accurate and dynamic predictive models using machine learning methods in medicine and health care. J Med Syst 41(4):69. https://doi.org/10.1007/s10916-017-0715-6

    Article  PubMed  Google Scholar 

  21. Asadi S, Roshan SE, Kattan MW (2021) Random forest swarm optimization-based for heart diseases diagnosis. J Biomed Inform 115:103690. https://doi.org/10.1016/j.jbi.2021.103690

    Article  PubMed  Google Scholar 

  22. Valarmathi R, Sheela T (2021) Heart disease prediction using hyper parameter optimization (HPO) tuning. Biomed Signal Process Control 70:103033. https://doi.org/10.1016/j.bspc.2021.103033

    Article  Google Scholar 

  23. Srinivas P, Katarya R (2022) hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomed Signal Process Control 73(June 2021):103456. https://doi.org/10.1016/j.bspc.2021.103456

  24. Mahmood N, Shahid S, Bakhshi T, Riaz S, Ghufran H, Yaqoob M (2020) Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach. Med Biol Eng Comput 58(11):2631–2640. https://doi.org/10.1007/s11517-020-02245-2

    Article  PubMed  Google Scholar 

  25. Kuntz S et al (2021) Gastrointestinal cancer classification and prognostication from histology using deep learning: systematic review. Eur J Cancer 155:200–215. https://doi.org/10.1016/j.ejca.2021.07.012

    Article  PubMed  Google Scholar 

  26. Asri H, Mousannif H, Al Moatassime H, Noel T (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Comput Sci 83(Fams):1064–1069. https://doi.org/10.1016/j.procs.2016.04.224

    Article  Google Scholar 

  27. Yadav SS, Jadhav SM (2021) Detection of common risk factors for diagnosis of cardiac arrhythmia using machine learning algorithm. Expert Syst Appl 163(March 2020):113807. https://doi.org/10.1016/j.eswa.2020.113807

    Article  Google Scholar 

  28. Muhammad LJ, Al-Shourbaji I, Haruna AA, Mohammed IA, Ahmad A, Jibrin MB (2021) Machine learning predictive models for coronary artery disease. SN Comput Sci 2(5):350. https://doi.org/10.1007/s42979-021-00731-4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M (2017) Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Inf 4(3):159–169. https://doi.org/10.1007/s40708-017-0065-7

    Article  CAS  Google Scholar 

  30. Latha CBC, Jeeva SC (2018) Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inform Med Unlocked 16(November):100203. https://doi.org/10.1016/j.imu.2019.100203

    Article  Google Scholar 

  31. Fitriyani NL, Syafrudin M, Alfian G, Rhee J (2020) An effective heart disease prediction model for a clinical decision support system. IEEE Access 8:133034–133050. https://doi.org/10.1109/ACCESS.2020.3010511

    Article  Google Scholar 

  32. Tama BA, Im S, Lee S (2020) Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. Biomed Res Int 2020:1–10. https://doi.org/10.1155/2020/9816142

    Article  Google Scholar 

  33. Koppu S, Kumar P, Maddikunta R, Srivastava G (2020) Deep learning disease prediction model for use with intelligent robots. Comput Electr Eng 87:106765. https://doi.org/10.1016/j.compeleceng.2020.106765

    Article  PubMed  PubMed Central  Google Scholar 

  34. Karadeniz T, Tokdemir G, Maraş HH (2021) Ensemble methods for heart disease prediction. New Gener Comput 39(3–4):569–581. https://doi.org/10.1007/s00354-021-00124-4

    Article  Google Scholar 

  35. Jothi Prakash V, Karthikeyan NK (2021) Enhanced evolutionary feature selection and ensemble method for cardiovascular disease prediction. Interdiscip Sci – Comput Life Sci 13(3):389–412. https://doi.org/10.1007/s12539-021-00430-x

    Article  CAS  Google Scholar 

  36. Zhenya Q, Zhang Z (2021) A hybrid cost-sensitive ensemble for heart disease prediction. BMC Med Inform Decis Mak 21(1):1–18. https://doi.org/10.1186/s12911-021-01436-7

    Article  Google Scholar 

  37. Nandy S, Adhikari M, Balasubramanian V, Menon VG, Li X, Zakarya M (2023) An intelligent heart disease prediction system based on swarm-artificial neural network. Neural Comput Appl 35(20):14723–14737. https://doi.org/10.1007/s00521-021-06124-1

    Article  Google Scholar 

  38. Nagarajan SM, Muthukumaran V, Murugesan R, Joseph RB, Meram M, Prathik A (2021) Innovative feature selection and classification model for heart disease prediction. J Reliab Intell Environ. https://doi.org/10.1007/s40860-021-00152-3

    Article  Google Scholar 

  39. El-Shafiey MG, Hagag A, El-Dahshan E-SA, Ismail MA (2022) A hybrid GA and PSO optimized approach for heart-disease prediction based on random forest. Multimed Tools Appl 81(13):18155–18179. https://doi.org/10.1007/s11042-022-12425-x

    Article  Google Scholar 

  40. Anderies A, Tchin JARW, Putro PH, Darmawan YP, Gunawan AAS (2022) Prediction of heart disease UCI dataset using machine learning algorithms. Eng Math Comput Sci J 4(3):87–93. https://doi.org/10.21512/emacsjournal.v4i3.8683

  41. Statlog (Heart) (2017) [online] Available: https://doi.org/10.24432/C57303

  42. Ma B, Meng F, Yan G, Yan H, Chai B, Song F (2020) Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data. Comput Biol Med 121:103761. https://doi.org/10.1016/j.compbiomed.2020.103761

    Article  CAS  PubMed  Google Scholar 

  43. Louppe G (2014) Understanding random forests: from theory to practice, no. [Online]. Available: https://doi.org/10.48550/arXiv.1407.7502

  44. Hastie T, Tibshirani R, Friedman J (2009) Random forests. In: The Elements of Statistical Learning. Springer Series in Statistics. Springer, New York, pp 587–604. https://doi.org/10.1007/978-0-387-84858-7_15

  45. Jackins V, Vimal S, Kaliappan M, Lee MY (2021) AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J Supercomput 77(5):5198–5219. https://doi.org/10.1007/s11227-020-03481-x

    Article  Google Scholar 

  46. Nguyen H, Bui XN (2019) Predicting blast-induced air overpressure: a robust artificial intelligence system based on artificial neural networks and random forest. Nat Resour Res 28(3):893–907. https://doi.org/10.1007/s11053-018-9424-1

    Article  MathSciNet  Google Scholar 

  47. Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A Next-generation Hyperparameter Optimization Framework. Proc ACM SIGKDD Int Conf Knowl Discov Data Min 2623–2631. https://doi.org/10.1145/3292500.3330701

  48. Jeba JA (2021) Case study of Hyperparameter optimization framework Optuna on a Multi-column Convolutional Neural Network A Thesis Submitted to the College of Graduate and Postdoctoral Studies in Partial Fulfillment of the Requirements for the degree of Master of Science

  49. Dileep P et al (2022) An automatic heart disease prediction using cluster-based bi-directional LSTM (C-BiLSTM) algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07064-0

    Article  Google Scholar 

  50. Abdellatif A, Abdellatef H, Kanesan J, Chow C-O, Chuah JH, Gheni HM (2022) Improving the heart disease detection and patients’ survival using supervised infinite feature selection and improved weighted random forest. IEEE Access 10(June):67363–67372. https://doi.org/10.1109/ACCESS.2022.3185129

    Article  Google Scholar 

  51. Fiaidhi J, Mohammed S (2021) Prognosis analysis of thick data: clustering heart diseases risk groups case study. Comput Electr Eng 92(June 2020):107187. https://doi.org/10.1016/j.compeleceng.2021.107187

  52. Saboor A, Usman M, Ali S et al (2022) A Method for improving prediction of human heart disease using machine learning algorithms. Mob Inf Syst 2022:1–9. https://doi.org/10.1155/2022/1410169

    Article  Google Scholar 

Download references

Funding

This research study received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

Sanjay Dhanka: Data curation, Resources, Investigation, Formal analysis, Conceptualization, Methodology, Project administration, Prepared original draft, Review and editing, Validation. Surita Maini: Methodology, Supervision, Project administration, Review and editing.

Corresponding author

Correspondence to Sanjay Dhanka.

Ethics declarations

Ethical and informed consent for data

There are no ethical issues related to this dataset. 

Competing interests

The authors declare that they have no competing interests. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this research study.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dhanka, S., Maini, S. HyOPTXGBoost and HyOPTRF: Hybridized Intelligent Systems using Optuna Optimization Framework for Heart Disease Prediction with Clinical Interpretations. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18312-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18312-x

Keywords

Navigation