Abstract
Recently, every human being has been busy keeping pace with fast changes and is unable to put his attention on a healthy lifestyle. This unhealthy lifestyle leads to several fatal diseases like heart disease (HD), cancer disease, allergies, asthma, breast cancer, etc. These diseases are increasing the morbidity and mortality rate every year, and out of these HD is more fetal. Earlier/Timely prediction of HD is regarded as one of the most crucial tasks in the medical field. In the healthcare industry, a big collection of datasets is available free of cost. Motivated by existing challenges, the authors proposed two intelligent models i.e., HyOPTRF (Model 1), and HyOPTXGBoost Classifier (Model 2), implemented on the Statlog HD dataset with default and hyper-tuned parameters. Both the models recorded maximum Accuracy of 92.59%, Specificity 100%, Precision 100%, Negative Predicted Value 100%, Geometric mean 93.93% F1 Score 93.75%, Training Time 0.02s, and Testing Time 0.005s for the HyOPTRF on Trial no. 2nd and Accuracy of 96.30%, Specificity 100%, Precision 100%, Negative Predicted Value 100%, Geometric mean 96.82%, F1 Score 96.77%, Training Time 0.009s, and Testing Time 0.002s for the HyOPTXGBoost Classifier on Trial no. 33rd. The proposed models were validated by the Stratify Kfold Cross-Validation technique and compared with the other existing models.
Similar content being viewed by others
Data availability
In this research study an open-source dataset has been used, which is freely available on the UCI Machine Learning repository at https://archive.ics.uci.edu/ml/datasets/statlog+(heart).
References
Mirbabaie M, Stieglitz S, Frick NRJ (2021) Artificial intelligence in disease diagnostics: a critical review and classification on the current state of research guiding future direction. Health Technol (Berl) 11(4):693–731. https://doi.org/10.1007/s12553-021-00555-5
Farzin A, Hassan S, Emadi R, Etesami SA, Ai J (2019) Comparative evaluation of magnetic hyperthermia performance and biocompatibility of magnetite and novel Fe-doped hardystonite nanoparticles for potential bone cancer therapy. Mater Sci Eng C 98(August 2018):930–938. https://doi.org/10.1016/j.msec.2019.01.038
Long NC, Meesad P, Unger H (2015) A highly accurate firefly based algorithm for heart disease prediction. Expert Syst Appl 42(21):8221–8231. https://doi.org/10.1016/j.eswa.2015.06.024
Havaei M et al (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31. https://doi.org/10.1016/j.media.2016.05.004
Ali F et al (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion 63(April):208–222. https://doi.org/10.1016/j.inffus.2020.06.008
Ayon SI, Islam MM, Hossain MR (2020) Coronary artery heart disease prediction: a comparative study of computational intelligence techniques. IETE J Res 0(0):1–20. https://doi.org/10.1080/03772063.2020.1713916
Rong G, Mendez A, Bou Assi E, Zhao B, Sawan M (2020) Artificial intelligence in healthcare: review and prediction case studies. Engineering 6(3):291–301. https://doi.org/10.1016/j.eng.2019.08.015
Ke C et al (2018) Divergent trends in ischaemic heart disease and stroke mortality in India from 2000 to 2015: a nationally representative mortality study. Lancet Glob Heal 6(8):e914–e923. https://doi.org/10.1016/S2214-109X(18)30242-0
Bharti R, Khamparia A, Shabaz M, Dhiman G, Pande S, Singh P (2021) Prediction of heart disease using a combination of machine learning and deep learning. Comput Intell Neurosci 2021:1. https://doi.org/10.1155/2021/8387680
Dhanka S, Maini S (2021) Random forest for heart disease detection: a classification approach. In: 2021 IEEE 2nd International Conference On Electrical Power and Energy Systems (ICEPES), pp 1–3. https://doi.org/10.1109/ICEPES52894.2021.9699506
Rajkamal R, Karthi A (2022) Heart disease prediction using entropy based feature engineering and ensembling of machine learning classifiers. Expert Syst Appl 207:117882. https://doi.org/10.1016/j.eswa.2022.117882
Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554. https://doi.org/10.1109/ACCESS.2019.2923707
Amin MS, Chiam YK, Varathan KD (2019) Identification of significant features and data mining techniques in predicting heart disease. Telemat Inform 36(August 2018):82–93. https://doi.org/10.1016/j.tele.2018.11.007
Budholiya K, Shrivastava SK, Sharma V (2020) An optimized XGBoost based diagnostic system for effective prediction of heart disease. J King Saud Univ - Comput Inf Sci 34(7):4514–4523. https://doi.org/10.1016/j.jksuci.2020.10.013
Nagavelli U, Samanta D, Chakraborty P (2022) Machine learning technology-based heart disease detection models. J Healthc Eng 2022. https://doi.org/10.1155/2022/7351061
Indrakumari R, Poongodi T, Jena SR (2020) Heart disease prediction using exploratory data analysis. Procedia Comput Sci 173(2019):130–139. https://doi.org/10.1016/j.procs.2020.06.017
Vijayashree J, Sultana HP (2018) A machine learning framework for feature selection in heart disease classification using improved particle swarm optimization with support vector machine classifier. Program Comput Softw 44(6):388–397. https://doi.org/10.1134/S0361768818060129
Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In: Proc ACM SIGKDD Int Conf Knowl Discov Data Min, pp 785–794. https://doi.org/10.1145/2939672.2939785
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. [Online]. Available: https://www.jstor.org/stable/2699986
Alanazi HO, Abdullah AH, Qureshi KN (2017) A critical review for developing accurate and dynamic predictive models using machine learning methods in medicine and health care. J Med Syst 41(4):69. https://doi.org/10.1007/s10916-017-0715-6
Asadi S, Roshan SE, Kattan MW (2021) Random forest swarm optimization-based for heart diseases diagnosis. J Biomed Inform 115:103690. https://doi.org/10.1016/j.jbi.2021.103690
Valarmathi R, Sheela T (2021) Heart disease prediction using hyper parameter optimization (HPO) tuning. Biomed Signal Process Control 70:103033. https://doi.org/10.1016/j.bspc.2021.103033
Srinivas P, Katarya R (2022) hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomed Signal Process Control 73(June 2021):103456. https://doi.org/10.1016/j.bspc.2021.103456
Mahmood N, Shahid S, Bakhshi T, Riaz S, Ghufran H, Yaqoob M (2020) Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach. Med Biol Eng Comput 58(11):2631–2640. https://doi.org/10.1007/s11517-020-02245-2
Kuntz S et al (2021) Gastrointestinal cancer classification and prognostication from histology using deep learning: systematic review. Eur J Cancer 155:200–215. https://doi.org/10.1016/j.ejca.2021.07.012
Asri H, Mousannif H, Al Moatassime H, Noel T (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Comput Sci 83(Fams):1064–1069. https://doi.org/10.1016/j.procs.2016.04.224
Yadav SS, Jadhav SM (2021) Detection of common risk factors for diagnosis of cardiac arrhythmia using machine learning algorithm. Expert Syst Appl 163(March 2020):113807. https://doi.org/10.1016/j.eswa.2020.113807
Muhammad LJ, Al-Shourbaji I, Haruna AA, Mohammed IA, Ahmad A, Jibrin MB (2021) Machine learning predictive models for coronary artery disease. SN Comput Sci 2(5):350. https://doi.org/10.1007/s42979-021-00731-4
Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M (2017) Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Inf 4(3):159–169. https://doi.org/10.1007/s40708-017-0065-7
Latha CBC, Jeeva SC (2018) Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inform Med Unlocked 16(November):100203. https://doi.org/10.1016/j.imu.2019.100203
Fitriyani NL, Syafrudin M, Alfian G, Rhee J (2020) An effective heart disease prediction model for a clinical decision support system. IEEE Access 8:133034–133050. https://doi.org/10.1109/ACCESS.2020.3010511
Tama BA, Im S, Lee S (2020) Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. Biomed Res Int 2020:1–10. https://doi.org/10.1155/2020/9816142
Koppu S, Kumar P, Maddikunta R, Srivastava G (2020) Deep learning disease prediction model for use with intelligent robots. Comput Electr Eng 87:106765. https://doi.org/10.1016/j.compeleceng.2020.106765
Karadeniz T, Tokdemir G, Maraş HH (2021) Ensemble methods for heart disease prediction. New Gener Comput 39(3–4):569–581. https://doi.org/10.1007/s00354-021-00124-4
Jothi Prakash V, Karthikeyan NK (2021) Enhanced evolutionary feature selection and ensemble method for cardiovascular disease prediction. Interdiscip Sci – Comput Life Sci 13(3):389–412. https://doi.org/10.1007/s12539-021-00430-x
Zhenya Q, Zhang Z (2021) A hybrid cost-sensitive ensemble for heart disease prediction. BMC Med Inform Decis Mak 21(1):1–18. https://doi.org/10.1186/s12911-021-01436-7
Nandy S, Adhikari M, Balasubramanian V, Menon VG, Li X, Zakarya M (2023) An intelligent heart disease prediction system based on swarm-artificial neural network. Neural Comput Appl 35(20):14723–14737. https://doi.org/10.1007/s00521-021-06124-1
Nagarajan SM, Muthukumaran V, Murugesan R, Joseph RB, Meram M, Prathik A (2021) Innovative feature selection and classification model for heart disease prediction. J Reliab Intell Environ. https://doi.org/10.1007/s40860-021-00152-3
El-Shafiey MG, Hagag A, El-Dahshan E-SA, Ismail MA (2022) A hybrid GA and PSO optimized approach for heart-disease prediction based on random forest. Multimed Tools Appl 81(13):18155–18179. https://doi.org/10.1007/s11042-022-12425-x
Anderies A, Tchin JARW, Putro PH, Darmawan YP, Gunawan AAS (2022) Prediction of heart disease UCI dataset using machine learning algorithms. Eng Math Comput Sci J 4(3):87–93. https://doi.org/10.21512/emacsjournal.v4i3.8683
Statlog (Heart) (2017) [online] Available: https://doi.org/10.24432/C57303
Ma B, Meng F, Yan G, Yan H, Chai B, Song F (2020) Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data. Comput Biol Med 121:103761. https://doi.org/10.1016/j.compbiomed.2020.103761
Louppe G (2014) Understanding random forests: from theory to practice, no. [Online]. Available: https://doi.org/10.48550/arXiv.1407.7502
Hastie T, Tibshirani R, Friedman J (2009) Random forests. In: The Elements of Statistical Learning. Springer Series in Statistics. Springer, New York, pp 587–604. https://doi.org/10.1007/978-0-387-84858-7_15
Jackins V, Vimal S, Kaliappan M, Lee MY (2021) AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J Supercomput 77(5):5198–5219. https://doi.org/10.1007/s11227-020-03481-x
Nguyen H, Bui XN (2019) Predicting blast-induced air overpressure: a robust artificial intelligence system based on artificial neural networks and random forest. Nat Resour Res 28(3):893–907. https://doi.org/10.1007/s11053-018-9424-1
Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A Next-generation Hyperparameter Optimization Framework. Proc ACM SIGKDD Int Conf Knowl Discov Data Min 2623–2631. https://doi.org/10.1145/3292500.3330701
Jeba JA (2021) Case study of Hyperparameter optimization framework Optuna on a Multi-column Convolutional Neural Network A Thesis Submitted to the College of Graduate and Postdoctoral Studies in Partial Fulfillment of the Requirements for the degree of Master of Science
Dileep P et al (2022) An automatic heart disease prediction using cluster-based bi-directional LSTM (C-BiLSTM) algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-022-07064-0
Abdellatif A, Abdellatef H, Kanesan J, Chow C-O, Chuah JH, Gheni HM (2022) Improving the heart disease detection and patients’ survival using supervised infinite feature selection and improved weighted random forest. IEEE Access 10(June):67363–67372. https://doi.org/10.1109/ACCESS.2022.3185129
Fiaidhi J, Mohammed S (2021) Prognosis analysis of thick data: clustering heart diseases risk groups case study. Comput Electr Eng 92(June 2020):107187. https://doi.org/10.1016/j.compeleceng.2021.107187
Saboor A, Usman M, Ali S et al (2022) A Method for improving prediction of human heart disease using machine learning algorithms. Mob Inf Syst 2022:1–9. https://doi.org/10.1155/2022/1410169
Funding
This research study received no external funding.
Author information
Authors and Affiliations
Contributions
Sanjay Dhanka: Data curation, Resources, Investigation, Formal analysis, Conceptualization, Methodology, Project administration, Prepared original draft, Review and editing, Validation. Surita Maini: Methodology, Supervision, Project administration, Review and editing.
Corresponding author
Ethics declarations
Ethical and informed consent for data
There are no ethical issues related to this dataset.
Competing interests
The authors declare that they have no competing interests. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this research study.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dhanka, S., Maini, S. HyOPTXGBoost and HyOPTRF: Hybridized Intelligent Systems using Optuna Optimization Framework for Heart Disease Prediction with Clinical Interpretations. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18312-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18312-x