skip to main content
10.1145/3647444.3647927acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimmiConference Proceedingsconference-collections
research-article

Utilizing Data Mining Techniques to Develop Accurate Predictive Models for Cardiovascular Disease Risk Assessment and Early Detection

Published:13 May 2024Publication History

ABSTRACT

Healthcare research priorities predictive modelling accuracy and effectiveness, especially in cardiovascular disease (CVD) risk assessment and early detection. This transformative study uses advanced data mining techniques to build an unmatched predictive model with 99.37% accuracy. This novel hybrid framework relies on the powerful Random Forest algorithm, which is adaptable and robust. The Random Forest paradigm, which trains many decision trees on a random subset of the data, has been successful in uncovering complex feature interactions and data relationships. The above trait allows it to navigate the complex CVD risk assessment conundrum. Due to its ability to iteratively improve performance, Gradient Boosting is added to the Random Forest framework. Gradient Boosting is used to improve model prediction in this study. This method refines predictive performance by iteratively fixing previous model errors. This study uses an iterative refinement process to capture even the smallest dataset nuances, improving analysis precision. Creating the hybrid model in this study required skill and planning. Random Forest's ability to understand complex feature interactions and Gradient Boosting's iterative precision improve its prediction. The convergence of data mining methods creates a predictive model that outperforms its components, predicting cardiovascular disease risk with 99.37% accuracy. This revolutionary hybrid model has two major implications. This tool helps doctors identify potential cases of CVD and implement personalized care strategies by assessing risk accurately and quickly. This phenomenon also shows the potential that data mining's complex methods unleash when seamlessly integrated. This harmonious integration helps explore uncharted territories and opens new predictive modelling precision frontiers. The convergence of Random Forest and Gradient Boosting methods for cardiovascular disease risk assessment advances predictive modelling. A novel hybrid model with 98.737% accuracy is presented in this study. This model solves complex cardiovascular disease detection and risk mitigation problems, making it a major data mining advancement. By using this innovative approach, data mining techniques can be used to address CVD's formidable challenges. The concept is a symbol that guides the healthcare community towards better diagnostic precision and patient-centered care.

References

  1. WHO, “Cardiovascular diseases (CVDs).” .Google ScholarGoogle Scholar
  2. C. Krittanawong, H. J. Zhang, Z. Wang, M. Aydar, and T. Kitai, “Artificial Intelligence in Precision Cardiovascular Medicine,” J. Am. Coll. Cardiol., vol. 69, no. 21, pp. 2657–2664, 2017, doi: 10.1016/j.jacc.2017.03.571.Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Swathy and K. Saruladha, “A comparative study of classification and prediction of Cardio-Vascular Diseases (CVD) using Machine Learning and Deep Learning techniques,” ICT Express, vol. 8, no. 1, pp. 109–116, 2022, doi: 10.1016/j.icte.2021.08.021.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Azmi, M. Arif, M. T. Nafis, M. A. Alam, S. Tanweer, and G. Wang, “A systematic review on machine learning approaches for cardiovascular disease prediction using medical big data,” Med. Eng. Phys., vol. 105, no. February, p. 103825, 2022, doi: 10.1016/j.medengphy.2022.103825.Google ScholarGoogle ScholarCross RefCross Ref
  5. H. B. Kibria and A. Matin, “The severity prediction of the binary and multi-class cardiovascular disease − A machine learning-based fusion approach,” Comput. Biol. Chem., vol. 98, no. November 2021, p. 107672, 2022, doi: 10.1016/j.compbiolchem.2022.107672.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Moon, H. F. Posada-Quintero, and K. H. Chon, “A literature embedding model for cardiovascular disease prediction using risk factors, symptoms, and genotype information,” Expert Syst. Appl., vol. 213, no. PA, p. 118930, 2023, doi: 10.1016/j.eswa.2022.118930.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. U. Ahmed, J. C. W. Lin, and G. Srivastava, “Multivariate time-series sensor vital sign forecasting of cardiovascular and chronic respiratory diseases,” Sustain. Comput. Informatics Syst., vol. 38, no. December 2021, 2023, doi: 10.1016/j.suscom.2023.100868.Google ScholarGoogle ScholarCross RefCross Ref
  8. J. Femila Roseline, G. B. S. R. Naidu, V. Samuthira Pandi, S. Alamelu alias Rajasree, and D. N. Mageswari, “Autonomous credit card fraud detection using machine learning approach☆,” Comput. Electr. Eng., vol. 102, no. December 2021, p. 108132, 2022, doi: 10.1016/j.compeleceng.2022.108132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Yu , “Associations between residential greenness and the predicted 10-year risk for atherosclerosis cardiovascular disease among Chinese adults,” Sci. Total Environ., vol. 868, no. July 2022, p. 161643, 2023, doi: 10.1016/j.scitotenv.2023.161643.Google ScholarGoogle ScholarCross RefCross Ref
  10. I. El Boujnouni, B. Harouchi, A. Tali, S. Rachafi, and Y. Laaziz, “Automatic diagnosis of cardiovascular diseases using wavelet feature extraction and convolutional capsule network,” Biomed. Signal Process. Control, vol. 81, no. May 2022, p. 104497, 2023, doi: 10.1016/j.bspc.2022.104497.Google ScholarGoogle ScholarCross RefCross Ref
  11. P. Srinivas and R. Katarya, “hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost,” Biomed. Signal Process. Control, vol. 73, no. November 2021, p. 103456, 2022, doi: 10.1016/j.bspc.2021.103456.Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Lv , “Long-term effects of particulate matter on incident cardiovascular diseases in middle-aged and elder adults: The CHARLS cohort study,” Ecotoxicol. Environ. Saf., vol. 262, no. June, p. 115181, 2023, doi: 10.1016/j.ecoenv.2023.115181.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. O. Klompmaker , “Long-term exposure to specific humidity and cardiovascular disease hospitalizations in the US Medicare population,” ISEE Conf. Abstr., vol. 2021, no. 1, p. 108182, 2021, doi: 10.1289/isee.2021.p-597.Google ScholarGoogle ScholarCross RefCross Ref
  14. Y. Tian , “Short-term exposure to reduced specific-size ambient particulate matter increase the risk of cause-specific cardiovascular disease: A national-wide evidence from hospital admissions,” Ecotoxicol. Environ. Saf., vol. 263, no. August, p. 115327, 2023, doi: 10.1016/j.ecoenv.2023.115327.Google ScholarGoogle ScholarCross RefCross Ref
  15. N. Kumaraswamy, M. K. Markey, J. C. Barner, and K. Rascati, “Feature engineering to detect fraud using healthcare claims data,” Expert Syst. Appl., vol. 210, no. August, p. 118433, 2022, doi: 10.1016/j.eswa.2022.118433.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. E. Hossain, S. Uddin, and A. Khan, “Network analytics and machine learning for predictive risk modelling of cardiovascular disease in patients with type 2 diabetes,” Expert Syst. Appl., vol. 164, no. April 2020, p. 113918, 2021, doi: 10.1016/j.eswa.2020.113918.Google ScholarGoogle ScholarCross RefCross Ref
  17. B. P. Doppala, D. Bhattacharyya, M. Chakkravarthy, and T. hoon Kim, “A hybrid machine learning approach to identify coronary diseases using feature selection mechanism on heart disease dataset,” Distrib. Parallel Databases, vol. 41, no. 1–2, pp. 1–20, 2023, doi: 10.1007/s10619-021-07329-y.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Guo, J. Zhang, Y. Liu, Y. Xie, Z. Han, and J. Yu, “Recursion Enhanced Random Forest with an Improved Linear Model (RERF-ILM) for Heart Disease Detection on the Internet of Medical Things Platform,” IEEE Access, vol. 8, pp. 59247–59256, 2020, doi: 10.1109/ACCESS.2020.2981159.Google ScholarGoogle ScholarCross RefCross Ref
  19. N. L. Fitriyani, M. Syafrudin, G. Alfian, and J. Rhee, “HDPM: An Effective Heart Disease Prediction Model for a Clinical Decision Support System,” IEEE Access, vol. 8, pp. 133034–133050, 2020, doi: 10.1109/ACCESS.2020.3010511.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. T. Reddy, M. P. K. Reddy, K. Lakshmanna, D. S. Rajput, R. Kaluri, and G. Srivastava, “Hybrid genetic algorithm and a fuzzy logic classifier for heart disease diagnosis,” Evol. Intell., vol. 13, no. 2, pp. 185–196, 2020, doi: 10.1007/s12065-019-00327-1.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. ULIANOVA, “Cardiovascular Disease dataset | Kaggle.” [Online]. Available: https://www.kaggle.com/datasets/sulianova/cardiovascular-disease-dataset.Google ScholarGoogle Scholar
  22. Singh, U. P., Saxena, V., Kumar, A., Bhari, P., & Saxena, D. (2022, December). Unraveling the Prediction of Fine Particulate Matter over Jaipur, India using Long Short-Term Memory Neural Network. In Proceedings of the 4th International Conference on Information Management & Machine Intelligence (pp. 1-5).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kumar, A., Bhari, P. L., Singh, U. P., & Saxena, V. (2022, December). Comparative Study of different Machine Learning Algorithms to Analyze Sentiments with a Case Study of Two Person's Microblogs on Twitter. In Proceedings of the 4th International Conference on Information Management & Machine Intelligence (pp.1-6).Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Saxena, V., Saxena, D., & Singh, U. P. (2022, December). Security Enhancement using Image verification method to Secure Docker Containers. In Proceedings of the 4th International Conference on Information Management & Machine Intelligence (pp. 1-5).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Chauhan, M., Malhotra, R., Pathak, M., & Singh, U. P. (2012). Different aspects of cloud security. International Journal of Engineering Research and Applications, 2, 864-869.Google ScholarGoogle Scholar
  26. Mittal, A. K., Singh, U. P., Tiwari, A., Dwivedi, S., Joshi, M. K., & Tripathi, K. C. (2015). Short-term predictions by statistical methods in regions of varying dynamical error growth in a chaotic system. Meteorology and Atmospheric Physics, 127, 457-465.Google ScholarGoogle Scholar
  27. Singh, U. P., Mittal, A. K., Dwivedi, S., & Tiwari, A. (2015). Predictability study of forced Lorenz model: an artificial neural network approach. History, 40(181), 27-33.Google ScholarGoogle Scholar
  28. Singh, U. P., Mittal, A. K., Dwivedi, S., & Tiwari, A. (2020). Evaluating the predictability of central Indian rainfall on short and long timescales using theory of nonlinear dynamics. Journal of water and Climate Change, 11(4), 1134-1149.Google ScholarGoogle Scholar
  29. Singh, U., Pathak, M., Malhotra, R., & Chauhan, M. (2012). Secure communication protocol for ATM using TLS handshake. Journal of Engineering Research and Applications (IJERA), 2(2), 838-948.Google ScholarGoogle Scholar
  30. Singh, U. P., & Mittal, A. K. (2021). Testing reliability of the spatial Hurst exponent method for detecting a change point. Journal of Water and Climate Change, 12(8), 3661-3674.Google ScholarGoogle Scholar
  31. Tiwari, A., Mittal, A. K., Dwivedi, S., & Singh, U. P. (2015). Nonlinear time series analysis of rainfall over central Indian region using CMIP5 based climate model. Climate Change, 1(4), 411-417.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICIMMI '23: Proceedings of the 5th International Conference on Information Management & Machine Intelligence
    November 2023
    1215 pages

    Copyright © 2023 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2024

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)15

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format