Skip to main content

Ensemble Machine Learning Models for Breast Cancer Identification

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops (AIAI 2023)

Abstract

The advances in the Machine Learning (ML) domain, from pattern recognition to computational statistical learning, have increased its utility for breast cancer as well by contributing to the screening strategy of diverse risk factors with complex relationships and personalized early prediction. In this work, we focused on Ensemble ML models after using the synthetic minority oversampling technique (SMOTE) with 10-fold cross-validation. Models were compared in terms of precision, accuracy, recall and area under the curve (AUC). After the experimental evaluation, the model that prevailed over the others was the Rotation Forest achieving accuracy, precision and recall equal to 82% and an AUC of 87.4%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breast cancer. https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Accessed 1 Apr 2023

  2. UCI Ml repository. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra. Accessed 1 Apr 2023

  3. Weka. https://www.weka.io/. Accessed 1 Apr 2023

  4. Ahmad, A.: Breast cancer statistics: recent trends. Breast cancer metastasis and drug resistance: challenges and progress, pp. 1–7 (2019)

    Google Scholar 

  5. Ahmad, L.G., Eshlaghy, A., Poorebrahimi, A., Ebrahimi, M., Razavi, A., et al.: Using three machine learning techniques for predicting breast cancer recurrence. J. Health Med. Inform. 4(124), 3 (2013)

    Google Scholar 

  6. Alexiou, S., Dritsas, E., Kocsis, O., Moustakas, K., Fakotakis, N.: An approach for personalized continuous glucose prediction with regression trees. In: 2021 6th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM), pp. 1–6. IEEE (2021)

    Google Scholar 

  7. Alfian, G., et al.: Predicting breast cancer from risk factors using SVM and extra-trees-based feature selection method. Computers 11(9), 136 (2022)

    Article  Google Scholar 

  8. Amrane, M., Oukid, S., Gagaoua, I., Ensari, T.: Breast cancer classification using machine learning. In: 2018 electric electronics, computer science, biomedical engineerings’ meeting (EBBT), pp. 1–4. IEEE (2018)

    Google Scholar 

  9. Billena, C., et al.: 10-year breast cancer outcomes in women \(\le \) 35 years of age. Int. J Rad. Oncol. Biol. Phys. 109(4), 1007–1018 (2021)

    Article  Google Scholar 

  10. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  11. Dritsas, E., Alexiou, S., Moustakas, K.: COPD severity prediction in elderly with ml techniques. In: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 185–189 (2022)

    Google Scholar 

  12. Dritsas, E., Alexiou, S., Konstantoulas, I., Moustakas, K.: Short-term glucose prediction based on oral glucose tolerance test values. In: HEALTHINF, pp. 249–255 (2022)

    Google Scholar 

  13. Dritsas, E., Alexiou, S., Moustakas, K.: Cardiovascular disease risk prediction with supervised machine learning techniques. In: ICT4AWE, pp. 315–321 (2022)

    Google Scholar 

  14. Dritsas, E., Alexiou, S., Moustakas, K.: Efficient data-driven machine learning models for hypertension risk prediction. In: 2022 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 1–6. IEEE (2022)

    Google Scholar 

  15. Dritsas, E., Alexiou, S., Moustakas, K.: Metabolic syndrome risk forecasting on elderly with ML techniques. In: Learning and Intelligent Optimization: 16th International Conference, LION 16, Milos Island, Greece, June 5–10, 2022, Revised Selected Papers, pp. 460–466. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-24866-5_33

  16. Dritsas, E., Fazakis, N., Kocsis, O., Fakotakis, N., Moustakas, K.: Long-term hypertension risk prediction with ML techniques in ELSA database. In: Simos, D.E., Pardalos, P.M., Kotsireas, I.S. (eds.) LION 2021. LNCS, vol. 12931, pp. 113–120. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92121-7_9

    Chapter  Google Scholar 

  17. Dritsas, E., Fazakis, N., Kocsis, O., Moustakas, K., Fakotakis, N.: Optimal team pairing of elder office employees with machine learning on synthetic data. In: 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–4. IEEE (2021)

    Google Scholar 

  18. Dritsas, E., Trigka, M.: Data-driven machine-learning methods for diabetes risk prediction. Sensors 22(14), 5304 (2022)

    Article  Google Scholar 

  19. Dritsas, E., Trigka, M.: Lung cancer risk prediction with machine learning models. Big Data Cognitive Comput. 6(4), 139 (2022)

    Article  Google Scholar 

  20. Dritsas, E., Trigka, M.: Machine learning methods for hypercholesterolemia long-term risk prediction. Sensors 22(14), 5365 (2022)

    Article  Google Scholar 

  21. Dritsas, E., Trigka, M.: Machine learning techniques for chronic kidney disease risk prediction. Big Data Cognitive Comput. 6(3), 98 (2022)

    Article  Google Scholar 

  22. Dritsas, E., Trigka, M.: Stroke risk prediction with machine learning techniques. Sensors 22(13), 4670 (2022)

    Article  Google Scholar 

  23. Dritsas, E., Trigka, M.: Supervised machine learning models to identify early-stage symptoms of sars-cov-2. Sensors 23(1), 40 (2022)

    Article  Google Scholar 

  24. Dritsas, E., Trigka, M.: Efficient data-driven machine learning models for cardiovascular diseases risk prediction. Sensors 23(3), 1161 (2023)

    Article  Google Scholar 

  25. Dritsas, E., Trigka, M.: Supervised machine learning models for liver disease risk prediction. Computers 12(1), 19 (2023)

    Article  Google Scholar 

  26. Fahad Ullah, M.: Breast cancer: current perspectives on the disease status. Breast Cancer Metastasis and Drug Resistance: Challenges and Progress, pp. 51–64 (2019)

    Google Scholar 

  27. Fazakis, N., Dritsas, E., Kocsis, O., Fakotakis, N., Moustakas, K.: Long-term cholesterol risk prediction using machine learning techniques in elsa database. In: IJCCI, pp. 445–450 (2021)

    Google Scholar 

  28. Fazakis, N., Kocsis, O., Dritsas, E., Alexiou, S., Fakotakis, N., Moustakas, K.: Machine learning tools for long-term type 2 diabetes risk prediction. IEEE Access 9, 103737–103757 (2021)

    Article  Google Scholar 

  29. Gordon, P.B.: The impact of dense breasts on the stage of breast cancer at diagnosis: a review and options for supplemental screening. Curr. Oncol. 29(5), 3595–3636 (2022)

    Article  Google Scholar 

  30. Gucalp, A., et al.: Male breast cancer: a disease distinct from female breast cancer. Breast Cancer Res. Treat. 173, 37–48 (2019)

    Article  Google Scholar 

  31. Hossin, M., Sulaiman, M.N.: A review on evaluation metrics for data classification evaluations. Int. J. Data Mining Knowl. Manage. Process 5(2), 1 (2015)

    Article  Google Scholar 

  32. Islam, M.M., Haque, M.R., Iqbal, H., Hasan, M.M., Hasan, M., Kabir, M.N.: Breast cancer prediction: a comparative study using machine learning techniques. SN Comput. Sci. 1, 1–14 (2020)

    Article  Google Scholar 

  33. Jafari, S.H., et al.: Breast cancer diagnosis: imaging techniques and biochemical markers. J. Cellular Physiol. 233(7), 5200–5213 (2018)

    Article  Google Scholar 

  34. Johansson, A.L., Trewin, C.B., Hjerkind, K.V., Ellingjord-Dale, M., Johannesen, T.B., Ursin, G.: Breast cancer-specific survival by clinical subtype after 7 years follow-up of young and elderly women in a nationwide cohort. Int. J. Cancer 144(6), 1251–1261 (2019)

    Article  Google Scholar 

  35. Kabari, L.G., Onwuka, U.C.: Comparison of bagging and voting ensemble machine learning algorithm as a classifier. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 9(3), 19–23 (2019)

    Google Scholar 

  36. Konstantoulas, I., Dritsas, E., Moustakas, K.: Sleep quality evaluation in rich information data. In: 2022 13th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–4. IEEE (2022)

    Google Scholar 

  37. Konstantoulas, I., Kocsis, O., Dritsas, E., Fakotakis, N., Moustakas, K.: Sleep quality monitoring with human assisted corrections. In: IJCCI, pp. 435–444 (2021)

    Google Scholar 

  38. Lee, K., Kruper, L., Dieli-Conwright, C.M., Mortimer, J.E.: The impact of obesity on breast cancer diagnosis and treatment. Curr. Oncol. Rep. 21, 1–6 (2019)

    Article  Google Scholar 

  39. Li, H., et al.: Alcohol consumption, cigarette smoking, and risk of breast cancer for brca1 and brca2 mutation carriers: results from the brca1 and brca2 cohort consortium. Cancer Epidemiol. Biomarkers Prev. 29(2), 368–378 (2020)

    Article  Google Scholar 

  40. Liu, Y., Wang, Y., Zhang, J.: New machine learning algorithm: random forest. In: Liu, B., Ma, M., Chang, J. (eds.) ICICA 2012. LNCS, vol. 7473, pp. 246–252. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34062-8_32

    Chapter  Google Scholar 

  41. Mokhatri-Hesari, P., Montazeri, A.: Health-related quality of life in breast cancer patients: review of reviews from 2008 to 2018. Health Qual. Life Outcomes 18, 1–25 (2020)

    Google Scholar 

  42. Naji, M.A., El Filali, S., Aarika, K., Benlahmar, E.H., Abdelouhahid, R.A., Debauche, O.: Machine learning algorithms for breast cancer prediction and diagnosis. Procedia Comput. Sci. 191, 487–492 (2021)

    Article  Google Scholar 

  43. Nusinovici, S., et al.: Logistic regression was as good as machine learning for predicting major chronic diseases. J. Clin. Epidemiol. 122, 56–69 (2020)

    Article  Google Scholar 

  44. Olsson, H.L., Olsson, M.L.: The menstrual cycle and risk of breast cancer: a review. Front. Oncol. 10, 21 (2020)

    Article  Google Scholar 

  45. Patrício, M., et al.: Using resistin, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer 18(1), 1–8 (2018)

    Article  Google Scholar 

  46. Posonia, A.M., Vigneshwari, S., Rani, D.J.: Machine learning based diabetes prediction using decision tree j48. In: 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), pp. 498–502. IEEE (2020)

    Google Scholar 

  47. Riggio, A.I., Varley, K.E., Welm, A.L.: The lingering mysteries of metastatic recurrence in breast cancer. British J. Cancer 124(1), 13–26 (2021)

    Article  Google Scholar 

  48. Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)

    Article  Google Scholar 

  49. Sagi, O., Rokach, L.: Ensemble learning: a survey. Wiley Interdisc. Rev.: Data Min. Knowl. Discov 8(4), e1249 (2018)

    Google Scholar 

  50. Satapathy, S.K., Bhoi, A.K., Loganathan, D., Khandelwal, B., Barsocchi, P.: Machine learning with ensemble stacking model for automated sleep staging using dual-channel EEG signal. Biomed. Signal Process. Control 69, 102898 (2021)

    Article  Google Scholar 

  51. Trigka, M., Dritsas, E.: Long-term coronary artery disease risk prediction with machine learning models. Sensors 23(3), 1193 (2023)

    Article  Google Scholar 

  52. Wang, L.: Early diagnosis of breast cancer. Sensors 17(7), 1572 (2017)

    Google Scholar 

Download references

Acknowledgements

This research was funded by the European Union and Greece (Partnership Agreement for the Development Framework 2014–2020) under the Regional Operational Programme Ionian Islands 2014–2020, project title: “Indirect costs for project “Smart digital applications and tools for the effective promotion and enhancement of the Ionian Islands bio-diversity” ”, project number: 5034557.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elias Dritsas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dritsas, E., Trigka, M., Mylonas, P. (2023). Ensemble Machine Learning Models for Breast Cancer Identification. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34171-7_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34170-0

  • Online ISBN: 978-3-031-34171-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics