Abstract
In recent times, prediction modeling using machine learning is gaining widespread recognition, due to its ability to facilitate the detection of critical features from large datasets (Salama and Abdelhalim Int J Comput Inf Technol 01:2277–0764, 2012). The areas of application are fairly diverse. The conventional approach involves several data mining classifiers individually to build knowledge discovery systems. The emphasis has shifted now for deploying an ensemble of these classifiers in order to further enhance the prediction capabilities and accuracy of the knowledge discovery database (KDD) process. This current study proposes a model for prediction of the recurrence of breast cancer within 3 years, based on one of the types of ensemble data mining classification techniques called voting. This approach uses different combinations of four data mining base classifiers, viz., decision tree, multilayer perceptron, Naïve Bayes, and SMO. An attempt is being made to compare the effectiveness of voting classifiers, vis-a-vis the base classifiers in order to determine the performance-enhancing capabilities of the ensemble approach. Our work clearly demonstrates that the performance accuracy of the voting classifiers analyzed with seven combinations is consistently high with values ranging between 81.0526% and 83.8596%. In contrast, the performance accuracy of base classifiers varies widely ranging between 75.7895 and 84.2105%. We have clearly established that the performance of the voting classifier is very consistent. Voting also enhances the performance of weak classifiers like MLP and SMO. The dataset used in our experiment consists of 23 attributes containing 575 samples obtained from the Mizoram Cancer Institute of Aizawl, Mizoram, India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balentine: Breast cancer: signs, symptoms, causes, treatment, stages & survival rates (2019). https://www.medicinenet.com/breast_cancer_facts_stages/article.htm
World Cancer Research Fund and American Institute for Cancer Research: Breast cancer statistics|World cancer research fund. Cancer Trends (2018). https://www.wcrf.org/dietandcancer/cancer-trends/breast-cancer-statistics
Sabharanjak, S.: Strand genomics – enabling precision medicine issue 10 _ Strand life sciences (2017)
Frawley, W.J., Piatetsky-shapiro, G., Matheus, C.J.: Knowledge discovery in databases : an overview. Assoc. Adv. Artif. Intell. 13(3), 57–70 (1992)
Abouelnadar, N.A., Saad, A.A.: Towards a better model for predicting cancer recurrence in breast cancer patients, pp. 887–899 (2019)
Yarabarla, M.S., Ravi, L.K., Sivasangari, A.: Breast cancer prediction via machine learning, no. Icoei, pp. 121–124 (2019)
Salama, G.I., Abdelhalim, M.B., Zeid, M.A.: Breast Cancer diagnosis on three different datasets, using multi-classifiers. Int. J. Comput. Inf. Technol. 01(01), 2277–0764 (2012)
Sivakami, K.: Mining big data : breast cancer prediction using DT – SVM hybrid model. Int. J. Sci. Eng. Appl. Sci. 5, 418–429 (2015)
Safiyari, A., Javidan, R.: Predicting lung cancer survivability using ensemble learning methods. In: 2017 Intelligent Systems Conference IntelliSys 2017, vol. 2018, pp. 684–688 (2018)
Kumar, U.K., Nikhil, M.B.S., Sumangali, K.: Prediction of breast cancer using voting classifier technique. In: 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials ICSTM 2017 – Proceedings, Aug, pp. 108–114 (2017)
Mohebian, M.R., Marateb, H.R., Mansourian, M., Angel, M., Mokarian, F.: A hybrid computer-aided-diagnosis system for prediction of breast cancer recurrence (HPBCR) using optimized ensemble learning. Comput. Struct. Biotechnol. J. 15, 75–85 (2017)
Lavanya, D., Usha Rani, K.: Ensemble decision making system for breast cancer data. Int. J. Comput. Appl. 51(17), 19–23 (2012)
Avula, A., Asha, A.: Improving prediction accuracy using hybrid machine learning algorithm on medical datasets. IJSER. 9(10), 1461–1467 (2018)
Dietterich, T.G.: Ensemble methods in machine learning. Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence Lecture Notes Bioinformatics), vol. 1857 LNCS, pp. 1–15 (2000)
Lutin, E.: Ensemble methods in machine learning: what are they and why use them? Medium: towards data science (2017). https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why-use-them-68ec3f9fef5f
Polikar, R.: Ensemble learning (2012). https://doi.org/10.1007/978-1-4419-9326-7
Brown, G., Kuncheva, L.I.: ‘Good’ and ‘Bad’ diversity in majority vote ensembles, pp. 124–125 (2010)
Scikit-Learn Developers: Ensemble methods (2019). https://scikit-learn.org/stable/modules/ensemble.html
Sharma, M.: What steps should one take while doing data preprocessing? Hackernoon (2018). https://hackernoon.com/what-steps-should-one-take-while-doing-data-preprocessing-502c993e1caa
Rajaratne, M.: Machine learning – part 1 – data preprocessing (2018). https://towardsdatascience.com/data-pre-processing-techniques-you-should-know-8954662716d6
Hall, M., Frank, E., Holmes, G., Witten, I.H., Cunningham, S.J.: Weka: practical machine learning tools and techniques. In: Workshop on emerging knowledge engineering and connectionist-based information systems (2007)
Narkhede, S.: Understanding AUC – ROC curve – towards data science, pp. 1–9 (2018). https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
STARD: The area under an ROC curve. Area, pp. 2–3 (2012). https://doi.org/10.1177/0272989X03251246
Kumar, R., Abhaya, I.: Receiver operating characteristic (ROC) curve for medical researchers (2011)
Glen, S.: ROC curve explained in one picture – data science central (2019). https://www. datasciencecentral. com/profiles/blogs/ roc-curve-explained-in-one-picture?fbclid=IwAR3aih- HiSK0AsuZ50TdSp34KEqrXh5h1ZJSmf3udXBimiKSiWUrcaLPrus
Simon, S.: What is a Kappa coefficient_ (Cohen’s Kappa). http://www.pmean.com/definitions/kappa.htm
Acknowledgments
We would like to thank Dr. Jerry Lalrinsanga, Medical Oncologist of Mizoram Cancer Institute (MCI), Aizawl, for supporting this research work by permitting us to collect breast cancer datasets from MCI. The authors express their sincere gratitude to MLCU for facilitating and extending all possible help to complete this particular research work.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Dawngliani, M.S., Chandrasekaran, N., Lalmawipuii, R., Thangkhanhau, H. (2021). Breast Cancer Recurrence Prediction Model Using Voting Technique. In: Raj, J.S. (eds) International Conference on Mobile Computing and Sustainable Informatics . ICMCSI 2020. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-49795-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-49795-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49794-1
Online ISBN: 978-3-030-49795-8
eBook Packages: EngineeringEngineering (R0)