Predicting Breast Cancer: A Comparative Analysis of Machine Learning Algorithms

Authors

  • Pulung Hendro Prastyo Universitas Gadjah Mada
  • I Gede Yudi Paramartha Universitas Gadjah Mada
  • Michael S. Moses Pakpahan Universitas Gadjah Mada
  • Igi Ardiyanto Universitas Gadjah Mada

DOI:

https://doi.org/10.14421/icse.v3.545

Keywords:

XGBoost, MachineLearning, Breast Cancer, Classification

Abstract

Breast cancer is the most common cancer among women (43.3 incidents per 100.000 women), with the highest mortality (14.3 incidents per 100.000 women). Early detection is critical for survival. Using machine learning approaches, the problem can be effectively classified, predicted, and analyzed. In this study, we compared eight machine learning algorithms: Gaussian Naïve Bayes (GNB), k-Nearest Neighbors (K-NN), Support Vector Machine(SVM), Random Forest (RF), AdaBoost, Gradient Boosting (GB), XGBoost, and Multi-Layer Perceptron (MLP). The experiment is conducted using Breast Cancer Wisconsin datasets, confusion matrix, and 5-folds cross-validation. Experimental results showed that XGBoost provides the best performance. XGBoost obtained accuracy (97,19%), recall (96,75%), precision (97,28%), F1-score (96,99%), and AUC (99,61%). Our result showed that XGBoost is the most effective method to predict breast cancer in the Breast Cancer Wisconsin dataset.

Downloads

Download data is not yet available.

Downloads

Published

2020-04-30

How to Cite

Prastyo, P. H. ., Paramartha, I. G. Y. ., Pakpahan, M. S. M. ., & Ardiyanto, I. . (2020). Predicting Breast Cancer: A Comparative Analysis of Machine Learning Algorithms. Proceeding International Conference on Science and Engineering, 3, 455–459. https://doi.org/10.14421/icse.v3.545

Issue

Section

Articles