The Effect of Feature Selection on Gray Level Co-Occurrence Matrix (GLCM) for the Four Breast Cancer Classifications

Article Preview

Abstract:

Breast cancer is ranked first as the most common cancer case affecting women in the world. Early detection of breast cancer can increase the chances of survival in patients. The role of the radiologist is necessary for the detection of breast cancer, and the radiologists often have limitations in conducting disease consultations with so many patients. The detection gives a subjective result because the process is based on the decision-making of the radiologists. In this work, we proposed a system to detect and classify breast cancer accurately to anticipate delays in patient handling and subjective result. We proposed a digital image processing method using mammograms to classify breast cancer into four categories based on tissue density, namely BI-RADS I, II, III, and IV. The main stages carried out in this research are images processing, feature extraction, data normalization, feature selection, classification, and parameter optimization. This method uses GLCM to extract texture features and two feature selection methods namely, RFE-RF and Chi-Square. The method was tested with various classifiers such as SVM, KNN, Random Forests, and Decision Trees. The hyper-parameters of the classifier were optimized using GridSearch. The final result is measure using accuracy. In this work, Random Forest with the RFE-RF gives the highest accuracy of 99.7%. Feature selection offers a significant impact on improving accuracy. The results of this work prove that our system can classify breast cancer with high accuracy. So that our system can solve problems to assist radiologists in screening mammograms and help make decisions to diagnose patients with breast cancer based on density.

You might also be interested in these eBooks

Info:

Pages:

168-179

Citation:

Online since:

March 2022

Export:

Price:

* - Corresponding Author

[1] H. Sung et al., Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,, CA. Cancer J. Clin., 2021,.

DOI: 10.3322/caac.21609

Google Scholar

[2] S. Paramkusham, Automatic Classification of Mammograms Using 2d-Discrete Wavelet Transform and Feature Selection Methods,, Journal of Critical Reviews, April, (2021).

Google Scholar

[3] A. P. Charate and S. B. Jamge, The Preprocessing Methods of Mammogram Images for Breast Cancer Detection,, Int. J. Recent Innov. Trends Comput. Commun., (2017).

Google Scholar

[4] A. Rampun, P. J. Morrow, B. W. Scotney, and H. Wang, Breast density classification in mammograms: An investigation of encoding techniques in binary-based local patterns,, Comput. Biol. Med., 2020,.

DOI: 10.1016/j.compbiomed.2020.103842

Google Scholar

[5] C. D. Lehman, R. D. Wellman, D. S. M. Buist, K. Kerlikowske, A. N. A. Tosteson, and D. L. Miglioretti, Diagnostic accuracy of digital screening mammography with and without computer-aided detection,, JAMA Intern. Med., 2015,.

DOI: 10.1001/jamainternmed.2015.5231

Google Scholar

[6] D. Ribli, A. Horváth, Z. Unger, P. Pollner, and I. Csabai, Detecting and classifying lesions in mammograms with Deep Learning,, Sci. Rep., 2018,.

DOI: 10.1038/s41598-018-22437-z

Google Scholar

[7] G. Valvano et al., Convolutional Neural Networks for the Segmentation of Microcalcification in Mammography Imaging,, J. Healthc. Eng., 2019,.

Google Scholar

[8] N. Salman, S. Ali, S. L. Kailan, and F. Mohammed, Breast Cancer Classification as Malignant or Benign Based on Texture Features Using Multilayer Perceptron,, Int. J. Simul. Syst. Sci. Technol., 2020,.

DOI: 10.5013/ijssst.a.20.01.12

Google Scholar

[9] T. T. Htay and S. S. Maung, Early Stage Breast Cancer Detection System using GLCM feature extraction and K-Nearest Neighbor (k-NN) on Mammography image,, 2018,.

DOI: 10.1109/iscit.2018.8587920

Google Scholar

[10] T. Mahmood, S. Ziauddin, A. R. Shahid, and A. Safi, Mitosis Detection in Breast Cancer Histopathology Images Using Statistical, Color and Shape-Based Features,, J. Med. Imaging Heal. Informatics, 2018,.

DOI: 10.1166/jmihi.2018.2382

Google Scholar

[11] A. S. Eltrass and M. S. Salama, Fully automated scheme for computer-aided detection and breast cancer diagnosis using digitised mammograms,, IET Image Process., 2020,.

DOI: 10.1049/iet-ipr.2018.5953

Google Scholar

[12] M. Jalilvand, X. Li, L. Zwirello, and T. Zwick, Ultra wideband compact near-field imaging system for breast cancer detection,, IET Microwaves, Antennas Propag., 2015,.

DOI: 10.1049/iet-map.2014.0735

Google Scholar

[13] Y. J. Suh, J. Jung, and B. J. Cho, Automated breast cancer detection in digital mammograms of various densities via deep learning,, J. Pers. Med., 2020,.

DOI: 10.3390/jpm10040211

Google Scholar

[14] S. J. A. Sarosa, F. Utaminingrum, and F. A. Bachtiar, Mammogram Breast Cancer Classification Using Gray-Level Co-Occurrence Matrix and Support Vector Machine,, 2018,.

DOI: 10.1109/siet.2018.8693146

Google Scholar

[15] S. Uyun and L. Choridah, Feature selection mammogram based on breast cancer mining,, Int. J. Electr. Comput. Eng., 2018,.

Google Scholar

[16] N. Shobha Rani and C. S. Rao, Exploration and evaluation of efficient pre-processing and segmentation technique for breast cancer diagnosis based on mammograms,, Int. J. Res. Pharm. Sci., 2019,.

DOI: 10.26452/ijrps.v10i3.1423

Google Scholar

[17] H. Henderi, Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer,, IJIIS Int. J. Informatics Inf. Syst., 2021,.

DOI: 10.47738/ijiis.v4i1.73

Google Scholar

[18] Z. Mushtaq, A. Yaqub, S. Sani, and A. Khalid, Effective K-nearest neighbor classifications for Wisconsin breast cancer data sets,, J. Chinese Inst. Eng. Trans. Chinese Inst. Eng. A, 2020,.

DOI: 10.1080/02533839.2019.1676658

Google Scholar

[19] K. Juneja and C. Rana, An improved weighted decision tree approach for breast cancer prediction,, Int. J. Inf. Technol., 2020,.

Google Scholar

[20] E. J. Sutton et al., A machine learning model that classifies breast cancer pathologic complete response on MRI post-neoadjuvant chemotherapy,, Breast Cancer Res., 2020,.

DOI: 10.1186/s13058-020-01291-w

Google Scholar

[21] R. Ramani, N. S. Vanitha, and S. Valarmathy, The Pre-Processing Techniques for Breast Cancer Detection in Mammography Images,, Int. J. Image, Graph. Signal Process., 2013,.

DOI: 10.5815/ijigsp.2013.05.06

Google Scholar

[22] D. Saranyaraj, M. Manikandan, and S. Maheswari, A deep convolutional neural network for the early detection of breast carcinoma with respect to hyper- parameter tuning,, Multimed. Tools Appl., 2020,.

DOI: 10.1007/s11042-018-6560-x

Google Scholar

[23] R. M. Haralick, I. Dinstein, and K. Shanmugam, Textural Features for Image Classification,, IEEE Trans. Syst. Man Cybern., 1973,.

DOI: 10.1109/tsmc.1973.4309314

Google Scholar

[24] S. Marianingsih, F. Utaminingrum, and F. A. Bachtiar, Road surface types classification using combination of K-nearest neighbor and Naïve Bayes based on GLCM,, Int. J. Adv. Soft Comput. its Appl., (2019).

Google Scholar

[25] R. Biswas, A. Nath, and S. Roy, Mammogram classification using gray-level co-occurrence matrix for diagnosis of breast cancer,, 2016,.

DOI: 10.1109/icmete.2016.85

Google Scholar

[26] N. Trang, Data mining for Education Sector, a proposed concept,, J. Appl. Data Sci., 2020,.

Google Scholar

[27] C. Saranya and G. Manikandan, A study on normalization techniques for privacy preserving data mining,, Int. J. Eng. Technol., (2013).

Google Scholar

[28] A. Ridok, N. Widodo, W. F. Mahmudy, and M. Rifai, FC-SVM: DNA binding Proteins prediction with Average Blocks (AB) descriptors using SVM with FC feature Selection,, Proc. 2019 4th Int. Conf. Sustain. Inf. Eng. Technol. SIET 2019, p.22–27, 2019,.

DOI: 10.1109/siet48054.2019.8986070

Google Scholar

[29] T. G. Debelee, A. Gebreselasie, F. Schwenker, M. Amirian, and D. Yohannes, Classification of mammograms using texture and CNN based extracted features,, J. Biomimetics, Biomater. Biomed. Eng., 2019,.

DOI: 10.4028/www.scientific.net/jbbbe.42.79

Google Scholar

[30] L. Ladha and T. Deepa, Feature Selection Methods And Algorithms,, International Journal on Computer Science and Engineering. (2011).

Google Scholar

[31] S. Hamida, O. E. L. Gannour, B. Cherradi, H. Ouajji, and A. Raihani, Optimization of machine learning algorithms hyper-parameters for improving the prediction of patients infected with COVID-19,, 2020,.

DOI: 10.1109/icecocs50124.2020.9314373

Google Scholar

[32] C. Duke, K. Park, and R. Ewing, Chi-square,, in Basic Quantitative Research Methods for Urban Planners, (2020).

DOI: 10.4324/9780429325021-8

Google Scholar

[33] H. A. Parhusip, B. Susanto, L. Linawati, S. Trihandaru, Y. Sardjono, and A. S. Mugirahayu, Classification Breast Cancer Revisited with Machine Learning,, Int. J. Data Sci., 2020,.

DOI: 10.18517/ijods.1.1.42-50.2020

Google Scholar

[34] T. A. Assegie, R. L. Tulasi, and N. K. Kumar, Breast cancer prediction model with decision tree and adaptive boosting,, IAES Int. J. Artif. Intell., 2021,.

DOI: 10.11591/ijai.v10.i1.pp184-190

Google Scholar

[35] D. Syauqy, H. Fitriyah, and K. Anwar, Classification of Physical Soil Condition for Plants using Nearest Neighbor Algorithm with Dimensionality Reduction of Color and Moisture Information,, J. Inf. Technol. Comput. Sci., 2018,.

DOI: 10.25126/jitecs.20183266

Google Scholar

[36] L. Muflikhah, W. Widodo, W. F. Mahmudy, and S. Solimun, A support vector machine based on kernel k-means for detecting the liver cancer disease,, Int. J. Intell. Eng. Syst., 2020,.

DOI: 10.22266/ijies2020.0630.27

Google Scholar

[37] C. G. Siji George and B. Sumathi, Grid search tuning of hyperparameters in random forest classifier for customer feedback sentiment prediction,, Int. J. Adv. Comput. Sci. Appl., 2020,.

DOI: 10.14569/ijacsa.2020.0110920

Google Scholar