Abstract
Cervical cancer (CC) is affecting women predominantly, and early diagnosis could cure this cancer. This study aims to design and develop an effective deep learning-based classification model to detect early CC stages using clinical data. The proposed method is a combination of an unsupervised deep learning and a supervised neural network, i.e. sparse stacked autoencoder (SSAE) and fuzzy adaptive resonance theory MAP (FAM), respectively, and is denoted as SSAE-FAM. Specifically, SSAE is applied to tackle the data sparsity problem. It extracts the representative features from a data set through feature transformation. The transformed features are then classified by FAM. In this study, a CC data set obtained from the University of California Irvine (UCI) machine learning repository is utilised for evaluation. Owing to missing data in the original CC data set, two data sets are generated from the original CC data samples using two data preprocessing techniques. Both generated CC data sets with four target classes (i.e. Schiller, Cytology, Biopsy, and Hinselmann) are evaluated as four independent binary-class problems. We improve the classification performance of FAM by mitigating the data sparsity problem. Based on a series of experimental studies, SSAE-FAM outperforms other state-of-art methods by achieving 99.47%, 99.34%, 99.48%, and 99.81% mean accuracy rates, respectively, with the first CC data set, and 99.74%, 99.86%, 99.77%, and 99.80% mean accuracy rates, respectively, with the second CC data set. The results positively indicate the usefulness of SSAE-FAM for early CC diagnosis.
Similar content being viewed by others
Data availability
The datasets used in this research work are available in http://archive.ics.uci.edu/ml.
References
Ilango B, Nithya V (2019) Evaluation of machine learning based optimized feature selection approaches and classification methods for cervical cancer prediction. SN Appl Sci. https://doi.org/10.1007/s42452-019-0645-7
WHO (2013) Comprehensive cervical cancer prevention and control: a healthier future for girls and woman. Youth and Health Risks, pp 1–12
Marván ML, López-Vázquez E (2017) Preventing health and environmental risks in Latin America. The Anthropocene: Politik-Economics-Society-Science (APESS), 23
Castanon A, Sasieni P (2018) Is the recent increase in cervical cancer in women aged 20–24 years in England a cause for concern? Prev Med 107:21–28
Oluwole EO, Mohammed AS, Akinyinka MR, Salako O (2017) Cervical cancer awareness and screening uptake among rural women in Lagos, Nigeria. J Commun Med Primary Health Care 29(1):81–88
Fernandes K, Cardoso J, Fernandes J (2018) Automated methods for the decision support of cervical cancer screening using digital colposcopies. IEEE Access 6:33910–33927
Cleveland Clinic (2022) Cervical cancer. Retrieved from: https://my.clevelandclinic.org/health/diseases/12216-cervical-cancer
WHO (2023) Cervical cancer. Retrieved from: https://www.who.int/news-room/fact-sheets/detail/cervical-cancer
Chatterjee S, Divesh G, Prakash A, Sharma A (2021) Exploring healthcare/health-product ecommerce satisfaction: a text mining and machine learning application. J Bus Res 131:815–825
Prabhpreet K, Gurvinder S, Parminder K (2019) Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification. Inf Med Unlocked 16:100151
Osuwa A, Öztoprak H (2021) Importance of continuous improvement of machine learning algorithms from a health care management and management information systems perspective. In Proceedings of the 2021 International Conference on Engineering and Emerging Technologies (ICEET), pp 1–5
Yan B, Han G (2018) Effective feature extraction via stacked sparse autoencoder to improve intrusion detection system. IEEE Access 6:41238–41248, retrieved from: https://doi.org/10.1109/ACCESS.2018.2858277
Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914
Carpenter GA, Grossberg S (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput Vis Graph Image Process 37(1):54–115
Carpenter GA, Grossberg S (1988) The ART of adaptive pattern recognition by a self-organizing neural network. IEEE Comput 21(3):77–88
Zhang Y-D, Zhang Y, Hou X-X, Chen H, Wang S-H (2017) Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed. Multimedia Tools Appl 77(9):10521–10538
Jia WJ, Muhammad K, Wang S-H, Zhang Y-D (2017) Five-category classification of pathological brain images based on deep stacked sparse autoencoder. Multimedia Tools Appl 78(4):4045–4064
Zhang Y-D, Khan MA, Zhu ZQ, Wang S-H (2021) Pseudo Zernike moment and deep stacked sparse autoencoder for COVID-19 diagnosis. Comput Mater Contin 69(3):3145–3162
Ho SH, Jee SH, Lee JE, Park JS (2004) Analysis on risk factors for cervical cancer using induction technique. Expert Syst Appl 27(1):97–105
Hair JF, Anderson RE, Tatham RL, Black WC (1998) Multivariate data analysis, 5th edn. Prentice Hall, Upper Saddle River
Tseng CJ, Lu CJ, Chang CC, Chen GD (2014) Application of machine learning to predict the recurrence-proneness for cervical cancer. Neural Comput Appl 24(6):1311–1316
Sharma S (2016) Cervical cancer stage prediction using decision tree approach of machine learning. Int J Adv Res Comput Commun Eng 5(4):345–348
Gomathi M, Thangaraj P (2011) A computer aided diagnosis system for lung cancer detection using machine learning technique. Eur J Sci Res 51:260–275
Malar E, Kandaswamy A, Chakravarthy D, Giri Dharan A (2012) A novel approach for detection and classification of mammographic microcalcifications using wavelet analysis and extreme learning machine. Comput Biol Med 42(9):898–905
Yu SX, Feng XX, Wang B, Dun H, Zhang S, Zhang RH, Huang X (2021) Automatic classification of cervical cells using deep learning method. IEEE Access 9:32559–32568
Ghoneim A, Muhammad G, Hossain MS (2019) Cervical cancer classification using convolutional neural networks and extreme learning machines. Futur Gener Comput Syst 102:643–649
Ali M, Ahmed K, Bui FM, Paul BK, Ibrahim SM, Quinn JMW, Moni MA (2021) Machine learning-based statistical analysis for early stage detection of cervical cancer. Comput Biol Med 139:104985
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Cleary JG, Trigg LE (1995) K*: an instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning, pp 108–114
Yates, D., Islam, M. Z., & Gao J. (2018). SPAARC: a fast decision tree algorithm. In: Australasian conference on data mining, pp 43–55
Kalmegh S (2015) Analysis of WEKA data mining algorithm REPTree, simple cart and RrandomTtree for classification of Indian news. Int J Innov Sci Eng Technol 2(2):438–446
Breiman L (2001) Random forests. Mach Learn 45:5–32
Wu W, Zhou H (2017) Data-driven diagnosis of cervical cancer with support vector machine-based approaches. IEEE Access 5:25189–25195
Dua D, Graff C (2019) UCI Machine Learning Repository, retrieved from: http://archive.ics.uci.edu/ml
Newaz A, Muhtadi S, Haq FS (2022) An intelligent decision support system for the accurate diagnosis of cervical cancer. Knowl Based Syst 245:108634
Chaudhuri AK, Ray A, Banerjee DK, Das A (2021) A multi-stage approach combining feature selection with machine learning techniques for higher prediction reliability and accuracy in cervical cancer diagnosis. Int J Intell Syst Appl 5:46–63
Dweekat OY, Lam SS (2022) Cervical cancer diagnosis using an integrated system of principal component analysis, genetic algorithm, and multilayer perceptron. Healthcare, 10(10), retrieved from: https://doi.org/10.3390/healthcare10102002
Adem K, Kiliçarslan S, Cömert O (2019) Classification and diagnosis of cervical cancer with stacked autoencoder and softmax classification. Expert Syst Appl 115:557–564
Tanimu JJ, Hamada M, Hassan M, Kakudi H, Abiodun JO (2022) A machine learning method for classification of cervical cancer. Electronics 11(3):463
Sun G, Li S, Cao Y, Lang F (2017) Cervical cancer diagnosis based on random forest. Int J Perform Eng 13:446–457
Khamparia A, Gupta D, Rodrigues JJPC, de Albuquerque VHC (2021) DCAVN: Cervical cancer prediction and classification using deep convolutional and variational autoencoder network. Multimedia Tools Appl 80(1):30399–30415
Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), 1200–1205, retrieved from: https://doi.org/10.1109/MIPRO.2015.7160458.
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. Data classification: algorithms and applications. CRC Press, Boca Raton
Ng A (2011) Sparse autoencoder. CS294A Lecture Notes 72:1–19
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. Parallel Distrib Process Explor Microstruct Cogn 1:318–362
Baştürk A, Yüksei ME, Badem H, Çalışkan A (2017) Deep neural network based diagnosis system for melanoma skin cancer. In: 25th signal processing and communications applications conference (SIU), pp 1–4
Kaynar O, Yüksek AG, Görmez Y, Işik YE (2017) Intrusion detection with autoencoder based deep learning machine. In: 25th signal processing and communications applications conference (SIU), pp 1–4
Tyagi K, Rane C, Harshvardhan, Manry M (2022) Chapter 4—Regression analysis. Artificial intelligence and machine learning for EDGE computing, Academic Press, pp 53–63
Olshausenand BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of the 21st international conference on neural information processing systems (NIPS), pp 153–160
Singh P, Singh S, Pandi-Jain GS (2018) Effective heart disease prediction system using data mining techniques. Int J Nanomed 13:121–124
Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554
Thabtah F, Peebles D (2020) A new machine learning model based on induction of rules for autism detection. Health Inform J 26(1):264–286
Fix E, Hodges JL (1951) Discriminatory analysis, nonparametric discrimination: consistency properties. Int Stat Rev 57(3):238–247
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408
Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630
Acknowledgements
This research is supported by Fundamental Research Grant Scheme (FRGS), FRGS/1/2019/ICT02/MMU/02/2.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liaw, L.C.M., Tan, S.C., Goh, P.Y. et al. Cervical cancer classification using sparse stacked autoencoder and fuzzy ARTMAP. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09706-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00521-024-09706-x