Abstract
Billions of messages are sent daily over the Internet, out of which a majority part of them is spam. These spam messages have become a primary cause of distraction and security threat for users as their number keeps on increasing day by day. Many researchers have addressed this problem, and there are different approaches to it. In this present study, Machine Learning algorithms such as Naïve Bayes, Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Random Forest, Gradient Boosting, and Extra Trees Classifier have been utilized to predict whether an incoming message or e-mail is spam. The model performance has been evaluated based on accuracy, precision, F1-score, and confusion matrix. Three different datasets including two SMS datasets and one e-mail dataset has been used, and a maximum F1-score of 96.06% and accuracy of 99.12% with the Extra Trees Classifier are achieved, which is 0.02% higher than the highest value of accuracy ever achieved for the SMS Spam Collection Dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bansal C, Sidhu B (2021) Machine learning based hybrid approach for email spam detection. In: 2021 9th international conference on reliability,Infocom technologies and optimization (trends and future directions), ICRITO 2021. https://doi.org/10.1109/ICRITO51393.2021.9596149
Al-Rawashdeh G, Mamat R, Hafhizah Binti Abd Rahim N (2019) Hybrid water cycle optimization algorithm with simulated annealing for spam e-mail detection. IEEE Access 7:143721–143734. https://doi.org/10.1109/ACCESS.2019.2944089
Laorden C, Sanz B, Santos I, Galá N-garćia P, Bringas PG (2013) Collective classification for spam filtering. Log J IGPL 21(4):540–541. https://doi.org/10.1093/jigpal/jzs030
Karim A, Azam S, Shanmugam B, Kannoorpatti K, Alazab M (2019) A comprehensive survey for intelligent spam email detection. IEEE Access 7:168261–168295. https://doi.org/10.1109/ACCESS.2019.2954791
Liu X, Lu H, Nayak A (2021) A spam transformer model for SMS spam detection. IEEE Access 9:80253–80263. https://doi.org/10.1109/ACCESS.2021.3081479
Navaney P, Dubey G, Rana A (2018) SMS spam filtering using supervised machine learning algorithms. In: Proceedings of the 8th international conference confluence 2018 on cloud computing, data science and engineering, confluence 2018, pp 43–48. https://doi.org/10.1109/CONFLUENCE.2018.8442564
I-SMAC (2019) Third international conference on I-SMAC (IoT in social, mobile, analytics and cloud). IEEE
Institute of Electrical and Electronics Engineers (2017) International conference on computing and communication technologies for smart nation (IC3TSN). 12–14 Oct 2017
Aluru S (2018) Jaypee Institute of Information Technology University, University of Florida. College of Engineering, IEEE Computer Society, IEEE Computer Society. Technical Committee on Parallel Processing, and Institute of Electrical and Electronics Engineers, 2018 Eleventh International Conference on Contemporary Computing (IC3): 2–4 Aug 2018, Jaypee Institute of Information Technology, Noida, India
Sharma N (2022) A methodological study of SMS spam classification using machine learning algorithms. In: 2022 2nd international conference on intelligent technologies, CONIT 2022. https://doi.org/10.1109/CONIT55038.2022.9848171
ISCON (2019) 4th international conference on information systems and computer networks (ISCON). IEEE
Debnath K, Kar N (2022) Email spam detection using deep learning approach. In: 2022 international conference on machine learning, big data, cloud and parallel computing, COM-IT-CON 2022, pp 37–41. https://doi.org/10.1109/COM-IT-CON54601.2022.9850588
Abdullahi AA, Kaya M (2021) A deep learning based method to detect email and SMS spams. In: 2021 international conference on decision aid sciences and application, DASA 2021, pp 430–435. https://doi.org/10.1109/DASA53625.2021.9681921
Cota RP, Zinca D (2022) Comparative results of spam email detection using machine learning algorithms. In: 14th international conference on communications, COMM 2022—proceedings. https://doi.org/10.1109/COMM54429.2022.9817305
Hidalgo JMG, Bringas GC, Sánz EP, García FC (2006) Content based SMS spam filtering. In: Proceedings of the 2006 ACM symposium on document engineering, DocEng 2006, pp 107–114. https://doi.org/10.1145/1166160.1166191
Jáñez-Martino F, Alaiz-Rodríguez R, González-Castro V, Fidalgo E, Alegre E (2022) A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10195-4
Singh T, Kumar TA, Shambharkar PG (2022) Enhancing spam detection on SMS performance using several machine learning classification models. In: 2022 6th international conference on trends in electronics and informatics, ICOEI 2022—proceedings, pp 1472–1478. https://doi.org/10.1109/ICOEI53556.2022.9777157
Ubale G, Gaikwad S (2022) SMS spam detection using TFIDF and voting classifier. In: 2022 international mobile and embedded technology conference, MECON 2022, pp 363–366. https://doi.org/10.1109/MECON53876.2022.9752078
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bari, P., Mathew, V., Tandel, S.P., Aniket, P., Chaudhari, K.S., Naik, S. (2023). SMS and E-mail Spam Classification Using Natural Language Processing and Machine Learning. In: Singh, S.N., Mahanta, S., Singh, Y.J. (eds) Proceedings of the NIELIT's International Conference on Communication, Electronics and Digital Technology. NICE-DT 2023. Lecture Notes in Networks and Systems, vol 676. Springer, Singapore. https://doi.org/10.1007/978-981-99-1699-3_6
Download citation
DOI: https://doi.org/10.1007/978-981-99-1699-3_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1698-6
Online ISBN: 978-981-99-1699-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)