Int J Performability Eng ›› 2021, Vol. 17 ›› Issue (6): 528-535.doi: 10.23940/ijpe.21.06.p5.528535

Previous Articles     Next Articles

Collaborative and Early Detection of Email Spam using Multitask Learning

Balika J. Chelliah, Anand Sasidharan*, Dharmesh Kumar Singh, and Nilesh Dangi   

  1. Department of Computer Science, SRM IST, Chennai, 600089, India
  • Contact: * E-mail address: anand03111998@gmail.com
  • About author:Dr. Balika J. Chelliah is an Associate Professor in the field of Computer Science Engineering at SRM IST, Chennai. His areas of interest include Service oriented Architecture, Web services, Cloud services, Software Engineering.Anand Sasidharan is an undergraduate student of SRM IST, India. He is pursuing his B.Tech Degree in Computer Science and Engineering at SRM IST, Chennai. His area of interest lies in Data Structures and programming. Dharmesh Kumar Singh is an undergraduate student of SRM IST, India. He is pursuing his B.Tech Degree in Computer Science and Engineering at SRM IST, Chennai. His area of interest lies in Algorithm Design and analysis. Nilesh Dangi is an undergraduate student of SRM IST, India. He is pursuing his B.Tech Degree in Computer Science and Engineering at SRM IST, Chennai. His are of interest lies in programming and microprocessors and microcontroller modelling.

Abstract: Email spam has become a huge nuisance in the last couple of years. It not only wastes valuable time, but is also extremely dangerous as well. The various solutions that exist to detect spam use a manual input of keywords or filtering of particular domains. No matter the amount of filtering, it is quite tedious and difficult to check for spam. This paper includes a unique solution that attempts to use deep neural networks, a machine learning technique which detects any pattern of recurrent words which may have been classified as spam. Every other parameter of the email is examined as a feature and applied accordingly to the machine learning algorithm. Deep neural networks are quite advanced and can easily differentiate between a proper and an improper output.

Key words: deep neural network, synthetic minority over-sampling technique, term space partition algorithm