Abstract
The paper proposes the use of the multilayer perceptron model to the problem of detecting ham and spam e-mail patterns. It also proposes an intensive use of data pre-processing and feature selection methods to simplify the task of the multilayer perceptron in classifying ham and spam e-mails. The multilayer perceptron is trained and assessed on patterns extracted from the SpamAssassin Public Corpus. It is required to classify novel types of ham and spam patterns. The results are presented and evaluated in the paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice-Hall, Inc., Englewood Cliffs (1999)
Fawcett, T.: “In vivo” spam filtering: A challenge problem for KDD. ACM SIGKDD Explorations 5, 140–148 (2003)
Gomes, L.H., Cazita, C., Almeida, J.M., Almeida, V., Meira Junior, W.: Characterizing a spam traffic. In: Proceedings of the Internet Measurement Conference, ACM SIGCOMM (2004)
Pfleeger, S.L., Bloom, G.: Canning spam: Proposed solutions to unwanted email. IEEE Security & Privacy 3, 40–47 (2005)
Cournane, A., Hunt, R.: An analysis of the tools used for the generation and prevention of spam. Computers & Security 23, 154–166 (2004)
Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of naive Bayesian anti-spam filtering. In: Proceedings of the Workshop on Machine Learning in the New Information Age, pp. 9–17 (2000)
Özgür, L., Güngör, T., Gürgen, F.: Adaptive anti-spam filtering for agglutinative languages: a special case for Turkish. Pattern Recognition Letters 25, 1819–1831 (2004)
Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing 3, 243–269 (2004)
Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10, 1048–1054 (1999)
Chuan, Z., Xianliang, L., Mengshu, H., Xu, Z.: A LVQ-based neural network antispam email approach. ACM SIGOPS Operating Systems Review 39, 34–39 (2005)
Zorkadis, V., Karras, D.A., Panayotou, M.: Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering. Neural Networks 18, 799–807 (2005)
Internet web page: The Apache SpamAssassin Project. The Apache Software Foundation (2006), http://spamassassin.apache.org/publiccorpus/
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the International Conference on Machine Learning (1997)
Papoulis, A., Pillai, S.U.: Probability, Random Variables, and Stochastic Processes, 4th edn. McGraw-Hill, New York (2001)
Fahlman, S.E.: An empirical study of learning speed in back-propagation networks. Technical Report CMU-CS-88-162, School of Computer Science—Carnegie Mellon University, Pittsburgh, PA (1988)
Rumelhart, D.E., Hinton, G.E., McClelland, J.L.: A general framework for parallel distributed processing. In: Rumelhart, D.E., McClelland, J.L., the PDP Research Group (eds.) Parallel Distributed Processing, vol. 1, pp. 45–76. The MIT Press, Cambridge (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Carpinteiro, O.A.S., Lima, I., Assis, J.M.C., de Souza, A.C.Z., Moreira, E.M., Pinheiro, C.A.M. (2006). A Neural Model in Anti-spam Systems. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840930_88
Download citation
DOI: https://doi.org/10.1007/11840930_88
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38871-5
Online ISBN: 978-3-540-38873-9
eBook Packages: Computer ScienceComputer Science (R0)