Abstract
Phishing emails are more active than ever before and putting the average computer user and organizations at risk of significant data, brand and financial loss. Through an analysis of a number of phishing and ham email collected, this paper focused on fundamental attacker behavior which could be extracted from email header. It also put forward a hybrid feature selection approach based on combination of content-based and behavior-based. The approach could mine the attacker behavior based on email header. On a publicly available test corpus, our hybrid features selections are able to achieve 96% accuracy rate. In addition, we successfully tested the quality of our proposed behavior-based feature using the information gain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bergholz, A., Paab, G., Reichartz, F., Strobel, S., Chung, J.H.: Improved Phishing Detection using Model-Based Features. In: Proceedings of the International Conference on E-mail and Anti-Spam (2008)
The Anti-Phishing work Group, http://www.apwg.org/
Liu, C.: Fighting Unicode-Obfuscated Spam. In: Proceedings of E-Crime Research (2007)
Toolan, F., Carthy, J.: Phishing Detection using Classifier Ensemble. In: eCrime Researchers Summit (2009)
Toolan, F., Carthy, J.: Feature Selection for Spam and Phishing Detection. In: eCrime Researchers Summit, eCrime (2010)
Fette, I., Sadeh, N., Tomasic, A.: Learning to Detect Phishing Emails. Technical report, Institute of Software Research International, School of Computer Science, Carneige Melon University (2006)
Zhang, J., Du, Z., Liu, W.: A Behavior-based Detection Approach To Mass-Mailing Host. In: Proceedings of the Sixth International Conference on Machine Learning and Cybernetics (2007)
Ma, L., Ofoghani, B., Watters, P., Brown, S.: Detecting Phishing Emails Using Hybrid Features. In: Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing (2009)
Zhou, L., Shi, Y., Zhang, D.: A Statistical Language Modeling Approach to Online Deception Detection. IEEE Transactions on Knowledge and Data Engineering (2007)
Bazarganigilani, M.: Phishing E-Mail Detection Using Ontology Concept and Naïve Bayes Algorithm. International Journal of Research and Reviews in Computer Science, IJRRCS (2011)
Chandrasekaran, M., Narayanan, K., Upadyaya, S.: Phishing Email Detection Based on Structural Properties. In: Proceeding of the NYS Cyber Security Conference (2006)
Chandrasekaran, M., Shankaranarayanan, V., Upadhyaya, S.: CUSP: Customizable and Usable Spam Filters for Detecting Phishing Emails. In: NYS Symposium, Albany, NY (2008)
Ahmed Syed, N., Feamster, N., Gray, A.: Learning To Predict Bad Behavior. In: NIPS 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security (2008)
Nazario: J. Phishing Corpus, http://www.monkey.org/jose/wiki/doku.php?id=phishingcorpus
Basnet, R.B., Sung, A.H.: Classifying Phishing Emails Using Confidence-Weighted Linear Classifiers. In: International Conference on Information Security and Artificial Intelligence (ISAI) (2010)
Abu-Nimeh, S., Nappa, D., Wang, X., Nair, S.: Comparison of Machine Learning Techniques for Phishing Detection. In: Proceeding of APWG eCrime Researchers Summit, Pittsburgh, USA (2007)
Spamassassin public corpus, http://spamassassin.apache.org/publiccorpus
Gansterer, W.N., Polz, D.: E-Mail Classification for Phishing Defense. LNCS Advances (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
A. Hamid, I.R., Abawajy, J. (2011). Hybrid Feature Selection for Phishing Email Detection. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-24669-2_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24668-5
Online ISBN: 978-3-642-24669-2
eBook Packages: Computer ScienceComputer Science (R0)