Skip to main content

Hybrid Feature Selection for Phishing Email Detection

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7017))

Abstract

Phishing emails are more active than ever before and putting the average computer user and organizations at risk of significant data, brand and financial loss. Through an analysis of a number of phishing and ham email collected, this paper focused on fundamental attacker behavior which could be extracted from email header. It also put forward a hybrid feature selection approach based on combination of content-based and behavior-based. The approach could mine the attacker behavior based on email header. On a publicly available test corpus, our hybrid features selections are able to achieve 96% accuracy rate. In addition, we successfully tested the quality of our proposed behavior-based feature using the information gain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bergholz, A., Paab, G., Reichartz, F., Strobel, S., Chung, J.H.: Improved Phishing Detection using Model-Based Features. In: Proceedings of the International Conference on E-mail and Anti-Spam (2008)

    Google Scholar 

  2. The Anti-Phishing work Group, http://www.apwg.org/

  3. Liu, C.: Fighting Unicode-Obfuscated Spam. In: Proceedings of E-Crime Research (2007)

    Google Scholar 

  4. Toolan, F., Carthy, J.: Phishing Detection using Classifier Ensemble. In: eCrime Researchers Summit (2009)

    Google Scholar 

  5. Toolan, F., Carthy, J.: Feature Selection for Spam and Phishing Detection. In: eCrime Researchers Summit, eCrime (2010)

    Google Scholar 

  6. Fette, I., Sadeh, N., Tomasic, A.: Learning to Detect Phishing Emails. Technical report, Institute of Software Research International, School of Computer Science, Carneige Melon University (2006)

    Google Scholar 

  7. Zhang, J., Du, Z., Liu, W.: A Behavior-based Detection Approach To Mass-Mailing Host. In: Proceedings of the Sixth International Conference on Machine Learning and Cybernetics (2007)

    Google Scholar 

  8. Ma, L., Ofoghani, B., Watters, P., Brown, S.: Detecting Phishing Emails Using Hybrid Features. In: Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing (2009)

    Google Scholar 

  9. Zhou, L., Shi, Y., Zhang, D.: A Statistical Language Modeling Approach to Online Deception Detection. IEEE Transactions on Knowledge and Data Engineering (2007)

    Google Scholar 

  10. Bazarganigilani, M.: Phishing E-Mail Detection Using Ontology Concept and Naïve Bayes Algorithm. International Journal of Research and Reviews in Computer Science, IJRRCS (2011)

    Google Scholar 

  11. Chandrasekaran, M., Narayanan, K., Upadyaya, S.: Phishing Email Detection Based on Structural Properties. In: Proceeding of the NYS Cyber Security Conference (2006)

    Google Scholar 

  12. Chandrasekaran, M., Shankaranarayanan, V., Upadhyaya, S.: CUSP: Customizable and Usable Spam Filters for Detecting Phishing Emails. In: NYS Symposium, Albany, NY (2008)

    Google Scholar 

  13. Ahmed Syed, N., Feamster, N., Gray, A.: Learning To Predict Bad Behavior. In: NIPS 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security (2008)

    Google Scholar 

  14. Nazario: J. Phishing Corpus, http://www.monkey.org/jose/wiki/doku.php?id=phishingcorpus

  15. Basnet, R.B., Sung, A.H.: Classifying Phishing Emails Using Confidence-Weighted Linear Classifiers. In: International Conference on Information Security and Artificial Intelligence (ISAI) (2010)

    Google Scholar 

  16. Abu-Nimeh, S., Nappa, D., Wang, X., Nair, S.: Comparison of Machine Learning Techniques for Phishing Detection. In: Proceeding of APWG eCrime Researchers Summit, Pittsburgh, USA (2007)

    Google Scholar 

  17. Spamassassin public corpus, http://spamassassin.apache.org/publiccorpus

  18. Gansterer, W.N., Polz, D.: E-Mail Classification for Phishing Defense. LNCS Advances (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

A. Hamid, I.R., Abawajy, J. (2011). Hybrid Feature Selection for Phishing Email Detection. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24669-2_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24668-5

  • Online ISBN: 978-3-642-24669-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics