Early Detection of Outgoing Spammers in Large-Scale Service Provider Networks

Cohen, Yehonatan; Gordon, Daniel; Hendler, Danny

doi:10.1007/978-3-642-39235-1_5

Yehonatan Cohen¹⁸,
Daniel Gordon¹⁸ &
Danny Hendler¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7967))

Included in the following conference series:

International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

2226 Accesses
5 Citations

Abstract

We present ErDOS, an Early Detection scheme for Outgoing Spam. The detection approach implemented by ErDOS combines content-based detection and features based on inter-account communication patterns. We define new account features, based on the ratio between the numbers of sent and received emails and on the distribution of emails received from different accounts.

Our empirical evaluation of ErDOS is based on a real-life data-set collected by an email service provider, much larger than data-sets previously used for outgoing-spam detection research. It establishes that ErDOS is able to provide early detection for a significant fraction of the spammers population, that is, it identifies these accounts as spammers before they are detected as such by a content-based detector. Moreover, ErDOS only requires a single day of training data for providing a high-quality list of suspect accounts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Radicati, S.: Email statistics report. Technical report, The Radicati Group, Inc. (2010)
Google Scholar
Pingdom: Internet 2010 in numbers, http://royal.pingdom.com/2011/01/12/internet-2010-in-numbers/
Fallows, D.: Spam: How it is hurting email and degrading life on the internet. Pew Internet and American Life Project, 1–43 (2003)
Google Scholar
Clayton, R.: Stopping spam by extrusion detection. In: First Conference on Email and Anti-Spam (CEAS 2004), Mountain View CA, USA, pp. 30–31 (2004)
Google Scholar
Venkataraman, S., Sen, S., Spatscheck, O., Haffner, P., Song, D.: Exploiting network structure for proactive spam mitigation. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, p. 11. USENIX Association (2007)
Google Scholar
Taylor, B.: Sender reputation in a large webmail service. In: Proceedings of the Third Conference on Email and Anti-Spam (CEAS), vol. 27, p. 19 (2006)
Google Scholar
John, J., Moshchuk, A., Gribble, S., Krishnamurthy, A.: Studying spamming botnets using botlab. In: Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, pp. 291–306. USENIX Association (2009)
Google Scholar
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the 1998 Workshop, vol. 62, pp. 98–105. AAAI Technical Report WS-98-05, Madison (1998)
Google Scholar
Aradhye, H., Myers, G., Herson, J.: Image analysis for efficient categorization of image-based spam e-mail. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 914–918. IEEE (2005)
Google Scholar
Krawetz, N.: Anti-honeypot technology. IEEE Security & Privacy 2(1), 76–79 (2004)
Article Google Scholar
Bouguessa, M.: An unsupervised approach for identifying spammers in social networks. In: 2011 23rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI, pp. 832–840. IEEE (2011)
Google Scholar
Boykin, P., Roychowdhury, V.: Leveraging social networks to fight spam. Computer 38(4), 61–68 (2005)
Article MathSciNet Google Scholar
Gomes, L., Almeida, R., Bettencourt, L., Almeida, V., Almeida, J.: Comparative graph theoretical characterization of networks of spam and legitimate email. Arxiv preprint physics/0504025 (2005)
Google Scholar
Lam, H., Yeung, D.: A learning approach to spam detection based on social networks. In: Proceedings of the Fourth Conference on Email and Anti-Spam, CEAS 2007, pp. 832–840 (2007)
Google Scholar
Moradi, F., Olovsson, T., Tsigas, P.: Towards modeling legitimate and unsolicited email traffic using social network properties. In: Proceedings of the Fifth Workshop on Social Network Systems, p. 9. ACM (2012)
Google Scholar
Tseng, C., Chen, M.: Incremental SVM model for spam detection on dynamic email social networks. In: International Conference on Computational Science and Engineering, CSE 2009, vol. 4, pp. 128–135. IEEE (2009)
Google Scholar
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
Article Google Scholar
Gomes, L., Cazita, C., Almeida, J., Almeida, V., Meira, W.: Workload models of spam and legitimate e-mails. Performance Evaluation 64(7), 690–714 (2007)
Article Google Scholar
Kossinets, G., Watts, D.J.: Empirical analysis of an evolving social network. Science 311(5757), 88–90 (2006)
Article MathSciNet MATH Google Scholar
Shetty, J., Adibi, J.: The Enron email dataset database schema and brief statistical report. Information Sciences Institute Technical Report, University of Southern California 4 (2004)
Google Scholar
Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)
Article Google Scholar
University of Waikato: Weka 3: Data mining software in Java, http://www.cs.waikato.ac.nz/ml/weka/
Rokach, L., Maimon, O.: Data mining with decision trees: theroy and applications, vol. 69. World Scientific Publishing Company Incorporated (2008)
Google Scholar
Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence, vol. 14, pp. 1137–1145. Lawrence Erlbaum Associates Ltd (1995)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Kirk, R.: Statistics: an introduction. Wadsworth Publishing Company (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Ben Gurion University of the Negev, Be’er Sheva, Israel
Yehonatan Cohen, Daniel Gordon & Danny Hendler

Authors

Yehonatan Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Gordon
View author publications
You can also search for this author in PubMed Google Scholar
Danny Hendler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Computer Security Group, University of Göttingen, Goldschmidtstr. 7, 37077, Göttingen, Germany
Konrad Rieck
Telekom Innovation Laboratories, Security in Telecommunications, Technische Universität Berlin, Ernst-Reuter-Platz 7, 10587, Berlin, Germany
Patrick Stewin & Jean-Pierre Seifert &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen, Y., Gordon, D., Hendler, D. (2013). Early Detection of Outgoing Spammers in Large-Scale Service Provider Networks. In: Rieck, K., Stewin, P., Seifert, JP. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2013. Lecture Notes in Computer Science, vol 7967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39235-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-39235-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39234-4
Online ISBN: 978-3-642-39235-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics