skip to main content
10.1145/3012071.3012078acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

Leveraging time for spammers detection on Twitter

Published:01 November 2016Publication History

ABSTRACT

Twitter is one of the most popular microblogging social systems, which provides a set of distinctive posting services operating in real time. The flexibility of these services has attracted unethical individuals, so-called "spammers", aiming at spreading malicious, phishing, and misleading information. Unfortunately, the existence of spam results non-ignorable problems related to search and user's privacy. In the battle of fighting spam, various detection methods have been designed, which work by automating the detection process using the "features" concept combined with machine learning methods. However, the existing features are not effective enough to adapt spammers' tactics due to the ease of manipulation in the features. Also, the graph features are not suitable for Twitter based applications, though the high performance obtainable when applying such features.

In this paper, beyond the simple statistical features such as number of hashtags and number of URLs, we examine the time property through advancing the design of some features used in the literature, and proposing new time based features. The new design of features is divided between robust advanced statistical features incorporating explicitly the time attribute, and behavioral features identifying any posting behavior pattern. The experimental results show that the new form of features is able to classify correctly the majority of spammers with an accuracy higher than 93% when using Random Forest learning algorithm, applied on a collected and annotated data-set. The results obtained outperform the accuracy of the state of the art features by about 6%, proving the significance of leveraging time in detecting spam accounts.

References

  1. Formerly Digital Marketing Ramblings. By the numbers: 170+ amazing twitter statistics. http://expandedramblings.com/index.php/march-2013-by-\\the-numbers-a-few-amazing-twitter-stats/, 2013. {Online; accessed 1-July-2016}.Google ScholarGoogle Scholar
  2. Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. Detecting spammers on twitter. In In Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS, page 12, 2010.Google ScholarGoogle Scholar
  3. Alex Hai Wang. Don't follow me: Spam detection in twitter. In Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on, pages 1--10, July 2010.Google ScholarGoogle Scholar
  4. Kyumin Lee, James Caverlee, and Steve Webb. Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '10, pages 435--442, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. McCord and M. Chuah. Spam detection on twitter using traditional classifiers. In Proceedings of the 8th International Conference on Autonomic and Trusted Computing, ATC'11, pages 175--186. Springer-Verlag, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC '10, pages 1--9, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chao Yang, Robert Chandler Harkreader, and Guofei Gu. Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers. In Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, RAID'11, pages 318--337, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Amit A Amleshwaram, Nutan Reddy, Suneel Yadav, Guofei Gu, and Chao Yang. Cats: Characterizing automation of twitter spammers. In Communication Systems and Networks (COMSNETS), 2013 Fifth International Conference on, pages 1--10. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  9. Cheng Cao and James Caverlee. Detecting spam urls in social media via behavioral analysis. In Advances in Information Retrieval, pages 703--714. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  10. Zi Chu, Indra Widjaja, and Haining Wang. Detecting social spam campaigns on twitter. In Applied Cryptography and Network Security, pages 455--472. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Claudia Meda, Federica Bisio, Paolo Gastaldo, and Rodolfo Zunino. A machine learning approach for twitter spammers detection. In 2014 International Carnahan Conference on Security Technology (ICCST), pages 1--6. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  12. Igor Santos, Igor Miambres-Marcos, Carlos Laorden, Patxi Galn-Garca, Aitor Santamara-Ibirika, and Pablo Garca Bringas. Twitter content-based spam filtering. In International Joint Conference SOCO'13-CISIS'13-ICEUTE'13, pages 449--458. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  13. Juan Martinez-Romo and Lourdes Araujo. Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications, 40(8):2992--3000, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Twitter. The twitter rules. https://support.twitter.com/articles/18311#, 2016. {Online; accessed 1-July-2016}.Google ScholarGoogle Scholar
  15. Chao Yang, Robert Harkreader, Jialong Zhang, Seungwon Shin, and Guofei Gu. Analyzing spammers' social networks for fun and profit: A case study of cyber criminal ecosystem on twitter. In Proceedings of the 21st International Conference on World Wide Web, WWW '12, pages 71--80, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. Detecting automation of twitter accounts: Are you a human, bot, or cyborg? Dependable and Secure Computing, IEEE Transactions on, 9(6):811--824, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xia Hu, Jiliang Tang, and Huan Liu. Online social spammer detection. In AAAI, pages 59--65, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xia Hu, Jiliang Tang, Yanchao Zhang, and Huan Liu. Social spammer detection in microblogging. In IJCAI, volume 13, pages 2633--2639. Citeseer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. Introduction to Data Mining, (First Edition). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Solomon Kullback and Richard A Leibler. On information and sufficiency. The annals of mathematical statistics, 22(1):79--86, 1951.Google ScholarGoogle Scholar
  21. Alex Hai Wang. Detecting spam bots in online social networking sites: A machine learning approach. In Proceedings of the 24th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, DBSec'10, pages 335--342, Berlin, Heidelberg, 2010. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Po-Ching Lin and Po-Min Huang. A study of effective features for detecting long-surviving twitter spam accounts. In Advanced Communication Technology (ICACT), 2013 15th International Conference on, pages 841--846, Jan 2013.Google ScholarGoogle Scholar
  23. Sarita Yardi, Daniel Romero, Grant Schoenebeck, and danah boyd. Detecting spam in a twitter network. First Monday, 15(1), 2009.Google ScholarGoogle Scholar
  24. Kurt Thomas, Chris Grier, Dawn Song, and Vern Paxson. Suspended accounts in retrospect: An analysis of twitter spam. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC '11, pages 243--258, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia, and Alok N Choudhary. Towards online spam filtering in social networks. In NDSS, page 16, 2012.Google ScholarGoogle Scholar
  26. Jonghyuk Song, Sangho Lee, and Jong Kim. Spam filtering in twitter using sender-receiver relationship. In Recent Advances in Intrusion Detection, pages 301--317. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10--18, November 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Diansheng Guo and Chao Chen. Detecting non-personal and spam users on geo-tagged twitter network. Transactions in GIS, 18(3):370--384, 2014.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Leveraging time for spammers detection on Twitter

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      MEDES: Proceedings of the 8th International Conference on Management of Digital EcoSystems
      November 2016
      243 pages
      ISBN:9781450342674
      DOI:10.1145/3012071

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 November 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate267of682submissions,39%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader