skip to main content
research-article
Public Access

Graph-Based Fraud Detection in the Face of Camouflage

Published:29 June 2017Publication History
Skip Abstract Section

Abstract

Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look “normal.” Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic.

Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower--followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4, 000 detected accounts, of which a majority had tweets showing that they used follower-buying services.

References

  1. Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  2. Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. Cobafi: Collaborative Bayesian filtering. In Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 97--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Shankar Bhamidi, J. Michael Steele, Tauhid Zaman, and others. 2015. Twitter event networks and the superstar model. The Annals of Applied Probability 25, 5 (2015), 2462--2502.Google ScholarGoogle ScholarCross RefCross Ref
  5. Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization. Springer, 84--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of Interest. Springer.Google ScholarGoogle Scholar
  8. Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st International Conference on World Wide Web. ACM, 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Christos Giatsidis, Dimitrios M. Thilikos, and Michalis Vazirgiannis. 2011. Evaluating cooperation in communities with the k-core structure. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 87--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhongshu Gu, Kexin Pei, Qifan Wang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2015. LEAPS: Detecting camouflaged attacks with statistical learning guided by program analysis. In Proceedings of 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 57--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM). IEEE, 781--786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014b. CatchSync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014a. Inferring strange behavior from connectivity pattern in social networks. In Advances in Knowledge Discovery and Data Mining. Springer, 126--138.Google ScholarGoogle Scholar
  15. Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining 2008. ACM, 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, and Stefan Savage. 2008. Spamalytics: An empirical analysis of spam marketing conversion. In Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chris Kanich, Nicholas Weaver, Damon McCoy, Tristan Halvorson, Christian Kreibich, Kirill Levchenko, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. 2011. Show me the money: Characterizing spam-advertised revenue. In Proceedings of the 20th USENIX Security Symposium. 15--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Karypis and V. Kumar. 1995. METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2. The University of Minnesota.Google ScholarGoogle Scholar
  19. J. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1361--1370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 165--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Bhaskar Mehta and Thomas Hofmann. 2008. A survey of attack-resistant collaborative filtering algorithms. IEEE Technical Committee on Data Engineering 31, 2 (2008), 14--22.Google ScholarGoogle Scholar
  24. Bhaskar Mehta, Thomas Hofmann, and Wolfgang Nejdl. 2007. Robust collaborative filtering. In Proceedings of the 2007 ACM Conference on Recommender Systems. ACM, 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 75--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2007. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. ACM Transactions on Internet Technology (TOIT) 7, 4 (2007), 23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Arash Molavi Kakhki, Chloe Kliman-Silver, and Alan Mislove. 2013. Iolaus: Securing online content rating systems. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 919--930. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael O’Mahony, Neil Hurley, Nicholas Kushmerick, and Guénolé Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Transactions on Internet Technology (TOIT) 4, 4 (2004), 344--377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Vol. 1, Association for Computational Linguistics, 309--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: A fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th International Conference on World Wide Web. ACM, 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1346--1355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. A. Prakash, M. Seshadri, A. Sridharan, S. Machiraju, and C. Faloutsos. 2010. Eigenspokes: Surprising patterns and community structure in large graphs. Pacific Asia Knowledge Discovery and Data Mining, 2010a. Vol. 84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Anand Rajaraman, Jeffrey D. Ullman, Jeffrey David Ullman, and Jeffrey David Ullman. 2012. Mining of Massive Datasets. Vol. 1, Cambridge University Press Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fBox: An adversarial perspective. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14). IEEE, 959--964. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Gianluca Stringhini, Manuel Egele, Christopher Kruegel, and Giovanni Vigna. 2012. Poultry markets: On the underground economy of twitter followers. In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks. ACM, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Steven H. Strogatz. 2001. Exploring complex networks. Nature 410, 6825 (2001), 268--276.Google ScholarGoogle Scholar
  39. Dinh Nguyen Tran, Bonan Min, Jinyang Li, and Lakshminarayanan Subramanian. 2009. Sybil-resilient online content voting. In Proceedings of the 6th USENIX symposium on Networked Systems Design and Implementation, Vol. 9, 15--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Charalampos Tsourakakis. 2015. The K-clique densest subgraph problem. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1122--1132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sankar Virdhagriswaran and Gordon Dakin. 2006. Camouflaged fraud detection in domains with complex relationships. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--947. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 618--626. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Steve Webb, James Caverlee, and Calton Pu. 2008. Social honeypots: Making friends with a spammer near you. In Conference Proceedings of on Email and Anti-Spam.Google ScholarGoogle Scholar
  44. Baoning Wu, Vinay Goel, and Brian D. Davison. 2006. Propagating trust and distrust to demote web spam. In Proceedings of the Workshop on Models of Trust for the Web. Vol. 190.Google ScholarGoogle Scholar
  45. Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. Sybillimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 3--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. Sybilguard: Defending against sybil attacks via social networks. ACM SIGCOMM Computer Communication Review 36, 4 (2006), 267--278. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Graph-Based Fraud Detection in the Face of Camouflage

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 4
        Special Issue on KDD 2016 and Regular Papers
        November 2017
        419 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3119906
        • Editor:
        • Jie Tang
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 June 2017
        • Accepted: 1 February 2017
        • Revised: 1 January 2017
        • Received: 1 November 2016
        Published in tkdd Volume 11, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader