research-article

Public Access

Graph-Based Fraud Detection in the Face of Camouflage

Authors:
Bryan Hooi

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA

0000-0002-5645-1754
View Profile

,
Kijung Shin

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA

0000-0002-2872-1526
View Profile

,
Hyun Ah Song

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA
View Profile

,
Alex Beutel

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA
View Profile

,
Neil Shah

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA
View Profile

,
Christos Faloutsos

Carnegie Mellon University, Pittsburgh, USA

Carnegie Mellon University, Pittsburgh, USA
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 11 Issue 4Article No.: 44pp 1–26https://doi.org/10.1145/3056563

Published:29 June 2017Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look “normal.” Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic.

Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower--followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4, 000 detected accounts, of which a majority had tweets showing that they used follower-buying services.

References

Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.Google Scholar
Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. Cobafi: Collaborative Bayesian filtering. In Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 97--108. Google ScholarDigital Library
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 119--130. Google ScholarDigital Library
Shankar Bhamidi, J. Michael Steele, Tauhid Zaman, and others. 2015. Twitter event networks and the superstar model. The Annals of Applied Probability 25, 5 (2015), 2462--2502.Google ScholarCross Ref
Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. Google ScholarDigital Library
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization. Springer, 84--95. Google ScholarDigital Library
Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of Interest. Springer.Google Scholar
Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st International Conference on World Wide Web. ACM, 61--70. Google ScholarDigital Library
Christos Giatsidis, Dimitrios M. Thilikos, and Michalis Vazirgiannis. 2011. Evaluating cooperation in communities with the k-core structure. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 87--93. Google ScholarDigital Library
Zhongshu Gu, Kexin Pei, Qifan Wang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2015. LEAPS: Detecting camouflaged attacks with statistical learning guided by program analysis. In Proceedings of 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 57--68. Google ScholarDigital Library
Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587. Google ScholarDigital Library
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM). IEEE, 781--786. Google ScholarDigital Library
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014b. CatchSync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950. Google ScholarDigital Library
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014a. Inferring strange behavior from connectivity pattern in social networks. In Advances in Knowledge Discovery and Data Mining. Springer, 126--138.Google Scholar
Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining 2008. ACM, 219--230. Google ScholarDigital Library
Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, and Stefan Savage. 2008. Spamalytics: An empirical analysis of spam marketing conversion. In Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 3--14. Google ScholarDigital Library
Chris Kanich, Nicholas Weaver, Damon McCoy, Tristan Halvorson, Christian Kreibich, Kirill Levchenko, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. 2011. Show me the money: Characterizing spam-advertised revenue. In Proceedings of the 20th USENIX Security Symposium. 15--15. Google ScholarDigital Library
G. Karypis and V. Kumar. 1995. METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2. The University of Minnesota.Google Scholar
J. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632. Google ScholarDigital Library
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600. Google ScholarDigital Library
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1361--1370. Google ScholarDigital Library
Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 165--172. Google ScholarDigital Library
Bhaskar Mehta and Thomas Hofmann. 2008. A survey of attack-resistant collaborative filtering algorithms. IEEE Technical Committee on Data Engineering 31, 2 (2008), 14--22.Google Scholar
Bhaskar Mehta, Thomas Hofmann, and Wolfgang Nejdl. 2007. Robust collaborative filtering. In Proceedings of the 2007 ACM Conference on Recommender Systems. ACM, 49--56. Google ScholarDigital Library
Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 75--82. Google ScholarDigital Library
Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2007. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. ACM Transactions on Internet Technology (TOIT) 7, 4 (2007), 23. Google ScholarDigital Library
Arash Molavi Kakhki, Chloe Kliman-Silver, and Alan Mislove. 2013. Iolaus: Securing online content rating systems. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 919--930. Google ScholarDigital Library
George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294. Google ScholarDigital Library
Michael O’Mahony, Neil Hurley, Nicholas Kushmerick, and Guénolé Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Transactions on Internet Technology (TOIT) 4, 4 (2004), 344--377. Google ScholarDigital Library
Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Vol. 1, Association for Computational Linguistics, 309--319. Google ScholarDigital Library
Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: A fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th International Conference on World Wide Web. ACM, 201--210. Google ScholarDigital Library
Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1346--1355. Google ScholarDigital Library
B. A. Prakash, M. Seshadri, A. Sridharan, S. Machiraju, and C. Faloutsos. 2010. Eigenspokes: Surprising patterns and community structure in large graphs. Pacific Asia Knowledge Discovery and Data Mining, 2010a. Vol. 84. Google ScholarDigital Library
Anand Rajaraman, Jeffrey D. Ullman, Jeffrey David Ullman, and Jeffrey David Ullman. 2012. Mining of Massive Datasets. Vol. 1, Cambridge University Press Cambridge. Google ScholarDigital Library
Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fBox: An adversarial perspective. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14). IEEE, 959--964. Google ScholarDigital Library
Gianluca Stringhini, Manuel Egele, Christopher Kruegel, and Giovanni Vigna. 2012. Poultry markets: On the underground economy of twitter followers. In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks. ACM, 1--6. Google ScholarDigital Library
Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 1--9. Google ScholarDigital Library
Steven H. Strogatz. 2001. Exploring complex networks. Nature 410, 6825 (2001), 268--276.Google Scholar
Dinh Nguyen Tran, Bonan Min, Jinyang Li, and Lakshminarayanan Subramanian. 2009. Sybil-resilient online content voting. In Proceedings of the 6th USENIX symposium on Networked Systems Design and Implementation, Vol. 9, 15--28. Google ScholarDigital Library
Charalampos Tsourakakis. 2015. The K-clique densest subgraph problem. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1122--1132. Google ScholarDigital Library
Sankar Virdhagriswaran and Gordon Dakin. 2006. Camouflaged fraud detection in domains with complex relationships. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--947. Google ScholarDigital Library
Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 618--626. Google ScholarDigital Library
Steve Webb, James Caverlee, and Calton Pu. 2008. Social honeypots: Making friends with a spammer near you. In Conference Proceedings of on Email and Anti-Spam.Google Scholar
Baoning Wu, Vinay Goel, and Brian D. Davison. 2006. Propagating trust and distrust to demote web spam. In Proceedings of the Workshop on Models of Trust for the Web. Vol. 190.Google Scholar
Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. Sybillimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 3--17. Google ScholarDigital Library
Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. Sybilguard: Defending against sybil attacks via social networks. ACM SIGCOMM Computer Communication Review 36, 4 (2006), 267--278. Google ScholarDigital Library

Index Terms

Graph-Based Fraud Detection in the Face of Camouflage
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Social networks
    2. Web searching and information discovery
      1. Content ranking

Recommendations

Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

The popularity of microblogging platforms, such as Twitter, makes them important for information dissemination and sharing. However, they are also recognized as ideal places by spammers to conduct social spamming. Massive social spammers and spam ...
Read More
Graph-based review spammer group detection

Online product reviews nowadays are increasingly prevalent in E-commerce websites. People often refer to product reviews to evaluate the quality of a product before purchasing. However, there have been a large number of review spammers who often work ...
Read More
Poster: CUD: crowdsourcing for URL spam detection
CCS '11: Proceedings of the 18th ACM conference on Computer and communications security

The prevalence of spam URLs in Internet services, such as email, social networks, blogs and online forums has become a serious problem. These spam URLs host spam advertisements, phishing attempts, and malwares, which are harmful for normal users. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 11, Issue 4
Special Issue on KDD 2016 and Regular Papers
November 2017
419 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3119906
Editor:
Jie Tang
Tsinghua University, China
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 June 2017
- Accepted: 1 February 2017
- Revised: 1 January 2017
- Received: 1 November 2016
Published in tkdd Volume 11, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Fraud detection
link analysis
spam detection
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 46
  Total Citations
  View Citations
- 1,827
  Total Downloads
- Downloads (Last 12 months)278
- Downloads (Last 6 weeks)51
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Graph-Based Fraud Detection in the Face of Camouflage

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization

Graph-based review spammer group detection

Poster: CUD: crowdsourcing for URL spam detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Graph-Based Fraud Detection in the Face of Camouflage

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization

Graph-based review spammer group detection

Poster: CUD: crowdsourcing for URL spam detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media