Abstract
Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look “normal.” Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic.
Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower--followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4, 000 detected accounts, of which a majority had tweets showing that they used follower-buying services.
- Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.Google Scholar
- Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. Cobafi: Collaborative Bayesian filtering. In Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 97--108. Google ScholarDigital Library
- Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 119--130. Google ScholarDigital Library
- Shankar Bhamidi, J. Michael Steele, Tauhid Zaman, and others. 2015. Twitter event networks and the superstar model. The Annals of Applied Probability 25, 5 (2015), 2462--2502.Google ScholarCross Ref
- Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. Google ScholarDigital Library
- Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization. Springer, 84--95. Google ScholarDigital Library
- Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of Interest. Springer.Google Scholar
- Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st International Conference on World Wide Web. ACM, 61--70. Google ScholarDigital Library
- Christos Giatsidis, Dimitrios M. Thilikos, and Michalis Vazirgiannis. 2011. Evaluating cooperation in communities with the k-core structure. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 87--93. Google ScholarDigital Library
- Zhongshu Gu, Kexin Pei, Qifan Wang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2015. LEAPS: Detecting camouflaged attacks with statistical learning guided by program analysis. In Proceedings of 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 57--68. Google ScholarDigital Library
- Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587. Google ScholarDigital Library
- Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM). IEEE, 781--786. Google ScholarDigital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014b. CatchSync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950. Google ScholarDigital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014a. Inferring strange behavior from connectivity pattern in social networks. In Advances in Knowledge Discovery and Data Mining. Springer, 126--138.Google Scholar
- Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining 2008. ACM, 219--230. Google ScholarDigital Library
- Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, and Stefan Savage. 2008. Spamalytics: An empirical analysis of spam marketing conversion. In Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 3--14. Google ScholarDigital Library
- Chris Kanich, Nicholas Weaver, Damon McCoy, Tristan Halvorson, Christian Kreibich, Kirill Levchenko, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. 2011. Show me the money: Characterizing spam-advertised revenue. In Proceedings of the 20th USENIX Security Symposium. 15--15. Google ScholarDigital Library
- G. Karypis and V. Kumar. 1995. METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2. The University of Minnesota.Google Scholar
- J. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632. Google ScholarDigital Library
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600. Google ScholarDigital Library
- Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1361--1370. Google ScholarDigital Library
- Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 165--172. Google ScholarDigital Library
- Bhaskar Mehta and Thomas Hofmann. 2008. A survey of attack-resistant collaborative filtering algorithms. IEEE Technical Committee on Data Engineering 31, 2 (2008), 14--22.Google Scholar
- Bhaskar Mehta, Thomas Hofmann, and Wolfgang Nejdl. 2007. Robust collaborative filtering. In Proceedings of the 2007 ACM Conference on Recommender Systems. ACM, 49--56. Google ScholarDigital Library
- Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 75--82. Google ScholarDigital Library
- Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2007. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. ACM Transactions on Internet Technology (TOIT) 7, 4 (2007), 23. Google ScholarDigital Library
- Arash Molavi Kakhki, Chloe Kliman-Silver, and Alan Mislove. 2013. Iolaus: Securing online content rating systems. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 919--930. Google ScholarDigital Library
- George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294. Google ScholarDigital Library
- Michael O’Mahony, Neil Hurley, Nicholas Kushmerick, and Guénolé Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Transactions on Internet Technology (TOIT) 4, 4 (2004), 344--377. Google ScholarDigital Library
- Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Vol. 1, Association for Computational Linguistics, 309--319. Google ScholarDigital Library
- Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: A fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th International Conference on World Wide Web. ACM, 201--210. Google ScholarDigital Library
- Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1346--1355. Google ScholarDigital Library
- B. A. Prakash, M. Seshadri, A. Sridharan, S. Machiraju, and C. Faloutsos. 2010. Eigenspokes: Surprising patterns and community structure in large graphs. Pacific Asia Knowledge Discovery and Data Mining, 2010a. Vol. 84. Google ScholarDigital Library
- Anand Rajaraman, Jeffrey D. Ullman, Jeffrey David Ullman, and Jeffrey David Ullman. 2012. Mining of Massive Datasets. Vol. 1, Cambridge University Press Cambridge. Google ScholarDigital Library
- Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fBox: An adversarial perspective. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14). IEEE, 959--964. Google ScholarDigital Library
- Gianluca Stringhini, Manuel Egele, Christopher Kruegel, and Giovanni Vigna. 2012. Poultry markets: On the underground economy of twitter followers. In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks. ACM, 1--6. Google ScholarDigital Library
- Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 1--9. Google ScholarDigital Library
- Steven H. Strogatz. 2001. Exploring complex networks. Nature 410, 6825 (2001), 268--276.Google Scholar
- Dinh Nguyen Tran, Bonan Min, Jinyang Li, and Lakshminarayanan Subramanian. 2009. Sybil-resilient online content voting. In Proceedings of the 6th USENIX symposium on Networked Systems Design and Implementation, Vol. 9, 15--28. Google ScholarDigital Library
- Charalampos Tsourakakis. 2015. The K-clique densest subgraph problem. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1122--1132. Google ScholarDigital Library
- Sankar Virdhagriswaran and Gordon Dakin. 2006. Camouflaged fraud detection in domains with complex relationships. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--947. Google ScholarDigital Library
- Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 618--626. Google ScholarDigital Library
- Steve Webb, James Caverlee, and Calton Pu. 2008. Social honeypots: Making friends with a spammer near you. In Conference Proceedings of on Email and Anti-Spam.Google Scholar
- Baoning Wu, Vinay Goel, and Brian D. Davison. 2006. Propagating trust and distrust to demote web spam. In Proceedings of the Workshop on Models of Trust for the Web. Vol. 190.Google Scholar
- Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. Sybillimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 3--17. Google ScholarDigital Library
- Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. Sybilguard: Defending against sybil attacks via social networks. ACM SIGCOMM Computer Communication Review 36, 4 (2006), 267--278. Google ScholarDigital Library
Index Terms
- Graph-Based Fraud Detection in the Face of Camouflage
Recommendations
Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementThe popularity of microblogging platforms, such as Twitter, makes them important for information dissemination and sharing. However, they are also recognized as ideal places by spammers to conduct social spamming. Massive social spammers and spam ...
Graph-based review spammer group detection
Online product reviews nowadays are increasingly prevalent in E-commerce websites. People often refer to product reviews to evaluate the quality of a product before purchasing. However, there have been a large number of review spammers who often work ...
Poster: CUD: crowdsourcing for URL spam detection
CCS '11: Proceedings of the 18th ACM conference on Computer and communications securityThe prevalence of spam URLs in Internet services, such as email, social networks, blogs and online forums has become a serious problem. These spam URLs host spam advertisements, phishing attempts, and malwares, which are harmful for normal users. ...
Comments