Abstract
Due to the growth of online review data, detecting fake or fraudulent reviews is becoming an urgent issue. One barrier to effective detection of fake reviews/reviewers is the great difficulty of collecting ground-truth data—fake reviews are hard to judge, even by human experts. As researchers propose a large number of methods to detect review spam from a variety of perspectives, e.g., text-based or behavior-based, there is a need to combine these methods to improve the overall detection performance. In this paper, we raise the important question of how to integrate multiple ranking lists generated by different types of review spam detection algorithms into an overall ranking list. To address this problem, we propose a novel unsupervised integration model, namely SpamVote, that combines multiple ranking lists together by voting. In view of the diversity of review spam strategies, we model the fitness of a particular algorithm to detect a specific item as latent, and learn the latent variables from the ranking data. Extensive experiments on real-world datasets with various kinds of algorithms show that the integrated ranking list created by SpamVote outperforms the voting lists with a large probability.
Similar content being viewed by others
References
Luca M, Zervas G (2016) Fake it till you make it: reputation, competition, and yelp review fraud. Manag Sci 62(12):3412–3427
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - vol 1. Stroudsburg, PA, USA, pp 309–319
Mukherjee A, Venkataraman V, Liu B, Glance NS (2013) What yelp fake review filter might be doing?. In: Proceedings of the Seventh International Conference on weblogs and social media, ICWSM 2013, Cambridge, Massachusetts, USA, 8-11 July
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on Web Search and data mining. ACM, pp 219–230
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 191–200
Rayana S, Akoglu L (2015) Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia,10-13 August, pp 985–994
Kumar N, Venugopal D, Qiu L, Kumar S (2019) Detecting anomalous online reviewers: an unsupervised approach using mixture models. J Manag Inf Syst 36(4):1313–1346
Dou Y, Ma G, Yu PS, Xie S (2020) Robust spammer detection by nash reinforcement learning. In: Gupta R, Liu Y, Tang J, Prakash BA (eds) KDD ’20: the 26th ACM SIGKDD conference On knowledge discovery and data mining, virtual event, CA, USA, 23-27 August 2020. ACM, pp 924–933
Wang G, Xie S, Liu B, Yu PS (2011) Review graph based online store review spammer detection. In: 11Th IEEE International Conference On Data Mining, ICDM 2011, Vancouver, BC, Canada, 11-14 December 2011, pp 1242–1247
Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H (2015) Survey of review spam detection using machine learning techniques. J Big Data 2(1):23
Rastogi A, Mehrotra M (2017) Opinion spam detection in online reviews. JIKM 16(4):1–38
Li J, Wang X, Yang L, Zhang P, Yang D (2020) Identifying ground truth in opinion spam: an empirical survey based on review psychology. Appl Intell 50(11):3554–3569
Pourhabibi T, Ong K, Kam B, Boo YL (2020) Fraud detection: a systematic literature review of graph-based anomaly detection approaches. Decis Support Syst 133:113303
Lim E-P, Nguyen V-A, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM international conference on information and knowledge management. CIKM ’10, New York, NY, USA pp 939–948
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’13, New York, NY, USA pp 632–640
Choo E, Yu T, Chi M (2015) Detecting opinion spammer groups through community discovery and sentiment analysis. In: Data and applications security and privacy XXIX - 29th annual IFIP WG 11.3 working conference, DBSec 2015, fairfax, VA, USA, 13-15 July 2015, Proceedings, pp 170–187
Kaghazgaran P, Caverlee J, Squicciarini AC (2018) Combating crowdsourced review manipulators: a neighborhood-based approach. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, 5-9 February 2018, pp 306–314
Wang Z, Hu R, Chen Q, Gao P, Xu X (2020) Collueagle: collusive review spammer detection using markov random fields. Data Min Knowl Discov 34(6):1621–1641
Fang Y, Wang H, Zhao L, Yu F, Wang C (2020) Dynamic knowledge graph based fake-review detection. Appl Intell 50(12):4281–4295
Wei F, Li W, Liu S (2010) Irank: a rank-learn-combine framework for unsupervised ensemble ranking. J Assoc Inf Sci Technol 61(6):1232–1243
Lebanon G, Lafferty JD (2002) Cranking: combining rankings using conditional probability models on permutations. In: Sammut C, Hoffmann AG (eds) machine learning, proceedings of the nineteenth international conference (ICML 2002), University of New South Wales, Sydney, Australia, 8-12 July 2002, Morgan Kaufmann, pp 363–370
Wang Z, Chen Q (2020) Monitoring online reviews for reputation fraud campaigns. Knowl Based Syst 195:105685
Akoglu L, Chandy R, Faloutsos C (2013) Opinion fraud detection in online reviews by network effects. In: Proceedings of the seventh international conference on Weblogs and social media, ICWSM 2013, Cambridge, Massachusetts, USA, 8-11 July
Wang Z, Hou T, Song D, Li Z, Kong T (2016) Detecting review spammer groups via bipartite graph projection. Comput J 59(6):861–874. https://doi.org/10.1093/comjnl/bxv068
Wang Z, Gu S, Zhao X, Xu X (2018) Graph-based review spammer group detection. Knowl Inf Syst 55(3):571–597. https://doi.org/10.1007/s10115-017-1068-7
Acknowledgements
We are very grateful to Knectt Lendoye from Beijing Institute of Technology for proofreading our manuscript, and Dr. Feng Wen from Shenyang Ligong University for the support of this research. We also thank the anonymous reviewers who gave invaluable suggestions on this work. This work was supported by the 2021 Shenyang Ligong University Research and Innovation Team Development Program Support Project under Grant No. SYLUTD202105.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no conflict of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Z., Li, H. & Wang, H. Vote-based integration of review spam detection algorithms. Appl Intell 53, 5048–5059 (2023). https://doi.org/10.1007/s10489-022-03807-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03807-7