DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Zhang, Wen; Wang, Qiang; Li, Xiangjun; Yoshida, Taketoshi; Li, Jian

doi:10.1007/s11518-019-5438-4

DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Published: 11 December 2019

Volume 28, pages 731–746, (2019)
Cite this article

Journal of Systems Science and Systems Engineering Aims and scope Submit manuscript

Wen Zhang¹,
Qiang Wang¹,
Xiangjun Li²,
Taketoshi Yoshida³ &
…
Jian Li¹

161 Accesses
11 Citations
Explore all metrics

Abstract

Due to the anonymous and free-for-all characteristics of online forums, it is very hard for human beings to differentiate deceptive reviews from truthful reviews. This paper proposes a deep learning approach for text representation called DCWord (Deep Context representation by Word vectors) to deceptive review identification. The basic idea is that since deceptive reviews and truthful reviews are composed by writers without and with real experience on using the online purchased goods or services, there should be different contextual information of words between them. Unlike state-of-the-art techniques in seeking best linguistic features for representation, we use word vectors to characterize contextual information of words in deceptive and truthful reviews automatically. The average-pooling strategy (called DCWord-A) and max-pooling strategy (called DCWord-M) are used to produce review vectors from word vectors. Experimental results on the Spam dataset and the Deception dataset demonstrate that the DCWord-M representation with LR (Logistic Regression) produces the best performances and outperforms state-of-the-art techniques on deceptive review identification. Moreover, the DCWord-M strategy outperforms the DCWord-A strategy in review representation for deceptive review identification. The outcome of this study provides potential implications for online review management and business intelligence of deceptive review identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Context Identification of Deceptive Reviews Using Word Vectors

Deceptive Reviews Detection Using Deep Learning Techniques

Learning Document Representation for Deceptive Opinion Spam Detection

References

Cao L, Tang X, (2014). Topics and trends of the online public concerns based on Tianya forum. Journal of Systems Science and Systems Engineering 23(2):212–230.
Article Google Scholar
Chatterjeei P (2001). Online reviews. Do consumers use them? Proceedings of Conference on Association for Consumer Research: 129–134.
Google Scholar
Chen J, Zhou X, Tang X (2018). An empirical feasibility study of societal risk classification toward BBS posts. Journal of Systems Science and Systems Engineering 27(6):709–726.
Article Google Scholar
Chen L, Wang F (2013). Preference-based clustering reviews for augmenting e-commerce recommendation. Knowledge-Based Systems 50(3):44–59.
Article Google Scholar
Ciresan D C, Meier U, Masci J, Gambardella L M, Schmidhuber (2011). Flexible, high performance convolutional neural networks for image classification. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence: 1237–1242.
Google Scholar
Collobert R, Weston J (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. Journal of Parallel & Distributed Computing: 160–167.
Google Scholar
Collobert R, Weston J, Bottou L, Karlen M., Kavukcuoglu K, Kuksa P (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(1):2493–2537.
MATH Google Scholar
Feng S, Banerjee R, Choi Y (2012). Syntactic stylometry for deception detection. ACL: 8–14.
Google Scholar
Feng V W, Hirst G. (2013). Detecting deceptive opinions with profile compatibility. International Joint Conference on Natural Language Processing: 14–18.
Google Scholar
Firth J R (1957). A synopsis of linguistic theory 1930–1955. Studies in Linguistic Analysis. Philological Society 40(2):305–321.
Google Scholar
Gokhman S, Hancock J, Prabhu P, Ott M, Cardie C (2012). In search of a gold standard in studies of deception. Proceedings of the EACL 2012 Workshop on Computational Approaches to Deception Detection: 23–27.
Google Scholar
Guo C, Du Z, Kou X (2018). Products ranking through aspect-based sentiment analysis of online heterogeneous reviews. Journal of Systems Science and Systems Engineering 27(5):542–558.
Article Google Scholar
Hinton G E, Salakhutdinov R R (2006). Reducing the dimensionality of data with neural networks. Science 313(5786):504–507.
Article MathSciNet Google Scholar
Jindal N, Liu B (2008). Opinion spam and analysis. International Conference on Web Search and Data Mining, ACM.
Google Scholar
Kietzmann J, Canhoto A (2013). Bittersweet! Understanding and managing electronic word of mouth. Journal of Public Affairs 13(2):146–159.
Article Google Scholar
Klein D, Manning CD (2003). Accurate unlexicalized parsing. Meeting on Association for Computational Linguistics: 423–430.
Google Scholar
Lai S, Xu L, Liu K, Zhao J (2015). Recurrent convolutional neural network for text classification. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence: 2267–2273.
Google Scholar
Li F, Huang M, Yang Y, Zhu X (2011). Learning to identify review spam. International Joint Conference on Artificial Intelligence: 2488–2493.
Google Scholar
Li J, Ott M, Cardie C, Hovy E (2014). Towards a general rule for identifying deceptive opinion spam. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: 1566–1576.
Google Scholar
Lim Y J, Osman A, Salahuddin S N, Romle A R, Abdullah S (2016). Factors influencing online shopping behavior: The mediating role of purchase intention. Procedia Economics and Finance 35:401–410.
Article Google Scholar
Liu B (2012). Opinion spam detection: Detecting fake reviews and reviewers. https://www.cs.uic.edu/liub/FBS/fake-reviews.html.
Google Scholar
Liu Q, Gao Z, Liu B, Zhang Y (2013). A logic programming approach to aspect extraction in opinion mining. Ieee/wic/acm International Joint Conferences on Web Intelligence 1:276–283.
Google Scholar
Marrese-Taylor E, Velásquez J D, Bravo-Marquez F, Matsuo Y (2013). Identifying customer preferences about tourism products using an aspect-based opinion mining approach. Procedia Computer Science 22:182–191.
Article Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013). Efficient estimation of word representations in vector space. Computer Science: 1301.
Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26:3111–3119.
Google Scholar
Mudambi S M, Schuff D (2010). What makes a helpful online review? A study of customer reviews on Amazon.com. MIS Quarterly 34(1):185–200.
Article Google Scholar
Nitin I, Fred J D, Zhang T (2005). Text mining: Predictive methods for analyzing unstructured information. Springer Science and Business Media: 15–37.
Google Scholar
Ott M, Choi Y, Cardie C, Hancock J T (2011). Finding deceptive opinion spam by any stretch of the imagination. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: 19–24.
Google Scholar
Pannakkong W, Sriboonchitta S, Huynh V (2018). An ensemble model of arima and ann with restricted boltzmann machine based on decomposition of discrete wavelet transform for time series forecasting. Journal of Systems Science and Systems Engineering 27(5):690–708.
Article Google Scholar
Ren Y, Ji D (2017). Neural networks for deceptive opinion spam detection: An empirical study. Information Sciences 385:213–224.
Article Google Scholar
Ren Y, Zhang Y (2016). Deceptive opinion spam detection using neural network. Proceedings of the 26th International Conference on Computational Linguistics:140–150.
Google Scholar
Socher R, Lin CY, Ng AY, Manning CD (2011). Parsing natural scenes and natural language with recursive neural networks. International Conference on Machine Learning: 129–136.
Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P A (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11(12):3371–3408.
MathSciNet MATH Google Scholar
Zhang W, Yoshida T, Tang X (2007). Text classification toward a scientific forum. Journal of Systems Science and Systems Engineering 16(3):356–379.
Article Google Scholar
Zhang W, Yoshida T, Tang X (2008). Text classification based on multi-word with support vector machine. Knowledge-Based Systems 21(8):879–886.
Article Google Scholar
Zhang W, Yoshida T, Tang X, Ho T (2009). Improving effectiveness of mutual information substantival multiword expression extraction. Expert Systems with Application 36(8):10919–10930.
Article Google Scholar
Zhou L, Shi Y, Zhang D (2008). A statistical language modeling approach to online deception detection. IEEE Transactions on Knowledge & Data Engineering 20(8):1077–1081.
Article Google Scholar

Download references

Acknowledgments

This research is supported in part by National Natural Science Foundation of China under Grant Nos. 71932002, 61379046, 91318302 and 61432001; the Innovation Fund Project of Xi’an Science and Technology Program (Special Series for Xi’an University under Grant No. 2016CXWL21). Also, the authors sincerely thank the referees for their much practical help to improve the quality of this paper.

Author information

Authors and Affiliations

Research Base of Beijing Modern Manufacturing Development, College of Economics and Management, Beijing University of Technology, Beijing, 100124, China
Wen Zhang, Qiang Wang & Jian Li
School of Information Engineering, Xi’an University, Xi’an, 710065, China
Xiangjun Li
School of Knowledge Science, Japan Advanced Institute of Science and Technology, Ishikawa, 923-1292, Japan
Taketoshi Yoshida

Authors

Wen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangjun Li
View author publications
You can also search for this author in PubMed Google Scholar
Taketoshi Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
Jian Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Zhang.

Additional information

Wen Zhang is a professor of College of Economics and Management at Beijing University of Technology (BJUT). He received his PhD degree in knowledge science from the Japan Advanced Institute of Science and Technology in 2009. His recent research interests include machine learning, data mining, and information systems.

Qiang Wang is a PhD candidate of College of Economics and Management at Beijing University of Technology (BJUT). He received his BS degree in marketing from Qufu Normal University in 2016. His research interest includes E-commerce big data analysis, data mining, and machine learning.

Xiangjun Li is a professor with School of Information Engineering, Xi’an University. She received her PhD from Xidian University in 2013. Her current research interest includes data mining, knowledge discovery, and machine learning.

Taketoshi Yoshida is a professor with School of Knowledge Science, Japan Advanced Institute of Science and Technology. He received his PhD degree in systems engineering from Case Western Reserve University in 1984. His current research interest includes knowledge management, knowledge discovery, and information systems.

Jian Li is a professor of College of Economics and Management at Beijing University of Technology (BJUT). He received his PhD degree from Chinese Academy of Sciences in 2007. His recent research interests include supply chain finance, blockchain technology, and emergency management.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Wang, Q., Li, X. et al. DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors. J. Syst. Sci. Syst. Eng. 28, 731–746 (2019). https://doi.org/10.1007/s11518-019-5438-4

Download citation

Published: 11 December 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11518-019-5438-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Abstract

Access this article

Similar content being viewed by others

Deep Context Identification of Deceptive Reviews Using Word Vectors

Deceptive Reviews Detection Using Deep Learning Techniques

Learning Document Representation for Deceptive Opinion Spam Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Abstract

Access this article

Similar content being viewed by others

Deep Context Identification of Deceptive Reviews Using Word Vectors

Deceptive Reviews Detection Using Deep Learning Techniques

Learning Document Representation for Deceptive Opinion Spam Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation