Skip to main content

Advertisement

Log in

How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis

  • Cognitive Computing for Intelligent Application and Service
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Cognitive computing is an interdisciplinary research field that simulates human thought processes in a computerized model. One application for cognitive computing is sentiment analysis on online reviews, which reflects opinions and attitudes toward products and services experienced by consumers. A high level of classification performance facilitates decision making for both consumers and firms. However, while much effort has been made to propose advanced classification algorithms to improve the performance, the importance of the textual quality of the data has been ignored. This research explores the impact of two influential textual features, namely the word count and review readability, on the performance of sentiment classification. We apply three representative deep learning techniques, namely SRN, LSTM, and CNN, to sentiment analysis tasks on a benchmark movie reviews dataset. Multiple regression models are further employed for statistical analysis. Our findings show that the dataset with reviews having a short length and high readability could achieve the best performance compared with any other combinations of the levels of word count and readability and that controlling the review length is more effective for garnering a higher level of accuracy than increasing the readability. Based on these findings, a practical application, i.e., a text evaluator or a website plug-in for text evaluation, can be developed to provide a service of review editorials and quality control for crowd-sourced review websites. These findings greatly contribute to generating more valuable reviews with high textual quality to better serve sentiment analysis and decision making.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Feature selection is a strategy that aims at making classifiers more efficient and accurate.

  2. They used a gradient boosting algorithm, an ensemble learning technique, to implement the prediction task.

  3. https://www.amazon.com/.

  4. https://www.tripadvisor.com.au/.

  5. https://www.douban.com/.

  6. https://www.imdb.com/.

  7. Helpfulness is regarded as a quality assurance tool for online reviews (Huang et al. [39]).

  8. LSA, i.e., latent semantic analysis.

  9. MySpace is a social network site where comments contain informal English text.

  10. TF-IDF is a popular statistical method that employs word frequencies.

  11. IG, namely Information Gain, is one of the techniques for feature selection.

  12. http://ai.stanford.edu/~amaas/data/sentiment/.

  13. www.nltk.org/.

  14. https://www.tensorflow.org/.

  15. https://keras.io/.

  16. The statistical significance is tested by using an independent t test.

  17. https://www.taobao.com/.

  18. https://www.jd.com/.

  19. https://www.vip.com/.

  20. https://www.yelp.com/.

References

  1. Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12

    Google Scholar 

  2. Amarouche K, Benbrahim H, Kassou I (2015) Product opinion mining for competitive intelligence. Proc Comput Sci 73:358–365

    Google Scholar 

  3. Araque O, Corcuera-Platas I, Snchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl Int J 77(C):236–246

    Google Scholar 

  4. Ardire S, Roe C (2014) Cognitive computing: an emerging hub in IT ecosystems—data management’s new imperative. DATAVERSITY. August 18, 2014

  5. Arkhipenko K, Kozlov I, Trofimovich J, Skorniakov K, Gomzin A, Turdakov D (2016) Comparison of neural network architectures for sentiment analysis of russian tweets. In: Proceedings of international conference, computational linguistics and intellectual technologies

  6. Cao Q, Duan W, Gan Q (2011) Exploring determinants of voting for the “helpfulness” of online user reviews: a text mining approach. Decis Support Syst 50(2):511–521

    Google Scholar 

  7. Chaturvedi I, Ong YS, Tsang IW, Welsch RE, Cambria E (2016) Learning word dependencies in text by means of a deep recurrent belief network. Knowl Based Syst 108(C):144–154

    Google Scholar 

  8. Chen J, Chen Y, Du X, Li C, Lu J, Zhao S et al (2013) Big data challenge: a data management perspective. Front Comput Sci 7(2):157–164

    MathSciNet  Google Scholar 

  9. Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230

    Google Scholar 

  10. Chen Y, Argentinis JE, Weber G (2016) IBM Watson: how cognitive computing can be applied to big data challenges in life sciences research. Clin Ther 38(4):688–701

    Google Scholar 

  11. Choi Y, Lee H (2017) Data properties and the performance of sentiment classification for electronic commerce applications. Inf Syst Front 19(5):993–1012. https://doi.org/10.1007/s10796-017-9741-7

    Article  Google Scholar 

  12. Chua AYK, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54(C):547–554

    Google Scholar 

  13. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  14. Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. Eprint Arxiv 157(10):3642–3649

    Google Scholar 

  15. Coccoli M, Maresca P, Stanganelli L (2017) The role of big data and cognitive computing in the learning process. J Vis Lang Comput 38:97–103

    Google Scholar 

  16. Coleman M, Liau TL (1975) A computer readability formula designed for machine scoring. J Appl Psychol 60(2):283–284

    Google Scholar 

  17. Cortes C, Jackel LD, Chiang WP (1995) Limits on learning machine accuracy imposed by data quality. In: International Conference on Neural Information Processing Systems, MIT Press, pp 239–246

  18. Crossley SA, Kyle K, Allen LK, Guo L, McNamara DS (2014) Linguistic microfeatures to predict L2 writing proficiency: a case study in automated writing evaluation. J Writ Assess 7(1). Retrieved from http://www.journalofwritingassessment.org/article.php?article=74

  19. Dale E, Chall JS (1948) A formula for predicting readability. Educ Res Bull 27(1):37–54

    Google Scholar 

  20. Demirkan H, Bess C, Spohrer J, Rayes A, Allen D, Moghaddam Y (2015) Innovations with smart service systems: analytics, big data, cognitive assistance, and the internet of everything. CAIS 37:35

    Google Scholar 

  21. Dolamic L, Savoy J (2010) When stopword lists make the difference. J Assoc Inf Sci Technol 61(1):200–203

    Google Scholar 

  22. Ebert S, Vu NT, Schütze H (2015) A linguistically informed convolutional neural network. In Proceedings of the 6th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 109–114

  23. Elman JL (1990) Finding structure in time. Cognit Sci 14(2):179–211

    Google Scholar 

  24. Fink L, Rosenfeld L, Ravid G (2018) Longer online reviews are not necessarily better. Int J Inf Manag 39:30–37

    Google Scholar 

  25. Flesch R (1948) A new readability yardstick. J Appl Psychol 32(3):221

    Google Scholar 

  26. Forman C, Ghose A, Wiesenfeld B (2008) Examining the relationship between reviews and sales: the role of reviewer identity disclosure in electronic markets. Inf Syst Res 19(3):291–313

    Google Scholar 

  27. Garrett MA (2014) Big data analytics and cognitive computing—future opportunities for astronomical research. In: IOP conference series: materials science and engineering, vol 67(1). IOP Publishing, p 012017

  28. Garrett MA (2015) SETI reloaded: next generation radio telescopes, transients and cognitive computing. Acta Astronaut 113:8–12

    Google Scholar 

  29. Ghose A, Ipeirotis P (2007) Designing ranking systems for consumer reviews: the economic impact of customer sentiment in electronic markets. In: ICDSS

  30. Ghose A, Ipeirotis PG (2011) Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics. Soc Sci Electron Publ 23(10):1498–1512

    Google Scholar 

  31. Graesser AC, McNamara DS, Louwerse MM, Cai Z (2004) Coh-Metrix: analysis of text on cohesion and language. Behav Res Methods instrum Comput 36(2):193–202

    Google Scholar 

  32. Gudivada VN (2016) Cognitive computing: concepts, architectures, systems, and applications. In: Handbook of statistics, vol 35. Elsevier, pp 3–38. https://doi.org/10.1016/bs.host.2016.07.004

  33. Gunning R (1952) The technique of clear writing. McGraw-Hill, New York

    Google Scholar 

  34. Guo Y, Oerlemans A, Oerlemans A, Lao S, Wu S, Lew MS (2016) Deep learning for visual understanding. Neurocomputing 187(C):27–48

    Google Scholar 

  35. Gupta S, Kar AK, Baabdullah A, Khowaiter WA (2018) Big data with cognitive computing: a review for the future. Int J Inf Manag 42:78–89

    Google Scholar 

  36. Hensley N (2014) Data warehousing: the brains of the big data operation. In: IBM data management magazine. Retrieved from http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84900316486&origin=inward

  37. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  38. Hu N, Koh NS, Reddy SK (2014) Ratings lead you to the product, reviews help you clinch it? The mediating role of online review sentiments on product sales. Decis Support Syst 57(1):42–53

    Google Scholar 

  39. Huang AH, Chen K, Yen DC, Tran TP (2015) A study of factors that contribute to online review helpfulness. Comput Hum Behav 48(C):17–27

    Google Scholar 

  40. Huang AH, Yen DC (2013) Predicting the helpfulness of online reviews. Int J Hum Comput Interact 29(2):129–138

    MathSciNet  Google Scholar 

  41. Hurwitz J, Kaufman M, Bowles A (2015) Cognitive computing and big data analytics. Wiley, New York

    Google Scholar 

  42. Hyman R, Schroder HM, Driver MJ, Streufert S (1967) Human information processing: individuals and groups functioning in complex situations. Am J Psychol 83(1):136

    Google Scholar 

  43. Irsoy O, Cardie C (2014) Deep recursive neural networks for compositionality in language. In: Advances in neural information processing systems 27: annual conference on neural information processing systems (NIPS), pp 2096–2104

  44. Jacoby J, Speller DE, Berning CK (1974) Brand choice behavior as a function of information load. J Consum Res 1(1):33–42

    Google Scholar 

  45. Jang H, Shin H (2010) Effective use of linguistic features for sentiment analysis of Korean. In: PACLIC, pp 173–182

  46. Johnson RC (2002) Darpa puts thought into cognitive computing, Electronic Engineering Times. Retrieved from https://www.eetimes.com/document.asp?doc_id=1227413

  47. Jones Q, Ravid G, Rafaeli S (2004) Information overload and the message dynamics of online interaction spaces: a theoretical model and empirical exploration. Inf Syst Res 15(2):194–210

    Google Scholar 

  48. Jozefowicz R, Zaremba W, Sutskever I (2015) An empirical exploration of recurrent network architectures. In: International conference on international conference on machine learning, pp 2342–2350. JMLR.org

  49. Jurgovsky J, Granitzer M (2015) Comparing recursive autoencoder and convolutional network for phrase-level sentiment polarity classification. Natural language processing and information systems. Springer, New York

    Google Scholar 

  50. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: proceedings of the 52nd annual meeting of the association for computational linguistics (ACL). Association for Computational Linguistics, Baltimore, pp 655–665

  51. Kelly JE III, Hamm S (2013) Smart machines: IBM’s Watson and the era of cognitive computing. Columbia University Press, NewYork

    Google Scholar 

  52. Khatri I, Shrivastava VK (2016) A survey of big data in healthcare industry. In: Choudhary R, Mandal J, Auluck N, Nagarajaram H (eds) Advanced computing and communication technologies. Advances in intelligent systems and computing, vol 452. Springer, Singapore, pp 245–257

    Google Scholar 

  53. Kim SM, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: Conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 423–430

  54. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751

  55. Kincaid JP, Fishburne RP, Rogers RL, Chissom BS (1975) Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Research Branch Report, 8-75. Naval Air Station Memphis, Millington, Tennessee

  56. Korfiatis N, García-Bariocanal E, Sánchez-Alonso S (2012) Evaluating content quality and helpfulness of online product reviews: the interplay of review helpfulness versus review content q. Electron Commer Res Appl 11(3):205–217

    Google Scholar 

  57. Krishnamoorthy S (2015) Linguistic features for review helpfulness prediction. Expert Syst Appl 42(7):3751–3759

    Google Scholar 

  58. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, vol 60. Curran Associates Inc, pp 1097–1105

  59. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Google Scholar 

  60. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Google Scholar 

  61. Lee EJ, Shin SY (2014) When do consumers buy online product reviews? Effects of review quality, product type, and reviewer’s photo. Comput Hum Behav 31(1):356–366

    Google Scholar 

  62. Liu B (2010) Sentiment analysis and subjectivity. 30(36):152–153

    Google Scholar 

  63. Zhang L, Liu B (2017) Sentiment analysis and opinion mining. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning and data mining. Springer, Boston, MA, pp 1152–1161

    Google Scholar 

  64. Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. Mining text data. Springer, New York

    Google Scholar 

  65. Liu J, Cao Y, Lin CY, Huang Y, Zhou M (2007) Low-quality product review detection in opinion summarization. In: EMNLP-CoNLL 2007, Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning. DBLP, Prague, Czech Republic, pp 334–342

  66. Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: International joint conference on artificial intelligence. AAAI Press, pp 2873–2879

  67. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, Stroudsburg, pp 142–150. Retrieved from http://www.aclweb.org/anthology/P11-1015

  68. Malhotra NK (1982) Information load and consumer decision making. J Consum Res 8(4):419–430

    Google Scholar 

  69. Malik M, Hussain A (2017) Helpfulness of product reviews as a function of discrete positive and negative emotions. Elsevier, Amsterdam

    Google Scholar 

  70. McLaughlin GH (1969) Smog grading—a new readability formula. J Read 12(8):639–646

    Google Scholar 

  71. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Int Conf Neural Inf Process Syst 26:3111–3119

    Google Scholar 

  72. Moraes R, Valiati JF, Gavião Neto WP (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633

    Google Scholar 

  73. Mudambi SM, Schuff D (2010) What makes a helpful online review? A study of customer reviews on amazon.com. In: Society for information management and the management information systems research center

  74. National Governors Association (2010) Common core state standards. Light J 19:19

    Google Scholar 

  75. Onan A, Korukoğlu S (2015) A feature selection model based on genetic rank aggregation for text sentiment classification. J Inf Sci 39(5):1103–1107

    Google Scholar 

  76. Otterbacher J (2009) ‘Helpfulness’ in online communities: a measure of message quality. In: Sigchi conference on human factors in computing systems. ACM, pp 955–964

  77. Pande V, Khandelwal A (2017) Comparative study of various methods of classification techniques using different datasets. Int J Adv Res 5(5):1573–1583

    Google Scholar 

  78. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of empirical methods in natural language processing

  79. Qazi A, Syed KBS, Raj RG, Cambria E, Tahir M, Alghazzawi D (2016) A concept-level approach to the analysis of online review helpfulness. Comput Hum Behav 58(C):75–81

    Google Scholar 

  80. Roscoe RD, Varner LK, Crossley SA, McNamara DS (2013) Developing pedagogically-guided algorithms for intelligent writing feedback. Int J Learn Technol 8(4):362. https://doi.org/10.1504/ijlt.2013.059131

    Article  Google Scholar 

  81. Ruangkanokmas P, Achalakul T, Akkarajitsakul K (2016) Deep belief networks with feature selection for sentiment classification. In: International conference on intelligent systems, modelling and simulation. IEEE, pp 9–14

  82. Saidani FR, Rassoul I (2017) A weighted genetic approach for feature selection in sentiment analysis. Int J Comput Intell Appl 16(02):591–600

    Google Scholar 

  83. Salehan M, Kim DJ (2016) Predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics. Decis Support Syst 81:30–40

    Google Scholar 

  84. Schroder HM, Driver MJ, Streufert S (1967) Human information processing: Individuals and groups functioning in complex social situations. Rinehart and Winston, Holt

    Google Scholar 

  85. Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks. In: The international ACM SIGIR conference. ACM, pp 959–962

  86. Sheth A (2016) Internet of things to smart iot through semantic, cognitive, and perceptual computing. IEEE Intell Syst 31(2):108–112

    Google Scholar 

  87. Sheth A, Anantharam P, Henson C (2016) Semantic, cognitive, and perceptual computing: paradigms that shape human experience. Computer 49(3):64–72

    Google Scholar 

  88. Siering M, Muntermann J (2013) What drives the helpfulness of online product reviews? From stars to facts and emotions. WI, pp 103–118

  89. Siering M, Muntermann J, Rajagopalan B (2018) Explaining and predicting online review helpfulness: the role of content and reviewer-related signals. Decis Support Syst 108:1–12. https://doi.org/10.1016/j.dss.2018.01.004

    Article  Google Scholar 

  90. Singh J, Singh G, Singh R (2016) A review of sentiment analysis techniques for opinionated web text. CSI Trans ICT 4(2–4):241–247

    Google Scholar 

  91. Singh JP, Irani S, Rana NP, Dwivedi YK, Saumya S, Roya PK (2017) Predicting the “helpfulness” of online consumer reviews. J Bus Res 70:346–355

    Google Scholar 

  92. Smith EA, Senter RJ (1967) Automated readability index. AMRL-TR. Aerospace medical research laboratories (6570th). iii. pp 1–14

  93. Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Conference on empirical methods in natural language processing, EMNLP 2011, 27–31 July 2011. John Mcintyre Conference Centre, Edinburgh, UK, A Meeting of Sigdat, A Special Interest Group of the ACL. DBLP, pp 151–161

  94. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing (EMNLP). Citeseer, pp 1631–1642

  95. Spohrer J, Banavar G (2015) Cognition as a service: an industry perspective. AI Mag 36(4):71–86

    Google Scholar 

  96. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics

  97. Tang D, Qin B, Liu T (2015a) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432

  98. Tang D, Qin B, Liu T (2015) Deep learning for sentiment analysis: successful approaches and future challenges. Wiley Interdiscip Rev Data Min Knowl Discov 5(6):292–303

    Google Scholar 

  99. Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Assoc Inf Sci Technol 61(12):2544–2558

    Google Scholar 

  100. Tripathy A, Anand A, Rath SK (2017) Document-level sentiment classification using hybrid machine learning approach. Knowl Inf Syst 1:1–27

    Google Scholar 

  101. Van De Bogart W (2015) Information entanglement: developments in cognitive based knowledge acquisition strategies based on big data. In: International conference on intellectual capital and knowledge management and organisational learning. Academic Conferences International Limited, p 299

  102. Xu K, Liao SS, Li J, Song Y (2011) Mining comparative opinions from customer reviews for competitive intelligence. Decis Support Syst 50(4):743–754

    Google Scholar 

  103. Yin W, Kann K, Yu M, Schütze H (2017) Comparative study of CNN and RNN for natural language processing. Arxiv.org. Retrieved from https://arxiv.org/abs/1702.01923

  104. Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93

    Google Scholar 

  105. Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley interdiscip rev data min knowl discov 8(4):e1253. https://doi.org/10.1002/widm.1253

    Article  Google Scholar 

  106. Zhao H, Zhang X, Li K (2017) A sentiment classification model using group characteristics of writing style features. Int J Pattern Recog Artif Intell 31(12):12–453

    Google Scholar 

  107. Zhou ZH, Chawla NV, Jin Y, Williams GJ (2014) Big data opportunities and challenges: discussions from data analytics perspectives [discussion forum]. IEEE Comput Intell Mag 9(4):62–74

    Google Scholar 

  108. Vessey I (1991) Cognitive fit: a theory-based analysis of the graphs versus tables literature. Decis Sci 22(2):219–240

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Jin.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest associated with this article.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, L., Goh, TT. & Jin, D. How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput & Applic 32, 4387–4415 (2020). https://doi.org/10.1007/s00521-018-3865-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3865-7

Keywords

Navigation