Abstract
Convolutional neural networks are known for their excellent performance in computer vision, achieving results in the state of the art. Moreover, recent research has shown that these networks can also provide promising results for natural language processing. In this case, the basic idea is to concatenate the vector representations of words into a single block and use it as an image. However, despite the good results, the problem of using convolution networks is the large numbers of design decisions that need to be made á priori. These models require the definition of many hyper-parameters, including the type of word embeddings, which consists of the data vectorized representation, the activation function that prints the non-linearity characteristics to the model, the size of the filter that applies data convolution, the number of feature maps, which are responsible for identifying the attributes and the pooling method used for data reduction. In addition, one must also predefine the regularization constant and the dropout rate, which are responsible for avoiding any network over-fitting. In existing research works, convolutional neural network architectures capable of overcoming the performance of traditional machine learning models are presented. Even though these can compete with more complex models, the problem of how the different setting of the hyper-parameters may affect the performance of this type of network has not yet been explored. In this paper, we propose an efficient sentiment analysis classifier using convolutional neural networks by analyzing the impact of the hyper-parameters on the model performance. The main interest in analyzing sentiment comes from the advent of social media and the technological advances that flood the Internet with opinions. Nonetheless, mining the Internet for opinion and sentiment analysis is not an easy task and thus needs outstanding models with the best hyper-parameters setting to be able to get pertinent answers. The results achieved are obtained with the use of GPU and show that the different configurations exceed the reference models in the most of the cases with gains of up to 18% and have similar performance to the models of the state of the art with gains of up to 2% in some cases.
Similar content being viewed by others
References
Abadi M et al (2015) TensorFlow: large-scale machinelearning on heterogeneous systems, 2015. Software https://www.tensorflow.org/. Accessed 20 Aug 2015
Al-Smadi M, Qawasmeh O, Al-Ayyouba M, Jararweh Y, Gupta B (2018) Deep recurrent neural network vs. support vectormachine for aspect-based sentiment analysis of arabic hotels’ reviews. J Comput Sci 27:386–393
Archak N, Ghose A, Ipeirotis PG (2007) Show me the money!: Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07. ACM, New York, pp 56–65
Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology—volume 01, WI-IAT ’10. IEEE Computer Society, Washington, pp 492–499
Bagui S, Nguyen LT (2015) Database sharding: to provide faulttolerance and scalability of big data on the cloud. Int J Cloud Appl Comput 5:36–52
Balbaert I (2015) Getting started with Julia programming language. Packt Publishing, New York
Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd international conference on computational linguistics: posters, COLING ’10. Association for Computational Linguistics, Stroudsburg, pp 36–44
Bar-Haim R, Dinur E, Feldman R, Fresko M, Goldstein G (2011) Identifying and following expert investors in stock microblogs. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’11. Association for Computational Linguistics, Stroudsburg, pp 1310–1319
Bhushan K, Gupta BB (2017) Network flow analysis for detection and mitigation of fraudulent resource consumption (frc) attacks in multimedia cloud computing. Multimed Tools Appl 78(4):4267–4298
Bhushan K, Gupta BB Distributed denial of service (ddos) attack mitigation in software defined network (sdn)-based cloud computing environment. J Ambient Intell Humaniz Comput 1–13. https://doi.org/10.1007/s12652-018-0800-9
Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information. arXiv:1607.04606
Bollen J, Mao H, Zeng X-J (2010) Twitter mood predicts the stock market. CoRR. arXiv:abs/1010.3003
Castellanos M, Dayal U, Hsu M, Ghosh R, Dekhil M, Lu Y, Zhang L, Schreiman M (2011) Lci: a social channel analysis platform for live customer intelligence. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, SIGMOD ’11. ACM, New York, pp 1049–1058
Chen Y, Xie J (2008) Online consumer review: word-of-mouth asa new element of marketing communication mix. Manag Sci 54(3):477–491
Collins M, Schapire RE, Singer Y (2002) Logisticregression, adaboost and bregman distances. Mach Learn 48(1–3):253–285
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Das SR, Chen MY (2007) Yahoo! for amazon: sentimentextraction from small talk on the web. Manag Sci 53(9):1375–1388
Dav. word2vec. https://github.com/dav/word2vec. 10, 2016
Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, pp 519–528
Dellarocas C, Zhang X, Awad NF (2007) Exploring the value of online product reviews in forecasting sales: the case of motion pictures. J Interact Market 21(4):23–45
Dong L, Wei F, Liu S, Zhou M, Xu K (2014) A statistical parsing framework for sentiment classification. CoRR. arXiv:abs/1401.6330
Dumais S, Platt J, Heckerman D, Sahami M (1988) Inductive learning algorithms and representations for text categorization. In: Proceedings of the seventh international conference on information and knowledge management, CIKM ’98. ACM, New York, pp 148–155
Esuli A, Sentiwordnet FS (2006) A publicly available lexical resource for opinion mining. https://nmis.isti.cnr.it/sebastiani/Publications/LREC06.pdf. Accessed 30 June 2018
Feldman R, Rosenfeld B, Bar-Haim R, Fresko M (2011) The stock sonar—sentiment analysis of stocks based on a hybrid approach. In Shapiro DG, Fromherz MPJ (eds) IAAI. AAAI, San Francisco
Gerard S, McGill Michael J (1983) Introduction to moderninformation retrieval. McGraw-Hill Book Company, New York
Glove global vectors for word representation. https://nlp.stanford.edu/projects/glove/. 08, 2014
Gosset WS (1908) The probable error of a mean. Biometrika 1:1–25, March 1908. Originally published under the pseudonym “Student”
Groh G, Hauffa J (2011) Characterizing social relations vianlp-based sentiment analysis. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM. The AAAI Press
Guido R (1995) Python reference manual. Technical report, Amsterdam, The Netherlands
Hermann KM, Blunsom P (2013) The role of syntax in vector space models of compositional semantics. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria, 2013. Association for Computational Linguistics
Hong Y, Skiena S (2010) The wisdom of bookies—sentiment analysis versus the nfl point spread. In: Cohen WW, Gosling S (eds) ICWSM. The AAAI Press, Atlanta
Hu Z, Hu J, Ding W, Zheng X (2015) Review sentiment analysis based on deep learning. In: 2015 IEEE 12th international conference on, e-Business engineering (ICEBE). IEEE, Taipei, pp 87–94
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04. ACM, New York, pp 168–177
Hu N, Pavlou PA, Zhang J (2006) Can online reviews reveal a product’s true quality? Empirical findings and analytical modeling of online word-of-mouth communication. In: Proceedings of the 7th ACM conference on electronic commerce, EC ’06. ACM, New York, pp 324–330
John N, Ian B, Michael G, Kevin S (2008) Scalable parallel programming with cuda. Queue 6(2):40–53
Jones E, Oliphant T, Peterson P et al (2001) SciPy: Open source scientific tools for Python, 2001. Accessed 24 June, 2017
Joshi M, Das D, Gimpel K, Smith, NA (2010) Movie reviews and revenues: an experiment in text regression. In: Human language technologies: the 2010 annual conference of the north american chapter of the association for computational linguistics, HLT ’10, 2010. Association for Computational Linguistics, Stroudsburg, pp 293–296
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. CoRR. arXiv: abs/1404.2188
Kernighan BW (1998) The C programming language, 2nd edn. Prentice Hall Professional Technical Reference, New York
Kim Y (2014) Convolutional neural networks for sentence classification. CoRR. arXiv: abs/1408.5882
Liu J, Cao Y, Lin C-Y, Huang Y, Ming Z (2007) Low-quality product review detection in opinion summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), 2007, Poster paper. The Association for Computational Linguistics, Prague, pp 334–342
Li C, Xu B, Wu G, He S, Tian G, Hao H (2014) Recursive deep learning for sentiment analysis over social data. In: Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM international joint conferences on, vol 2, Aug 2014. IEEE, Warsaw, pp 180–185
McGlohon M, Glance N, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: Proceedings of fourth international conference on weblogs and social media (ICWSM), 2010. Association for the Advancement of Artificial Intelligence, Washington
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR. arXiv: abs/1301.3781
Miller M, Sathi C, Wiesenthal D, Leskovec J, Potts C (2011) Sentiment flow through hyperlink networks. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM, 2011. The AAAI Press, San Francisco
Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill Inc, New York
Mohamed NM, Shimaa OL (2015) Cloud computing: the futureof big data management. Int J Cloud Appl Comput 5:53–61
Mohammad SM, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. CoRR. arXiv: abs/1308.6242
Mohammad SM, Yang T (2013) Tracking sentiment in mail: How genders differ on emotional axes. CoRR. arXiv: abs/1309.6347
Mucheol K, Gupta BB, Seunmin R (2018) Crowd sourcing based scientific issue tracking with topic analysis. Appl Soft Comput 66:506–511
Narayanan V, Arora I, Bhatia A (2013) Fast and accurate sentiment classification using an enhanced naive bayes model. CoRR. arXiv: abs/1305.6143
Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, K-CAP ’03, 2003. ACM, New York, pp 70–77
Nielsen MA (2015) Neural networks and deep learning. Determination Press, New York
Nir F, Dan G, Moises G (1997) Bayesiannetwork classifiers. Mach Learn 29(2–3):131–163
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the international AAAI conference on weblogs and social media, 2010. Association for the Advancement of Artificial Intelligence, Washington
Odersky M, Spoon L, Venners B (2008) Programming in Scala: a comprehensive step-by-step guide, 1st edn. Artima Incorporation, New York
Ouyang X, Zhou P, Li CH, Liu L (2015) Sentiment analysis using convolutional neural network. In: 2015 IEEE international conference on computer and information technology; ubiquitous computing and communications; dependable, autonomic and secure computing; pervasive intelligence and computing (CIT/IUCC/DASC/PICOM), Oct 2015. IEEE, Allahabad, pp 2359–2364
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Calzolari N (Conference Chair), Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh international conference on language resources and evaluation (LREC’10), May 2010. European Language Resources Association (ELRA), Valletta
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing—volume 10, EMNLP ’02, 2002. Association for Computational Linguistics, Stroudsburg, pp 79–86
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, PassosA CD, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), 2014. Association for Computational Linguistics, Doha, pp 1532–1543
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1993) Numerical recipes in FORTRAN; the art of scientific computing, 2nd edn. Cambridge University Press, New York
Rudner LM, Liang T (2002) Automated essay scoring using bayes’ theorem. J Technol Learn Assess 1(2)
Sadikov E, Parameswaran AG, Venetis P (2009) Blogs aspredictors of movie success. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) ICWSM, 2009. The AAAI Press, Arlington
Sakunkoo P, Sakunkoo N (2009) Analysis of social influencein online book reviews. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) ICWSM, 2009. The AAAI Press, Arlington
Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 conference on empirical methods in natural language processing (EMNLP), 2012. Association for Computational Linguistics, Seattle
Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’11, 2011. Association for Computational Linguistics, Stroudsburg
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, 2013. Association for Computational Linguistics, Seattle
Steinwart I, Christmann A (2008) Support vector machines, 1st edn. Springer Publishing Company, Incorporated, New York
Stone PJ, Hunt EB (1963) A computer approach to content analysis: studies using the general inquirer system. In: Proceedings of the May 21–23, 1963, Spring joint computer conference, AFIPS ’63 (Spring), 1963. ACM, New York, pp 241–256
Tapas K, Mount David M, Netanyahu Nathan S, PiatkoChristine D, Silverman Ruth W, Angela Y (2002) An efficientk-means clustering algorithm: analysis and implementation. IEEETrans Pattern Anal Mach Intell 24(7):881–892
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media, 2010. Association for the Advancement of Artificial Intelligence, Washington, pp 178–185
Turney PD (2002) Thumbs up or thumbs down: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL ’02, 2002. Association for Computational Linguistics, Stroudsburg, pp 417–424
van der Walt S, Colbert SC, Varoquaux G (2011) Thenumpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30
Vapnik Vladimir N (1995) The nature of statistical learning theory. Springer, New York
Wang S, Manning C (2013) Fast dropout training. In: Dasgupta S, Mcallester D (eds) Proceedings of the 30th international conference on machine learning (ICML-13), vol 28, May 2013. JMLR workshop and conference proceedings. Atlanta, pp 118–126
Wiebe JM, Bruce RF, O’Hara TP (1999) Development and use of a gold-standard data set for subjectivity classifications. In: Proceedings of the 37th annual meeting of the association for computational linguistics on computational linguistics, ACL ’99, 1999. Association for Computational Linguistics, Stroudsburg, pp 246–253
Wilson T, Wiebe J, Hwa R (2004) Just how mad are you? finding strong and weak opinion clauses. In: Proceedings of the 19th national conference on artifical intelligence, AAAI’04. AAAI Press, pp 761–767
Yano T, Smith NA (2010) What’s worthy of comment? Content andcomment volume in political blogs. In: Cohen WW, Gosling S (eds) ICWSM, 2010. The AAAI Press, Washington
Zhang W, Skiena S (2010) Trading strategies to exploit blog and news sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media, May 23th, 2010. Association for the Advancement of Artificial Intelligence, Washington, DC, USA, pp 375–378
Zhang Y, Wallace BC (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. CoRR. arXiv: abs/1510.03820
Zharmagambetov AS, Pak AA (2015) Sentiment analysis of a document using deep learning approach and decision trees. In: 2015 Twelve international conference on electronics computer and computation (ICECCO), Sept 2015. IEEE, Almaty, pp 1–4
Zhou S, Chen Q, Wang X (2010) Active deep networks for semi-supervised sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters, COLING ’10, 2010. Association for Computational Linguistics, Stroudsburg, pp 1515–1523
Acknowledgements
This work is supported by CAPES, the Coordination of Improvement of Higher Education Personnel of the Brazilian Federal Government. This study is also funded by FAPERJ (Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro) via the grant number 203.111/2018.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nedjah, N., Santos, I. & de Macedo Mourelle, L. Sentiment analysis using convolutional neural network via word embeddings. Evol. Intel. 15, 2295–2319 (2022). https://doi.org/10.1007/s12065-019-00227-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-019-00227-4