Solving the twitter sentiment analysis problem based on a machine learning-based approach

Zarisfi Kermani, Fatemeh; Sadeghi, Faramarz; Eslami, Esfandiar

doi:10.1007/s12065-019-00301-x

Solving the twitter sentiment analysis problem based on a machine learning-based approach

Research Paper
Published: 22 October 2019

Volume 13, pages 381–398, (2020)
Cite this article

Evolutionary Intelligence Aims and scope Submit manuscript

Fatemeh Zarisfi Kermani ORCID: orcid.org/0000-0003-4048-9296¹,
Faramarz Sadeghi¹ &
Esfandiar Eslami²

696 Accesses
16 Citations
Explore all metrics

Abstract

Twitter Sentiment Analysis (TSA) as part of a text classification task has been widely attended by researchers in recent years. This paper presents a machine learning approach to solving the TSA problem in three phases. In the second phase, a suitable value for representing each feature in the Vector Space Model is determined through the weighted combination of the values obtained from four methods (i.e., Term Frequency and Inverse Document Frequency, semantic similarity, sentiment scoring using SentiWordNet, and sentiment scoring based on the class of tweets). In this manner, finding the percentage of contributions or weights of each method is defined as an optimization problem and solved using a genetic algorithm. Also, the weighted values obtained from four methods are combined based on the Einstein sum as an important T-conorm method. Finally, the performance of the proposed method is tested based on the accuracy of support vector machine and multinomial naïve Bayes classification algorithms on four famous Twitter datasets, namely the Stanford testing dataset, STS-Gold dataset, Obama-McCain Debate dataset, and Strict Obama-McCain Debate dataset. The obtained results show the high superiority of the proposed method in comparison with the other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Various Machine Learning Algorithms for Twitter Sentiment Analysis

An efficient approach for sentiment analysis using machine learning algorithm

Article 03 June 2020

Twitter Sentiment Analysis Using Supervised Machine Learning

Notes

https://zephoria.com/twitter-statistics-top-ten/.
https://www.internetlivestats.com/twitter-statistics/.
http://sentiment.christopherpotts.net/tokenizing.html.
https://en.wikipedia.org/wiki/List_of_emoticons.
https://www.noslang.com/dictionary.
https://www.netlingo.com/acronyms.php.
Stanford dataset official page: http://help.sentiment140.com/forstudents.
STS-Gold dataset: https://github.com/pollockj/world_mood/blob/master/sts_gold_v03/sts_gold_tweet.csv.
OMD dataset: https://github.com/pmbaumgartner/text-feat-lib.

References

Supriya BN, Kallimani V, Prakash S, Akki CB (2016) Twitter sentiment analysis using binary classification technique. In: International conference on nature of computation and communication ICTCC 2016: nature of computation and communication pp 91–396
Haque MdA, Rahman T (2014) Sentiment analysis by using fuzzy logic. Int J Comput Sci Eng Inf Technol (IJCSEIT) 4:33–48
Google Scholar
Shirdastian H, Laroche M, Richard M-O (2019) Using big data analytics to study brand authenticity sentiments: the case of starbucks on twitter. Int J Inf Manage 48:291–307
Article Google Scholar
Mansour R, Hady MFA, Hosam E, Amr H, Ashour A (2015) Feature selection for twitter sentiment analysis: an experimental study. In: International conference on intelligent text processing and computational linguistics CICLing computational linguistics and intelligent text processing, pp 92–103
Bao Y, Quan Ch, Wang L, Ren F (2014) The role of pre-processing in twitter sentiment analysis. In: International conference on intelligent computing ICIC: intelligent computing methodologies, pp 615–624
Keshavarz H, Abadeh M-S (2017) ALGA: adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs. Knowl-Based Syst 122:1–16
Article Google Scholar
Ismail H-M, Belkhouche B, Zaki N (2018) Semantic twitter sentiment analysis based on a fuzzy thesaurus. Soft Comput 22:6011–6024
Article Google Scholar
Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5:1093–1113
Article Google Scholar
Asghar M-Z, Khan A, Khan F, Kundi F-M (2018) RIFT: a rule induction framework for twitter sentiment analysis. Arabian J Sci Eng 43:857–877
Article Google Scholar
Le B, Nguyen H (2015) Twitter sentiment analysis using machine learning techniques. In: Advanced computational methods for knowledge engineering AISC: advances in intelligent systems and computing, pp 279–289
Pandey A-Ch, Rajpoot D-S, Saraswat M (2017) Twitter sentiment analysis using hybrid cuckoo search method. Inf Process Manage 53:764–779
Article Google Scholar
Speriosu M, Sudan N, Upadhyay S, Baldridge J (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. In: Conference on empirical methods in natural language processing, UK, pp 53–63
Masud F, Khan A, Ahmad S, Asghar M-Z (2014) Lexicon-based sentiment analysis in the social web. J Basic Appl Sci Res 4(6):238–248
Google Scholar
Asghar M-Z, Kundi F-M, Ahmad Sh, Khan A, Khan F (2018) T-SAF: twitter sentiment analysis framework using a hybrid classification scheme. Exp Syst 35:1–19
Google Scholar
Saif H, He Y, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of Twitter. Inf Process Manage 52:5–19
Article Google Scholar
Khan F-H, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153
Article Google Scholar
Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the fifth international conference on language resources and evaluation, pp 417–422
Nielsen F-A (2011) A new ANEW: evaluation of a word list for sentiment analysis for microblogs. In: Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: big things come in small packages, pp 93–98
Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Lingust 37:267–307
Article Google Scholar
Paltoglou G, Thelwall M (2010) A study of information retrieval weighting schemes for sentiment analysis. In: Proceedings of the 48th annual meeting of the association for computational linguistics: association for computational linguistics, pp 1386–1395
Yager RR, Kelman A (1996) Fusion of fuzzy information with considerations for compatibility, partial aggregation, and reinforcement. Int J Appr Reason 15:93–122
Article MathSciNet Google Scholar
Appel O, Chiclana F, Carter J, Fujita H (2016) a hybrid approach to the sentiment analysis problem at the sentence level. Knowl-Based Syst 108:110–124
Article Google Scholar
Gassert H (2018) Operators on fuzzy sets: zadeh and einsteinations on fuzzy sets properties of T-Norms and T-Conorms. https://pdfs.semanticscholar.org/a045/52b74047208d23d77b8aa9f5f334b59e65ea.pdf. Accessed 8 Dec 2018
Goldberg D-E (1989) Genetic algorithms in search optimization and machine learning. Addition Wesley, Massachusetts
MATH Google Scholar
Effrosynidis D, Symeonidis S, Arampatzis A (2017) A comparison of pre-processing techniques. In: International conference on theory and practice of digital libraries TPDL: research and advanced technology for digital libraries, pp 394–406
Salton G, Wong A, Yang C-S (1975) A vector space model for automatic indexing. Commun ACM 18:613–620
Article Google Scholar
Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. University of Illinois at Urbana-Champaign, printed on Elsevier Inc
Vierira S-M, Mendonca L-F, Farinha G-J, Sousa J-M-C (2013) Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl Soft Comput 13:3494–3504
Article Google Scholar
Gen M, Cheng R (1997) Genetic algorithms and engineering design, printed on Wiley
Vapnik V-N (1995) The nature of statistical learning theory. Springer, New York
Book Google Scholar
Saif H, Fernande M, Alani YHH (2013) Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: 1st interantional workshop on emotion and sentiment in social and expressive media: approaches and perspectives from AI (ESSEM 2013), Turin, Italy, pp 9–21
Go A, Bhayani R, Huang L (2010) Twitter sentiment classification using distant supervision. Technical report Stanford University
Shapiro SS, Wilk MB, Chen HJ (1968) A comparative study of various tests for normality. J Am Stat Assoc 63(324):1343–1372
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Mathematics and Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran
Fatemeh Zarisfi Kermani & Faramarz Sadeghi
Department of Pure Mathematics, Faculty of Mathematics and Computer Science, Shahid Bahonar University of Kerman, Kerman, Iran
Esfandiar Eslami

Authors

Fatemeh Zarisfi Kermani
View author publications
You can also search for this author in PubMed Google Scholar
Faramarz Sadeghi
View author publications
You can also search for this author in PubMed Google Scholar
Esfandiar Eslami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fatemeh Zarisfi Kermani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

The Shapiro–Wilk test is a normality test in statistic science and was published in 1965. At a time that the size of the sample is small, this test can be considered as an appropriate alternative. Handling the small samples (n < 20) is identified as one of this test advantages [33]. In this test, the null hypothesis is what the population is normally distributed. This hypothesis is rejected with the significant level of α, if the data tested has not been distributed normally. Table 9 indicates the results distribution is the normal distribution (the significance level 0.05), which was mentioned above in this research.

Table 9 The results of the Shapiro–Wilk test on all methods mentioned in this paper

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zarisfi Kermani, F., Sadeghi, F. & Eslami, E. Solving the twitter sentiment analysis problem based on a machine learning-based approach. Evol. Intel. 13, 381–398 (2020). https://doi.org/10.1007/s12065-019-00301-x

Download citation

Received: 09 January 2019
Revised: 29 July 2019
Accepted: 09 October 2019
Published: 22 October 2019
Issue Date: September 2020
DOI: https://doi.org/10.1007/s12065-019-00301-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving the twitter sentiment analysis problem based on a machine learning-based approach

Abstract

Access this article

Similar content being viewed by others

Various Machine Learning Algorithms for Twitter Sentiment Analysis

An efficient approach for sentiment analysis using machine learning algorithm

Twitter Sentiment Analysis Using Supervised Machine Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Solving the twitter sentiment analysis problem based on a machine learning-based approach

Abstract

Access this article

Similar content being viewed by others

Various Machine Learning Algorithms for Twitter Sentiment Analysis

An efficient approach for sentiment analysis using machine learning algorithm

Twitter Sentiment Analysis Using Supervised Machine Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation