Skip to main content
Log in

An application of MOGW optimization for feature selection in text classification

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Due to extensive web applications, sentiment classification (SC) has become a relevant issue of interest among text mining experts. The extensive online reviews prevent the application of effective models to be used in companies and in the decision making of individuals. Pre-processing greatly contributes in sentiment classification. The traditional bag-of-words approaches do not record multiple relationships among words. In this study, emphasis is on the pre-processing stage and data reduction techniques, which would make a big difference in sentiment classification efficiency. To classify opinions, a multi-objective-grey wolf-optimization algorithm is proposed where the two objectives aim for decreasing the error of Naïve Bayes and K-nearest neighbour classifiers and a neural network as the final classifier. In evaluating this proposed framework, three datasets are applied. By obtaining 95.76% precision, 95.75% accuracy, 95.99% recall, and 95.82% f-measure, it is evident that this framework outperforms its counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Gao H, Zeng X, Yao C (2019) Application of improved distributed naive Bayesian algorithms in text classification. J Supercomput 75(9):5831–5847

    Article  Google Scholar 

  2. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retriev. https://doi.org/10.1561/1500000001

    Article  Google Scholar 

  3. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, Barcelona, Spain, pp 271–278. https://doi.org/10.3115/1218955.1218990

  4. Abbas A, Hussein QM (2020) Twitter Sentiment Analysis Using an Ensemble Majority Vote Classifier. J Southwest Jiaotong Univ. https://doi.org/10.35741/issn.0258-2724.55.1.9

    Article  Google Scholar 

  5. Ahmad S, Zakwan M, Syafira N, Moziyana N (2019) A review of feature selection and sentiment analysis technique in issues of Propaganda. Int J Adv Comput Sci Appl. https://doi.org/10.14569/IJACSA.2019.0101132

    Article  Google Scholar 

  6. Alsaeedi A, Khan MZ (2019) A study on sentiment analysis techniques of twitter data. Int J Adv Comput Sci Appl. https://doi.org/10.14569/IJACSA.2019.0100248

    Article  Google Scholar 

  7. Verma B, Thakur RS (2018) Sentiment analysis using lexicon and machine learning-based approaches: a survey. In: Proceedings of International Conference on Recent Advancement on Computer and Communication, Lecture Notes in Networks and Systems, Springer, Singapore. https://doi.org/10.1007/978-981-10-8198-9_46

  8. Zhang H, Gan W, Jiang B (2014) Machine learning and lexicon based methods for sentiment classification: a survey. In: Proceeding of the 11th Web Information System and Application Conference, IEEE, Tianjin, China. https://doi.org/10.1109/WISA.2014.55

  9. Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M, Al-Kabi MN, Al-rifai S (2014) Towards improving the lexicon-based approach for arabic sentiment analysis. Int J Inf Technol Web Eng 9(3):55–71

    Article  Google Scholar 

  10. Nawaz A, Asghar S, Naqvi SHA (2019) A segregational approach for determining aspect sentiments in social media analysis. J Supercomput 75(5):2584–2602

    Article  Google Scholar 

  11. Alnawas A, Arici N (2018) The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: a literature review. Politeknik Dergisi 21(2):461–470

    Google Scholar 

  12. Cruz L, Ochoa J, Roche M, Poncelet P (2017) Dictionary-based sentiment analysis applied to a specific domain. In: Proceeding of the 3rd. Annual Internacional Symposium on Information Management and Big Data, Communications in Computer and Information Science, Springer, Cham. https://doi.org/10.1007/978-3-319-55209-5_5

  13. Phu VN, Chau VTN, Tran VTN, Dat ND (2018) A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. Artif Intell Rev 50:93–159. https://doi.org/10.1007/s10462-017-9538-6

    Article  Google Scholar 

  14. Kumar CSP, Babu LDD (2020) Evolving dictionary based sentiment scoring framework for patient authored text. Evol Intel. https://doi.org/10.1007/s12065-020-00366-z

    Article  Google Scholar 

  15. Park S, Kim Y (2016) Building thesaurus lexicon using dictionary-based approach for sentiment classification. In: Proceeding of the 14th International Conference on Software Engineering Research, Management and Applications, IEEE, Towson, MD, USA. https://doi.org/10.1109/SERA.2016.7516126

  16. Kumar A, Khorwal R (2017) Firefly algorithm for feature selection in sentiment analysis. In: Computational Intelligence in Data Mining. Singapore, Springer. pp 693–703. https://doi.org/10.1007/978-981-10-3874-7_66

  17. Shang L, Zhou Z, Liu X (2016) Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20(10):3821–3834

    Article  Google Scholar 

  18. Manek AS, Shenoy PD, Mohan MC et al (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World wide web 20(2):135–154. https://doi.org/10.1007/s11280-015-0381-x

    Article  Google Scholar 

  19. Zhuang L, Jing F, Zhu X-Y (2006) Movie review mining and summarization. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management. ACM, pp 43–50. https://doi.org/10.1145/1183614.1183625

  20. Severyn A, Moschitti A, Uryupina O et al (2016) Multi-lingual opinion mining on YouTube. Inf Process Manag 52(1):46–60

    Article  Google Scholar 

  21. Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49

    Article  Google Scholar 

  22. Chen L, Qi L (2011) Social opinion mining for supporting buyers’ complex decision making: exploratory user study and algorithm comparison. Social Netw Anal Min 1(4):301–320. https://doi.org/10.1007/s13278-011-0023-y

    Article  Google Scholar 

  23. Chaovalit P, Zhou L (2005) Movie review mining: a comparison between supervised and unsupervised classification approaches. In: Proceedings of the 38th Annual Hawaii International Conference on System Sciences. IEEE, Big Island, HI, USA, pp 1–9. https://doi.org/10.1109/HICSS.2005.445

  24. Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web. ACM, 2003, pp 519–528. https://doi.org/10.1145/775152.775226

  25. Kumar A, Jaiswal A (2019) Swarm intelligence based optimal feature selection for enhanced predictive sentiment accuracy on twitter. Multimed Tools Appl 78(20):29529–29553. https://doi.org/10.1007/s11042-019-7278-0

    Article  Google Scholar 

  26. Rashaideh H, Sawaie A, Al-Betar MA et al (2018) A grey wolf optimizer for text document clustering. J Intell Syst 29(1):814–830. https://doi.org/10.1515/jisys-2018-0194

    Article  Google Scholar 

  27. Movie review data set. https://www.cs.cornell.edu/people/pabo/movie-review-data/

  28. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 168–177. https://doi.org/10.1145/1014052.1014073

  29. Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp 502–518. https://doi.org/10.18653/v1/S17-2088

  30. Nakov P, Ritter A, Rosenthal S et al (2019) SemEval-2016 task 4: Sentiment analysis in Twitter, In: 10th International Workshop on Semantic Evaluation (SemEval-2016), Association for Computational Linguistics, San Diego, California, pp 1–18. https://doi.org/10.18653/v1/S16-1001

  31. Deshmukh JS, Tripathy AK (2018) Entropy based classifier for cross-domain opinion mining. Appl Comput Inform 14(1):55–64. https://doi.org/10.1016/j.aci.2017.03.001

    Article  Google Scholar 

  32. Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, pp 375–384. https://doi.org/10.1145/1645953.1646003

  33. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

    Book  Google Scholar 

  34. Nguyen DQ, Nguyen Dat Q, Vu T et al (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceeding if the 5th Workshop on Computational Approaches to Subjectivity. Sentiment and Social Media Analysis, Baltimore, Maryland, pp 128–135. https://doi.org/10.3115/v1/W14-2621

  35. Cha S-H (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 4(1):300–307

    Google Scholar 

  36. Alpaydin E (2014) Introduction to machine learning. MIT press, Cambridge

    MATH  Google Scholar 

  37. Han J, Micheline K, Jian P (2012) Data mining: concepts and techniques. Morgan Kaufmann Elsevier, Burlington. https://doi.org/10.1016/C2009-0-61819-5

    Book  MATH  Google Scholar 

  38. Coello CAC, Lamont GB, Van Veldhuizen DA (2007) Evolutionary algorithms for solving multi-objective problems. Springer, Berlin. https://doi.org/10.1007/978-0-387-36797-2

    Book  MATH  Google Scholar 

  39. Zitzler E (1999) Evolutionary algorithms for multiobjective optimization: methods and applications. Citeseer. https://doi.org/10.1.1.39.9023

  40. Korosec P (2010) New achievements in evolutionary computation. BoD–books on demand. https://doi.org/10.5772/3083

  41. Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197. https://doi.org/10.1109/4235.996017

    Article  Google Scholar 

  42. Deeply moving: deep learning for sentiment analysis. https://nlp.stanford.edu/sentiment/

Download references

Acknowledgements

Appreciations are extended to Islamic Azad University Isfahan Branch, for supporting this study by Grant #23842006951003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Amirhassan Monadjemi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asgarnezhad, R., Monadjemi, S.A. & Soltanaghaei, M. An application of MOGW optimization for feature selection in text classification. J Supercomput 77, 5806–5839 (2021). https://doi.org/10.1007/s11227-020-03490-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03490-w

Keywords

Navigation