Skip to main content
Log in

Particle swarm optimization-based feature selection in sentiment classification

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for “sentiment classification-oriented”. Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets/Madelon

  2. http://archive.ics.uci.edu/ml/datasets/Semeion+Handwritten+Digit

  3. http://www.searchforum.org.cn/tansongbo/index.htm

References

  • Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12

    Article  Google Scholar 

  • Basu T, Murthy C (2012) Effective text classification by a supervised feature selection approach. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW), pp 918–925. IEEE

  • Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian conference on artificial intelligence, pp 313–325. Springer, New York

  • Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156

    Article  Google Scholar 

  • Dong Z, Dong Q (2000) Hownet

  • Eberhart R, Simpson P, Dobbins R (1996) Computational intelligence PC tools. Academic Press Professional Inc, San Diego

    Google Scholar 

  • Engelbrecht AP (2005) Fundamentals of computational swarm intelligence. Wiley, New York

    Google Scholar 

  • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  • Jin Y, Xiong W, Wang C (2010) Feature selection for chinese text categorization based on improved particle swarm optimization. In: 2010 International conference on natural language processing and knowledge engineering (NLP-KE), pp 1–6. IEEE

  • Kennedy J (2003) Bare bones particle swarms. In: Proceedings of IEEE swarm intelligence symposium, pp 80–87

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc IEEE Int Conf Neural Netw 4:1942–1948

    Article  Google Scholar 

  • Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm optimization. In: Proceedings of IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, vol 5, pp 4104–4108

  • Khanesar MA, Teshnehlab M, Shoorehdeli MA (2007) A novel binary particle swarm optimization. In: IEEE mediterranean conference on control and automation, pp 1–6

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324

    Article  MATH  Google Scholar 

  • Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Progr Nat Sci 18(9):1161–1166

    Article  MathSciNet  Google Scholar 

  • Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Computers Oper Res 34(10):3099–3111

    Article  MATH  Google Scholar 

  • Liu X, Shang L (2013) A fast wrapper feature subset selection method based on binary particle swarm optimization. In: Proceedings of IEEE congress on evolutionary computation, pp 3347–3353

  • Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135

    Article  Google Scholar 

  • Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, pp 79–86. Association for Computational Linguistics

  • Qiu B, Zhao K, Mitra P, Wu D, Caragea C, Yen J, Greer GE, Portier K (2011) Get online support, feel better–sentiment analysis and dynamics in an online cancer survivor community. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE third inernational conference on social computing (SocialCom), pp 274–281. IEEE

  • Sadri J, Sadri CY (2006) A genetic binary particle swarm optimization model. In: IEEE congress on evolutionary computation, pp 656–663

  • Shi X, Liang Y, Lee H, Lu C, Wang Q (2007) Particle swarm optimization-based algorithms for tsp and generalized tsp. Inf Process Lett 103(5):169–176

    Article  MathSciNet  MATH  Google Scholar 

  • Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of IEEE world congress on computational intelligence, pp 69–73

  • Song Q, Ni J, Wang G (2013) A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans Knowl Data Eng 25(1):1–14

    Article  Google Scholar 

  • Tasgetiren MF, Liang Y-C (2004) A binary particle swarm optimization algorithm for lot sizing problem. J Econ Soc Res 5(2):1–20

    Google Scholar 

  • Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85(6):317–325

    Article  MathSciNet  MATH  Google Scholar 

  • Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceedings of international conference on internet multimedia computing and service, p 76. ACM

  • Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28(4):459–471

    Article  Google Scholar 

  • Xue B, Cervante L, Shang L, Browne WN, Zhang M (20104) Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(2)

  • Xue B, Zhang M, Browne WN (2013) Novel initialisation and updating mechanisms in pso for feature selection in classification. In: EvoApplications, pp 428–438

  • Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Article  Google Scholar 

  • Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. ICML 97:412–420

    Google Scholar 

  • Yang S, Wang M, Jiao L (2004) A quantum particle swarm optimization. In: IEEE congress on evolutionary computation, vol 1, pp 320–324

  • Zhou Z, Liu X, Li P, Shang L (2014) Feature selection method with proportionate fitness based binary particle swarm optimization. In: Simulated evolution and learning, pp 582–592. Springer, New York

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC No. 61170180, NSFC No. 61403200).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhe Zhou.

Ethics declarations

Conflict of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Communicated by B. Xue and A. G. Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shang, L., Zhou, Z. & Liu, X. Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20, 3821–3834 (2016). https://doi.org/10.1007/s00500-016-2093-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2093-2

Keywords

Navigation