Skip to main content
Log in

Cross-D-vectorizers: a set of feature-spaces for cross-domain sentiment analysis from consumer review

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Supervised sentiment classification approaches require labeled training (source) and testing (target) dataset. Generation of such datasets demands substantial time and effort but cross-domain classification minimizes the effort by considering two different domains for source and target datasets. In this paper, we propose Cross-D-Vectorizers i.e., a set of three sentiment n-gram feature-spaces (Lexical-TFIDF, Lex-Delta-TFIDF and SEND) for the purpose of cross-domain analysis. We construct the features by extracting sentiment unigrams combination with intensifiers and negations from the source dataset. By utilizing an existing lexicon the scores of these features are computed in three different procedures. The scores for each feature are computed by multiplying sentiment value with corresponding TFIDF rating, Delta-TFIDF rating and feature-importance-values (FIV) respectively. Importance-value for each SEND (Sentiment wEight ofN-grams inDataset) feature is calculated by multiplying the number of times the feature appears in the review and the logarithm of its inverse frequency in the corpus. We experiment by using Maximum Entropy, Support Vector Machine and K-Nearest Neighbors classifiers on three benchmark datasets and one proposed dataset for cross-domain classification. Proposed approach show improved results in comparison with existing methods. The advantage of our approach is the complexity of system reduces by considering sentiment n-grams as domain independent features instead of any n-grams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Arunachalam R, Sarkar S (2013) The new eye of government: Citizen sentiment analysis in social media. In: 6th international joint conference on natural language processing, p 23

  2. Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders, Domain adaptation for sentiment classification. In: ACL, vol 7, pp 440–447

  3. Bo P, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pp 79–86. Association for Computational Linguistics

  4. Bollegala D, Mu T, Goulermas JY (2016) Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans Knowl Data Eng 28(2):398–410

    Article  Google Scholar 

  5. Brooke J (2009) A semantic approach to automated text sentiment analysis. PhD thesis, Simon Fraser University

    Google Scholar 

  6. Chen Y (2017) A high-quality digital library supporting computing education: The ensemble approach. PhD diss., Virginia Tech

    Google Scholar 

  7. Chen Y, Fox EA (2014) Using ACM DL paper metadata as an auxiliary source for building educational collections

  8. Chen Y, Xie Z, Fox EA (2017) A library to manage web archive files in cloud storage. TCDL Bulletin 13, 1

  9. Chidlovskii B, Csurka G, Gangwar S (2014) Assembling Heterogeneous Domain Adaptation Methods for Image Classification. In: CLEF (Working Notes), pp 448–461

  10. Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE Trans Syst Man Cybern 25(5):804–813

    Article  Google Scholar 

  11. Dey A, Jenamani M, Thakkar JJ (2018) Senti-N-Gram: An n-gram lexicon for sentiment analysis. Expert Syst Appl 103:92–105

    Article  Google Scholar 

  12. García-Díaz JA, Salas-Zárate MP, Hernández-Alcaraz ML, Valencia-García R, Gómez-Berbís JM (2018) Machine learning based sentiment analysis on spanish financial tweets. In: World conference on information systems and technologies. Springer, Cham, pp 305–311

  13. Han H, Zhang J, Yang J, Shen Y, Zhang Y (2018) Generate domain-specific sentiment lexicon for review sentiment analysis. Multimedia Tools and Applications. 1–6

  14. Hsu C-W, Chang C-C, Lin C-J et al (2003) A practical guide to support vector classification. http://www.csie.ntu.edu.tw/cjlin/papers/guide/guide.pdf

  15. Hutto CJ, Gilbert E (2014) Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: 8th international AAAI conference on Weblogs and social media

  16. Ji J, Luo C, Chen X, Yu L, Li P (2018) Cross-domain sentiment classification via a bifurcated-LSTM. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 681–693

  17. Jones KS (1973) Index term weighting. Information storage and retrieval 9 (11):619–633

    Article  Google Scholar 

  18. Li Y, Qin Z, Xu W, Guo J. (2015) A holistic model of mining product aspects and associated sentiments from online reviews. Multimed Tools Appl 74(23):10177–10194

    Article  Google Scholar 

  19. Liang Y, Liu B, Lin H, Lin Y (2016) Combining local and global information for product feature extraction in opinion documents. Inf Process Lett 116(10):623–627

    Article  Google Scholar 

  20. Liu B (2011) Opinion mining and sentiment analysis. In: Web data mining. Springer, pp 459–526

  21. Liu Y-H, Chen Y-L (2018) A two-phase sentiment analysis approach for judgement prediction. J Inf Sci 44(5):594–607

    Article  Google Scholar 

  22. Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146

    Article  Google Scholar 

  23. Martineau J, Finin T (2009) Delta TFIDF: An improved feature space for sentiment analysis, International Conference on Web and Social Media 9 106.

  24. Matsumoto S, Takamura H, Okumura M (2005) Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees, PAKDD. vol 5

  25. Mudinas A, Zhang D, Levene M (2012) Combining lexicon and learning based approaches for concept-level sentiment analysis. In: Proceedings of the 1st international workshop on issues of sentiment discovery and opinion mining, pp 5. ACM

  26. Nigam K, Lafferty J, McCallum A (1999) Using maximum entropy for text classification, IJCAI-99 workshop on machine learning for information filtering. Vol 1

  27. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  28. Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics, p 271

  29. Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 task 4: Sentiment analysis in Twitter. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pp 502–518

  30. Taboada M et al (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307

    Article  Google Scholar 

  31. Taboada M, Grieve J (2004) Analyzing appraisal automatically, AAAI Press, Stanford University

  32. Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126

    Article  Google Scholar 

  33. Wang L, Niu J, Song H, Atiquzzaman M (2018) SentiRelated: A cross-domain sentiment classification algorithm for short texts through sentiment related index. J Netw Comput Appl 101:111–119

    Article  Google Scholar 

  34. Yu LC, Lee CW, Pan HI, Chou CY, Chao PY, Chen ZH, Tseng SF, Chan CL, Lai KR (2018) Improving early prediction of academic failure using sentiment analysis on self-evaluated comments. Journal of Computer Assisted Learning

Download references

Acknowledgments

We are grateful for the access to facilities of “E-Business Centre of Excellence” Lab at Indian Institute of Technology, Kharagpur. This work is supported by MHRD, Govt. of India, [Sanction Letter No.: F.No. 5-5/2014-TS.VII, Dt; 04-09-2014], Dept. of Higher Education, New Delhi,India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atanu Dey.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dey, A., Jenamani, M. & Thakkar, J.J. Cross-D-vectorizers: a set of feature-spaces for cross-domain sentiment analysis from consumer review. Multimed Tools Appl 78, 23141–23159 (2019). https://doi.org/10.1007/s11042-019-7553-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7553-0

Keywords

Navigation