ABSTRACT
Due to its high popularity, Weblogs (or blogs in short) present a wealth of information that can be very helpful in assessing the general public's sentiments and opinions. In this paper, we study the problem of mining sentiment information from blogs and investigate ways to use such information for predicting product sales performance. Based on an analysis of the complex nature of sentiments, we propose Sentiment PLSA (S-PLSA), in which a blog entry is viewed as a document generated by a number of hidden sentiment factors. Training an S-PLSA model on the blog data enables us to obtain a succinct summary of the sentiment information embedded in the blogs. We then present ARSA, an autoregressive sentiment-aware model, to utilize the sentiment information captured by S-PLSA for predicting product sales performance. Extensive experiments were conducted on a movie data set. We compare ARSA with alternative models that do not take into account the sentiment information, as well as a model with a different feature selection method. Experiments confirm the effectiveness and superiority of the proposed approach.
- J. Bar-Ilan. An outsider's view on "topic-oriented blogging". In WWW Alt.'04 pages 28--34, 2004. Google ScholarDigital Library
- D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research 2003. Google ScholarDigital Library
- Angelo Dalli. System for spatio-temporal analysis of online news and blogs. In WWW '06 pages 929--930, 2006. Google ScholarDigital Library
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statistical Society B(39):1--38, 1977.Google Scholar
- Miles Efron. The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents. In CIKM '04 pages 390--398, 2004. Google ScholarDigital Library
- Walter Enders. Applied Econometric Time Series Wiley, New York, 2nd edition, 2004.Google Scholar
- Daniel Gruhl, R. Guha, Ravi Kumar, Jasmine Novak, and Andrew Tomkins. The predictive power of online chatter. In KDD '05 pages 78--87, 2005. Google ScholarDigital Library
- Daniel Gruhl, R. Guha, David Liben-Nowell, and Andrew Tomkins. Information di. usion through blogspace. In WWW'04 pages 491--501, 2004. Google ScholarDigital Library
- Thomas Hofmann. Probabilistic latent semantic analysis. In UAI'99 1999. Google ScholarDigital Library
- Wolfgang Jank, Galit Shmueli, and Shanshan Wang. Dynamic, real-time forecasting of online auctions via functional models. In KDD '06 pages 580--585, 2006. Google ScholarDigital Library
- Jaap Kamps and Maarten Marx. Words with attitude. In Proc. of the First International Conference on Global WordNet pages 332--341, 2002.Google Scholar
- Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. On the bursty evolution of blogspace. In WWW '03 pages 568--576, 2003. Google ScholarDigital Library
- Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. Structure and evolution of blogspace. Commun. ACM 47(12):35--39, 2004. Google ScholarDigital Library
- Zhiwei Li, Bin Wang, Mingjing Li, and Wei-Ying Ma. A probabilistic model for retrospective news event detection. In SIGIR '05 pages 106--113, 2005. Google ScholarDigital Library
- Bing Liu, Minqing Hu, and Junsheng Cheng. Opinion observer: analyzing and comparing opinions on the web. In WWW '05 pages 342--351, 2005. Google ScholarDigital Library
- Qiaozhu Mei, Chao Liu, Hang Su, and ChengXiang Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In WWW '06 pages 533--542, 2006. Google ScholarDigital Library
- Qiaozhu Mei and ChengXiang Zhai. A mixture model for contextual text mining. In KDD '06 pages 649--655, 2006. Google ScholarDigital Library
- Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In ACL '04 pages 271--278, 2004. Google ScholarDigital Library
- Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL '05 pages 115--124, 2005. Google ScholarDigital Library
- Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proc. of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2002. Google ScholarDigital Library
- Technorati. URL:http://technorati. com/about/. Retrieved on January 27, 2007.Google Scholar
- B. L. Tseng, J. Tatemura, and Y. Wu. Tomographic clustering to visualize blog communities as mountain views. In Proc. of 2nd Annual Workshop on the Weblogging Ecosystem 2005.Google Scholar
- Peter D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classi. cation of reviews. In ACL '02 pages 417--424, 2001. Google ScholarDigital Library
- Casey Whitelaw, Navendu Garg, and Shlomo Argamon. Using appraisal groups for sentiment analysis. In CIKM '05 pages 625--631, 2005. Google ScholarDigital Library
- Zhu Zhang and Balaji Varadarajan. Utility scoring of product reviews. In CIKM '06 pages 51--57, 2006. Google ScholarDigital Library
Index Terms
- ARSA: a sentiment-aware model for predicting sales performance using blogs
Recommendations
Comments-oriented blog summarization by sentence extraction
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge managementMuch existing research on blogs focused on posts only, ignoring their comments. Our user study conducted on summarizing blog posts, however, showed that reading comments does change one's understanding about blog posts. In this research, we aim to ...
Sentiment analysis of greek tweets and hashtags using a sentiment lexicon
PCI '15: Proceedings of the 19th Panhellenic Conference on InformaticsThe rapid growth of social media has rendered opinion and sentiment mining an important area of research with a wide range of applications. We focus on the Greek language and the microblogging platform "Twitter", investigating methods for extracting ...
Unsupervised Artificial Neural Nets for Modeling Movie Sentiment
CICSYN '10: Proceedings of the 2010 2nd International Conference on Computational Intelligence, Communication Systems and NetworksSentiment mining aims at extracting features on which users express their opinions in order to determine the user’s sentiment towards the query object. Movie sentiment in Twitter provides an excellent base upon which to evaluate sentiment mining ...
Comments