skip to main content
10.1145/1277741.1277845acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

ARSA: a sentiment-aware model for predicting sales performance using blogs

Published:23 July 2007Publication History

ABSTRACT

Due to its high popularity, Weblogs (or blogs in short) present a wealth of information that can be very helpful in assessing the general public's sentiments and opinions. In this paper, we study the problem of mining sentiment information from blogs and investigate ways to use such information for predicting product sales performance. Based on an analysis of the complex nature of sentiments, we propose Sentiment PLSA (S-PLSA), in which a blog entry is viewed as a document generated by a number of hidden sentiment factors. Training an S-PLSA model on the blog data enables us to obtain a succinct summary of the sentiment information embedded in the blogs. We then present ARSA, an autoregressive sentiment-aware model, to utilize the sentiment information captured by S-PLSA for predicting product sales performance. Extensive experiments were conducted on a movie data set. We compare ARSA with alternative models that do not take into account the sentiment information, as well as a model with a different feature selection method. Experiments confirm the effectiveness and superiority of the proposed approach.

References

  1. J. Bar-Ilan. An outsider's view on "topic-oriented blogging". In WWW Alt.'04 pages 28--34, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Angelo Dalli. System for spatio-temporal analysis of online news and blogs. In WWW '06 pages 929--930, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statistical Society B(39):1--38, 1977.Google ScholarGoogle Scholar
  5. Miles Efron. The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents. In CIKM '04 pages 390--398, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Walter Enders. Applied Econometric Time Series Wiley, New York, 2nd edition, 2004.Google ScholarGoogle Scholar
  7. Daniel Gruhl, R. Guha, Ravi Kumar, Jasmine Novak, and Andrew Tomkins. The predictive power of online chatter. In KDD '05 pages 78--87, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Daniel Gruhl, R. Guha, David Liben-Nowell, and Andrew Tomkins. Information di. usion through blogspace. In WWW'04 pages 491--501, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Thomas Hofmann. Probabilistic latent semantic analysis. In UAI'99 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wolfgang Jank, Galit Shmueli, and Shanshan Wang. Dynamic, real-time forecasting of online auctions via functional models. In KDD '06 pages 580--585, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jaap Kamps and Maarten Marx. Words with attitude. In Proc. of the First International Conference on Global WordNet pages 332--341, 2002.Google ScholarGoogle Scholar
  12. Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. On the bursty evolution of blogspace. In WWW '03 pages 568--576, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. Structure and evolution of blogspace. Commun. ACM 47(12):35--39, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Zhiwei Li, Bin Wang, Mingjing Li, and Wei-Ying Ma. A probabilistic model for retrospective news event detection. In SIGIR '05 pages 106--113, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bing Liu, Minqing Hu, and Junsheng Cheng. Opinion observer: analyzing and comparing opinions on the web. In WWW '05 pages 342--351, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Qiaozhu Mei, Chao Liu, Hang Su, and ChengXiang Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In WWW '06 pages 533--542, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Qiaozhu Mei and ChengXiang Zhai. A mixture model for contextual text mining. In KDD '06 pages 649--655, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In ACL '04 pages 271--278, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL '05 pages 115--124, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proc. of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Technorati. URL:http://technorati. com/about/. Retrieved on January 27, 2007.Google ScholarGoogle Scholar
  22. B. L. Tseng, J. Tatemura, and Y. Wu. Tomographic clustering to visualize blog communities as mountain views. In Proc. of 2nd Annual Workshop on the Weblogging Ecosystem 2005.Google ScholarGoogle Scholar
  23. Peter D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classi. cation of reviews. In ACL '02 pages 417--424, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Casey Whitelaw, Navendu Garg, and Shlomo Argamon. Using appraisal groups for sentiment analysis. In CIKM '05 pages 625--631, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Zhu Zhang and Balaji Varadarajan. Utility scoring of product reviews. In CIKM '06 pages 51--57, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ARSA: a sentiment-aware model for predicting sales performance using blogs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
      July 2007
      946 pages
      ISBN:9781595935977
      DOI:10.1145/1277741

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 July 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader