skip to main content
10.1145/2872427.2882985acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Modeling a Retweet Network via an Adaptive Bayesian Approach

Authors Info & Claims
Published:11 April 2016Publication History

ABSTRACT

Twitter (and similar microblogging services) has become a central nexus for discussion of the topics of the day. Twitter data contains rich content and structured information on users' topics of interest and behavior patterns. Correctly analyzing and modeling Twitter data enables the prediction of the user behavior and preference in a variety of practical applications, such as tweet recommendation and followee recommendation. Although a number of models have been developed on Twitter data in prior work, most of these only model the tweets from users, while neglecting their valuable retweet information in the data. Models would enhance their predictive power by incorporating users' retweet content as well as their retweet behavior. In this paper, we propose two novel Bayesian nonparametric models, URM and UCM, on retweet data. Both of them are able to integrate the analysis of tweet text and users' retweet behavior in the same probabilistic framework. Moreover, they both jointly model users' interest in tweet and retweet. As nonparametric models, URM and UCM can automatically determine the parameters of the models based on input data, avoiding arbitrary parameter settings. Extensive experiments on real-world Twitter data show that both URM and UCM are superior to all the baselines, while UCM further outperforms URM, confirming the appropriateness of our models in retweet modeling.

References

  1. A. Ahmed and E. P. Xing. Dynamic non-parametric mixture models and the recurrent chinese restaurant process: with applications to evolutionary clustering. In Proceedings of the SIAM International Conference on Data Mining, SDM 2008, April 24--26, 2008, Atlanta, Georgia, USA, pages 219--230, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  2. C. E. Antoniak. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6):1152--1174, 11 1974.Google ScholarGoogle ScholarCross RefCross Ref
  3. Y. Artzi, P. Pantel, and M. Gamon. Predicting responses to microblog posts. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT '12, pages 602--606, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Baralis, T. Cerquitelli, S. Chiusano, L. Grimaudo, and X. Xiao. Analysis of twitter data using a multiple-level clustering strategy. In A. Cuzzocrea and S. Maabout, editors, MEDI, volume 8216 of Lecture Notes in Computer Science, pages 13--24. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Bi, Y. Tian, Y. Sismanis, A. Balmin, and J. Cho. Scalable topic-specific influence analysis on microblogs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM '14, pages 513--522, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. R. Bild, Y. Liu, R. P. Dick, Z. M. Mao, and D. S. Wallach. Aggregate characterization of user behavior in twitter and analysis of the retweet graph. ACM Trans. Internet Technol., 15(1):4:1--4:24, Mar. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Blackwell and J. B. MacQueen. Ferguson distributions via polya urn schemes. The Annals of Statistics, 1(2):353--355, 03 1973.Google ScholarGoogle ScholarCross RefCross Ref
  8. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, Mar. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Boyd, S. Golder, and G. Lotan. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, HICSS '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Chang, S. Gerrish, C. Wang, J. L. Boyd-graber, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 288--296. Curran Associates, Inc., 2009.Google ScholarGoogle Scholar
  11. W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '09, pages 199--208, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Cheong and V. Lee. Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base. In Proceedings of the 2Nd ACM Workshop on Social Web Search and Mining, SWSM '09, pages 1--8, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Comarela, M. Crovella, V. Almeida, and F. Benevenuto. Understanding factors that affect response rates in twitter. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, HT '12, pages 123--132, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Dai, A. Sun, and X.-Y. Liu. Crest: Cluster-based representation enrichment for short text classification. In J. Pei, V. S. Tseng, L. Cao, H. Motoda, and G. Xu, editors, PAKDD (2), volume 7819 of Lecture Notes in Computer Science, pages 256--267. Springer, 2013.Google ScholarGoogle Scholar
  15. T. S. Ferguson. A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1(2):209--230, 1973. Google ScholarGoogle ScholarCross RefCross Ref
  16. K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, HLT '11, pages 42--47, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. H. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Trans. on Knowl. and Data Eng., 15(4):784--796, July 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Hong, O. Dan, and B. D. Davison. Predicting popular messages in twitter. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW '11, pages 57--58, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03, pages 137--146, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. W. Lim, C. Chen, and W. Buntine. Twitter-Network topic model: A full bayesian treatment for social network and text modeling. In NIPS2013 Topic Model workshop, page 4, Australia, Dec 2013.Google ScholarGoogle Scholar
  21. S. A. Macskassy and M. Michelson. Why do people retweet? anti-homophily wins the day! In L. A. Adamic, R. A. Baeza-Yates, and S. Counts, editors, ICWSM. The AAAI Press, 2011.Google ScholarGoogle Scholar
  22. M. Mathioudakis and N. Koudas. Twittermonitor: Trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pages 1155--1158, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. M. Neal. Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249--265, 2000. Google ScholarGoogle ScholarCross RefCross Ref
  24. P. Orbanz and Y. W. Teh. Bayesian nonparametric models. In Encyclopedia of Machine Learning. Springer, 2010.Google ScholarGoogle Scholar
  25. I. Porteous. Networks of Mixture Blocks for Non Parametric Bayesian Models with Applications. PhD thesis, Long Beach, CA, USA, 2010. AAI3403449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. P. Robert and G. Casella. Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2005. Google ScholarGoogle Scholar
  27. K. D. Rosa, R. Shah, B. Lin, A. Gershman, and R. Frederking. Topical Clustering of Tweets. Proceedings of the ACM SIGIR: SWSM, 2011.Google ScholarGoogle Scholar
  28. J. Sethuraman. A constructive definition of Dirichlet priors. Statistica Sinica, 4:639--650, 1994.Google ScholarGoogle Scholar
  29. Y. W. Teh and M. I. Jordan. Hierarchical Bayesian nonparametric models with applications. In N. Hjort, C. Holmes, P. Müller, and S. Walker, editors, Bayesian Nonparametrics: Principles and Practice. Cambridge University Press, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  30. Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):pp. 1566--1581, 2006. Google ScholarGoogle ScholarCross RefCross Ref
  31. M. J. Welch, U. Schonfeld, D. He, and J. Cho. Topical semantics of twitter links. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM '11, pages 327--336, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: Finding topic-sensitive influential twitterers. In Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM '10, pages 261--270, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Yang and S. Counts. Predicting the speed, scale, and range of information diffusion in Twitter. In 4th International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010.Google ScholarGoogle Scholar
  34. Z. Yang, J. Guo, K. Cai, J. Tang, J. Li, L. Zhang, and Z. Su. Understanding retweeting behaviors in social networks. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, pages 1633--1636, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. T. R. Zaman, R. Herbrich, J. V. Gael, and D. Stern. Predicting information spreading in twitter. In Computational Social Science and the Wisdom of Crowds Workshop (colocated with NIPS 2010), December 2010.Google ScholarGoogle Scholar
  36. W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In Proceedings of the 33rd European Conference on Advances in Information Retrieval, ECIR'11, pages 338--349, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling a Retweet Network via an Adaptive Bayesian Approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '16: Proceedings of the 25th International Conference on World Wide Web
      April 2016
      1482 pages
      ISBN:9781450341431

      Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      • Published: 11 April 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader