research-article

Modeling a Retweet Network via an Adaptive Bayesian Approach

Authors:
Bin Bi

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Junghoo Cho

University of California, Los Angeles, Los Angeles, CA, USA

University of California, Los Angeles, Los Angeles, CA, USA
View Profile

WWW '16: Proceedings of the 25th International Conference on World Wide WebApril 2016Pages 459–469https://doi.org/10.1145/2872427.2882985

Published:11 April 2016Publication History

WWW '16: Proceedings of the 25th International Conference on World Wide Web

Pages 459–469

ABSTRACT

Twitter (and similar microblogging services) has become a central nexus for discussion of the topics of the day. Twitter data contains rich content and structured information on users' topics of interest and behavior patterns. Correctly analyzing and modeling Twitter data enables the prediction of the user behavior and preference in a variety of practical applications, such as tweet recommendation and followee recommendation. Although a number of models have been developed on Twitter data in prior work, most of these only model the tweets from users, while neglecting their valuable retweet information in the data. Models would enhance their predictive power by incorporating users' retweet content as well as their retweet behavior. In this paper, we propose two novel Bayesian nonparametric models, URM and UCM, on retweet data. Both of them are able to integrate the analysis of tweet text and users' retweet behavior in the same probabilistic framework. Moreover, they both jointly model users' interest in tweet and retweet. As nonparametric models, URM and UCM can automatically determine the parameters of the models based on input data, avoiding arbitrary parameter settings. Extensive experiments on real-world Twitter data show that both URM and UCM are superior to all the baselines, while UCM further outperforms URM, confirming the appropriateness of our models in retweet modeling.

References

A. Ahmed and E. P. Xing. Dynamic non-parametric mixture models and the recurrent chinese restaurant process: with applications to evolutionary clustering. In Proceedings of the SIAM International Conference on Data Mining, SDM 2008, April 24--26, 2008, Atlanta, Georgia, USA, pages 219--230, 2008. Google ScholarCross Ref
C. E. Antoniak. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6):1152--1174, 11 1974.Google ScholarCross Ref
Y. Artzi, P. Pantel, and M. Gamon. Predicting responses to microblog posts. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT '12, pages 602--606, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics. Google ScholarDigital Library
E. Baralis, T. Cerquitelli, S. Chiusano, L. Grimaudo, and X. Xiao. Analysis of twitter data using a multiple-level clustering strategy. In A. Cuzzocrea and S. Maabout, editors, MEDI, volume 8216 of Lecture Notes in Computer Science, pages 13--24. Springer, 2013. Google ScholarDigital Library
B. Bi, Y. Tian, Y. Sismanis, A. Balmin, and J. Cho. Scalable topic-specific influence analysis on microblogs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM '14, pages 513--522, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
D. R. Bild, Y. Liu, R. P. Dick, Z. M. Mao, and D. S. Wallach. Aggregate characterization of user behavior in twitter and analysis of the retweet graph. ACM Trans. Internet Technol., 15(1):4:1--4:24, Mar. 2015. Google ScholarDigital Library
D. Blackwell and J. B. MacQueen. Ferguson distributions via polya urn schemes. The Annals of Statistics, 1(2):353--355, 03 1973.Google ScholarCross Ref
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, Mar. 2003. Google ScholarDigital Library
D. Boyd, S. Golder, and G. Lotan. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, HICSS '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-graber, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 288--296. Curran Associates, Inc., 2009.Google Scholar
W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '09, pages 199--208, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
M. Cheong and V. Lee. Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base. In Proceedings of the 2Nd ACM Workshop on Social Web Search and Mining, SWSM '09, pages 1--8, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
G. Comarela, M. Crovella, V. Almeida, and F. Benevenuto. Understanding factors that affect response rates in twitter. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, HT '12, pages 123--132, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
Z. Dai, A. Sun, and X.-Y. Liu. Crest: Cluster-based representation enrichment for short text classification. In J. Pei, V. S. Tseng, L. Cao, H. Motoda, and G. Xu, editors, PAKDD (2), volume 7819 of Lecture Notes in Computer Science, pages 256--267. Springer, 2013.Google Scholar
T. S. Ferguson. A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1(2):209--230, 1973. Google ScholarCross Ref
K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, HLT '11, pages 42--47, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
T. H. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Trans. on Knowl. and Data Eng., 15(4):784--796, July 2003. Google ScholarDigital Library
L. Hong, O. Dan, and B. D. Davison. Predicting popular messages in twitter. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW '11, pages 57--58, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03, pages 137--146, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
K. W. Lim, C. Chen, and W. Buntine. Twitter-Network topic model: A full bayesian treatment for social network and text modeling. In NIPS2013 Topic Model workshop, page 4, Australia, Dec 2013.Google Scholar
S. A. Macskassy and M. Michelson. Why do people retweet? anti-homophily wins the day! In L. A. Adamic, R. A. Baeza-Yates, and S. Counts, editors, ICWSM. The AAAI Press, 2011.Google Scholar
M. Mathioudakis and N. Koudas. Twittermonitor: Trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pages 1155--1158, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
R. M. Neal. Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249--265, 2000. Google ScholarCross Ref
P. Orbanz and Y. W. Teh. Bayesian nonparametric models. In Encyclopedia of Machine Learning. Springer, 2010.Google Scholar
I. Porteous. Networks of Mixture Blocks for Non Parametric Bayesian Models with Applications. PhD thesis, Long Beach, CA, USA, 2010. AAI3403449. Google ScholarDigital Library
C. P. Robert and G. Casella. Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2005. Google Scholar
K. D. Rosa, R. Shah, B. Lin, A. Gershman, and R. Frederking. Topical Clustering of Tweets. Proceedings of the ACM SIGIR: SWSM, 2011.Google Scholar
J. Sethuraman. A constructive definition of Dirichlet priors. Statistica Sinica, 4:639--650, 1994.Google Scholar
Y. W. Teh and M. I. Jordan. Hierarchical Bayesian nonparametric models with applications. In N. Hjort, C. Holmes, P. Müller, and S. Walker, editors, Bayesian Nonparametrics: Principles and Practice. Cambridge University Press, 2010. Google ScholarCross Ref
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):pp. 1566--1581, 2006. Google ScholarCross Ref
M. J. Welch, U. Schonfeld, D. He, and J. Cho. Topical semantics of twitter links. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM '11, pages 327--336, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: Finding topic-sensitive influential twitterers. In Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM '10, pages 261--270, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
J. Yang and S. Counts. Predicting the speed, scale, and range of information diffusion in Twitter. In 4th International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010.Google Scholar
Z. Yang, J. Guo, K. Cai, J. Tang, J. Li, L. Zhang, and Z. Su. Understanding retweeting behaviors in social networks. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, pages 1633--1636, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
T. R. Zaman, R. Herbrich, J. V. Gael, and D. Stern. Predicting information spreading in twitter. In Computational Social Science and the Wisdom of Crowds Workshop (colocated with NIPS 2010), December 2010.Google Scholar
W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In Proceedings of the 33rd European Conference on Advances in Information Retrieval, ECIR'11, pages 338--349, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarDigital Library

Index Terms

Modeling a Retweet Network via an Adaptive Bayesian Approach
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing systems and tools
      1. Social networking sites

Recommendations

Who will retweet me?: finding retweeters in twitter
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

An important aspect of communication in Twitter (and other Social Network is message propagation -- people creating posts for others to share. Although there has been work on modelling how tweets in Twitter are propagated (retweeted), an untackled ...
Read More
Retweet Behavior Prediction in Twitter
ISCID '14: Proceedings of the 2014 Seventh International Symposium on Computational Intelligence and Design - Volume 02

Retweet, as a main way to spread information in twitter, has been researched in a number of works. Recently research focuses on analyzing the factors of retweet behavior. However, the prediction on retweet behavior is a new challenge which is not well ...
Read More
Bad news travel fast: a content-based analysis of interestingness on Twitter
WebSci '11: Proceedings of the 3rd International Web Science Conference

On the microblogging site Twitter, users can forward any message they receive to all of their followers. This is called a retweet and is usually done when users find a message particularly interesting and worth sharing with others. Thus, retweets reflect ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '16: Proceedings of the 25th International Conference on World Wide Web
April 2016
1482 pages
ISBN:9781450341431
General Chairs:
Jacqueline Bourdeau
Tele-university (TELUQ), Montreal, QC, Canada
,
Jim A. Hendler
Rensselaer Polytechnic Institute, Troy, NY, USA
,
Roger Nkambou Nkambou
Université du Québec à Montréal, Montreal, QC, Canada
,
Program Chairs:
Ian Horrocks
University of Oxford, UK
,
Ben Y. Zhao
University of California at Santa Barbara, CA, USA
Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 11 April 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bayesian nonparametric
retweet
topic modeling
twitter modeling
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 547
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling a Retweet Network via an Adaptive Bayesian Approach

WWW '16: Proceedings of the 25th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Who will retweet me?: finding retweeters in twitter

Retweet Behavior Prediction in Twitter

Bad news travel fast: a content-based analysis of interestingness on Twitter

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Modeling a Retweet Network via an Adaptive Bayesian Approach

WWW '16: Proceedings of the 25th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Who will retweet me?: finding retweeters in twitter

Retweet Behavior Prediction in Twitter

Bad news travel fast: a content-based analysis of interestingness on Twitter

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media