skip to main content
10.1145/2505515.2505667acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem and inter-tweet agreement

Published:27 October 2013Publication History

ABSTRACT

The increasing popularity of Twitter renders improved trust- worthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweets? content alone. We present a novel ranking method called RAProp, which combines two orthogonal measures of relevance and trustworthiness of a tweet. The first, called Feature Score, measures the trustworthiness of the source of the tweet by extracting features from a 3-layer Twitter ecosystem consisting of users, tweets and webpages. The second measure, called agreement analysis, estimates the trustworthiness of the content of a tweet by analyzing whether the content is independently corroborated by other tweets. We view the candidate result set of tweets as the vertices of a graph, with the edges measuring the estimated agreement between each pair of tweets. The feature score is propagated over this agreement graph to compute the top-k tweets that have both trustworthy sources and independent corroboration. The evaluation of our method on 16 million tweets from the TREC 2011 Microblog Dataset shows that for top-30 precision, we achieve 53% better precision than the current best performing method on the data set, and an improvement of 300% over current Twitter Search.

Skip Supplemental Material Section

Supplemental Material

References

  1. M.-A. Abbasi and H. Liu. Measuring user credibility in social media. In Social Computing, Behavioral-Cultural Modeling and Prediction, pages 441--448. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Twitter speaks, markets listen and fears rise. http://nyti.ms/ZuoSkj.Google ScholarGoogle Scholar
  3. R. Baeza-Yates, C. Castillo, V. López, and C. Telefónica. Pagerank increase under different collusion topologies. In Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), pages 17--24, 2005.Google ScholarGoogle Scholar
  4. R. Balakrishnan and S. Kambhampati. Sourcerank: Relevance and trust assessment for deep web sources based on inter-source agreement. In Proceedings of WWW, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, pages 107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In Proceedings of WWW, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Choi, B. Croft, and J. K. Kim. Quality models for microblog retrieval. In Proceedings of CIKM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Cohen, P. Ravikumar, and S. Fienberg. A comparison of string distance metrics for name-matching tasks. In Proceedings of IIWeb, pages 73--78, 2003.Google ScholarGoogle Scholar
  10. K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith. Part-of-speech tagging for twitter: annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, pages 42--47. Association for Computational Linguistics, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Gupta and J. Han. Heterogeneous network-based trust analysis: a survey. ACM SIGKDD Explorations, pages 54--71, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Gupta, P. Zhao, and J. Han. Evaluating event credibility on twitter. In SMD, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  13. Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 576--587. VLDB Endowment, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Jabeur, L. Tamine, and M. Boughanem. Featured tweet search: Modeling time and social influence for microblog retrieval. In IEEE/WIC/ACM International Conference on Web Intelligence, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Jiang, L. Hidayah, T. Elsayed, and H. Ramadan. Best of kaust at trec-2011: Building effective search in twitter. In Proceedings of the 20th Text REtrieval Conference (TREC 2011), 2012.Google ScholarGoogle Scholar
  16. R. McCreadie and C. Macdonald. Relevance in microblogs: Enhancing tweet retrieval using hyperlinked documents. 2012.Google ScholarGoogle Scholar
  17. D. Metzler and C. Cai. Usc/isi at trec 2011: Microblog track. In Proceedings of the Text REtrieval Conference (TREC 2011), 2011.Google ScholarGoogle Scholar
  18. R. Nagmoti, A. Teredesai, and M. De Cock. Ranking approaches for microblog search. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, volume 1, pages 153--157, 31 2010-sept. 3 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Twitter death hoaxes, alive and sadly, well. http://nyti.ms/10qVW9j.Google ScholarGoogle Scholar
  20. Trec 2011 microblog track. http://trec.nist.gov/data/tweets/.Google ScholarGoogle Scholar
  21. S. Ravikumar. RAProp: Ranking Tweets by Exploiting the Tweet/User/Web Ecosystem. PhD thesis, ARIZONA STATE UNIVERSITY, 2013.Google ScholarGoogle Scholar
  22. R. Socher, E. H. Huang, J. Pennington, A. Y. Ng, and C. D. Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Advances in Neural Information Processing Systems, 24:801--809, 2011.Google ScholarGoogle Scholar
  23. J. Teevan, D. Ramage, and M. R. Morris.#twittersearch: a comparison of microblog search and web search. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 35--44. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zombie followers and fake re-tweets. http://www.economist.com/node/21550333.Google ScholarGoogle Scholar
  25. State of twitter spam. http://bit.ly/d5PLDO.Google ScholarGoogle Scholar
  26. About top search results. http://bit.ly/IYssaa.Google ScholarGoogle Scholar
  27. Y. Yamaguchi, T. Takahashi, T. Amagasa, and H. Kitagawa. Turank: Twitter user ranking based on user-tweet graph analysis. In Web Information Systems Engineering--WISE 2010, pages 240--253. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Yang, J. Lee, S. Lee, and H. Rim. Finding interesting posts in twitter based on retweet graph analysis. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 1073--1074. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. RAProp: ranking tweets by exploiting the tweet/user/web ecosystem and inter-tweet agreement

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
        October 2013
        2612 pages
        ISBN:9781450322638
        DOI:10.1145/2505515

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 October 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader