skip to main content
10.1145/2185354.2185356acmotherconferencesArticle/Chapter ViewAbstractPublication PagespsosmConference Proceedingsconference-collections
research-article

Credibility ranking of tweets during high impact events

Authors Info & Claims
Published:17 April 2012Publication History

ABSTRACT

Twitter has evolved from being a conversation or opinion sharing medium among friends into a platform to share and disseminate information about current events. Events in the real world create a corresponding spur of posts (tweets) on Twitter. Not all content posted on Twitter is trustworthy or useful in providing information about the event. In this paper, we analyzed the credibility of information in tweets corresponding to fourteen high impact news events of 2011 around the globe. From the data we analyzed, on average 30% of total tweets posted about an event contained situational information about the event while 14% was spam. Only 17% of the total tweets posted about the event contained situational awareness information that was credible. Using regression analysis, we identified the important content and sourced based features, which can predict the credibility of information in a tweet. Prominent content based features were number of unique characters, swear words, pronouns, and emoticons in a tweet, and user based features like the number of followers and length of username. We adopted a supervised machine learning and relevance feedback approach using the above features, to rank tweets according to their credibility score. The performance of our ranking algorithm significantly enhanced when we applied re-ranking strategy. Results show that extraction of credible information from Twitter can be automated with high confidence.

References

  1. F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on Twitter. In CEAS, 2010.Google ScholarGoogle Scholar
  2. C. Buckley, G. Salton, and J. Allan. Automatic retrieval with locality information using SMART. NIST special publication, (500207):59--72, 1993.Google ScholarGoogle Scholar
  3. K. R. Canini, B. Suh, and P. L. Pirolli. Finding credible information sources in social networks based on content and social structure. In SocialCom, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. Castillo, M. Mendoza, and B. Poblete. Information Credibility on Twitter. In WWW, pages 675--684, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: experiments on recommending content from information streams. CHI '10, pages 1185--1194, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chhabra, A. Aggarwal, F. Benevenuto, and P. Kumaraguru. Phi.sh/$ocial: the phishing landscape through short urls. CEAS 2011, pages 92--101, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. De Longueville, R. S. Smith, and G. Luraschi. "omg, from here, i can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires, LBSN, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. WWW '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Duan, L. Jiang, T. Qin, M. Zhou, and H.-Y. Shum. An empirical study on learning to rank of tweets. In COLING '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Grier, K. Thomas, V. Paxson, and M. Zhang. @spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on Computer and communications security, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Gupta and P. Kumaraguru. Twitter explodes with activity in mumbai blasts! a lifeline or an unmonitored daemon in the lurking? IIIT, Delhi, Technical report, IIITD-TR-2011-005, 2011.Google ScholarGoogle Scholar
  12. A. l. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 2009 ISCRAM Conference, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. L. Hughes and L. Palen. Twitter adoption and use in crisis twitter adoption and use in mass convergence and emergency events. In ISCRAM, 2010.Google ScholarGoogle Scholar
  14. K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20:2002, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Joachims. Optimizing search engines using clickthrough data. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? WWW '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. R. Landis and G. G. Koch. The Measurement of Observer Agreement for Categorical Data. Biometrics, 33(1):159--174, Mar. 1977.Google ScholarGoogle ScholarCross RefCross Ref
  18. M. Mendoza, B. Poblete, and C. Castillo. In SOMA, July.Google ScholarGoogle Scholar
  19. B. O'Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In Proceedings of the International AAAI Conference on Weblogs and Social Media, 2010.Google ScholarGoogle Scholar
  20. O. Oh, M. Agrawal, and H. R. Rao. Information control and terrorism: Tracking the mumbai terrorist attack through twitter. Information Systems Frontiers, 13(1):33--43, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  22. J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. Truthy: mapping the spread of astroturf in microblog streams. WWW '11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. E. Robertson, S. Walker, and M. Beaulieu. Okapi at trec-7: automatic ad hoc, filtering, vlc and interactive track. IN, 1999.Google ScholarGoogle Scholar
  24. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. WWW '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Verma, S. Vieweg, W. J. Corvey, L. Palen, J. H. Martin, M. Palmer, A. Schram, and K. M. Anderson. Nlp to the rescue? extracting "situational awareness" tweets during mass emergency. ICWSM, 2011.Google ScholarGoogle Scholar
  26. S. Vieweg, A. L. Hughes, K. Starbird, and L. Palen. Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In CHI, CHI '10, pages 1079--1088, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Yardi, D. Romero, G. Schoenebeck, and D. Boyd. Detecting spam in a Twitter network. First Monday, 15(1), Jan. 2010.Google ScholarGoogle Scholar
  28. W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR'11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Credibility ranking of tweets during high impact events

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media
        April 2012
        43 pages
        ISBN:9781450312363
        DOI:10.1145/2185354

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 April 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PSOSM '12 Paper Acceptance Rate7of21submissions,33%Overall Acceptance Rate7of21submissions,33%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader