skip to main content
10.1145/2736277.2741637acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts

Authors Info & Claims
Published:18 May 2015Publication History

ABSTRACT

Many previous techniques identify trending topics in social media, even topics that are not pre-defined. We present a technique to identify trending rumors, which we define as topics that include disputed factual claims. Putting aside any attempt to assess whether the rumors are true or false, it is valuable to identify trending rumors as early as possible. It is extremely difficult to accurately classify whether every individual post is or is not making a disputed factual claim. We are able to identify trending rumors by recasting the problem as finding entire clusters of posts whose topic is a disputed factual claim.

The key insight is that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. If we can find signature text phrases that are used by a few people to express skepticism about factual claims and are rarely used to express anything else, we can use those as detectors for rumor clusters. Indeed, we have found a few phrases that seem to be used exactly that way, including: "Is this true?", "Really?", and "What?". Relatively few posts related to any particular rumor use any of these enquiry phrases, but lots of rumor diffusion processes have some posts that do and have them quite early in the diffusion.

We have developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases. We then rank the clusters by their likelihood of really containing a disputed factual claim. The detector, which searches for the very rare but very informative phrases, combined with clustering and a classifier on the clusters, yields surprisingly good performance. On a typical day of Twitter, about a third of the top 50 clusters were judged to be rumors, a high enough precision that human analysts might be willing to sift through them.

References

  1. L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and regression trees. CRC press, 1984.Google ScholarGoogle Scholar
  2. A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings, pages 21--29. IEEE, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Caruana and A. Niculescu-Mizil. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning, pages 161--168. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web, pages 675--684. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. H. Chi. Information seeking can be social. IEEE Computer, 42(3):42--46, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Cortes and V. Vapnik. Support-vector networks. Machine learning, 20(3):273--297, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Dagan, O. Glickman, and B. Magnini. The pascal recognising textual entailment challenge. In Machine learning challenges. evaluating predictive uncertainty, visual object classification, and recognising tectual entailment, pages 177--190. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. DiFonzo and P. Bordia. Rumor psychology: Social and organizational approaches. American Psychological Association, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Domm. False rumor of explosion at white house causes stocks to briefly plunge; ap confirms its twitter feed was hacked., April 2013.Google ScholarGoogle Scholar
  11. G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, pages 457--479, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Forman. An extensive empirical study of feature selection metrics for text classification. The Journal of machine learning research, 3:1289--1305, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Friggeri, L. A. Adamic, D. Eckles, and J. Cheng. Rumor cascades. In Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, 2014.Google ScholarGoogle Scholar
  14. A. Gupta and P. Kumaraguru. Credibility ranking of tweets during high impact events. In Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, page 2. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Gupta, H. Lamba, and P. Kumaraguru. $1.00 per rt# bostonmarathon# prayforboston: Analyzing fake content on twitter. In eCrime Researchers Summit (eCRS), 2013, pages 1--12. IEEE, 2013.Google ScholarGoogle Scholar
  16. A. Gupta, H. Lamba, P. Kumaraguru, and A. Joshi. Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd international conference on World Wide Web companion, pages 729--736. International World Wide Web Conferences Steering Committee, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Kwon, M. Cha, K. Jung, W. Chen, and Y. Wang. Prominent features of rumor propagation in online social media. In Data Mining (ICDM), 2013 IEEE 13th International Conference on, pages 1103--1108. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 497--506. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281--297. Oakland, CA, USA., 1967.Google ScholarGoogle Scholar
  20. M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 1155--1158. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we rt? In Proceedings of the first workshop on social media analytics, pages 71--79. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. R. Morris, S. Counts, A. Roseway, A. Hoff, and J. Schwarz. Tweeting is believing?: understanding microblog credibility perceptions. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pages 441--450. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. R. Morris, J. Teevan, and K. Panovich. A comparison of information seeking using search engines and social networks. ICWSM, 10:23--26, 2010.Google ScholarGoogle Scholar
  24. M. R. Morris, J. Teevan, and K. Panovich. What do people ask their social networks, and why?: a survey study of status message q&a behavior. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 1739--1748. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. A. Paul, L. Hong, and E. H. Chi. Is twitter a good place for asking questions? a characterization study. In ICWSM, 2011.Google ScholarGoogle Scholar
  26. S. C. Pendleton. Rumor research revisited and expanded. Language & Communication, 18(1):69--86, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  27. M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  28. V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1589--1599. Association for Computational Linguistics, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, A. Flammini, and F. Menczer. Detecting and tracking political abuse in social media. In ICWSM, 2011.Google ScholarGoogle Scholar
  30. J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. Truthy: mapping the spread of astroturf in microblog streams. In Proceedings of the 20th international conference companion on World wide web, pages 249--252. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. L. Rosnow. Inside rumor: A personal journey. American Psychologist, 46(5):484, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  32. E. Seo, P. Mohapatra, and T. Abdelzaher. Identifying rumors and their sources in social networks. In SPIE Defense, Security, and Sensing, pages 83891I--83891I. International Society for Optics and Photonics, 2012.Google ScholarGoogle Scholar
  33. S. Sun, H. Liu, J. He, and X. Du. Detecting event rumors on sina weibo automatically. In Web Technologies and Applications, pages 120--131. Springer, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  34. T. Takahashi and N. Igata. Rumor detection on twitter. In Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Conference on, pages 452--457. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  35. F. Yang, Y. Liu, X. Yu, and M. Yang. Automatic detection of rumor on sina weibo. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, page 13. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Yang, M. R. Morris, J. Teevan, L. A. Adamic, and M. S. Ackerman. Culture matters: A survey study of social q&a behavior. ICWSM, 11:409--416, 2011.Google ScholarGoogle Scholar
  37. Z. Zhao and Q. Mei. Questions about questions: An empirical analysis of information needs on twitter. In Proceedings of the 22nd international conference on World Wide Web, pages 1545--1556. International World Wide Web Conferences Steering Committee, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '15: Proceedings of the 24th International Conference on World Wide Web
      May 2015
      1460 pages
      ISBN:9781450334693

      Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      • Published: 18 May 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WWW '15 Paper Acceptance Rate131of929submissions,14%Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader