research-article

Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts

Authors:
Zhe Zhao

University of Michigan, Ann Arbor, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, Ann Arbor, MI, USA
View Profile

,
Paul Resnick

University of Michigan, Ann Arbor, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, Ann Arbor, MI, USA
View Profile

,
Qiaozhu Mei

University of Michigan, Ann Arbor, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, Ann Arbor, MI, USA
View Profile

WWW '15: Proceedings of the 24th International Conference on World Wide WebMay 2015Pages 1395–1405https://doi.org/10.1145/2736277.2741637

Published:18 May 2015Publication History

WWW '15: Proceedings of the 24th International Conference on World Wide Web

Pages 1395–1405

ABSTRACT

Many previous techniques identify trending topics in social media, even topics that are not pre-defined. We present a technique to identify trending rumors, which we define as topics that include disputed factual claims. Putting aside any attempt to assess whether the rumors are true or false, it is valuable to identify trending rumors as early as possible. It is extremely difficult to accurately classify whether every individual post is or is not making a disputed factual claim. We are able to identify trending rumors by recasting the problem as finding entire clusters of posts whose topic is a disputed factual claim.

The key insight is that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. If we can find signature text phrases that are used by a few people to express skepticism about factual claims and are rarely used to express anything else, we can use those as detectors for rumor clusters. Indeed, we have found a few phrases that seem to be used exactly that way, including: "Is this true?", "Really?", and "What?". Relatively few posts related to any particular rumor use any of these enquiry phrases, but lots of rumor diffusion processes have some posts that do and have them quite early in the diffusion.

We have developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases. We then rank the clusters by their likelihood of really containing a disputed factual claim. The detector, which searches for the very rare but very informative phrases, combined with clustering and a classifier on the clusters, yields surprisingly good performance. On a typical day of Twitter, about a third of the top 50 clusters were judged to be rumors, a high enough precision that human analysts might be willing to sift through them.

References

L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and regression trees. CRC press, 1984.Google Scholar
A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings, pages 21--29. IEEE, 1997. Google ScholarDigital Library
R. Caruana and A. Niculescu-Mizil. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning, pages 161--168. ACM, 2006. Google ScholarDigital Library
C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web, pages 675--684. ACM, 2011. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011. Google ScholarDigital Library
E. H. Chi. Information seeking can be social. IEEE Computer, 42(3):42--46, 2009. Google ScholarDigital Library
C. Cortes and V. Vapnik. Support-vector networks. Machine learning, 20(3):273--297, 1995. Google ScholarDigital Library
I. Dagan, O. Glickman, and B. Magnini. The pascal recognising textual entailment challenge. In Machine learning challenges. evaluating predictive uncertainty, visual object classification, and recognising tectual entailment, pages 177--190. Springer, 2006. Google ScholarDigital Library
N. DiFonzo and P. Bordia. Rumor psychology: Social and organizational approaches. American Psychological Association, 2007.Google ScholarCross Ref
P. Domm. False rumor of explosion at white house causes stocks to briefly plunge; ap confirms its twitter feed was hacked., April 2013.Google Scholar
G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, pages 457--479, 2004. Google ScholarDigital Library
G. Forman. An extensive empirical study of feature selection metrics for text classification. The Journal of machine learning research, 3:1289--1305, 2003. Google ScholarDigital Library
A. Friggeri, L. A. Adamic, D. Eckles, and J. Cheng. Rumor cascades. In Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, 2014.Google Scholar
A. Gupta and P. Kumaraguru. Credibility ranking of tweets during high impact events. In Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, page 2. ACM, 2012. Google ScholarDigital Library
A. Gupta, H. Lamba, and P. Kumaraguru. $1.00 per rt# bostonmarathon# prayforboston: Analyzing fake content on twitter. In eCrime Researchers Summit (eCRS), 2013, pages 1--12. IEEE, 2013.Google Scholar
A. Gupta, H. Lamba, P. Kumaraguru, and A. Joshi. Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In Proceedings of the 22nd international conference on World Wide Web companion, pages 729--736. International World Wide Web Conferences Steering Committee, 2013. Google ScholarDigital Library
S. Kwon, M. Cha, K. Jung, W. Chen, and Y. Wang. Prominent features of rumor propagation in online social media. In Data Mining (ICDM), 2013 IEEE 13th International Conference on, pages 1103--1108. IEEE, 2013.Google ScholarCross Ref
J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 497--506. ACM, 2009. Google ScholarDigital Library
J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281--297. Oakland, CA, USA., 1967.Google Scholar
M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 1155--1158. ACM, 2010. Google ScholarDigital Library
M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we rt? In Proceedings of the first workshop on social media analytics, pages 71--79. ACM, 2010. Google ScholarDigital Library
M. R. Morris, S. Counts, A. Roseway, A. Hoff, and J. Schwarz. Tweeting is believing?: understanding microblog credibility perceptions. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pages 441--450. ACM, 2012. Google ScholarDigital Library
M. R. Morris, J. Teevan, and K. Panovich. A comparison of information seeking using search engines and social networks. ICWSM, 10:23--26, 2010.Google Scholar
M. R. Morris, J. Teevan, and K. Panovich. What do people ask their social networks, and why?: a survey study of status message q&a behavior. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 1739--1748. ACM, 2010. Google ScholarDigital Library
S. A. Paul, L. Hong, and E. H. Chi. Is twitter a good place for asking questions? a characterization study. In ICWSM, 2011.Google Scholar
S. C. Pendleton. Rumor research revisited and expanded. Language & Communication, 18(1):69--86, 1998.Google ScholarCross Ref
M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarCross Ref
V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1589--1599. Association for Computational Linguistics, 2011. Google ScholarDigital Library
J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, A. Flammini, and F. Menczer. Detecting and tracking political abuse in social media. In ICWSM, 2011.Google Scholar
J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. Truthy: mapping the spread of astroturf in microblog streams. In Proceedings of the 20th international conference companion on World wide web, pages 249--252. ACM, 2011. Google ScholarDigital Library
R. L. Rosnow. Inside rumor: A personal journey. American Psychologist, 46(5):484, 1991.Google ScholarCross Ref
E. Seo, P. Mohapatra, and T. Abdelzaher. Identifying rumors and their sources in social networks. In SPIE Defense, Security, and Sensing, pages 83891I--83891I. International Society for Optics and Photonics, 2012.Google Scholar
S. Sun, H. Liu, J. He, and X. Du. Detecting event rumors on sina weibo automatically. In Web Technologies and Applications, pages 120--131. Springer, 2013.Google ScholarCross Ref
T. Takahashi and N. Igata. Rumor detection on twitter. In Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Conference on, pages 452--457. IEEE, 2012.Google ScholarCross Ref
F. Yang, Y. Liu, X. Yu, and M. Yang. Automatic detection of rumor on sina weibo. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, page 13. ACM, 2012. Google ScholarDigital Library
J. Yang, M. R. Morris, J. Teevan, L. A. Adamic, and M. S. Ackerman. Culture matters: A survey study of social q&a behavior. ICWSM, 11:409--416, 2011.Google Scholar
Z. Zhao and Q. Mei. Questions about questions: An empirical analysis of information needs on twitter. In Proceedings of the 22nd international conference on World Wide Web, pages 1545--1556. International World Wide Web Conferences Steering Committee, 2013. Google ScholarDigital Library

Index Terms

Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts
1. Information systems
  1. Information retrieval

Recommendations

Automatic detection of rumor on Sina Weibo
MDS '12: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics

The problem of gauging information credibility on social networks has received considerable attention in recent years. Most previous work has chosen Twitter, the world's largest micro-blogging platform, as the premise of research. In this work, we shift ...
Read More
Which cascade is more decisive in rumor detection on social media: Based on comparison between repost and reply sequences
Abstract
Rumor detection research is widely carried out to control the negative impact of rumor spreading. Many researchers conduct their research based on data from social media and prefer to employ different information cascades on social media, such as ...
Highlights
- Proposing CSRD, a rumor detection model, using modified dilated convolution.
- Revealing feature differences in rumor detection between reposts and replies
- Considering detection deadlines’ impact on data exposure levels in early ...
Read More
An Annotated Chinese Corpus for Rumor Veracity Detection
Artificial Intelligence and Mobile Services – AIMS 2020
Abstract
With the popularity of social media, Twitter, Facebook, and Weibo etc. platforms have become an indispensable part of people’s life, where users can freely release and spread information. Meanwhile, the information credibility cannot be guaranteed ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '15: Proceedings of the 24th International Conference on World Wide Web
May 2015
1460 pages
ISBN:9781450334693
General Chairs:
Aldo Gangemi
National Research Council, Italy & Paris 13 University-CNRS, France
,
Stefano Leonardi
Sapienza University of Rome, Italy
,
Alessandro Panconesi
Sapienza University of Rome, Italy
Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2)
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 18 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
enquiry tweets
rumor detection
social media
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '15 Paper Acceptance Rate131of929submissions,14%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 377
  Total Citations
  View Citations
- 2,061
  Total Downloads
- Downloads (Last 12 months)147
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts

WWW '15: Proceedings of the 24th International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic detection of rumor on Sina Weibo

Which cascade is more decisive in rumor detection on social media: Based on comparison between repost and reply sequences

An Annotated Chinese Corpus for Rumor Veracity Detection