research-article

Credibility ranking of tweets during high impact events

Authors:
Aditi Gupta

Indraprastha Institute of Information Technology, Delhi, India

Indraprastha Institute of Information Technology, Delhi, India
View Profile

,
Ponnurangam Kumaraguru

Indraprastha Institute of Information Technology, Delhi, India

Indraprastha Institute of Information Technology, Delhi, India
View Profile

PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social MediaApril 2012Article No.: 2Pages 2–8https://doi.org/10.1145/2185354.2185356

Published:17 April 2012Publication History

PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media

Pages 2–8

ABSTRACT

Twitter has evolved from being a conversation or opinion sharing medium among friends into a platform to share and disseminate information about current events. Events in the real world create a corresponding spur of posts (tweets) on Twitter. Not all content posted on Twitter is trustworthy or useful in providing information about the event. In this paper, we analyzed the credibility of information in tweets corresponding to fourteen high impact news events of 2011 around the globe. From the data we analyzed, on average 30% of total tweets posted about an event contained situational information about the event while 14% was spam. Only 17% of the total tweets posted about the event contained situational awareness information that was credible. Using regression analysis, we identified the important content and sourced based features, which can predict the credibility of information in a tweet. Prominent content based features were number of unique characters, swear words, pronouns, and emoticons in a tweet, and user based features like the number of followers and length of username. We adopted a supervised machine learning and relevance feedback approach using the above features, to rank tweets according to their credibility score. The performance of our ranking algorithm significantly enhanced when we applied re-ranking strategy. Results show that extraction of credible information from Twitter can be automated with high confidence.

References

F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on Twitter. In CEAS, 2010.Google Scholar
C. Buckley, G. Salton, and J. Allan. Automatic retrieval with locality information using SMART. NIST special publication, (500207):59--72, 1993.Google Scholar
K. R. Canini, B. Suh, and P. L. Pirolli. Finding credible information sources in social networks based on content and social structure. In SocialCom, 2011.Google ScholarCross Ref
C. Castillo, M. Mendoza, and B. Poblete. Information Credibility on Twitter. In WWW, pages 675--684, 2011. Google ScholarDigital Library
J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: experiments on recommending content from information streams. CHI '10, pages 1185--1194, 2010. Google ScholarDigital Library
S. Chhabra, A. Aggarwal, F. Benevenuto, and P. Kumaraguru. Phi.sh/$ocial: the phishing landscape through short urls. CEAS 2011, pages 92--101, 2011. Google ScholarDigital Library
B. De Longueville, R. S. Smith, and G. Luraschi. "omg, from here, i can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires, LBSN, 2009. Google ScholarDigital Library
A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. WWW '10. Google ScholarDigital Library
Y. Duan, L. Jiang, T. Qin, M. Zhou, and H.-Y. Shum. An empirical study on learning to rank of tweets. In COLING '10. Google ScholarDigital Library
C. Grier, K. Thomas, V. Paxson, and M. Zhang. @spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on Computer and communications security, 2010. Google ScholarDigital Library
A. Gupta and P. Kumaraguru. Twitter explodes with activity in mumbai blasts! a lifeline or an unmonitored daemon in the lurking? IIIT, Delhi, Technical report, IIITD-TR-2011-005, 2011.Google Scholar
A. l. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 2009 ISCRAM Conference, 2009.Google ScholarCross Ref
A. L. Hughes and L. Palen. Twitter adoption and use in crisis twitter adoption and use in mass convergence and emergency events. In ISCRAM, 2010.Google Scholar
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20:2002, 2002. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 133--142, 2002. Google ScholarDigital Library
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? WWW '10, 2010. Google ScholarDigital Library
J. R. Landis and G. G. Koch. The Measurement of Observer Agreement for Categorical Data. Biometrics, 33(1):159--174, Mar. 1977.Google ScholarCross Ref
M. Mendoza, B. Poblete, and C. Castillo. In SOMA, July.Google Scholar
B. O'Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In Proceedings of the International AAAI Conference on Weblogs and Social Media, 2010.Google Scholar
O. Oh, M. Agrawal, and H. R. Rao. Information control and terrorism: Tracking the mumbai terrorist attack through twitter. Information Systems Frontiers, 13(1):33--43, 2011. Google ScholarDigital Library
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, S. Patil, A. Flammini, and F. Menczer. Truthy: mapping the spread of astroturf in microblog streams. WWW '11. Google ScholarDigital Library
S. E. Robertson, S. Walker, and M. Beaulieu. Okapi at trec-7: automatic ad hoc, filtering, vlc and interactive track. IN, 1999.Google Scholar
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. WWW '10, 2010. Google ScholarDigital Library
S. Verma, S. Vieweg, W. J. Corvey, L. Palen, J. H. Martin, M. Palmer, A. Schram, and K. M. Anderson. Nlp to the rescue? extracting "situational awareness" tweets during mass emergency. ICWSM, 2011.Google Scholar
S. Vieweg, A. L. Hughes, K. Starbird, and L. Palen. Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In CHI, CHI '10, pages 1079--1088, 2010. Google ScholarDigital Library
S. Yardi, D. Romero, G. Schoenebeck, and D. Boyd. Detecting spam in a Twitter network. First Monday, 15(1), Jan. 2010.Google Scholar
W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR'11. Google ScholarDigital Library

Index Terms

Credibility ranking of tweets during high impact events
1. Information systems
  1. Information retrieval
2. Social and professional topics
  1. Computing / technology policy

Recommendations

Credibility-inspired ranking for blog post retrieval

Credibility of information refers to its believability or the believability of its sources. We explore the impact of credibility-inspired indicators on the task of blog post retrieval, following the intuition that more credible blog posts are preferred ...
Read More
Credibility in Context: An Analysis of Feature Distributions in Twitter
SOCIALCOM-PASSAT '12: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust

Twitter is a major forum for rapid dissemination of user-provided content in real time. As such, a large proportion of the information it contains is not particularly relevant to many users and in fact is perceived as unwanted 'noise' by many. There has ...
Read More
Blog credibility ranking by exploiting verified content
WICOW '09: Proceedings of the 3rd workshop on Information credibility on the web

People use weblogs to express thoughts, present ideas and share knowledge. However, weblogs can also be misused to influence and manipulate the readers. Therefore the credibility of a blog has to be validated before the available information is used for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media
April 2012
43 pages
ISBN:9781450312363
DOI:10.1145/2185354
Conference Chairs:
Ponnurangam Kumaraguru,
Virgilio Almeida
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 April 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
credibility
high impact events
online social media
Qualifiers
- research-article
Conference

Acceptance Rates
PSOSM '12 Paper Acceptance Rate7of21submissions,33%Overall Acceptance Rate7of21submissions,33%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 198
  Total Citations
  View Citations
- 2,076
  Total Downloads
- Downloads (Last 12 months)63
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Credibility ranking of tweets during high impact events

PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media

ABSTRACT

References

Cited By

Index Terms

Recommendations

Credibility-inspired ranking for blog post retrieval

Credibility in Context: An Analysis of Feature Distributions in Twitter

Blog credibility ranking by exploiting verified content

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Credibility ranking of tweets during high impact events

PSOSM '12: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media

ABSTRACT

References

Cited By

Index Terms

Recommendations

Credibility-inspired ranking for blog post retrieval

Credibility in Context: An Analysis of Feature Distributions in Twitter

Blog credibility ranking by exploiting verified content

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media