Published May 9, 2019 | Version v1
Dataset Open

Cashtag Piggybacking dataset - Twitter dataset enriched with financial data

Description

This dataset is composed of 

  • Twitter dataset of ~9M tweets mentioning stocks (cashtags) traded on the most important US markets, shared between May and September 2017 (users data enriched with bot classification label)
  • Financial information about ~30k companies found in those tweets, retrieved from Google Finance

Refer to the paper below for more details.

Cresci, S., Lillo, F., Regoli, D., Tardelli, S., & Tesconi, M. (2019). Cashtag Piggybacking: Uncovering Spam and Bot Activity in Stock Microblogs on Twitter. ACM Transactions on the Web (TWEB)13(2), 11.

Files

companies.csv.zip

Files (964.6 MB)

Name Size Download all
md5:2a62b895e28e4420315990f6b84635ad
468.1 kB Preview Download
md5:a8b09f5e82e4db4cbcb124847dd0bbfc
321.1 MB Preview Download
md5:c2a9ea24f914f28e44c455d802038f20
1.4 kB Download
md5:8c85dff82999f6a7ffd7fd6dea3e836b
588.5 MB Preview Download
md5:91949d7811b8bab33b53df7572d10114
54.6 MB Preview Download

Additional details

Related works

Is documented by
10.1145/3313184 (DOI)