Abstract
Twitter has been in the forefront of political discourse, with politicians choosing it as their platform for disseminating information to their constituents. We seek to explore the effectiveness of social media as a resource for both polling and predicting the election outcome. To this aim, we create a dataset consisting of approximately 3 million tweets ranging from September 22nd to November 8th, 2016. Polling analysis will be performed on two levels: national and state. Predicting the election is performed only at the state level due to the electoral college process present in the U.S. election system. Two approaches are used for predicting the election, a winner-take-all approach and shared elector count approach. Twenty-one states are chosen, eleven categorized as swing states, and ten as heavily favored states. Two metrics are incorporated for polling and predicting the election outcome: tweet volume per candidate and positive sentiment per candidate. Our approach shows when polling on the national level, aggregated sentiment across the election time period provides values close to the polls. At the state level, volume is not a good candidate for polling state votes. Sentiment produces values closer to swing state polls when the election is close.
Similar content being viewed by others
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G. S, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems, 2015, software available from tensorflow.org. http://tensorflow.org/. Accessed 1 Jan 2017
Anuta D, Churchin J, Luo J (2017) Election bias: comparing polls and Twitter in the 2016 US election. arXiv preprint arXiv:1701.06232
Banzhaf iII JF (1968) One man, 3.312 votes: a mathematical analysis of the electoral college. Villanova Law Rev 13:304. https://digitalcommons.law.villanova.edu/cgi/viewcontent.cgi?referer=https://scholar.google.com/&httpsredir=1&article=1780&context=vlr
Bermingham A, Smeaton AF (2011) On using Twitter to monitor political sentiment and predict election results. In : Sentiment analysis where AI meets psychology (SAAIP) workshop at the international joint conference for natural language processing (IJCNLP). http://doras.dcu.ie/16670/
Bollen J, Mao H, Pepe A (2011) Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM 11:450–453
Boutet A, Kim H, Yoneki E (2012) What’s in your tweets? I know who you supported in the UK 2010 general election. In: The international AAAI conference on weblogs and social media (ICWSM), 2012. https://hal.archives-ouvertes.fr/hal-00702390/
Dickerson JP, Kagan V, Subrahmanian VS (2014) Using sentiment to detect bots on twitter: Are humans more opinionated than bots? In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014), pp 620–627
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pp 1–12
Grimme C, Preuss M, Adam L (2017) Trautmann H (2017) Social bots: human-like by means of human control? Big Data 5:279–293
Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2016) Cross-domain sentiment analysis: an empirical investigation. In: 2016 IEEE international conference on information reuse and integration (IRI), pp 160–165
Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2016) Integrating multiple data sources to enhance sentiment prediction. In: IEEE 2nd international conference collaboration and internet computing (CIC), 2016, pp 285–291
Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2017) Improving detection of untrustworthy online reviews using ensemble learners combined with feature selection. Soc Netw Anal Min 7(1):37. https://doi.org/10.1007/s13278-017-0456-z
Heredia B, Prusa JD, Khoshgoftaar TM (2017) Exploring the effectiveness of twitter at polling the United States 2016 presidential election. In: IEEE 3rd international conference collaboration and internet computing (CIC), 2017, pp 283–290
Jones T (2016) Ibd/tipp tracking poll: The most accurate presidential poll in America, 2016. http://www.investors.com/politics/ibd-tipp-tracking-poll-most-accurate-presidential-poll/. Accessed 9 Jan 2017
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, p 1097–1105
Liu Y, Huang X, An A, Yu X (2007) ARSA: a sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ser. SIGIR ’07, pp 607–614
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 50–60. https://projecteuclid.org/download/pdf_1/euclid.aoms/1177730491
Meador C, Gluck J (2009) Analyzing the relationship between tweets, box-office performance and stocks. Methods
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Prusa JD, Khoshgoftaar TM (2017) Improving deep neural network design with new text data representations. J Big Data 4(1):7, 3. https://dx.doi.org/10.1186/s40537-017-0065-8
Prusa JD, Khoshgoftaar TM, Seliya N (2016) Enhancing ensemble learners with data sampling on high-dimensional imbalanced tweet sentiment data. In: The twenty-ninth international flairs conference, pp 322–327
Real clear politics—election 2016 presidential polls. https://www.realclearpolitics.com/epolls/latest_polls/. Accessed 08-2017
Reuters (2016) Election night 2016 was the most watched in cable news history, Nov 9 2016. http://fortune.com/2016/11/10/election-night-2016-cable-news-history/. Accessed 11 Jan 2017
Robertson SP, Vatrapu RK, Medina R (2010) Off the wall political discourse: Facebook use in the 2008 US presidential election. Inf Polity 15(1 2):11–31. http://content.iospress.com/articles/information-polity/ip000196
Roesslein J (2009) Tweepy: Twitter for python!’ 2009. http://www.tweepy.org/. Accessed 9 Jan 2016
Schäfer F, Evert S, Heinrich P (2017) Japan’s 2014 general election: political bots, rightwing internet activism, and prime minister shinzō abe’s hidden nationalist agenda. Big Data 5:294–309
Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. ACL vol 2, pp 24–29. http://www.aclweb.org/old_anthology/P/P13/P13-2.pdf#page=72
Sloan L, Morgan J, Burnap P, Williams M (2015) Who tweets? deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PLoS One 10(3):e0115545
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10(1):178–185
Twitter usage statistics, June, 2015. http://www.internetlivestats.com/twitter-statistics//. Accessed 2 Jan 2017
Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations. Association for computational linguistics, pp 115–120
Zhang X, LeCun Y (2015) Text understanding from scratch. Cornell University, Tech. Rep., 02
Acknowledgements
The authors would like to thank the anonymous reviewers and the Editor for the constructive evaluation of this paper and also the various members of the Data Mining and Machine Learning Laboratory, Florida Atlantic University, for assistance with the reviews. Also, we acknowledge partial support by the NSF (CNS-1427536). Opinions, findings, conclusions, or recommendations in this paper are of the authors and do not reflect the views of the NSF.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Heredia, B., Prusa, J.D. & Khoshgoftaar, T.M. Social media for polling and predicting United States election outcome. Soc. Netw. Anal. Min. 8, 48 (2018). https://doi.org/10.1007/s13278-018-0525-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-018-0525-y