Skip to main content
Log in

Social media for polling and predicting United States election outcome

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Twitter has been in the forefront of political discourse, with politicians choosing it as their platform for disseminating information to their constituents. We seek to explore the effectiveness of social media as a resource for both polling and predicting the election outcome. To this aim, we create a dataset consisting of approximately 3 million tweets ranging from September 22nd to November 8th, 2016. Polling analysis will be performed on two levels: national and state. Predicting the election is performed only at the state level due to the electoral college process present in the U.S. election system. Two approaches are used for predicting the election, a winner-take-all approach and shared elector count approach. Twenty-one states are chosen, eleven categorized as swing states, and ten as heavily favored states. Two metrics are incorporated for polling and predicting the election outcome: tweet volume per candidate and positive sentiment per candidate. Our approach shows when polling on the national level, aggregated sentiment across the election time period provides values close to the polls. At the state level, volume is not a good candidate for polling state votes. Sentiment produces values closer to swing state polls when the election is close.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G. S, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems, 2015, software available from tensorflow.org. http://tensorflow.org/. Accessed 1 Jan 2017

  • Anuta D, Churchin J, Luo J (2017) Election bias: comparing polls and Twitter in the 2016 US election. arXiv preprint arXiv:1701.06232

  • Banzhaf iII JF (1968) One man, 3.312 votes: a mathematical analysis of the electoral college. Villanova Law Rev 13:304. https://digitalcommons.law.villanova.edu/cgi/viewcontent.cgi?referer=https://scholar.google.com/&httpsredir=1&article=1780&context=vlr

  • Bermingham A, Smeaton AF (2011) On using Twitter to monitor political sentiment and predict election results. In : Sentiment analysis where AI meets psychology (SAAIP) workshop at the international joint conference for natural language processing (IJCNLP). http://doras.dcu.ie/16670/

  • Bollen J, Mao H, Pepe A (2011) Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM 11:450–453

    Google Scholar 

  • Boutet A, Kim H, Yoneki E (2012) What’s in your tweets? I know who you supported in the UK 2010 general election. In: The international AAAI conference on weblogs and social media (ICWSM), 2012. https://hal.archives-ouvertes.fr/hal-00702390/

  • Dickerson JP, Kagan V, Subrahmanian VS (2014) Using sentiment to detect bots on twitter: Are humans more opinionated than bots? In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014), pp 620–627

  • Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pp 1–12

  • Grimme C, Preuss M, Adam L (2017) Trautmann H (2017) Social bots: human-like by means of human control? Big Data 5:279–293

    Article  Google Scholar 

  • Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2016) Cross-domain sentiment analysis: an empirical investigation. In: 2016 IEEE international conference on information reuse and integration (IRI), pp 160–165

  • Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2016) Integrating multiple data sources to enhance sentiment prediction. In: IEEE 2nd international conference collaboration and internet computing (CIC), 2016, pp 285–291

  • Heredia B, Khoshgoftaar TM, Prusa JD, Crawford M (2017) Improving detection of untrustworthy online reviews using ensemble learners combined with feature selection. Soc Netw Anal Min 7(1):37. https://doi.org/10.1007/s13278-017-0456-z

    Article  Google Scholar 

  • Heredia B, Prusa JD, Khoshgoftaar TM (2017) Exploring the effectiveness of twitter at polling the United States 2016 presidential election. In: IEEE 3rd international conference collaboration and internet computing (CIC), 2017, pp 283–290

  • Jones T (2016) Ibd/tipp tracking poll: The most accurate presidential poll in America, 2016. http://www.investors.com/politics/ibd-tipp-tracking-poll-most-accurate-presidential-poll/. Accessed 9 Jan 2017

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, p 1097–1105

  • Liu Y, Huang X, An A, Yu X (2007) ARSA: a sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ser. SIGIR ’07, pp 607–614

  • Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 50–60. https://projecteuclid.org/download/pdf_1/euclid.aoms/1177730491

  • Meador C, Gluck J (2009) Analyzing the relationship between tweets, box-office performance and stocks. Methods

  • Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135

    Article  Google Scholar 

  • Prusa JD, Khoshgoftaar TM (2017) Improving deep neural network design with new text data representations. J Big Data 4(1):7, 3. https://dx.doi.org/10.1186/s40537-017-0065-8

  • Prusa JD, Khoshgoftaar TM, Seliya N (2016) Enhancing ensemble learners with data sampling on high-dimensional imbalanced tweet sentiment data. In: The twenty-ninth international flairs conference, pp 322–327

  • Real clear politics—election 2016 presidential polls. https://www.realclearpolitics.com/epolls/latest_polls/. Accessed 08-2017

  • Reuters (2016) Election night 2016 was the most watched in cable news history, Nov 9 2016. http://fortune.com/2016/11/10/election-night-2016-cable-news-history/. Accessed 11 Jan 2017

  • Robertson SP, Vatrapu RK, Medina R (2010) Off the wall political discourse: Facebook use in the 2008 US presidential election. Inf Polity 15(1 2):11–31. http://content.iospress.com/articles/information-polity/ip000196

  • Roesslein J (2009) Tweepy: Twitter for python!’ 2009. http://www.tweepy.org/. Accessed 9 Jan 2016

  • Schäfer F, Evert S, Heinrich P (2017) Japan’s 2014 general election: political bots, rightwing internet activism, and prime minister shinzō abe’s hidden nationalist agenda. Big Data 5:294–309

    Article  Google Scholar 

  • Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. ACL vol 2, pp 24–29. http://www.aclweb.org/old_anthology/P/P13/P13-2.pdf#page=72

  • Sloan L, Morgan J, Burnap P, Williams M (2015) Who tweets? deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PLoS One 10(3):e0115545

    Article  Google Scholar 

  • Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10(1):178–185

    Google Scholar 

  • Twitter usage statistics, June, 2015. http://www.internetlivestats.com/twitter-statistics//. Accessed 2 Jan 2017

  • Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations. Association for computational linguistics, pp 115–120

  • Zhang X, LeCun Y (2015) Text understanding from scratch. Cornell University, Tech. Rep., 02

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers and the Editor for the constructive evaluation of this paper and also the various members of the Data Mining and Machine Learning Laboratory, Florida Atlantic University, for assistance with the reviews. Also, we acknowledge partial support by the NSF (CNS-1427536). Opinions, findings, conclusions, or recommendations in this paper are of the authors and do not reflect the views of the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian Heredia.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 122 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heredia, B., Prusa, J.D. & Khoshgoftaar, T.M. Social media for polling and predicting United States election outcome. Soc. Netw. Anal. Min. 8, 48 (2018). https://doi.org/10.1007/s13278-018-0525-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-018-0525-y

Keywords

Navigation