Skip to main content

Identification of Political Hate Speech Using Machine Learning-Based Text Toxicity Analysis

  • Conference paper
  • First Online:
ICT Systems and Sustainability

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 516))

  • 505 Accesses

Abstract

The maintenance of civility during political campaigns has taken a huge blow during recent times with ever increasing toxicity in speeches targeting polarization of people for electoral gains. This is however creating a big divide in the society by the proliferation of cyber-hate speech, which is threatening the integrity and harmony of societies. The term “hate speech” refers to the use of words or phrases that are threatening, derogatory, or insulting to a specific individual or group. Users of social media in India are increasing rapidly, and this is coupled with an increase in the frequency with which cyber-hate speech targets specific segments of society or individuals based on their caste, color, or creed. An online environment free of hostility and bigotry has remained a key focus area of academics’ attention. In the present study, some contemporary political data from the Twitter platform is being used to identify hate speech with the help of machine training and learning methods including the text-based natural language processing (NLP). To achieve the identified goal, several political tweets across different ideologies have been taken into consideration. There are many different ways to collect and organize emotions and personality traits. Analysis of the processed dataset has been done using an ensemble of popular machine learning algorithms, and the results indicate the comparative performance of the methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al Z, Amr M (2019) Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 0123456789. https://doi.org/10.1007/s00607-019-00745-0

  2. Prentice S, Taylor PJ, Rayson P, Hoskins A, O’Loughlin B (2011) Analyzing the semantic content and persuasive composition of extremist media: a case study of texts produced during the Gaza conflict. Inf Syst Front 13(1):61–73. https://doi.org/10.1007/s10796-010-9272-y

    Article  Google Scholar 

  3. Yin W, Zubiaga A (2021) Towards generalisable hate speech detection: a review on obstacles and solutions. Peer J Comput Sci 7. https://doi.org/10.7717/PEERJ-CS.598

  4. Perifanos K, Goutsos D (2021) Multimodal hate speech detection in Greek social media. Multimodal Technol Interact 5(7). https://doi.org/10.3390/mti5070034

  5. Poletto F, Basile V, Sanguinetti M, Bosco C, Patti V (2021) Resources and benchmark corpora for hate speech detection: a systematic review. Lang Resour Evaluat 55(2). https://doi.org/10.1007/s10579-020-09502-8

  6. Matamoros-Fernández A, Farkas J (2021) Racism, hate speech, and social media: a systematic review and critique. TelevNew Media 22(2). https://doi.org/10.1177/1527476420982230

  7. United Nations Alliance of Civilizations (UNAOC) (2017) #SpreadNoHate: a global dialogue on hate speech against migrants and refugees in the media. Retrieved from https://www.unaoc.org/what-we-do/projects/hate-speech/

  8. Alrehili A (2019) Automatic hate speech detection on social media: a brief survey. In: 2019 IEEE/ACS 16th international conference on computer systems and applications (AICCSA), 2019, pp 1–6. https://doi.org/10.1109/AICCSA47632.2019.9035228

  9. https://medium.com/swlh/building-a-real-time-hate-speech-detection-for-the-web-ebfb210be32c

  10. http://github.com/nltk/nltk. 5 Feb 2020

  11. http://tweepy.org. 15 Feb 2020

  12. Abraham BP. Trends in religion based hate speech. https://defindia.org/wp-content/uploads/2017/09/Trends-in-Region-Based-Hate-Speech.pdf. 15 Feb 2020

  13. Padmaja PS, Bandu S (2014) Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. Int J Adv Res Artif Intell 3(11):1–6. https://doi.org/10.14569/ijarai.2014.031101

  14. Iglesias CA, Moreno A (2019) Sentiment analysis for social media 9(23)

    Google Scholar 

  15. Kumar A, Jaiswal A (2020) Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurr Comput Pract Exp 32(1):1–29. https://doi.org/10.1002/cpe.5107

    Article  MathSciNet  Google Scholar 

  16. Ahmed MT, Rahman M, Nur S, Islam A, Das D (2021) Deployment of machine learning and deep learning algorithms in detecting cyberbullying in Bangla and romanized Bangla text: a comparative study. Proc 2021 1st Int Conf Adv Electr Comput Commun Sustain Technol ICAECT 2021. https://doi.org/10.1109/ICAECT49130.2021.9392608

  17. Islam MM, Uddin MA, Islam L, Akter A, Sharmin S, Acharjee UK (2020) Cyberbullying detection on social networks using machine learning approaches. 2020 IEEE Asia-Pacific Conf Comput Sci Data Eng CSDE 2020. https://doi.org/10.1109/CSDE50874.2020.9411601

  18. Cihan Ates E, Bostanci E, Güzel S (2021) Comparative performance of machine learning algorithms in cyberbullying detection: using Turkish language preprocessing techniques

    Google Scholar 

  19. Witten IH, Frank E (1999) Data mining: practical machine learning tools and techniques with Java implementations. The Morgan Kaufmann Series in Data Management Systems 31:371. Available: http://www.amazon.com/Data-Mining-Techniques-Implementations-Management/dp/1558605525

  20. Tomkins S, Getoor L, Chen Y, Zhang Y (2018) A socio-linguistic model for cyberbullying detection. Proc 2018 IEEE/ACM Int Conf Adv Soc Networks Anal Mining, ASONAM 2018, pp 53–60. https://doi.org/10.1109/ASONAM.2018.8508294

  21. Raisi E, Huang B (2017) Cyberbullying detection with weakly supervised machine learning. In: Proceedings 2017 IEEE/ACM international conference advanced social networks anals mining, ASONAM 2017, pp 409–416. https://doi.org/10.1145/3110025.3110049

  22. Merging Datasets for Aggressive Text Identification

    Google Scholar 

  23. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: 15th Conference European chapter associates computing linguists EACL 2017—proceedings of conference 2(2):427–431. https://doi.org/10.18653/v1/e17-2068

  24. Mehdad Y, Tetreault J (2016) Do characters abuse more than words. In: SIGDIAL 2016, 17th annual meeting of the special interest group on discourse and dialogue proceedings of the conference, pp 299–303. https://doi.org/10.18653/v1/w16-3638

  25. Pedregosa F, Grisel O, Andreas M, Weiss R, Passos A, Brucher M (2011) Scikit-learn: machine learning in Python 12:2825–2830

    Google Scholar 

  26. Burnap P, Williams ML (201) Hate speech, machine classification and statistical modelling of information flows on Twitter: interpretation and communication for policy decision making, pp 1–18

    Google Scholar 

  27. Wiebe J (2003) Annotating expressions of opinions and emotions in language, pp 1–54

    Google Scholar 

  28. Razavi AH, Inkpen D, Uritsky S, Matwin S. Offensive language detection using multi-level classification, pp 16–27

    Google Scholar 

  29. Waseem Z, Hovy D (2016) Hateful symbols or hateful people ? Predictive features for hate speech detection on Twitter, pp 88–93

    Google Scholar 

  30. Vandebosch H (2009) Cyberbullying among youngsters: profiles of bullies and victims 11(8):1349–1371. https://doi.org/10.1177/1461444809341263

  31. Nobata C, Tetreault J (2016) Abusive language detection in online user content, pp 145–153

    Google Scholar 

  32. Hilte L, Lodewyckx E, Verhoeven B, Daelemans W (2005) A dictionary-based approach to racism detection in Dutch social media

    Google Scholar 

  33. Guermazi R, Ben Hamadou A (2008) Using a semi-automatic keyword dictionary for improving violent web site filtering, pp 343–350. https://doi.org/10.1109/SITIS.2007.137

  34. NAACL HLT 2016 the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies proceedings of the student research workshop San Diego, California, USA, 2016

    Google Scholar 

  35. Greevy E, Smeaton AF (2004) Classifying racist texts using a support vector machine, pp 468–469

    Google Scholar 

  36. Davidson T, Warmsley D, Macy M, Weber I (2013) Automated hate speech detection and the problem of offensive language

    Google Scholar 

  37. Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings, pp 29–30

    Google Scholar 

  38. Waseem Z (2016) Are you a racist or am i seeing things? Annotator influence on hate speech detection on Twitter, pp 138–142

    Google Scholar 

  39. Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection 3536(c):1–11. https://doi.org/10.1109/ACCESS.2018.2806394

  40. Robinson D, Zhang Z (2020) Detection of hate speech in social networks: a survey on related papers. https://doi.org/10.5121/csit.2019.90208

  41. Morris M (2020) Conflicted politicians. http://counterpoint.uk.com/reports-pamphlets/conflicted-politicians/. 10 Feb 2020

  42. Gaydhani A, Doma V, Kendre S, Bhagwat L. Detecting hate speech and offensive language on Twitter using machine learning: an N-gram and TFIDF based approach

    Google Scholar 

  43. Kannan S (2015) Preprocessing techniques for text mining

    Google Scholar 

  44. Najeeb MM (2014) Arabic natural language processing laboratory serving islamic sciences 5(3):114–117

    Google Scholar 

  45. Oriola O (2020) Evaluating machine learning techniques for detecting offensive and hate speech in South African Tweets 8

    Google Scholar 

  46. Aljarah I, Habib M (2020) Intelligent detection of hate speech in Arabic social network: a machine learning approach. https://doi.org/10.1177/0165551520917651

  47. Ika A, Rio M, Fanany MI, Ekanata Y (2017) Hate speech detection in the Indonesian Language: a dataset and preliminary study

    Google Scholar 

  48. Sagiroglu S, Sinanc D (2013) Big data: a review 42–47

    Google Scholar 

  49. Madden S (2012) to Big Data. IEEE Internet Comput 16:4–6. https://doi.org/10.1109/MIC.2012.50

    Article  Google Scholar 

  50. Alzahrani H (2016) Social media analytics using data mining 16(4)

    Google Scholar 

  51. Mathematics A (2018) An overview study on data cleaning, its types and its methods for data mining 119(12):16837–16848

    Google Scholar 

  52. Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study. https://doi.org/10.14569/IJACSA.2020.0110861

  53. Priyadharshini G (2020) Detection of hate speech using text mining and natural language processing 9(11):2018–2021

    Google Scholar 

  54. Madukwe KJ, Gao X, Xue B (2020) In data we trust: a critical analysis of hate speech detection datasets 150–161. https://doi.org/10.5072/FK2/ZDTE

  55. Liang H, Zhu JJH (2017) Big data, collection of (social media, harvesting). https://doi.org/10.1002/9781118901731.iecrm0015

  56. Methods of social media research: data collection and use in social media college of communication and information

    Google Scholar 

  57. Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: a survey data mining techniques in social media: a survey. https://doi.org/10.1016/j.neucom.2016.06.045

  58. Del Vigna F, Cimino A, Orletta FD, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on Facebook, pp 86–95

    Google Scholar 

  59. Silva L, Weber I (2015) Analyzing the targets of hate in online social media

    Google Scholar 

  60. Sun S, Luo C, Chen J (2016) PT US CR. Elsevier B.V.

    Google Scholar 

  61. Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums 26(3). https://doi.org/10.1145/1361684.1361685

  62. Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection 10(4):215–230

    Google Scholar 

  63. Warner W, Hirschberg J (2012) Detecting hate speech on the world wide web, Lsm:19–26

    Google Scholar 

  64. Fortuna P, Tec I (2020) A survey on automatic detection of hate speech in text 51(4)

    Google Scholar 

  65. Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study 11(8):484–491

    Google Scholar 

  66. Jose AC, Malekian R, Ye N (2016) improving home automation security; integrating device fingerprinting into smart home. IEEE Access 4:5776–5787. https://doi.org/10.1109/ACCESS.2016.2606478

    Article  Google Scholar 

  67. Florio K, Basile V, Lai M, Patti V, Informatica D (2019) Leveraging hate speech detection to investigate immigration-related phenomena in Italy

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Priya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Priya, Gupta, S. (2023). Identification of Political Hate Speech Using Machine Learning-Based Text Toxicity Analysis. In: Tuba, M., Akashe, S., Joshi, A. (eds) ICT Systems and Sustainability. Lecture Notes in Networks and Systems, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-19-5221-0_22

Download citation

Publish with us

Policies and ethics