Identification of Political Hate Speech Using Machine Learning-Based Text Toxicity Analysis

Priya; Gupta, Sachin

doi:10.1007/978-981-19-5221-0_22

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 516))

505 Accesses

Abstract

The maintenance of civility during political campaigns has taken a huge blow during recent times with ever increasing toxicity in speeches targeting polarization of people for electoral gains. This is however creating a big divide in the society by the proliferation of cyber-hate speech, which is threatening the integrity and harmony of societies. The term “hate speech” refers to the use of words or phrases that are threatening, derogatory, or insulting to a specific individual or group. Users of social media in India are increasing rapidly, and this is coupled with an increase in the frequency with which cyber-hate speech targets specific segments of society or individuals based on their caste, color, or creed. An online environment free of hostility and bigotry has remained a key focus area of academics’ attention. In the present study, some contemporary political data from the Twitter platform is being used to identify hate speech with the help of machine training and learning methods including the text-based natural language processing (NLP). To achieve the identified goal, several political tweets across different ideologies have been taken into consideration. There are many different ways to collect and organize emotions and personality traits. Analysis of the processed dataset has been done using an ensemble of popular machine learning algorithms, and the results indicate the comparative performance of the methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al Z, Amr M (2019) Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 0123456789. https://doi.org/10.1007/s00607-019-00745-0
Prentice S, Taylor PJ, Rayson P, Hoskins A, O’Loughlin B (2011) Analyzing the semantic content and persuasive composition of extremist media: a case study of texts produced during the Gaza conflict. Inf Syst Front 13(1):61–73. https://doi.org/10.1007/s10796-010-9272-y
Article Google Scholar
Yin W, Zubiaga A (2021) Towards generalisable hate speech detection: a review on obstacles and solutions. Peer J Comput Sci 7. https://doi.org/10.7717/PEERJ-CS.598
Perifanos K, Goutsos D (2021) Multimodal hate speech detection in Greek social media. Multimodal Technol Interact 5(7). https://doi.org/10.3390/mti5070034
Poletto F, Basile V, Sanguinetti M, Bosco C, Patti V (2021) Resources and benchmark corpora for hate speech detection: a systematic review. Lang Resour Evaluat 55(2). https://doi.org/10.1007/s10579-020-09502-8
Matamoros-Fernández A, Farkas J (2021) Racism, hate speech, and social media: a systematic review and critique. TelevNew Media 22(2). https://doi.org/10.1177/1527476420982230
United Nations Alliance of Civilizations (UNAOC) (2017) #SpreadNoHate: a global dialogue on hate speech against migrants and refugees in the media. Retrieved from https://www.unaoc.org/what-we-do/projects/hate-speech/
Alrehili A (2019) Automatic hate speech detection on social media: a brief survey. In: 2019 IEEE/ACS 16th international conference on computer systems and applications (AICCSA), 2019, pp 1–6. https://doi.org/10.1109/AICCSA47632.2019.9035228
https://medium.com/swlh/building-a-real-time-hate-speech-detection-for-the-web-ebfb210be32c
http://github.com/nltk/nltk. 5 Feb 2020
http://tweepy.org. 15 Feb 2020
Abraham BP. Trends in religion based hate speech. https://defindia.org/wp-content/uploads/2017/09/Trends-in-Region-Based-Hate-Speech.pdf. 15 Feb 2020
Padmaja PS, Bandu S (2014) Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. Int J Adv Res Artif Intell 3(11):1–6. https://doi.org/10.14569/ijarai.2014.031101
Iglesias CA, Moreno A (2019) Sentiment analysis for social media 9(23)
Google Scholar
Kumar A, Jaiswal A (2020) Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurr Comput Pract Exp 32(1):1–29. https://doi.org/10.1002/cpe.5107
Article MathSciNet Google Scholar
Ahmed MT, Rahman M, Nur S, Islam A, Das D (2021) Deployment of machine learning and deep learning algorithms in detecting cyberbullying in Bangla and romanized Bangla text: a comparative study. Proc 2021 1st Int Conf Adv Electr Comput Commun Sustain Technol ICAECT 2021. https://doi.org/10.1109/ICAECT49130.2021.9392608
Islam MM, Uddin MA, Islam L, Akter A, Sharmin S, Acharjee UK (2020) Cyberbullying detection on social networks using machine learning approaches. 2020 IEEE Asia-Pacific Conf Comput Sci Data Eng CSDE 2020. https://doi.org/10.1109/CSDE50874.2020.9411601
Cihan Ates E, Bostanci E, Güzel S (2021) Comparative performance of machine learning algorithms in cyberbullying detection: using Turkish language preprocessing techniques
Google Scholar
Witten IH, Frank E (1999) Data mining: practical machine learning tools and techniques with Java implementations. The Morgan Kaufmann Series in Data Management Systems 31:371. Available: http://www.amazon.com/Data-Mining-Techniques-Implementations-Management/dp/1558605525
Tomkins S, Getoor L, Chen Y, Zhang Y (2018) A socio-linguistic model for cyberbullying detection. Proc 2018 IEEE/ACM Int Conf Adv Soc Networks Anal Mining, ASONAM 2018, pp 53–60. https://doi.org/10.1109/ASONAM.2018.8508294
Raisi E, Huang B (2017) Cyberbullying detection with weakly supervised machine learning. In: Proceedings 2017 IEEE/ACM international conference advanced social networks anals mining, ASONAM 2017, pp 409–416. https://doi.org/10.1145/3110025.3110049
Merging Datasets for Aggressive Text Identification
Google Scholar
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: 15th Conference European chapter associates computing linguists EACL 2017—proceedings of conference 2(2):427–431. https://doi.org/10.18653/v1/e17-2068
Mehdad Y, Tetreault J (2016) Do characters abuse more than words. In: SIGDIAL 2016, 17th annual meeting of the special interest group on discourse and dialogue proceedings of the conference, pp 299–303. https://doi.org/10.18653/v1/w16-3638
Pedregosa F, Grisel O, Andreas M, Weiss R, Passos A, Brucher M (2011) Scikit-learn: machine learning in Python 12:2825–2830
Google Scholar
Burnap P, Williams ML (201) Hate speech, machine classification and statistical modelling of information flows on Twitter: interpretation and communication for policy decision making, pp 1–18
Google Scholar
Wiebe J (2003) Annotating expressions of opinions and emotions in language, pp 1–54
Google Scholar
Razavi AH, Inkpen D, Uritsky S, Matwin S. Offensive language detection using multi-level classification, pp 16–27
Google Scholar
Waseem Z, Hovy D (2016) Hateful symbols or hateful people ? Predictive features for hate speech detection on Twitter, pp 88–93
Google Scholar
Vandebosch H (2009) Cyberbullying among youngsters: profiles of bullies and victims 11(8):1349–1371. https://doi.org/10.1177/1461444809341263
Nobata C, Tetreault J (2016) Abusive language detection in online user content, pp 145–153
Google Scholar
Hilte L, Lodewyckx E, Verhoeven B, Daelemans W (2005) A dictionary-based approach to racism detection in Dutch social media
Google Scholar
Guermazi R, Ben Hamadou A (2008) Using a semi-automatic keyword dictionary for improving violent web site filtering, pp 343–350. https://doi.org/10.1109/SITIS.2007.137
NAACL HLT 2016 the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies proceedings of the student research workshop San Diego, California, USA, 2016
Google Scholar
Greevy E, Smeaton AF (2004) Classifying racist texts using a support vector machine, pp 468–469
Google Scholar
Davidson T, Warmsley D, Macy M, Weber I (2013) Automated hate speech detection and the problem of offensive language
Google Scholar
Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings, pp 29–30
Google Scholar
Waseem Z (2016) Are you a racist or am i seeing things? Annotator influence on hate speech detection on Twitter, pp 138–142
Google Scholar
Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection 3536(c):1–11. https://doi.org/10.1109/ACCESS.2018.2806394
Robinson D, Zhang Z (2020) Detection of hate speech in social networks: a survey on related papers. https://doi.org/10.5121/csit.2019.90208
Morris M (2020) Conflicted politicians. http://counterpoint.uk.com/reports-pamphlets/conflicted-politicians/. 10 Feb 2020
Gaydhani A, Doma V, Kendre S, Bhagwat L. Detecting hate speech and offensive language on Twitter using machine learning: an N-gram and TFIDF based approach
Google Scholar
Kannan S (2015) Preprocessing techniques for text mining
Google Scholar
Najeeb MM (2014) Arabic natural language processing laboratory serving islamic sciences 5(3):114–117
Google Scholar
Oriola O (2020) Evaluating machine learning techniques for detecting offensive and hate speech in South African Tweets 8
Google Scholar
Aljarah I, Habib M (2020) Intelligent detection of hate speech in Arabic social network: a machine learning approach. https://doi.org/10.1177/0165551520917651
Ika A, Rio M, Fanany MI, Ekanata Y (2017) Hate speech detection in the Indonesian Language: a dataset and preliminary study
Google Scholar
Sagiroglu S, Sinanc D (2013) Big data: a review 42–47
Google Scholar
Madden S (2012) to Big Data. IEEE Internet Comput 16:4–6. https://doi.org/10.1109/MIC.2012.50
Article Google Scholar
Alzahrani H (2016) Social media analytics using data mining 16(4)
Google Scholar
Mathematics A (2018) An overview study on data cleaning, its types and its methods for data mining 119(12):16837–16848
Google Scholar
Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study. https://doi.org/10.14569/IJACSA.2020.0110861
Priyadharshini G (2020) Detection of hate speech using text mining and natural language processing 9(11):2018–2021
Google Scholar
Madukwe KJ, Gao X, Xue B (2020) In data we trust: a critical analysis of hate speech detection datasets 150–161. https://doi.org/10.5072/FK2/ZDTE
Liang H, Zhu JJH (2017) Big data, collection of (social media, harvesting). https://doi.org/10.1002/9781118901731.iecrm0015
Methods of social media research: data collection and use in social media college of communication and information
Google Scholar
Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: a survey data mining techniques in social media: a survey. https://doi.org/10.1016/j.neucom.2016.06.045
Del Vigna F, Cimino A, Orletta FD, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on Facebook, pp 86–95
Google Scholar
Silva L, Weber I (2015) Analyzing the targets of hate in online social media
Google Scholar
Sun S, Luo C, Chen J (2016) PT US CR. Elsevier B.V.
Google Scholar
Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums 26(3). https://doi.org/10.1145/1361684.1361685
Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection 10(4):215–230
Google Scholar
Warner W, Hirschberg J (2012) Detecting hate speech on the world wide web, Lsm:19–26
Google Scholar
Fortuna P, Tec I (2020) A survey on automatic detection of hate speech in text 51(4)
Google Scholar
Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study 11(8):484–491
Google Scholar
Jose AC, Malekian R, Ye N (2016) improving home automation security; integrating device fingerprinting into smart home. IEEE Access 4:5776–5787. https://doi.org/10.1109/ACCESS.2016.2606478
Article Google Scholar
Florio K, Basile V, Lai M, Patti V, Informatica D (2019) Leveraging hate speech detection to investigate immigration-related phenomena in Italy
Google Scholar

Download references

Author information

Authors and Affiliations

MVN University, Aurangabad, India
Priya & Sachin Gupta

Authors

Priya
View author publications
You can also search for this author in PubMed Google Scholar
Sachin Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Priya .

Editor information

Editors and Affiliations

Singidunum University, Belgrade, Serbia
Milan Tuba
ITM University, Gwalior, Madhya Pradesh, India
Shyam Akashe
Global Knowledge Research Foundation, Ahmedabad, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Priya, Gupta, S. (2023). Identification of Political Hate Speech Using Machine Learning-Based Text Toxicity Analysis. In: Tuba, M., Akashe, S., Joshi, A. (eds) ICT Systems and Sustainability. Lecture Notes in Networks and Systems, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-19-5221-0_22

Download citation

DOI: https://doi.org/10.1007/978-981-19-5221-0_22
Published: 01 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5220-3
Online ISBN: 978-981-19-5221-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics