Abstract
The maintenance of civility during political campaigns has taken a huge blow during recent times with ever increasing toxicity in speeches targeting polarization of people for electoral gains. This is however creating a big divide in the society by the proliferation of cyber-hate speech, which is threatening the integrity and harmony of societies. The term “hate speech” refers to the use of words or phrases that are threatening, derogatory, or insulting to a specific individual or group. Users of social media in India are increasing rapidly, and this is coupled with an increase in the frequency with which cyber-hate speech targets specific segments of society or individuals based on their caste, color, or creed. An online environment free of hostility and bigotry has remained a key focus area of academics’ attention. In the present study, some contemporary political data from the Twitter platform is being used to identify hate speech with the help of machine training and learning methods including the text-based natural language processing (NLP). To achieve the identified goal, several political tweets across different ideologies have been taken into consideration. There are many different ways to collect and organize emotions and personality traits. Analysis of the processed dataset has been done using an ensemble of popular machine learning algorithms, and the results indicate the comparative performance of the methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al Z, Amr M (2019) Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 0123456789. https://doi.org/10.1007/s00607-019-00745-0
Prentice S, Taylor PJ, Rayson P, Hoskins A, O’Loughlin B (2011) Analyzing the semantic content and persuasive composition of extremist media: a case study of texts produced during the Gaza conflict. Inf Syst Front 13(1):61–73. https://doi.org/10.1007/s10796-010-9272-y
Yin W, Zubiaga A (2021) Towards generalisable hate speech detection: a review on obstacles and solutions. Peer J Comput Sci 7. https://doi.org/10.7717/PEERJ-CS.598
Perifanos K, Goutsos D (2021) Multimodal hate speech detection in Greek social media. Multimodal Technol Interact 5(7). https://doi.org/10.3390/mti5070034
Poletto F, Basile V, Sanguinetti M, Bosco C, Patti V (2021) Resources and benchmark corpora for hate speech detection: a systematic review. Lang Resour Evaluat 55(2). https://doi.org/10.1007/s10579-020-09502-8
Matamoros-Fernández A, Farkas J (2021) Racism, hate speech, and social media: a systematic review and critique. TelevNew Media 22(2). https://doi.org/10.1177/1527476420982230
United Nations Alliance of Civilizations (UNAOC) (2017) #SpreadNoHate: a global dialogue on hate speech against migrants and refugees in the media. Retrieved from https://www.unaoc.org/what-we-do/projects/hate-speech/
Alrehili A (2019) Automatic hate speech detection on social media: a brief survey. In: 2019 IEEE/ACS 16th international conference on computer systems and applications (AICCSA), 2019, pp 1–6. https://doi.org/10.1109/AICCSA47632.2019.9035228
https://medium.com/swlh/building-a-real-time-hate-speech-detection-for-the-web-ebfb210be32c
http://github.com/nltk/nltk. 5 Feb 2020
http://tweepy.org. 15 Feb 2020
Abraham BP. Trends in religion based hate speech. https://defindia.org/wp-content/uploads/2017/09/Trends-in-Region-Based-Hate-Speech.pdf. 15 Feb 2020
Padmaja PS, Bandu S (2014) Evaluating sentiment analysis methods and identifying scope of negation in newspaper articles. Int J Adv Res Artif Intell 3(11):1–6. https://doi.org/10.14569/ijarai.2014.031101
Iglesias CA, Moreno A (2019) Sentiment analysis for social media 9(23)
Kumar A, Jaiswal A (2020) Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurr Comput Pract Exp 32(1):1–29. https://doi.org/10.1002/cpe.5107
Ahmed MT, Rahman M, Nur S, Islam A, Das D (2021) Deployment of machine learning and deep learning algorithms in detecting cyberbullying in Bangla and romanized Bangla text: a comparative study. Proc 2021 1st Int Conf Adv Electr Comput Commun Sustain Technol ICAECT 2021. https://doi.org/10.1109/ICAECT49130.2021.9392608
Islam MM, Uddin MA, Islam L, Akter A, Sharmin S, Acharjee UK (2020) Cyberbullying detection on social networks using machine learning approaches. 2020 IEEE Asia-Pacific Conf Comput Sci Data Eng CSDE 2020. https://doi.org/10.1109/CSDE50874.2020.9411601
Cihan Ates E, Bostanci E, Güzel S (2021) Comparative performance of machine learning algorithms in cyberbullying detection: using Turkish language preprocessing techniques
Witten IH, Frank E (1999) Data mining: practical machine learning tools and techniques with Java implementations. The Morgan Kaufmann Series in Data Management Systems 31:371. Available: http://www.amazon.com/Data-Mining-Techniques-Implementations-Management/dp/1558605525
Tomkins S, Getoor L, Chen Y, Zhang Y (2018) A socio-linguistic model for cyberbullying detection. Proc 2018 IEEE/ACM Int Conf Adv Soc Networks Anal Mining, ASONAM 2018, pp 53–60. https://doi.org/10.1109/ASONAM.2018.8508294
Raisi E, Huang B (2017) Cyberbullying detection with weakly supervised machine learning. In: Proceedings 2017 IEEE/ACM international conference advanced social networks anals mining, ASONAM 2017, pp 409–416. https://doi.org/10.1145/3110025.3110049
Merging Datasets for Aggressive Text Identification
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: 15th Conference European chapter associates computing linguists EACL 2017—proceedings of conference 2(2):427–431. https://doi.org/10.18653/v1/e17-2068
Mehdad Y, Tetreault J (2016) Do characters abuse more than words. In: SIGDIAL 2016, 17th annual meeting of the special interest group on discourse and dialogue proceedings of the conference, pp 299–303. https://doi.org/10.18653/v1/w16-3638
Pedregosa F, Grisel O, Andreas M, Weiss R, Passos A, Brucher M (2011) Scikit-learn: machine learning in Python 12:2825–2830
Burnap P, Williams ML (201) Hate speech, machine classification and statistical modelling of information flows on Twitter: interpretation and communication for policy decision making, pp 1–18
Wiebe J (2003) Annotating expressions of opinions and emotions in language, pp 1–54
Razavi AH, Inkpen D, Uritsky S, Matwin S. Offensive language detection using multi-level classification, pp 16–27
Waseem Z, Hovy D (2016) Hateful symbols or hateful people ? Predictive features for hate speech detection on Twitter, pp 88–93
Vandebosch H (2009) Cyberbullying among youngsters: profiles of bullies and victims 11(8):1349–1371. https://doi.org/10.1177/1461444809341263
Nobata C, Tetreault J (2016) Abusive language detection in online user content, pp 145–153
Hilte L, Lodewyckx E, Verhoeven B, Daelemans W (2005) A dictionary-based approach to racism detection in Dutch social media
Guermazi R, Ben Hamadou A (2008) Using a semi-automatic keyword dictionary for improving violent web site filtering, pp 343–350. https://doi.org/10.1109/SITIS.2007.137
NAACL HLT 2016 the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies proceedings of the student research workshop San Diego, California, USA, 2016
Greevy E, Smeaton AF (2004) Classifying racist texts using a support vector machine, pp 468–469
Davidson T, Warmsley D, Macy M, Weber I (2013) Automated hate speech detection and the problem of offensive language
Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings, pp 29–30
Waseem Z (2016) Are you a racist or am i seeing things? Annotator influence on hate speech detection on Twitter, pp 138–142
Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection 3536(c):1–11. https://doi.org/10.1109/ACCESS.2018.2806394
Robinson D, Zhang Z (2020) Detection of hate speech in social networks: a survey on related papers. https://doi.org/10.5121/csit.2019.90208
Morris M (2020) Conflicted politicians. http://counterpoint.uk.com/reports-pamphlets/conflicted-politicians/. 10 Feb 2020
Gaydhani A, Doma V, Kendre S, Bhagwat L. Detecting hate speech and offensive language on Twitter using machine learning: an N-gram and TFIDF based approach
Kannan S (2015) Preprocessing techniques for text mining
Najeeb MM (2014) Arabic natural language processing laboratory serving islamic sciences 5(3):114–117
Oriola O (2020) Evaluating machine learning techniques for detecting offensive and hate speech in South African Tweets 8
Aljarah I, Habib M (2020) Intelligent detection of hate speech in Arabic social network: a machine learning approach. https://doi.org/10.1177/0165551520917651
Ika A, Rio M, Fanany MI, Ekanata Y (2017) Hate speech detection in the Indonesian Language: a dataset and preliminary study
Sagiroglu S, Sinanc D (2013) Big data: a review 42–47
Madden S (2012) to Big Data. IEEE Internet Comput 16:4–6. https://doi.org/10.1109/MIC.2012.50
Alzahrani H (2016) Social media analytics using data mining 16(4)
Mathematics A (2018) An overview study on data cleaning, its types and its methods for data mining 119(12):16837–16848
Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study. https://doi.org/10.14569/IJACSA.2020.0110861
Priyadharshini G (2020) Detection of hate speech using text mining and natural language processing 9(11):2018–2021
Madukwe KJ, Gao X, Xue B (2020) In data we trust: a critical analysis of hate speech detection datasets 150–161. https://doi.org/10.5072/FK2/ZDTE
Liang H, Zhu JJH (2017) Big data, collection of (social media, harvesting). https://doi.org/10.1002/9781118901731.iecrm0015
Methods of social media research: data collection and use in social media college of communication and information
Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: a survey data mining techniques in social media: a survey. https://doi.org/10.1016/j.neucom.2016.06.045
Del Vigna F, Cimino A, Orletta FD, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on Facebook, pp 86–95
Silva L, Weber I (2015) Analyzing the targets of hate in online social media
Sun S, Luo C, Chen J (2016) PT US CR. Elsevier B.V.
Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums 26(3). https://doi.org/10.1145/1361684.1361685
Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection 10(4):215–230
Warner W, Hirschberg J (2012) Detecting hate speech on the world wide web, Lsm:19–26
Fortuna P, Tec I (2020) A survey on automatic detection of hate speech in text 51(4)
Abro S, Shaikh S, Ali Z (2020) Automatic hate speech detection using machine learning: a comparative study 11(8):484–491
Jose AC, Malekian R, Ye N (2016) improving home automation security; integrating device fingerprinting into smart home. IEEE Access 4:5776–5787. https://doi.org/10.1109/ACCESS.2016.2606478
Florio K, Basile V, Lai M, Patti V, Informatica D (2019) Leveraging hate speech detection to investigate immigration-related phenomena in Italy
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Priya, Gupta, S. (2023). Identification of Political Hate Speech Using Machine Learning-Based Text Toxicity Analysis. In: Tuba, M., Akashe, S., Joshi, A. (eds) ICT Systems and Sustainability. Lecture Notes in Networks and Systems, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-19-5221-0_22
Download citation
DOI: https://doi.org/10.1007/978-981-19-5221-0_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5220-3
Online ISBN: 978-981-19-5221-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)