short-paper

Transfer Learning for Multilingual Abusive Meme Detection

Authors:
Mithun Das

Department of Computer Science & Engineering, Indian Institute of Technology Kharagpur, India

Department of Computer Science & Engineering, Indian Institute of Technology Kharagpur, India

0000-0003-1442-312X
View Profile

,
Animesh Mukherjee

Department of Computer Science & Engineering, Indian Institute of Technology Kharagpur, India

Department of Computer Science & Engineering, Indian Institute of Technology Kharagpur, India

0000-0003-4534-0044
View Profile

WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023April 2023Pages 245–250https://doi.org/10.1145/3578503.3583607

Published:30 April 2023Publication History

WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023

Pages 245–250

ABSTRACT

The exponential growth of social media platforms has permitted people to connect worldwide. However, it has also fueled the elevation of several harmful and abusive content on the Internet. Repeated exposure to abusive content may lead to psychological effects on the target users. Thus it is necessary to detect such abusive content in all forms to keep these platforms safe and healthy. So far, several works have been done for abusive speech detection; however, most of these are text-based. Yet, social media contents are often multimodal, comprising text, images, videos, etc. Internet memes have recently emerged as a predominant mode of content shared on social media and are used to express vitriol or harm toward others. Hence it is essential to detect such abusive memes. Although several works have been done for abusive/harmful meme detection, most of these are in English with only a very few extending to non-English datasets. Therefore, one of the immediate solutions is to detect abusive memes in one language and transfer them to other languages. This work explores several model transfer techniques to bridge the gap by creating various baseline models.

References

Sai Saketh Aluru, Binny Mathew, Punyajoy Saha, and Animesh Mukherjee. 2020. Deep learning models for multilingual hate speech detection. arXiv preprint arXiv:2004.06465(2020).Google Scholar
Somnath Banerjee, Maulindu Sarkar, Nancy Agrawal, Punyajoy Saha, and Mithun Das. 2021. Exploring transformer based models to identify hate speech and offensive content in english and indo-aryan languages. arXiv preprint arXiv:2111.13974(2021).Google Scholar
Mohit Chandra, Dheeraj Pailla, Himanshu Bhatia, Aadilmehdi Sanchawala, Manish Gupta, Manish Shrivastava, and Ponnurangam Kumaraguru. 2021. “Subverting the Jewtocracy”: Online Antisemitism Detection Using Multimodal Deep Learning. In 13th ACM Web Science Conference 2021. 148–157.Google Scholar
Mithun Das, Somnath Banerjee, and Animesh Mukherjee. 2022. Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages. 32–42. https://doi.org/10.1145/3511095.3531277Google ScholarDigital Library
Mithun Das, Somnath Banerjee, and Punyajoy Saha. 2021. Abusive and threatening language detection in urdu using boosting based and bert based models: A comparative approach. arXiv preprint arXiv:2111.14830(2021).Google Scholar
Mithun Das, Binny Mathew, Punyajoy Saha, Pawan Goyal, and Animesh Mukherjee. 2020. Hate speech in online social media. ACM SIGWEB NewsletterAutumn (2020), 1–8.Google Scholar
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11.Google ScholarCross Ref
J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL.Google Scholar
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929(2020).Google Scholar
Antigoni Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, and Nicolas Kourtellis. 2018. Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth International AAAI Conference on Web and Social Media.Google ScholarCross Ref
Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. 1470–1478.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
Nicola F Johnson, R Leahy, N Johnson Restrepo, Nicolas Velasquez, Ming Zheng, P Manrique, P Devkota, and Stefan Wuchty. 2019. Hidden resilience and adaptive dynamics of the global online hate ecology. Nature 573, 7773 (2019), 261–265.Google Scholar
Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, 2021. Muril: Multilingual representations for indian languages. arXiv preprint arXiv:2103.10730(2021).Google Scholar
Gokul Karthik Kumar and Karthik Nanadakumar. 2022. Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features. arXiv preprint arXiv:2210.05916(2022).Google Scholar
Ritesh Kumar, Atul Kr Ojha, Shervin Malmasi, and Marcos Zampieri. 2018. Benchmarking aggression identification in social media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 1–11.Google Scholar
Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557(2019).Google Scholar
Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems 32 (2019).Google Scholar
Krishanu Maity, Prince Jha, Sriparna Saha, and Pushpak Bhattacharyya. 2022. A Multitask Framework for Sentiment, Emotion and Sarcasm Aware Cyberbullying Detection from Multi-Modal Code-Mixed Memes. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 1739–1749. https://doi.org/10.1145/3477495.3531925Google ScholarDigital Library
Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, and Soumen Chakrabarti. 2021. A Data Bootstrapping Recipe for Low-Resource Multilingual Relation Classification. In Proceedings of the 25th Conference on Computational Natural Language Learning. 575–587.Google ScholarCross Ref
Casey Newton. 2019. The terror queue. https://www.theverge.com/2019/12/16/21021005/google-youtube-moderators-ptsd-accenture-violent-disturbing-content-interviews-videoGoogle Scholar
Shraman Pramanick, Dimitar Dimitrov, Rituparna Mukherjee, Shivam Sharma, Md Shad Akhtar, Preslav Nakov, and Tanmoy Chakraborty. 2021. Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2783–2796.Google ScholarCross Ref
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.Google Scholar
Hammad Rizwan, Muhammad Haroon Shakeel, and Asim Karim. 2020. Hate-speech and offensive language detection in roman Urdu. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2512–2522.Google ScholarCross Ref
Nauros Romim, Mosahed Ahmed, Hriteshwar Talukder, and Md Saiful Islam. 2021. Hate speech detection in the bengali language: A dataset and its baseline evaluation. In Proceedings of International Joint Conference on Advances in Computational Intelligence. Springer, 457–468.Google ScholarCross Ref
Benet Oriol Sabat, Cristian Canton Ferrer, and Xavier Giro-i Nieto. 2019. Hate speech in pixels: Detection of offensive memes towards automatic moderation. arXiv preprint arXiv:1910.02334(2019).Google Scholar
Shivam Sharma, Firoj Alam, Md Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty, 2022. Detecting and Understanding Harmful Memes: A Survey. arXiv preprint arXiv:2205.04274(2022).Google Scholar
Limor Shifman. 2013. Memes in digital culture. MIT press.Google ScholarDigital Library
N Statt. 2017. YouTube is facing a full-scale advertising boycott over hate speech. The Verge (2017).Google Scholar
Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, and Paul Buitelaar. 2020. Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In Proceedings of the second workshop on trolling, aggression and cyberbullying. 32–41.Google Scholar
Steve Durairaj Swamy, Anupam Jamatia, and Björn Gambäck. 2019. Studying generalisability across abusive language detection datasets. In Proceedings of the 23rd conference on computational natural language learning (CoNLL). 940–950.Google ScholarCross Ref
Janikke Solstad Vedeler, Terje Olsen, and John Eriksen. 2019. Hate speech harms: a social justice discussion of disabled Norwegians’ experiences. Disability & Society 34, 3 (2019), 368–383.Google ScholarCross Ref
Savvas Zannettou, Tristan Caulfield, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, and Guillermo Suarez-Tangil. 2018. On the Origins of Memes by Means of Fringe Web Communities. In Proceedings of the Internet Measurement Conference 2018 (Boston, MA, USA) (IMC ’18). Association for Computing Machinery, New York, NY, USA, 188–202. https://doi.org/10.1145/3278532.3278550Google ScholarDigital Library

Index Terms

Transfer Learning for Multilingual Abusive Meme Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Social and professional topics
  1. Computing / technology policy
    1. Censorship

Recommendations

User-aware multilingual abusive content detection in social media
Abstract
Despite growing efforts to halt distasteful content on social media, multilingualism has added a new dimension to this problem. The scarcity of resources makes the challenge even greater when it comes to low-resource languages. This work focuses ...
Highlights
- We propose a multilingual abuse detection method for low-resource Indic languages.
- User-history, post-affinity, and textual modality help to identify abusive content.
- Deep neural networks learn representations of social and text ...
Read More
A Statistical Learning Approach to Detect Abusive Twitter Accounts
ICCDA '17: Proceedings of the International Conference on Compute and Data Analysis

The increased use of social media has motivated spammers to post their malicious activities on social network sites. Some of these spammers use adult content to further the distribution of their malicious activities. Moreover, the extensive number of ...
Read More
Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages
HT '22: Proceedings of the 33rd ACM Conference on Hypertext and Social Media

Abusive language is a growing concern in many social media platforms. Repeated exposure to abusive speech has created physiological effects on the target users. Thus, the problem of abusive language should be addressed in all forms for online peace and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023
April 2023
373 pages
ISBN:9798400700897
DOI:10.1145/3578503
General Chairs:
Ágnes Horvát
Northwestern University, IL, USA
,
Wendy Hall
University of Southampton, UK
,
Noshir Contractor
Northwestern University, IL, USA
,
Organizing Chair:
Leon Fröhling
GESIS, Germany
,
Program Chairs:
Katherine Ognayova
Rutgers University, NJ, USA
,
Harsh Taneja
University of Illinois Urbana-Champaign, IL, USA
,
Ingmar Weber
Saarland University, Germany
,
Publications Chairs:
Kristina Gligori?
Stanford University, CA, USA
,
Yelena Mejova
ISI Foundation, Italy
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 April 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Abusive meme
detection
multilingual
social media
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate218of875submissions,25%
Upcoming Conference
Websci '24

Sponsor:

sigweb

16th ACM Web Science Conference

May 21 - 24, 2024

Stuttgart , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 124
  Total Downloads
- Downloads (Last 12 months)124
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Transfer Learning for Multilingual Abusive Meme Detection

WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

User-aware multilingual abusive content detection in social media

A Statistical Learning Approach to Detect Abusive Twitter Accounts

Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages