EmoffMeme: identifying offensive memes by leveraging underlying emotions

Kumari, Gitanjali; Bandyopadhyay, Dibyanayan; Ekbal, Asif

doi:10.1007/s11042-023-14807-1

EmoffMeme: identifying offensive memes by leveraging underlying emotions

Published: 26 April 2023

Volume 82, pages 45061–45096, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

523 Accesses
1 Citation
Explore all metrics

Abstract

Facebook, Twitter, Instagram, and other social media sites allow anonymity and independence. People exert their right to free expression without fear of repercussions. However, in the absence of thorough surveillance, people have fallen prey to offensiveness, trolls, and social media predators. Memes, a type of multimodal media, are becoming increasingly popular online. While most memes are meant to be humorous, some use dark humor to disseminate offensive content. Our present research focuses on learning the dependency and correlation between the three tasks, viz., detecting offensive memes, classifying offensive memes into fine-grained categories, and detecting emotions in a meme. For this, we created EmoffMeme, a large-scale multimodal dataset for Hindi. We aim at gaining insight into hidden social media users’ emotions by studying the meme’s text and image. We present an end-to-end multitasking deep neural network-based CLIP (Contrastive Language-Image Pre-training) model to solve the above correlated tasks simultaneously. We also employ Multimodal Factorized Bilinear (MFB) pooling to incorporate one common portrayal of a meme’s textual and visual part. We demonstrated the effectiveness of our work through extensive experiments. The evaluation shows that the proposed multitask framework yields better performance for the primary task, i.e., offensiveness identification, with the help of secondary task, i.e., emotion analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining Knowledge and Multi-modal Fusion for Meme Classification

Multimodal Hate Speech Detection from Bengali Memes and Texts

Multimodal Rumour Detection: Catching News that Never Transpired!

Data Availability

The dataset generated during and analysed during the current study are available in the journal1_memes-A48B repository at the link: https://github.com/Gitanjali1801/EmoffMeme.git.

Code Availability

The code of the current study is available at the link: https://github.com/Gitanjali1801/EmoffMeme.git.

Notes

¹ To maintain the anonymity of any individual, we replaced actual name with Person-XYZ throughout the paper.
https://download-all-images.mobilefirst.me/
https://github.com/tesseract-ocr/tesseract
https://github.com/FreddeFrallan/Multilingual-CLIP
https://pytorch.org/
https://github.com/google-research/bert/blob/master/multilingual.md
Our created corpus has textual part in Hindi. But VisualBERT and LXMERT are pre-trained on English corpus. So for these models only, we translated Hindi text part from our dataset into English with Google Translator and then used that translated text for training the model.

References

Akhtar S, Ghosal D, Ekbal A, Bhattacharyya P, Kurohashi S (2022) All-in-one: emotion, sentiment and intensity prediction using a multi-task ensemble framework. IEEE Trans Affect Comput 13:285–297
Article Google Scholar
Bayerl PS, Paul KI (2011) What determines inter-coder agreement in manual annotations? a meta-analytic investigation. Comput Linguis 37(4):699–725. https://doi.org/10.1162/COLI_a_00074
Article Google Scholar
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguis 5:135–146
Article Google Scholar
Boland K, Wira-Alam A, Messerschmidt R (2013) Creating an annotated corpus for sentiment analysis of german product reviews
Caruana R (2004) Multitask learning. Mach Learn 28:41–75
Article Google Scholar
Castro S, Hazarika D, Pérez-Rosas V, Zimmermann R, Mihalcea R, Poria S (2019) Towards multimodal sarcasm detection (an _obviously_ perfect paper), CoRR arXiv:1906.01815
Chatzakou D, Kourtellis N, Blackburn J, De Cristofaro E, Stringhini G, Vakali A (2017) Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference. WebSci ’17, pp 13–22. Association for computing machinery. https://doi.org/10.1145/3091478.3091487
Chatzakou D, Leontiadis I, Blackburn J, Cristofaro ED, Stringhini G, Vakali A, Kourtellis N (2019) Detecting cyberbullying and cyberaggression in social media. ACM Trans Web, vol 13(3). https://doi.org/10.1145/3343484
Chauhan DS, SR D, Ekbal A, Bhattacharyya P (2020) Sentiment and emotion help sarcasm? a multi-task learning framework for multi-modal sarcasm, sentiment and emotion analysis. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, pp 4351–4360. https://doi.org/10.18653/v1/2020.acl-main.401. https://aclanthology.org/2020.acl-main.401
Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 International conference on privacy, security, risk and trust and 2012 international confernece on social computing, pp 71–80. https://doi.org/10.1109/SocialCom-PASSAT.2012.55
Cheng L, Li J, Silva Y, Hall D, Liu H (2018) Xbully: cyberbullying detection within a multi-modal context. https://doi.org/10.1145/3289600.3291037
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for computational linguistics, pp 1724–1734. https://doi.org/10.3115/v1/D14-1179. https://aclanthology.org/D14-1179
Culpeper J (2011) Impoliteness: using language to cause offence. Studies in interactional sociolinguistics. Cambridge University Press. https://doi.org/10.1017/CBO9780511975752
Dadvar M, Trieschnigg D, Ordelman R, De Jong F (2013) Improving cyberbullying detection with user context. In: Serdyukov P, Braslavski P, Kuznetsov SO, Kamps J, Rüger S, Agichtein E, Segalovich I, Yilmaz E (eds) Advances in information retrieval. Springer, pp 693–696
Demszky D, Movshovitz-Attias D, Ko J, Cowen AS, Nemade G, Ravi S (2020) Goemotions: a dataset of fine-grained emotions. CoRR arXiv:2005.00547
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Dey R, Salem FM (2017) Gate-Variants of gated recurrent unit (gru) neural networks. arXiv:1701.05923. https://doi.org/10.48550/ARXIV.1701.05923
Dieber J, Kirrane S (2020) Why model why? assessing the strengths and limitations of lim. arXiv:2012.00093
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929
Drakett J, Rickett B, Day K, Milnes K (2018) Old jokes, new media – online sexism and constructions of gender in internet memes. Feminism Psychol 28(1):109–127. https://doi.org/10.1177/0959353517727560
Article Google Scholar
Drakett J, Rickett B, Day K, Milnes K (2018) Old jokes, new media – online sexism and constructions of gender in internet memes. Feminism Psychol 28:109–127
Article Google Scholar
Duan L, Cui G, Gao W, Zhang H (2001) Adult image detection method base-on skin color model and support vector machine
Ekman P, Cordaro DT (2011) What is meant by calling emotions basic. Emot Rev 3:364–370
Article Google Scholar
Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for computational linguistics, pp 457–468. https://doi.org/10.18653/v1/D16-1044. https://aclanthology.org/D16-1044
Gandhi S, Kokkula S, Chaudhuri A, Magnani A, Stanley T, Ahmadi B, Kandaswamy V, Ovenc O, Mannor S (2019) Image matters: detecting offensive and non-compliant content / logo in product images. arXiv:1905.02234
Ganguly D, Mofrad MH, Kovashka A (2017) Detecting sexually provocative images. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 660–668
He S, Zheng X, Wang J, Chang Z, Luo Y, Zeng D (2016) Meme extraction and tracing in crisis events. In: 2016 IEEE conference on intelligence and security informatics (ISI). IEEE Press, pp 61–66, https://doi.org/10.1109/ISI.2016.7745444
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. https://direct.mit.edu/neco/article-pdf/9/8/1735/813796/neco.1997.9.8.1735.pdf
Article Google Scholar
Hu A, Flaxman S (2018) Multimodal sentiment analysis to explore the structure of emotions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery &; data mining. KDD ’18. Association for computing machinery, pp 350–358, https://doi.org/10.1145/3219819.3219853
Hu W, Wu O, Chen Z, Fu Z, Maybank SJ (2007) Recognition of pornographic web pages by classifying texts and images. IEEE Trans Pattern Anal Mach Intell 29:1019–1034
Article Google Scholar
Kiela D, Firooz H, Mohan A, Goswami V, Singh A, Ringshia P, Testuggine D (2020) The hateful memes challenge: detecting hate speech in multimodal memes. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in neural information processing systems. Curran Associates, Inc, vol 33, pp 2611–2624. https://proceedings.neurips.cc/paper/2020/file/1b84c4cee2b8b3d823b30e2d604b1878-Paper.pdf
Kosti R, Alvarez JM, Recasens A, Lapedriza A (2017) Emotic: emotions in context dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops
Krippendorff k (2011) Computing krippendorff’s alpha-reliability
Kumar R, Ojha AK, Malmasi S, Zampieri M (2018) Benchmarking aggression identification in social media. In: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018). Association for computational linguistics, pp 1–11. https://aclanthology.org/W18-4401
Li W, Li Y, Liu W, Wang C (2022) An influence maximization method based on crowd emotion under an emotion-based attribute social network. Inf Process Manage, vol 59(2). https://doi.org/10.1016/j.ipm.2021.102818
Li LH, Yatskar M, Yin D, Hsieh C-J, Chang K-W (2019) VisualBERT: a simple and performant baseline for vision and language. arXiv:1908.03557. https://doi.org/10.48550/ARXIV.1908.03557
Malmasi S, Zampieri M (2018) Challenges in discriminating profanity from hate speech. J Exp Theo Artif Intell 30(2):187–202
Article Google Scholar
McCloud S (1994) Understanding comics: the invisible art. 1st HarperPerennial ed. New York HarperPerennial
Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web. WWW ’16. International world wide web conferences steering committee, pp 145–153. https://doi.org/10.1145/2872427.2883062
Öhman E (2020) Emotion annotation: rethinking emotion categorization. In: DHN post-proceedings
Plutchik R (2001) The nature of emotions. Am Sci 89(4):344. https://doi.org/10.1511/2001.4.344
Article Google Scholar
Prajwal KR, Jawahar CV, Kumaraguru P (2019) Towards increased accessibility of meme images with the help of rich face emotion captions. In: Proceedings of the 27th ACM international conference on multimedia. MM ’19. Association for computing machinery, pp 202–210, https://doi.org/10.1145/3343031.3350939
(1987) Quantification of agreement in psychiatric diagnosis revisited. In: Archives of General Psychiatry, vol 44:2
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning. Proceedings of machine learning research, PMLR, vol 139, pp 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
Roberts K, Roach MA, Johnson J, Guthrie J, Harabagiu SM (2012) Empatweet: annotating and detecting emotions on twitter. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12). European language resources association (ELRA), pp 3806–3813. http://www.lrec-conf.org/proceedings/lrec2012/pdf/201_Paper.pdf
Rosenthal S, Atanasova P, Karadzhov G, Zampieri M, Nakov P (2021) Solid: a large-scale semi-supervised dataset for offensive language identification. In: Findings
Sharma C, Bhageria D, Scott W, PYKL S, Das A, Chakraborty T, Pulabaigari V, Gambäck B (2020) SemEval-2020 task 8: memotion analysis- the visuo-lingual metaphor!. In: Proceedings of the fourteenth workshop on semantic evaluation, pp 759–773. International committee for computational linguistics. https://doi.org/10.18653/v1/2020.semeval-1.99. https://aclanthology.org/2020.semeval-1.99
Shaver PR, Schwartz JC, Kirson D, O’Connor C (1987) Emotion knowledge: further exploration of a prototype approach. J Pers Soc Psychol 52(6):1061–86
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Singh P, Lefever E (2021) LT3 at SemEval-2021 task 6: Using multi-modal compact bilinear pooling to combine visual and textual understanding in memes. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for computational linguistics, pp 1051–1055. https://doi.org/10.18653/v1/2021.semeval-1.145. https://aclanthology.org/2021.semeval-1.145
Suryawanshi S, Chakravarthi BR, Arcan M, Buitelaar P (2020) Multimodal meme dataset (multiOFF) for identifying offensive content in image and text. In: Proceedings of the second workshop on trolling, aggression and cyberbullying. European language resources association (ELRA), pp 32–41. https://aclanthology.org/2020.trac-1.6
Søgaard A, Goldberg Y (2016) Deep multi-task learning with low level tasks supervised at lower layers. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers). Association for computational linguistics, pp 231–235. https://doi.org/10.18653/v1/P16-2038. https://aclanthology.org/P16-2038
Tan H, Bansal M (2019) LXMERT: learning cross-modality encoder representations from transformers. arXiv:1908.07490. https://doi.org/10.48550/ARXIV.1908.07490
Tran HN, Cambria E (2018) Ensemble application of ELM and GPU for real-time multimodal sentiment analysis. Memetic Comput 10(1):3–13. https://doi.org/10.1007/s12293-017-0228-3
Article Google Scholar
Van Hee C, Jacobs G, Emmery C, Desmet B, Lefever E, Verhoeven B, De Pauw G, Daelemans W, Hoste V (2018) Automatic detection of cyberbullying in social media text. Plos One 13(10):1–22. https://doi.org/10.1371/journal.pone.0203794
Google Scholar
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. Association for computational linguistics, pp 88–93. https://doi.org/10.18653/v1/N16-2013. https://aclanthology.org/N16-2013
Wiegand M, Siegel M (2018) Overview of the germeval 2018 shared task on the identification of offensive language
Xu J-M, Jun K-S, Zhu X, Bellmore A (2012) Learning from bullying traces in social media. In: Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: human language technologies. Association for computational linguistics, Al, Canada, pp 656–666. https://aclanthology.org/N12-1084
Yoon I (2016) Why is it not just a joke? analysis of internet memes associated with racism and hidden ideology of colorblindness
Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) SemEval-2019 task 6: identifying and categorizing offensive language in social media (OffensEval). In: Proceedings of the 13th international workshop on semantic evaluation. Association for computational linguistics, pp 75–86, https://doi.org/10.18653/v1/S19-2010. https://aclanthology.org/S19-2010
Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) Predicting the type and target of offensive posts in social media. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for computational linguistics, pp 1415–1420. https://doi.org/10.18653/v1/N19-1144. https://aclanthology.org/N19-1144
Zhang W, Liu G, Li Z, Zhu F (2020) Hateful memes detection via complementary visual and linguistic networks. arXiv:2012.04977
Zhou Y, Chen Z (2020) Multimodal learning for hateful memes detection. arXiv:2011.12870
Zhu R (2020) Enhance multimodal transformer with external label and in-domain pretrain: hateful meme challenge winning solution. arXiv:2012.08290

Download references

Funding

The authors gratefully acknowledge the project “HELIOS - Hate, Hyperpartisan, and Hyperpluralism Elicitation and Observer System“, sponsored by Wipro AI.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology, Patna, 801103, Bihar, India
Gitanjali Kumari, Dibyanayan Bandyopadhyay & Asif Ekbal

Authors

Gitanjali Kumari
View author publications
You can also search for this author in PubMed Google Scholar
Dibyanayan Bandyopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Asif Ekbal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Gitanjali Kumari: Corpus creation, Algorithm design, Implementation, Experiments, Analysis, Writing - original draft. Dibyanayan Bandyopadhyay: Implementation, Experiments, Analysis, Writing - original draft. Asif Ekbal: Supervision, Algorithm conceptualization,

Corresponding author

Correspondence to Gitanjali Kumari.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interests about the work reported in this paper.

Conflict of Interests

1. Individual Privacy: To maintain the anonymity of any individual, we replaced the actual name with Person-XYZ throughout the paper. In addition, we also tried to anonymize the known faces presented in the visual part of the meme by masking them. We have masked these faces only to maintain the anonymity issues in the paper. During the implementation, we used the original image.

2. Biases: Detecting and removing political and religious biases is an extensive research area. However, previous annotation studies show that we cannot correctly remove bias and subjectivity from the annotation process despite having some form of annotation scheme. However, any biases detected in our dataset are unintentional, and we have no intention of harming any individual or group. We ensure that our data collection is generated equally and comparably in order to answer any political and religious bias queries. Furthermore, we ensure that the topic includes various issues relevant in the Indian context over the last seven years by using a keyword-based data-gathering technique. Moreover, we made sure that the terms included were inclusive of all the conceivable politicians, political organizations, young politicians, extreme groups, and religions and were not prejudiced against any one group. Based on previous work done by to remove biases from the dataset during annotation, in our dataset, annotators were strictly instructed not to make decisions based on what they believe but on what the social media user wants to transmit through that meme.

3. Misuse Potential: We suggest that researchers be aware that our dataset might be abused to filter the memes based on prejudices that may or may not be connected to demographics or other textual information. To prevent this from happening, human intervention with moderation would be essential.

4. Intended Use: Our dataset is presented to encourage research into studying humorous memes on the internet. We believe that it represents a valuable resource when used appropriately.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kumari, G., Bandyopadhyay, D. & Ekbal, A. EmoffMeme: identifying offensive memes by leveraging underlying emotions. Multimed Tools Appl 82, 45061–45096 (2023). https://doi.org/10.1007/s11042-023-14807-1

Download citation

Received: 28 June 2022
Revised: 16 September 2022
Accepted: 05 February 2023
Published: 26 April 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11042-023-14807-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EmoffMeme: identifying offensive memes by leveraging underlying emotions

Abstract

Access this article

Similar content being viewed by others

Combining Knowledge and Multi-modal Fusion for Meme Classification

Multimodal Hate Speech Detection from Bengali Memes and Texts

Multimodal Rumour Detection: Catching News that Never Transpired!

Data Availability

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

EmoffMeme: identifying offensive memes by leveraging underlying emotions

Abstract

Access this article

Similar content being viewed by others

Combining Knowledge and Multi-modal Fusion for Meme Classification

Multimodal Hate Speech Detection from Bengali Memes and Texts

Multimodal Rumour Detection: Catching News that Never Transpired!

Data Availability

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation