Reduced-Bias Co-trained Ensembles for Weakly Supervised Cyberbullying Detection

Raisi, Elaheh; Huang, Bert

doi:10.1007/978-3-030-34980-6_32

Reduced-Bias Co-trained Ensembles for Weakly Supervised Cyberbullying Detection

Elaheh Raisi¹⁰ &
Bert Huang¹⁰

Conference paper
First Online: 11 November 2019

1018 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11917))

Abstract

Social media reflects many aspects of society, including social biases against individuals based on sensitive characteristics such as gender, race, religion, physical ability, and sexual orientation. Machine learning algorithms trained on social media data may therefore perpetuate or amplify discriminatory attitudes against various demographic groups, causing unfair decision-making. One important application for machine learning is the automatic detection of cyberbullying. Biases in this context could take the form of bullying detectors that make false detections more frequently on messages by or about certain identity groups. In this paper, we present an approach for training bullying detectors from weak supervision while reducing the degree to which learned models reflect or amplify discriminatory biases in the data. Our goal is to decrease the sensitivity of models to language describing particular social groups. An ideal, fair language-based detector should treat language describing subpopulations of particular social groups equitably. Building on a previously proposed weakly supervised learning algorithm, we penalize the model when discrimination is observed. By penalizing unfairness, we encourage the learning algorithm to avoid unfair behavior in its predictions and achieve equitable treatment for protected subpopulations. We introduce two unfairness penalty terms: one aimed at removal fairness and another at substitutional fairness. We quantitatively and qualitatively evaluate the resulting models’ fairness on a synthetic benchmark and data from Twitter comparing against crowdsourced annotation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bolukbasi, T., Chang, K., Zou, J.Y., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. CoRR abs/1607.06520 (2016)
Google Scholar
Boyd, D.: It’s Complicated. Yale University Press, New Haven (2014)
Google Scholar
Chatzakou, D., Kourtellis, N., Blackburn, J., Cristofaro, E.D., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. CoRR abs/1702.06877 (2017)
Google Scholar
Chelmis, C., Zois, D.S., Yao, M.: Mining patterns of cyberbullying on Twitter. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 126–133 (2017)
Google Scholar
Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: International Conference on Social Computing, pp. 71–80 (2012)
Google Scholar
Dieterich, W., Mendoza, C., Brennan, T.: Compas risk scales: demonstrating accuracy equity and predictive parity performance of the compas risk scales in broward county (2016)
Google Scholar
Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: ICWSM Workshop on Social Mobile Web (2011)
Google Scholar
Garg, S., Perot, V., Limtiaco, N., Taly, A., Chi, E.H., Beutel, A.: Counterfactual fairness in text classification through robustness. CoRR abs/1809.10610 (2018)
Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. CoRR abs/1607.00653 (2016)
Google Scholar
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. CoRR abs/1610.02413 (2016)
Google Scholar
Hosseinmardi, H., Ghasemianlangroodi, A., Han, R., Lv, Q., Mishra, S.: Towards understanding cyberbullying behavior in a semi-anonymous social network. In: International Conference on Advances in Social Networks Analysis and Mining, pp. 244–252 (2014)
Google Scholar
Hosseinmardi, H., Mattson, S.A., Rafiq, R.I., Han, R., Lv, Q., Mishra, S.: Detection of cyberbullying incidents on the Instagram social network. Association for the Advancement of Artificial Intelligence (2015)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)
Google Scholar
Huang, Q., Singh, V.K.: Cyber bullying detection using social and textual analysis. In: Proceedings of the International Workshop on Socially-Aware Multimedia, pp. 3–6 (2014)
Google Scholar
Kim, M.P., Ghorbani, A., Zou, J.Y.: Multiaccuracy: black-box post-processing for fairness in classification. CoRR abs/1805.12317 (2018)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the International Conference on Machine Learning, pp. 1188–1196 (2014)
Google Scholar
Nahar, V., Li, X., Pang, C.: An effective approach for cyberbullying detection. Commun. Inf. Sci. Manag. Eng. 3(5), 238–247 (2013)
Google Scholar
noswearing.com: List of swear words & curse words (2016). http://www.noswearing.com
Ptaszynski, M., Dybala, P., Matsuba, T., Masui, F., Rzepka, R., Araki, K.: Machine learning and affect analysis against cyber-bullying. In: Linguistic and Cognitive Approaches to Dialog Agents Symposium, pp. 7–16 (2010)
Google Scholar
Raisi, E., Huang, B.: Co-trained ensemble models for weakly supervised cyberbullying detection. In: NeurIPS Workshop on Learning with Limited Labeled Data (2017)
Google Scholar
Raisi, E., Huang, B.: Weakly supervised cyberbullying detection using co-trained ensembles of embedding models. In: Proceedings of the IEEE/ACM International Conference on Social Networks Analysis and Mining, pp. 479–486 (2018)
Google Scholar
Rezvan, M., Shekarpour, S., Thirunarayan, K., Shalin, V.L., Sheth, A.P.: Analyzing and learning the language for different types of harassment. CoRR abs/1811.00644 (2018)
Google Scholar
Sinders, C.: Toxicity and tone are not the same thing: analyzing the new Google API on toxicity, PerspectiveAPI (2017). https://medium.com/@carolinesinders/toxicity-and-tone-are-not-the-same-thing-analyzing-the-new-google-api-on-toxicity-perspectiveapi-14abe4e728b3
Soni, D., Singh, V.K.: See no evil, hear no evil: audio-visual-textual cyberbullying detection. Proc. ACM Hum.-Comput. Interact. 2, 164:1–164:26 (2018)
Article Google Scholar
Tomkins, S., Getoor, L., Chen, Y., Zhang, Y.: A socio-linguistic model for cyberbullying detection. In: International Conference on Advances in Social Networks Analysis and Mining (2018)
Google Scholar
Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: Proceedings of the International Conference on Machine Learning, pp. 1113–1120 (2009)
Google Scholar
Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on Web 2.0. Content Analysis in the WEB 2.0 (2009)
Google Scholar
Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In: Proceedings of the International Conference on World Wide Web, pp. 1171–1180 (2017)
Google Scholar
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. CoRR abs/1801.07593 (2018)
Google Scholar
Zois, D.S., Kapodistria, A., Yao, M., Chelmis, C.: Optimal online cyberbullying detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2017–2021 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Virginia Tech, Blacksburg, VA, USA
Elaheh Raisi & Bert Huang

Authors

Elaheh Raisi
View author publications
You can also search for this author in PubMed Google Scholar
Bert Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elaheh Raisi .

Editor information

Editors and Affiliations

University of Calabria, Rende, Italy
Andrea Tagarelli
University of Illinois at Urbana-Champaign, Urbana, IL, USA
Hanghang Tong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raisi, E., Huang, B. (2019). Reduced-Bias Co-trained Ensembles for Weakly Supervised Cyberbullying Detection. In: Tagarelli, A., Tong, H. (eds) Computational Data and Social Networks. CSoNet 2019. Lecture Notes in Computer Science(), vol 11917. Springer, Cham. https://doi.org/10.1007/978-3-030-34980-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-34980-6_32
Published: 11 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34979-0
Online ISBN: 978-3-030-34980-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics