Skip to main content

Reduced-Bias Co-trained Ensembles for Weakly Supervised Cyberbullying Detection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11917))

Abstract

Social media reflects many aspects of society, including social biases against individuals based on sensitive characteristics such as gender, race, religion, physical ability, and sexual orientation. Machine learning algorithms trained on social media data may therefore perpetuate or amplify discriminatory attitudes against various demographic groups, causing unfair decision-making. One important application for machine learning is the automatic detection of cyberbullying. Biases in this context could take the form of bullying detectors that make false detections more frequently on messages by or about certain identity groups. In this paper, we present an approach for training bullying detectors from weak supervision while reducing the degree to which learned models reflect or amplify discriminatory biases in the data. Our goal is to decrease the sensitivity of models to language describing particular social groups. An ideal, fair language-based detector should treat language describing subpopulations of particular social groups equitably. Building on a previously proposed weakly supervised learning algorithm, we penalize the model when discrimination is observed. By penalizing unfairness, we encourage the learning algorithm to avoid unfair behavior in its predictions and achieve equitable treatment for protected subpopulations. We introduce two unfairness penalty terms: one aimed at removal fairness and another at substitutional fairness. We quantitatively and qualitatively evaluate the resulting models’ fairness on a synthetic benchmark and data from Twitter comparing against crowdsourced annotation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bolukbasi, T., Chang, K., Zou, J.Y., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. CoRR abs/1607.06520 (2016)

    Google Scholar 

  2. Boyd, D.: It’s Complicated. Yale University Press, New Haven (2014)

    Google Scholar 

  3. Chatzakou, D., Kourtellis, N., Blackburn, J., Cristofaro, E.D., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. CoRR abs/1702.06877 (2017)

    Google Scholar 

  4. Chelmis, C., Zois, D.S., Yao, M.: Mining patterns of cyberbullying on Twitter. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 126–133 (2017)

    Google Scholar 

  5. Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: International Conference on Social Computing, pp. 71–80 (2012)

    Google Scholar 

  6. Dieterich, W., Mendoza, C., Brennan, T.: Compas risk scales: demonstrating accuracy equity and predictive parity performance of the compas risk scales in broward county (2016)

    Google Scholar 

  7. Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: ICWSM Workshop on Social Mobile Web (2011)

    Google Scholar 

  8. Garg, S., Perot, V., Limtiaco, N., Taly, A., Chi, E.H., Beutel, A.: Counterfactual fairness in text classification through robustness. CoRR abs/1809.10610 (2018)

    Google Scholar 

  9. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. CoRR abs/1607.00653 (2016)

    Google Scholar 

  10. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. CoRR abs/1610.02413 (2016)

    Google Scholar 

  11. Hosseinmardi, H., Ghasemianlangroodi, A., Han, R., Lv, Q., Mishra, S.: Towards understanding cyberbullying behavior in a semi-anonymous social network. In: International Conference on Advances in Social Networks Analysis and Mining, pp. 244–252 (2014)

    Google Scholar 

  12. Hosseinmardi, H., Mattson, S.A., Rafiq, R.I., Han, R., Lv, Q., Mishra, S.: Detection of cyberbullying incidents on the Instagram social network. Association for the Advancement of Artificial Intelligence (2015)

    Google Scholar 

  13. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)

    Google Scholar 

  14. Huang, Q., Singh, V.K.: Cyber bullying detection using social and textual analysis. In: Proceedings of the International Workshop on Socially-Aware Multimedia, pp. 3–6 (2014)

    Google Scholar 

  15. Kim, M.P., Ghorbani, A., Zou, J.Y.: Multiaccuracy: black-box post-processing for fairness in classification. CoRR abs/1805.12317 (2018)

    Google Scholar 

  16. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  17. Nahar, V., Li, X., Pang, C.: An effective approach for cyberbullying detection. Commun. Inf. Sci. Manag. Eng. 3(5), 238–247 (2013)

    Google Scholar 

  18. noswearing.com: List of swear words & curse words (2016). http://www.noswearing.com

  19. Ptaszynski, M., Dybala, P., Matsuba, T., Masui, F., Rzepka, R., Araki, K.: Machine learning and affect analysis against cyber-bullying. In: Linguistic and Cognitive Approaches to Dialog Agents Symposium, pp. 7–16 (2010)

    Google Scholar 

  20. Raisi, E., Huang, B.: Co-trained ensemble models for weakly supervised cyberbullying detection. In: NeurIPS Workshop on Learning with Limited Labeled Data (2017)

    Google Scholar 

  21. Raisi, E., Huang, B.: Weakly supervised cyberbullying detection using co-trained ensembles of embedding models. In: Proceedings of the IEEE/ACM International Conference on Social Networks Analysis and Mining, pp. 479–486 (2018)

    Google Scholar 

  22. Rezvan, M., Shekarpour, S., Thirunarayan, K., Shalin, V.L., Sheth, A.P.: Analyzing and learning the language for different types of harassment. CoRR abs/1811.00644 (2018)

    Google Scholar 

  23. Sinders, C.: Toxicity and tone are not the same thing: analyzing the new Google API on toxicity, PerspectiveAPI (2017). https://medium.com/@carolinesinders/toxicity-and-tone-are-not-the-same-thing-analyzing-the-new-google-api-on-toxicity-perspectiveapi-14abe4e728b3

  24. Soni, D., Singh, V.K.: See no evil, hear no evil: audio-visual-textual cyberbullying detection. Proc. ACM Hum.-Comput. Interact. 2, 164:1–164:26 (2018)

    Article  Google Scholar 

  25. Tomkins, S., Getoor, L., Chen, Y., Zhang, Y.: A socio-linguistic model for cyberbullying detection. In: International Conference on Advances in Social Networks Analysis and Mining (2018)

    Google Scholar 

  26. Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: Proceedings of the International Conference on Machine Learning, pp. 1113–1120 (2009)

    Google Scholar 

  27. Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on Web 2.0. Content Analysis in the WEB 2.0 (2009)

    Google Scholar 

  28. Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In: Proceedings of the International Conference on World Wide Web, pp. 1171–1180 (2017)

    Google Scholar 

  29. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. CoRR abs/1801.07593 (2018)

    Google Scholar 

  30. Zois, D.S., Kapodistria, A., Yao, M., Chelmis, C.: Optimal online cyberbullying detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2017–2021 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elaheh Raisi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raisi, E., Huang, B. (2019). Reduced-Bias Co-trained Ensembles for Weakly Supervised Cyberbullying Detection. In: Tagarelli, A., Tong, H. (eds) Computational Data and Social Networks. CSoNet 2019. Lecture Notes in Computer Science(), vol 11917. Springer, Cham. https://doi.org/10.1007/978-3-030-34980-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34980-6_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34979-0

  • Online ISBN: 978-3-030-34980-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics