Abstract
The concept of fairness is gaining popularity in academia and industry. Social media is especially vulnerable to media biases and toxic language and comments. We propose a fair ML pipeline that takes a text as input and determines whether it contains biases and toxic content. Then, based on pre-trained word embeddings, it suggests a set of new words by substituting the biased words, the idea is to lessen the effects of those biases by replacing them with alternative words. We compare our approach to existing fairness models to determine its effectiveness. The results show that our proposed pipeline can detect, identify, and mitigate biases in social media data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
AI, J.: Jigsaw Multilingual Toxic Comment Classification (2022). https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification/overview. Accessed 03 Feb 2022
Borkan, D., Dixon, L., Jeffrey Sorensen, J., Thain, N., Lucy Vasserman, J., Sorensen, J.: Nuanced metrics for measuring unintended bias with real data for text classification (2019)https://doi.org/10.1145/3308560.3317593
Dixon, J.L., et al.: Measuring and mitigating unintended bias in text classification. Ethics, Soc. 67–73 (2018). https://doi.org/10.1145/3278721.3278729
Nielsen, A.: Practical Fairness. O’Reilly Media (2020)
Google developers: machine learning glossary|Google developers (2018). https://developers.google.com/machine-learning/glossary#m%0A https://developers.google.com/machine-learning/glossary/#t
Wang, Y., Ma, W., Zhang, M., Liu, Y., Ma, S.: A survey on the fairness of recommender systems. ACM Trans. Inf. Syst. (2022)https://doi.org/10.1145/3547333
Mastrine, J.: Types of Media Bias and How to Spot It AllSides (2018). https://www.allsides.com/media-bias/how-to-spot-types-of-media-bias
Matfield, K.: Gender decoder: find subtle bias in job ads (2016). http://gender-decoder.katmatfield.com
Service-Growth: Examples of gender-sensitive language (2003)
Barbour, H.: 25 Examples of Biased Language|Ongig Blog (2022). https://blog.ongig.com/diversity-and-inclusion/biased-language-examples. Accessed 03 Feb 2022
Narayanan, A.: Fairness Definitions and Their Politics. In: Tutorial Presented at the Conference on Fairness, Accountability, and Transparency (2018)
Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination (2012)https://doi.org/10.1007/s10115-011-0463-8
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340 (2018)
Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and calibration. arXiv Prepr. arXiv:1709.02012 (2017)
Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN: generating datasets with fairness properties using a generative adversarial network. IBM J. Res. Dev. 63, 1–3 (2019)
Saleiro, P., et al.: Aequitas: A bias and fairness audit toolkit. arXiv Prepr. arXiv:1811.05577 (2018)
Bantilan, N.: Themis-ml: a fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. J. Technol. Hum. Serv. 36, 15–30 (2018)
Bird, S., et al.: A toolkit for assessing and improving fairness in AI
Using the what-if tool AI platform prediction Google Cloud
Bellamy, R.K.E., et al.: AI fairness 360: an extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 63 (2019). https://doi.org/10.1147/JRD.2019.2942287
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances Neural Information Processing System 30 (2017)
Locatello, F., Abbati, G., Rainforth, T., Bauer, S., Schölkopf, B., Bachem, O.: On the fairness of disentangled representations. In: Advances Neural Information Processing System 32 (2019)
Clark, K., Luong, M.-T., Le, Q.V, Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv Prepr. arXiv:2003.10555 (2020)
Gaucher, D., Friesen, J., Kay, A.C.: Evidence that gendered wording in job advertisements exists and sustains gender inequality. J. Pers. Soc. Psychol. 101, 109–128 (2011). https://doi.org/10.1037/a0022530
Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv Prepr. arXiv:1402.3722 (2014)
Kaneko, M., Bollegala, D.: Dictionary-based debiasing of pre-trained word embeddings, pp. 212–223 (2021)
MacCarthy, M.: Standards of fairness for disparate impact assessment of big data algorithms. Cumb. L. Rev. 48, 67 (2017)
AllenNLP: AllenNLP - ELMo—Allen Institute for AI (2022). https://allenai.org/allennlp/software/elmo
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv Prepr. arXiv:1907.11692 (2019)
Celis, L.E., Huang, L., Keswani, V., Vishnoi, N.K.: Classification with fairness constraints: a meta-algorithm with provable guarantees. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 319–328 (2019)
Welcome to the Adversarial Robustness Toolbox—Adversarial Robustness Toolbox 1.10.3 documentation. https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
Raza, S., Ding, C.: A regularized model to trade-off between accuracy and diversity in a news recommender system. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 551–560 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Raza, S., Bashir, S.R., Sneha, Qamar, U. (2023). Addressing Biases in the Texts Using an End-to-End Pipeline Approach. In: Boratto, L., Faralli, S., Marras, M., Stilo, G. (eds) Advances in Bias and Fairness in Information Retrieval. BIAS 2023. Communications in Computer and Information Science, vol 1840. Springer, Cham. https://doi.org/10.1007/978-3-031-37249-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-37249-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37248-3
Online ISBN: 978-3-031-37249-0
eBook Packages: Computer ScienceComputer Science (R0)