Skip to main content

Addressing Biases in the Texts Using an End-to-End Pipeline Approach

  • Conference paper
  • First Online:
Advances in Bias and Fairness in Information Retrieval (BIAS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1840))

  • 211 Accesses

Abstract

The concept of fairness is gaining popularity in academia and industry. Social media is especially vulnerable to media biases and toxic language and comments. We propose a fair ML pipeline that takes a text as input and determines whether it contains biases and toxic content. Then, based on pre-trained word embeddings, it suggests a set of new words by substituting the biased words, the idea is to lessen the effects of those biases by replacing them with alternative words. We compare our approach to existing fairness models to determine its effectiveness. The results show that our proposed pipeline can detect, identify, and mitigate biases in social media data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. AI, J.: Jigsaw Multilingual Toxic Comment Classification (2022). https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification/overview. Accessed 03 Feb 2022

  2. Borkan, D., Dixon, L., Jeffrey Sorensen, J., Thain, N., Lucy Vasserman, J., Sorensen, J.: Nuanced metrics for measuring unintended bias with real data for text classification (2019)https://doi.org/10.1145/3308560.3317593

  3. Dixon, J.L., et al.: Measuring and mitigating unintended bias in text classification. Ethics, Soc. 67–73 (2018). https://doi.org/10.1145/3278721.3278729

  4. Nielsen, A.: Practical Fairness. O’Reilly Media (2020)

    Google Scholar 

  5. Google developers: machine learning glossary|Google developers (2018). https://developers.google.com/machine-learning/glossary#m%0A https://developers.google.com/machine-learning/glossary/#t

  6. Wang, Y., Ma, W., Zhang, M., Liu, Y., Ma, S.: A survey on the fairness of recommender systems. ACM Trans. Inf. Syst. (2022)https://doi.org/10.1145/3547333

  7. Mastrine, J.: Types of Media Bias and How to Spot It AllSides (2018). https://www.allsides.com/media-bias/how-to-spot-types-of-media-bias

  8. Matfield, K.: Gender decoder: find subtle bias in job ads (2016). http://gender-decoder.katmatfield.com

  9. Service-Growth: Examples of gender-sensitive language (2003)

    Google Scholar 

  10. Barbour, H.: 25 Examples of Biased Language|Ongig Blog (2022). https://blog.ongig.com/diversity-and-inclusion/biased-language-examples. Accessed 03 Feb 2022

  11. Narayanan, A.: Fairness Definitions and Their Politics. In: Tutorial Presented at the Conference on Fairness, Accountability, and Transparency (2018)

    Google Scholar 

  12. Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination (2012)https://doi.org/10.1007/s10115-011-0463-8

  13. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340 (2018)

    Google Scholar 

  14. Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and calibration. arXiv Prepr. arXiv:1709.02012 (2017)

  15. Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN: generating datasets with fairness properties using a generative adversarial network. IBM J. Res. Dev. 63, 1–3 (2019)

    Article  Google Scholar 

  16. Saleiro, P., et al.: Aequitas: A bias and fairness audit toolkit. arXiv Prepr. arXiv:1811.05577 (2018)

  17. Bantilan, N.: Themis-ml: a fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. J. Technol. Hum. Serv. 36, 15–30 (2018)

    Article  Google Scholar 

  18. Bird, S., et al.: A toolkit for assessing and improving fairness in AI

    Google Scholar 

  19. Using the what-if tool AI platform prediction Google Cloud

    Google Scholar 

  20. Bellamy, R.K.E., et al.: AI fairness 360: an extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 63 (2019). https://doi.org/10.1147/JRD.2019.2942287

  21. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances Neural Information Processing System 30 (2017)

    Google Scholar 

  22. Locatello, F., Abbati, G., Rainforth, T., Bauer, S., Schölkopf, B., Bachem, O.: On the fairness of disentangled representations. In: Advances Neural Information Processing System 32 (2019)

    Google Scholar 

  23. Clark, K., Luong, M.-T., Le, Q.V, Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv Prepr. arXiv:2003.10555 (2020)

  24. Gaucher, D., Friesen, J., Kay, A.C.: Evidence that gendered wording in job advertisements exists and sustains gender inequality. J. Pers. Soc. Psychol. 101, 109–128 (2011). https://doi.org/10.1037/a0022530

    Article  Google Scholar 

  25. Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv Prepr. arXiv:1402.3722 (2014)

  26. Kaneko, M., Bollegala, D.: Dictionary-based debiasing of pre-trained word embeddings, pp. 212–223 (2021)

    Google Scholar 

  27. MacCarthy, M.: Standards of fairness for disparate impact assessment of big data algorithms. Cumb. L. Rev. 48, 67 (2017)

    Google Scholar 

  28. AllenNLP: AllenNLP - ELMo—Allen Institute for AI (2022). https://allenai.org/allennlp/software/elmo

  29. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv Prepr. arXiv:1907.11692 (2019)

  30. Celis, L.E., Huang, L., Keswani, V., Vishnoi, N.K.: Classification with fairness constraints: a meta-algorithm with provable guarantees. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 319–328 (2019)

    Google Scholar 

  31. Welcome to the Adversarial Robustness Toolbox—Adversarial Robustness Toolbox 1.10.3 documentation. https://adversarial-robustness-toolbox.readthedocs.io/en/latest/

  32. Raza, S., Ding, C.: A regularized model to trade-off between accuracy and diversity in a news recommender system. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 551–560 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaina Raza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raza, S., Bashir, S.R., Sneha, Qamar, U. (2023). Addressing Biases in the Texts Using an End-to-End Pipeline Approach. In: Boratto, L., Faralli, S., Marras, M., Stilo, G. (eds) Advances in Bias and Fairness in Information Retrieval. BIAS 2023. Communications in Computer and Information Science, vol 1840. Springer, Cham. https://doi.org/10.1007/978-3-031-37249-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37249-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37248-3

  • Online ISBN: 978-3-031-37249-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics