Skip to main content

Fairness by Explicability and Adversarial SHAP Learning

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Abstract

The ability to understand and trust the fairness of model predictions, particularly when considering the outcomes of unprivileged groups, is critical to the deployment and adoption of machine learning systems. SHAP values provide a unified framework for interpreting model predictions and feature attribution but do not address the problem of fairness directly. In this work, we propose a new definition of fairness that emphasises the role of an external auditor and model explicability. To satisfy this definition, we develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model. We focus on the binary classification task with a single unprivileged group and link our fairness explicability constraints to classical statistical fairness metrics. We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset. The models produced were fairer and performant.

Supported by Experian Ltd.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. PMLR, 10–15 July 2018. http://proceedings.mlr.press/v80/agarwal18a.html. Proceedings of Machine Learning Research, vol. 80, pp. 60–69

  2. Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. Law Rev. 104, 671 (2016)

    Google Scholar 

  3. Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017)

  4. Beutel, A., et al.: Putting fairness principles into practice: challenges, metrics, and improvements. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2019, pp. 453–459. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3306618.3314234

  5. University of California, I.: Census income dataset (1996). https://archive.ics.uci.edu/ml/datasets/census+income

  6. Celis, L.E., Huang, L., Keswani, V., Vishnoi, N.K.: Classification with fairness constraints: a meta-algorithm with provable guarantees. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 319–328. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3287560.3287586

  7. Cesaro, J., Gagliardi Cozman, F.: Measuring unfairness through game-theoretic interpretability. In: Cellier, P., Driessens, K. (eds.) ECML PKDD 2019. CCIS, vol. 1167, pp. 253–264. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43823-4_22

    Chapter  Google Scholar 

  8. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785

  9. Chiappa, S., Gillam, T.: Path-specific counterfactual fairness. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, February 2018. https://doi.org/10.1609/aaai.v33i01.33017801

  10. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017). https://doi.org/10.1089/big.2016.0047. pMID: 28632438

    Article  Google Scholar 

  11. Corbett-Davies, S., Goel, S., Morgenstern, J., Cummings, R.: Defining and designing fair algorithms. In: Proceedings of the 2018 ACM Conference on Economics and Computation, p. 705. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219166.3277556

  12. Di Stefano, P., Hickey, J., Vasileiou, V.: Counterfactual fairness: removing direct effects through regularization. arXiv preprint arXiv:2002.10774 (2020)

  13. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS 2012, pp. 214–226. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2090236.2090255

  14. European Parliament and Council of the European Union: Regulation (EU) 2016/679 regulation on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (data protection directive). OJ L 119, 1–88 (2016). https://gdpr-info.eu/

  15. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997). https://doi.org/10.1006/jcss.1997.1504. http://www.sciencedirect.com/science/article/pii/S002200009791504X

    Article  MathSciNet  MATH  Google Scholar 

  16. Friedler, S., Choudhary, S., Scheidegger, C., Hamilton, E., Venkatasubramanian, S., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency, pp. 329–338. Association for Computing Machinery, Inc, January 2019. https://doi.org/10.1145/3287560.3287589

  17. Goel, N., Yaghini, M., Faltings, B.: Non-discriminatory machine learning through convex fairness criteria (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16476

  18. Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

  19. Goodman, B., Flaxman, S.: European union regulations on algorithmic decision-making and a “right to explanation”. AI Mag. 38(3), 50–57 (2017). https://doi.org/10.1609/aimag.v38i3.2741. https://www.aaai.org/ojs/index.php/aimagazine/article/view/2741

    Article  Google Scholar 

  20. Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with prejudice remover regularizer. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 35–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_3

    Chapter  Google Scholar 

  21. Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning, June 2017

    Google Scholar 

  22. Kleinberg, J.: Inherent trade-offs in algorithmic fairness. In: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2018, p. 40. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219617.3219634

  23. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4066–4076. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6995-counterfactual-fairness.pdf

  24. Lipovetsky, S., Conklin, M.: Analysis of regression in game theory approach. Appl. Stoch. Models Bus. Ind. 17(4), 319–330 (2001). https://doi.org/10.1002/asmb.446. https://onlinelibrary.wiley.com/doi/abs/10.1002/asmb.446

    Article  MathSciNet  MATH  Google Scholar 

  25. Lum, K., Isaac, W.: To predict and serve? Significance 13(5), 14–19 (2016). https://doi.org/10.1111/j.1740-9713.2016.00960.x

    Article  Google Scholar 

  26. Lundberg, S.M., et al.: Explainable AI for trees: from local explanations to global understanding. arXiv preprint arXiv:1905.04610 (2019)

  27. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4765–4774. Curran Associates, Inc. (2017)

    Google Scholar 

  28. Madras, D., Creager, E., Pitassi, T., Zemel, R.: Learning adversarially fair and transferable representations. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 3384–3393. PMLR, Stockholmsmässan, Stockholm Sweden, 10–15 July 2018. http://proceedings.mlr.press/v80/madras18a.html

  29. Manisha, P., Gujar, S.: A neural network framework for fair classifier. arXiv preprint arXiv:1811.00247 (2018)

  30. Mansoor, S.: A viral Tweet accused apple’s new credit card of being ‘sexist’. Now New York state regulators are investigating. TIME Mag. (2019). https://time.com/5724098/new-york-investigating-goldman-sachs-apple-card/

  31. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019)

  32. Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* 2020, pp. 607–617. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3351095.3372850

  33. Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Proceedings of the AAAI Conference on Artificial Intelligence 2018, pp. 1931–1940, February 2018. https://www.ncbi.nlm.nih.gov/pubmed/29796336. 29796336[pmid]

  34. O’Neil, C.: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, New York (2016)

    MATH  Google Scholar 

  35. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939778

  36. Roemer, J.E., Trannoy, A.: Equality of opportunity. In: Handbook of Income Distribution, vol. 2, pp. 217–300. Elsevier (2015)

    Google Scholar 

  37. Shapley, L.S.: A value for n-person games. In: Contributions to the Theory of Games, vol. 2, no. 28, pp. 307–317 (1953). https://doi.org/10.1515/9781400881970-018

  38. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70, pp. 3145–3153. JMLR.org (2017)

    Google Scholar 

  39. Štrumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2013). https://doi.org/10.1007/s10115-013-0679-x

    Article  Google Scholar 

  40. Suresh, H., Guttag, J.V.: A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:1901.10002 (2019)

  41. Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, pp. 1–7. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3194770.3194776

  42. Wachter, S., Mittelstadt, B.D., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard J. Law Technol. 31, 842–854 (2018)

    Google Scholar 

  43. Yeom, S., Tschantz, M.C.: Discriminative but not discriminatory: a comparison of fairness definitions under different worldviews. arXiv preprint arXiv:1808.08619v4 (2019)

  44. Young, H.P.: Monotonic solutions of cooperative games. Int. J. Game Theory 14, 65–72 (1985). https://doi.org/10.1007/BF01769885

    Article  MathSciNet  MATH  Google Scholar 

  45. Zafar, M.B., Valera, I., Rodriguez, M.G., Gummadi, K.P., Weller, A.: From parity to preference-based notions of fairness in classification. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 228–238. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  46. Zafar, M.B., Valera, I., Rogriguez, M.G., Gummadi, K.P.: Fairness constraints: mechanisms for fair classification. In: Singh, A., Zhu, J. (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, Fort Lauderdale, 20–22 April 2017. http://proceedings.mlr.press/v54/zafar17a.html. Proceedings of Machine Learning Research, vol. 54, pp. 962–970

  47. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, pp. 335–340. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3278721.3278779

  48. Zhang, L., Wu, Y., Wu, X.: A causal framework for discovering and removing direct and indirect discrimination. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, pp. 3929–3935 (2017). https://doi.org/10.24963/ijcai.2017/549

  49. Zhao, Q., Hastie, T.: Causal interpretations of black-box models. J. Bus. Econ. Stat. 1–10 (2019). https://doi.org/10.1080/07350015.2019.1624293

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James M. Hickey .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 122 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hickey, J.M., Di Stefano, P.G., Vasileiou, V. (2021). Fairness by Explicability and Adversarial SHAP Learning. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67664-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67663-6

  • Online ISBN: 978-3-030-67664-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics