Fairness by Explicability and Adversarial SHAP Learning

Hickey, James M.; Di Stefano, Pietro G.; Vasileiou, Vlasios

doi:10.1007/978-3-030-67664-3_11

James M. Hickey¹²,
Pietro G. Di Stefano¹² &
Vlasios Vasileiou¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2093 Accesses
5 Citations

Abstract

The ability to understand and trust the fairness of model predictions, particularly when considering the outcomes of unprivileged groups, is critical to the deployment and adoption of machine learning systems. SHAP values provide a unified framework for interpreting model predictions and feature attribution but do not address the problem of fairness directly. In this work, we propose a new definition of fairness that emphasises the role of an external auditor and model explicability. To satisfy this definition, we develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model. We focus on the binary classification task with a single unprivileged group and link our fairness explicability constraints to classical statistical fairness metrics. We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset. The models produced were fairer and performant.

Supported by Experian Ltd.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. PMLR, 10–15 July 2018. http://proceedings.mlr.press/v80/agarwal18a.html. Proceedings of Machine Learning Research, vol. 80, pp. 60–69
Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. Law Rev. 104, 671 (2016)
Google Scholar
Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017)
Beutel, A., et al.: Putting fairness principles into practice: challenges, metrics, and improvements. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2019, pp. 453–459. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3306618.3314234
University of California, I.: Census income dataset (1996). https://archive.ics.uci.edu/ml/datasets/census+income
Celis, L.E., Huang, L., Keswani, V., Vishnoi, N.K.: Classification with fairness constraints: a meta-algorithm with provable guarantees. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 319–328. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3287560.3287586
Cesaro, J., Gagliardi Cozman, F.: Measuring unfairness through game-theoretic interpretability. In: Cellier, P., Driessens, K. (eds.) ECML PKDD 2019. CCIS, vol. 1167, pp. 253–264. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43823-4_22
Chapter Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785
Chiappa, S., Gillam, T.: Path-specific counterfactual fairness. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, February 2018. https://doi.org/10.1609/aaai.v33i01.33017801
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017). https://doi.org/10.1089/big.2016.0047. pMID: 28632438
Article Google Scholar
Corbett-Davies, S., Goel, S., Morgenstern, J., Cummings, R.: Defining and designing fair algorithms. In: Proceedings of the 2018 ACM Conference on Economics and Computation, p. 705. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219166.3277556
Di Stefano, P., Hickey, J., Vasileiou, V.: Counterfactual fairness: removing direct effects through regularization. arXiv preprint arXiv:2002.10774 (2020)
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS 2012, pp. 214–226. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2090236.2090255
European Parliament and Council of the European Union: Regulation (EU) 2016/679 regulation on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (data protection directive). OJ L 119, 1–88 (2016). https://gdpr-info.eu/
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997). https://doi.org/10.1006/jcss.1997.1504. http://www.sciencedirect.com/science/article/pii/S002200009791504X
Article MathSciNet MATH Google Scholar
Friedler, S., Choudhary, S., Scheidegger, C., Hamilton, E., Venkatasubramanian, S., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency, pp. 329–338. Association for Computing Machinery, Inc, January 2019. https://doi.org/10.1145/3287560.3287589
Goel, N., Yaghini, M., Faltings, B.: Non-discriminatory machine learning through convex fairness criteria (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16476
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Goodman, B., Flaxman, S.: European union regulations on algorithmic decision-making and a “right to explanation”. AI Mag. 38(3), 50–57 (2017). https://doi.org/10.1609/aimag.v38i3.2741. https://www.aaai.org/ojs/index.php/aimagazine/article/view/2741
Article Google Scholar
Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with prejudice remover regularizer. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 35–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_3
Chapter Google Scholar
Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning, June 2017
Google Scholar
Kleinberg, J.: Inherent trade-offs in algorithmic fairness. In: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2018, p. 40. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219617.3219634
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4066–4076. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6995-counterfactual-fairness.pdf
Lipovetsky, S., Conklin, M.: Analysis of regression in game theory approach. Appl. Stoch. Models Bus. Ind. 17(4), 319–330 (2001). https://doi.org/10.1002/asmb.446. https://onlinelibrary.wiley.com/doi/abs/10.1002/asmb.446
Article MathSciNet MATH Google Scholar
Lum, K., Isaac, W.: To predict and serve? Significance 13(5), 14–19 (2016). https://doi.org/10.1111/j.1740-9713.2016.00960.x
Article Google Scholar
Lundberg, S.M., et al.: Explainable AI for trees: from local explanations to global understanding. arXiv preprint arXiv:1905.04610 (2019)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4765–4774. Curran Associates, Inc. (2017)
Google Scholar
Madras, D., Creager, E., Pitassi, T., Zemel, R.: Learning adversarially fair and transferable representations. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 3384–3393. PMLR, Stockholmsmässan, Stockholm Sweden, 10–15 July 2018. http://proceedings.mlr.press/v80/madras18a.html
Manisha, P., Gujar, S.: A neural network framework for fair classifier. arXiv preprint arXiv:1811.00247 (2018)
Mansoor, S.: A viral Tweet accused apple’s new credit card of being ‘sexist’. Now New York state regulators are investigating. TIME Mag. (2019). https://time.com/5724098/new-york-investigating-goldman-sachs-apple-card/
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019)
Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* 2020, pp. 607–617. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3351095.3372850
Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Proceedings of the AAAI Conference on Artificial Intelligence 2018, pp. 1931–1940, February 2018. https://www.ncbi.nlm.nih.gov/pubmed/29796336. 29796336[pmid]
O’Neil, C.: Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, New York (2016)
MATH Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939778
Roemer, J.E., Trannoy, A.: Equality of opportunity. In: Handbook of Income Distribution, vol. 2, pp. 217–300. Elsevier (2015)
Google Scholar
Shapley, L.S.: A value for n-person games. In: Contributions to the Theory of Games, vol. 2, no. 28, pp. 307–317 (1953). https://doi.org/10.1515/9781400881970-018
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70, pp. 3145–3153. JMLR.org (2017)
Google Scholar
Štrumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2013). https://doi.org/10.1007/s10115-013-0679-x
Article Google Scholar
Suresh, H., Guttag, J.V.: A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:1901.10002 (2019)
Verma, S., Rubin, J.: Fairness definitions explained. In: Proceedings of the International Workshop on Software Fairness, pp. 1–7. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3194770.3194776
Wachter, S., Mittelstadt, B.D., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard J. Law Technol. 31, 842–854 (2018)
Google Scholar
Yeom, S., Tschantz, M.C.: Discriminative but not discriminatory: a comparison of fairness definitions under different worldviews. arXiv preprint arXiv:1808.08619v4 (2019)
Young, H.P.: Monotonic solutions of cooperative games. Int. J. Game Theory 14, 65–72 (1985). https://doi.org/10.1007/BF01769885
Article MathSciNet MATH Google Scholar
Zafar, M.B., Valera, I., Rodriguez, M.G., Gummadi, K.P., Weller, A.: From parity to preference-based notions of fairness in classification. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 228–238. Curran Associates Inc., Red Hook (2017)
Google Scholar
Zafar, M.B., Valera, I., Rogriguez, M.G., Gummadi, K.P.: Fairness constraints: mechanisms for fair classification. In: Singh, A., Zhu, J. (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, Fort Lauderdale, 20–22 April 2017. http://proceedings.mlr.press/v54/zafar17a.html. Proceedings of Machine Learning Research, vol. 54, pp. 962–970
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, pp. 335–340. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3278721.3278779
Zhang, L., Wu, Y., Wu, X.: A causal framework for discovering and removing direct and indirect discrimination. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, pp. 3929–3935 (2017). https://doi.org/10.24963/ijcai.2017/549
Zhao, Q., Hastie, T.: Causal interpretations of black-box models. J. Bus. Econ. Stat. 1–10 (2019). https://doi.org/10.1080/07350015.2019.1624293

Download references

Author information

Authors and Affiliations

Experian UK&I and EMEA DataLabs, London, UK
James M. Hickey, Pietro G. Di Stefano & Vlasios Vasileiou

Authors

James M. Hickey
View author publications
You can also search for this author in PubMed Google Scholar
Pietro G. Di Stefano
View author publications
You can also search for this author in PubMed Google Scholar
Vlasios Vasileiou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James M. Hickey .

Editor information

Editors and Affiliations

Albert-Ludwigs-Universität, Freiburg, Germany
Frank Hutter
TU Darmstadt, Darmstadt, Germany
Kristian Kersting
Ghent University, Ghent, Belgium
Jefrey Lijffijt
Saarland University, Saarbrücken, Germany
Isabel Valera

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 122 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hickey, J.M., Di Stefano, P.G., Vasileiou, V. (2021). Fairness by Explicability and Adversarial SHAP Learning. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-67664-3_11
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67663-6
Online ISBN: 978-3-030-67664-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)