Fair Classifiers that Abstain without Harm

Yin, Tongxin; Ton, Jean-François; Guo, Ruocheng; Yao, Yuanshun; Liu, Mingyan; Liu, Yang

Abstract:In critical applications, it is vital for classifiers to defer decision-making to humans. We propose a post-hoc method that makes existing classifiers selectively abstain from predicting certain samples. Our abstaining classifier is incentivized to maintain the original accuracy for each sub-population (i.e. no harm) while achieving a set of group fairness definitions to a user specified degree. To this end, we design an Integer Programming (IP) procedure that assigns abstention decisions for each training sample to satisfy a set of constraints. To generalize the abstaining decisions to test samples, we then train a surrogate model to learn the abstaining decisions based on the IP solutions in an end-to-end manner. We analyze the feasibility of the IP procedure to determine the possible abstention rate for different levels of unfairness tolerance and accuracy constraint for achieving no harm. To the best of our knowledge, this work is the first to identify the theoretical relationships between the constraint parameters and the required abstention rate. Our theoretical results are important since a high abstention rate is often infeasible in practice due to a lack of human resources. Our framework outperforms existing methods in terms of fairness disparity without sacrificing accuracy at similar abstention rates.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2310.06205 [cs.LG]
	(or arXiv:2310.06205v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.06205

Computer Science > Machine Learning

Title:Fair Classifiers that Abstain without Harm

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators