Elsevier

Artificial Intelligence

Volume 302, January 2022, 103609
Artificial Intelligence

On fair selection in the presence of implicit and differential variance,☆☆

https://doi.org/10.1016/j.artint.2021.103609Get rights and content

Abstract

Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In this paper, we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group—we argue that such differential variance is a key feature of many selection problems. We analyze two notable settings: in the first, the noise variances are unknown to the decision maker who simply picks the candidates with the highest estimated quality independently of their group; in the second, the variances are known and the decision maker picks candidates having the highest expected quality given the noisy estimate. We show that both baseline decision makers yield discrimination, although in opposite directions: the first leads to underrepresentation of the low-variance group while the second leads to underrepresentation of the high-variance group. We study the effect on the selection utility of imposing a fairness mechanism that we term the γ-rule (it is an extension of the classical four-fifths rule and it also includes demographic parity). In the first setting (with unknown variances), we prove that under mild conditions, imposing the γ-rule increases the selection utility—here there is no trade-off between fairness and utility. In the second setting (with known variances), imposing the γ-rule decreases the utility but we prove a bound on the utility loss due to the fairness mechanism.

Introduction

Discrimination in selection and the role of implicit bias  Many selection problems such as hiring or college admission are subject to discrimination [2], where the outcomes for certain individuals are negatively correlated with their membership in salient demographic groups defined by attributes like gender, race, ethnicity, sexual orientation or religion. Over the past two decades, implicit bias—that is an unconscious negative perception of the members of certain demographic groups—has been put forward as a key factor in explaining this discrimination [3]. While human decision makers are naturally susceptible to implicit bias when assessing candidates, algorithmic decision makers are also vulnerable to implicit biases when the data used to train them or to make decisions was generated by humans.

To mitigate the effects of discrimination on candidates from underrepresented groups, various fairness mechanisms1 are adopted in many domains, either by law or through softer guidelines. For instance, the Rooney rule [4] requires that, when hiring for a given position, at least one candidate from the underrepresented group be interviewed. The Rooney rule was initially introduced for hiring American football coaches, but it is increasingly being adopted by many other businesses in particular for hiring top executives [5], [6]. Another widely used fairness mechanism is the so-called 45-rule [7], which requires that the selection rate for the underrepresented group be at least 80% of that for the overrepresented group (otherwise one says that there is adverse impact). This rule is part of the “Uniform Guidelines On Employee Selection Procedures”.2 A stricter version of the 45-rule is the so-called demographic parity constraint, which requires the selection rates for all groups to be equal. An overview of these and other fairness mechanisms can be found in [7].

Fairness mechanisms, however, have been the subject of frequent debates. On the one hand, they are believed to promote the inclusion of deserving candidates from underrepresented groups who would have otherwise been excluded in particular due to implicit bias. On the other hand, they are viewed as requiring consideration of candidates from underrepresented groups at the expense of candidates from overrepresented groups, which may potentially decrease the overall utility of the selection process, i.e., the overall quality of selected candidates.

Formal analysis of fairness mechanisms in the presence of implicit bias  Perhaps surprisingly, the mathematical analysis of the effect of fairness mechanisms on utility in the context of selection problems was initiated only recently by Kleinberg and Raghavan [8] (see also an extension to ranking problems in [9]). The authors of [8] assume that each candidate i has a true latent quality Wi that comes from a group-independent distribution. They model implicit bias by assuming that the decision maker sees an estimate of the quality Wˆi=Wi for candidates from the well-represented group and Wˆi=Wi/β for candidates from the underrepresented group, where β>1 measures the amount of implicit bias. The factor β is unknown (as it is implicit bias) and the decision maker selects candidates by ranking them according to Wˆi. Then Kleinberg and Raghavan [8] show that, under a well-defined condition (that roughly qualifies scenarios where the bias is large), the Rooney rule improves in expectation the utility of the selection (measured as the sum of true qualities of candidates selected for interview). This result contradicts conventional wisdom that fairness considerations in a selection process are at odds with the utility of the selection process. Rather, it formalizes the intuition that, in the presence of strong implicit bias (which makes it hard to compare candidates across groups), considering the best candidates across a diverse set of groups not only improves fairness but it also has a positive effect on utility.

The phenomenon of differential variance and its role in discrimination  In this paper, we identify and analyze a fundamentally different source of discrimination in selection problems than implicit bias. Even in the absence of implicit bias in a decision maker's estimate of candidates' quality, the estimates may differ between the different groups in their variance—that is, the decision maker's ability to precisely estimate a candidate's quality may depend on the candidate's group. There are at least two main reasons for group-dependent variances in practice. The first arises from candidates: different groups of candidates may exhibit different variability when their quality is estimated through a given test. For instance, students of different genders have been observed to show different variability on certain test scores [10], [11].3 The second arises from the decision makers: decision makers might have different levels of experience (or different amounts of data in case of algorithmic decision making) judging candidates from different groups and consequently, their ability to precisely assess the quality of candidates belonging to different groups might be different. For instance, when hiring top executives, one may have less experience in evaluating the performance of female candidates because there have been fewer women in those positions in the past (in France for instance, there was only one woman CEO amongst the top-40 companies in 2016-2020 [12]). The quality estimate's variance might also change from one decision maker to another. For example, in college admissions, recruiters might be able to judge candidates from schools in their own country more accurately than those from international schools.

We refer to the above phenomenon as differential variance as the variance of the quality estimate is group-dependent. We posit that differential variance is an omnipresent and fundamental feature affecting selection problems (including in algorithmic decision making). Indeed, having different variances for the different groups is mostly inevitable and hardly fixable. In this paper, we model the differential variance phenomenon by assuming that the decision maker sees of an estimate of the quality of a candidate Wˆi that is equal to the candidate's true latent quality Wi (possibly with an additional bias term) plus an additive noise4 whose variance depends on the group of the candidate.

We distinguish between two notable settings. In the first setting, the noise variance is assumed to be unknown to the decision maker—we then call it implicit variance. In this case, a natural baseline decision maker is the group-oblivious algorithm5 that simply selects the candidates with the highest estimated quality Wˆi, irrespective of their group. The group-oblivious selection algorithm can represent not only a decision maker unaware of the implicit variance in their estimates, but also a decision maker determined to not use group information—as it may be the case for instance in college admission based on standardized tests. In the second setting, the noise variance is known to the decision maker. In this case, a natural baseline is the Bayesian-optimal algorithm: this decision maker can use the group information as well as the knowledge of the distributions of latent quality and noise to select the candidates that maximize the expected quality given the noisy estimate.

As a first cornerstone, our analysis shows that, in the presence of differential variance, both the group-oblivious and the Bayesian-optimal algorithms lead to discrimination (although in opposite directions, see the overview of our results below). A natural way to address this representation inequality is to adopt fairness mechanisms proposed to address discrimination in selection such as the ones discussed above; but this poses the same question that was investigated by Kleinberg and Raghavan [8] in the case of implicit bias: what is the effect of fairness mechanisms on the quality of a selection in the presence of differential variance?

Our model and overview of our results  To answer this question, we propose a simple model with two groups of candidates A and B: for each candidate i, the decision maker receives a noisy (and possibly biased) quality estimate Wˆi=WiβGi+σGiεi, where Gi is the group to which the candidate belongs and εi is a standard normal random variable. The estimator has an additive bias βGi and a variance σGi2 that depend on the candidate's group. We assume that the true quality Wi comes from a distribution—assumed normal in our analytical results—that may be group-dependent. The decision maker then selects a fraction α (called selection budget) of the candidates.

The key feature of our model is the variance σGi2 that depends on the candidate's group—to model differential variance. In its general version, we also allow a bias and a latent quality distribution that depend on the candidate's group. Using this general model, we first show (Section 3.1) that both the group-oblivious and the Bayesian-optimal selection algorithms systematically lead to underrepresentation—i.e., lower selection rate—of one of the groups of candidates. Specifically, we identify a cutoff budget such that the group-oblivious selection algorithm leads to underrepresentation of the low-variance group for any budget α smaller than the cutoff (the most common case) and underrepresentation of the high-variance group for any budget α larger than the cutoff. Conversely (and for a different cutoff), the Bayesian-optimal algorithm leads to underrepresentation of the high-variance group for low budgets and of the low-variance group for high budgets. In fact, we show (Section 4.1) that this is true even in the absence of bias and with group-independent latent quality distributions—that is, if the noise variance is the only thing that depends on the candidate's group. In this particular case, the cutoff budget for both algorithms is α=1/2.

Then we investigate how the utility of the group-oblivious and the Bayesian-optimal baselines are affected when imposing a fairness mechanism. Specifically, we study a generalization of the 45-rule that we call γ-rule, which imposes that the selection rate for a given group is at least γ times that of the other group for some parameter γ[0,1]. This includes both the 45-rule (γ=0.8) and demographic parity (γ=1) as special cases. In the general model, we identify conditions under which the γ-rule never decreases the utility of the group-oblivious algorithm (Section 3.2)—that is, there is no trade-off between fairness and selection quality for this baseline. The utility even strictly increases for γ close enough to one, including for demographic parity. Interestingly, in the special case without bias and with group-independent latent quality distributions—that is, with only implicit differential variance—, this result always holds for any parameters (Section 4.1). Compared to the Bayesian-optimal baseline, the γ-rule cannot increase the utility (since Bayesian-optimal is already optimal given the available information). We prove, however, a bound on the ratio of the utility of the Bayesian-optimal algorithm with and without the γ-rule imposed, which limits the decrease of utility due to imposing a fairness mechanism in this setting. Our bound is valid in the general model (Section 3.3) but takes a particularly simple form in the special case without bias and with group-independent latent quality distributions (Section 4.1).

A typical case of differential variance is when the decision maker has more uncertainty about one group, due to lack of statistical confidence (e.g., in hiring). In such a case, the high-variance group naturally corresponds to the minority group. The group-oblivious algorithm would then overrepresent the minority group (for small selection budgets), and the fairness mechanism would lead to selecting fewer of the minority group—which is counter-intuitive. We stress, however, that those are typically cases where the relevant baseline is the Bayesian-optimal algorithm, which behaves very differently. Through the Bayesian posterior quality computation, this baseline would disregard candidates for which the observed quality estimate is uninformative, that is the high-variance group. As mentioned above, we indeed find that the Bayesian-optimal algorithm underrepresents the high-variance group (i.e., the minority), and that the fairness mechanism increases the proportion of selected high-variance candidates—which is coherent with intuition for that case. The group-oblivious baseline is meaningful in other scenarios, typically when the decision maker is not allowed to use the group information (e.g., in college admission based on standardized tests). In such cases, the high-variance group may not be a minority group (and our model does not require that it is).

At a high-level, our results indicate that, with differential variance, the two decision makers (group-oblivious and Bayesian-optimal) lead to nearly opposite outcomes in terms of discrimination; and that the effect of imposing fairness mechanisms can be very different for both. These results imply that a policy-maker considering fairness mechanisms for a given problem should first evaluate to which decision maker the selection rule corresponds, and then choose whether or not to recommend the γ-rule based on it. Note that this should be fairly easy to distinguish between the two in practice, since one conditions on group identity while the other does not.

Organization of the paper  The rest of the paper is organized as follows. We present the model in Section 2. We give all the results in the most general case in Section 3. Due to their generality, those results are sometimes complex. In Section 4, we analyze three notable cases for which the results are easier to interpret: the case without bias and with group-independent latent quality distributions (Section 4.1), the case with bias but with group-independent latent quality distributions (Section 4.2), and the case without bias but with group-dependent latent quality distributions (Section 4.3). Through numerical simulations in Section 5, we extend our analytical results, in particular to cases where the latent quality distribution does not follow a normal law. We conclude in Section 6.

Related work  There is an abundant literature on fairness in machine learning, in particular on classification, that tackles the question of how to learn a classifier while enforcing some fairness notion in the outcome [13], [14], [15], [16], [17], [18], [19], [20]. In this literature, fairness is usually seen as a constraint that reduces the classifier's accuracy and the fairness-accuracy tradeoff is analyzed. In contrast, in our work, we examine selection problems in which fairness can improve utility. Selection also differs from classification by the presence of selection budgets (i.e., maximal number of class-1 predictions), which changes the problem significantly.

The problem of selection is considered in [8] under the presence of implicit bias [3]. In their work, the authors study the Rooney rule [4] as a fairness mechanism and show that under certain conditions, it improves the quality of selection. An extension of the Rooney rule is studied under a similar model in [9], where the authors investigate the ranking problem (of which the selection problem can be seen as a special case) also in the presence of implicit bias and obtain similar results. In both papers, simple mathematical results expressing conditions under which the Rooney rule improves utility are obtained in the limit regime where the number of candidates is very large; we use the same limit regime in our work. In contrast to those papers that only consider bias, we introduce in addition the notion of differential variance to capture the difference in precision of the quality estimate for different groups. We also consider an additive bias rather than a multiplicative one as it makes more sense for normally distributed qualities. Although our model incorporates both an additive bias and differential variance (in Section 3), we purposely restrict it in Section 4 to the simplest possible form of differential variance so as to show its effect on the selection problem independently of bias. In our work, we also consider the 45-rule [7] (or rather an extension of it that we call the γ-rule and that includes demographic parity) rather than the Rooney rule. The main difference between the two is that the 45-rule imposes a constraint on the fraction of selected candidates from the underrepresented group whereas the Rooney rule or its extension in [9] imposes a constraint on the number of selected candidates from the underrepresented group.

Implicit bias, or simply bias (possibly from an algorithm trained on biased data) in the evaluation of candidates quality is certainly a primary factor of discrimination; but it is also one that may reasonably be fixable through the use of algorithms combined with appropriate debiasing techniques and ground truth data [21] (e.g., by learning fair representations of data [22], [23]). The effects of bias can be also fixed by introducing some fairness constraints on learned prediction models. For example, in [24], the binary classification problem in the presence of label bias is studied and it is shown that adding a demographic parity constraint to an empirical risk minimization problem can lead to better generalization. Similarly, in [25], the authors study the effects of label bias on binary classification and they show that equal opportunity fairness criterion (that ensures that true positives are equal across the groups) can reduce the bias in prediction for most of the reasonable cases, as well as improve the accuracy of classification. In [26], the authors quantify a fairness-accuracy trade-off using an information theoretic approach and, in addition, they show that for the majority of traditional fairness criteria (like equal opportunity and demographic parity) there exists an ideal data distribution for which fairness and Bayesian optimality are in accordance.

The notion of differential variance first appeared (with different terminology) in the seminal work of Phelps [27] to explain racial inequality in wages. There, a Bayesian decision maker observes noisy signals of productivity of each worker. Productivities are assumed to be drawn from a common distribution while precisions of estimation differ across races. Phelps shows that a Bayesian decision maker that assigns wages equal to the expected productivity of a worker leads to inequality of wages: in the region of high values of signals the low-precision workers receive lower wages. Our model is similar that of Phelps, with additional bias and possibly group-dependent prior distributions. We also study cases where the variance is implicit—hence the decision maker cannot use Bayes' rule to estimate expected quality given noisy estimates—, and focus on utility for our main results.

This paper is an extended version of our paper “On Fair Selection in the Presence of Implicit Variance” [1]. We extend it by considering the general model with bias and group-dependent latent quality distributions, and by analyzing in parallel the two baselines of the group-oblivious and Bayesian-optimal algorithms (whereas [1] only looks at the group-oblivious baseline, that is at implicit variance). On the other hand, we do not include the results on two-stage decisions makers for conciseness. Following [1], Garg et al. [28] studied a similar model (using the term differential variance that we also adopt here). The authors propose a model of school admission with students of two groups: advantaged and disadvantaged. Each student has an intrinsic quality which is not observable to schools: only noisy signals of the quality are available. The advantaged and disadvantaged students differ in the level of precision of their signals, and can also differ in their ability to access the tests. The authors consider the case of a Bayesian school of limited capacity. They study how different policies made by the school (group-aware and group-unaware) affect the diversity level, individual fairness and overall merit of admitted students. The authors also study how dropping test scores and different abilities to access tests affects the above characteristics.

Fairness mechanisms have been a subject of a number of studies in the economic literature, in particular from empirical data. In [29], the authors study whether affirmative actions can remove stereotypes about a particular population. In [30], an empirical evaluation of the influence of affirmative actions in recruiting is performed and it is shown that it can bring quality together with equality. Our work complements those studies through a theoretical model that leads to analytical results on the effect of fairness mechanisms in the presence of differential variance.

Section snippets

The model of selection with differential variance

We consider the following scenario. A decision maker is given n candidates, out of which a subset of size m=αn is selected, α(0,1). We assume that the set of candidates can be partitioned in two groups: group A and group B. There are nA candidates from group A and nB=nnA candidates from group B. We refer to them as A-candidates and B-candidates.

Each candidate i{1,,n} is endowed with a true latent quality Wi. We assume that the qualities Wi are drawn i.i.d. from an underlying probability

Analysis of the general model

In this section, we present the main technical results of the paper in the most general model. The results that we prove in this section are quite abstract; to make things more concrete and provide more intuitive results, we will instantiate this general model in important sub-cases in Section 4.

We start by showing why the two baseline algorithms lead to discrimination, in Theorem 1 for the group-oblivious and in Theorem 2 for the Bayesian-optimal algorithm. Then, we specify in Theorem 3

Notable special cases of the general model

The results in Section 3 might be difficult to interpret without considering some specific cases. In this section, we decompose the effects of different factors by tightening up some of the assumptions of our model while keeping the others in place. We consider the following important special cases: In Section 4.1 we assume that there is no bias in the estimation of quality and that the quality distribution is group-independent. This is the model studied in [1], where the only quantity that

Experiments

In this section,8 we challenge our theoretical results by using sets of data that do not satisfy our assumptions. We show in Section 5.1 that the results are qualitatively similar when the candidates' true quality comes from a non-normal distribution. We also observe a similar behavior when considering in Section 5.2 an artificial scenario that we construct using a real dataset coming from the national

Conclusion

In this work, we study a simple model of the selection problem that captures the phenomenon of differential variance, that is, the decision maker has estimates of the candidates' quality with different variances for different demographic groups. We distinguish two notable cases. In the first case, the decision maker does not have information about the estimate properties (variances and biases); as a result they use a group-oblivious algorithm. In the second case, every information about the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work has been partially supported by MIAI @ Grenoble Alpes (ANR-19-P3IA-0003), by the French National Research Agency (ANR) through grant ANR-20-CE23-0007, and by a European Research Council (ERC) Advanced Grant for the project “Foundations for Fair Social Computing” funded under the European Union's Horizon 2020 Framework Programme (grant agreement no. 789373). We thank the editor and the reviewers for their thoughtful comments.

References (32)

  • V. Emelianov et al.

    On fair selection in the presence of implicit variance

  • M. Bertrand et al.

    Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination

    Am. Econ. Rev.

    (2004)
  • A. Greenwald et al.

    Implicit bias: scientific foundations

    Calif. Law Rev.

    (2006)
  • B. Collins

    Tackling unconscious bias in hiring practices: the plight of the Rooney rule

    N.Y. Univ. Law Rev.

    (2007)
  • M. Cavicchia

    How to fight implicit bias? With conscious thought, diversity expert tells nabe

    Am. Bar Assoc., Bar Lead.

    (2015)
  • C. Passariello

    Tech firms borrow football play to increase hiring of women

    Wall St. J.

    (2016)
  • H. Holzer et al.

    Assessing affirmative action

    J. Econ. Lit.

    (2000)
  • J.M. Kleinberg et al.

    Selection problems in the presence of implicit bias

  • L.E. Celis et al.

    Interventions for ranking in the presence of implicit bias

  • A. Baye et al.

    Gender differences in variability and extreme scores in an international context

    Large Scale Assess. Educ.

    (2016)
  • R.E. O'Dea et al.

    Gender differences in individual variation in academic grades fail to fit expected patterns for stem

    Nat. Commun.

    (2018)
  • Isabelle Kocher, seule femme dirigeante du CAC 40

  • D. Pedreshi et al.

    Discrimination-aware data mining

  • M. Hardt et al.

    Equality of opportunity in supervised learning

  • M.B. Zafar et al.

    Fairness constraints: mechanisms for fair classification

  • M.B. Zafar et al.

    Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment

  • This paper is an extended and revised version of our paper “On Fair Selection in the Presence of Implicit Variance” that appeared in the proceedings of EC'20 [1].

    ☆☆

    This paper is a participant in the 2020 ACM Conference on Economics and Computation (EC) Forward-to-Journal Program.

    View full text