volved in sensitive decision-making process with To demonstrate a classifier’s disparate treatment, we need
adversarial implications on individuals. This paper to separate its discrimination from the social biases al-
presents mdfa, an approach that identifies the char- ready encoded in the data. We define a classifier as multi-
acteristics of the victims of a classifier’s discrimina- differentially fair if there is no subset of the feature space for
tion. We measure discrimination as a violation of which the classifier’s outcomes are dependent on sensitive at-
multi-differential fairness. Multi-differential fair- tributes. For any sub-population, disparate treatment is then
ness is a guarantee that a black box classifier’s out- measured as the maximum-divergence distance between the
comes do not leak information on the sensitive at- prior and posterior distributions of sensitive attributes.
tributes of a small group of individuals. We reduce mdfa audits for the worst-case violations of multi-
the problem of identifying worst-case violations to differential fairness. Theoretically, our construction relies
matching distributions and predicting where sensi- on a reduction to both matching distribution and agnostic
tive attributes and classifier’s outcomes coincide. learning problems. First, in order to neutralize the social-
We apply mdfa to a recidivism risk assessment biases encoded in the data, we re-balance it by minimizing
classifier and demonstrate that individuals identi- the maximum-mean discrepancy between distributions con-
fied as African-American with little criminal his- ditioned on different sensitive attributes. Secondly, we show
tory are three-times more likely to be considered at that violations of differential fairness is a problem of finding
high risk of violent recidivism than similar individ- correlations between sensitive attributes and classifier’s out-
uals but not African-American. comes. Therefore, mdfa searches for instances of violation
1 Introduction of multi-differential fairness by predicting where in the fea-
tures space the binary values of sensitive attributes and classi-
Machine learning algorithms are increasingly used to support fier’s outcomes coincide. Lastly, worst-case violations are ex-
decisions that could impose adverse consequences on an in- tracted by incrementally removing the least unfair instances.
dividual’s life: for example in the judicial system, algorithms This paper makes the following contributions:
are used to assess whether a criminal offender is likely to
recommit crimes; or within banks to determine the default • The proposed multi-differential fairness framework is
risk of a potential borrower. At issue is whether classifiers are the first attempt to identify disparate treatment for
fair [Calsamiglia, 2009] i.e., whether classifiers’ outcomes groups of individuals as small as computationally pos-
are independent of exogenous irrelevant characteristics or sible.
sensitive attributes like race and/or gender. Abundant exam- • mdfa efficiently identifies the individuals who are the
ples of classifiers’ discrimination can be found in diverse ap- most severely discriminated against by black-box classi-
plications ( [Atlantic, 2016; ProPublica, 2016]). ProPublica fiers.
([ProPublica, 2016]) reported that COMPAS machine learn- • We apply mdfa to a case study of a recidivism risk as-
ing based recidivism risk assessment tool assigns dispropor- sessment in Broward County, Florida and find a sub-
tionately higher risk to African-American defendants than to population of African-American defendants who are
Caucasian defendants. three times more likely to be considered at high risk
Establishing contestability is challenging for potential vic- of violent recidivism than similar individuals of other
tims of machine learning discrimination because (i) many as- races.
sessment tools are proprietary and usually not transparent;
and (ii) a precedent in United States case law places the • We apply mdfa to three other datasets related to crime,
burden on the plaintiff to demonstrate disparate treatment – income and credit predictions and find that classifiers,
to establish that characteristics irrelevant to the task affect even after being repaired for aggregate fairness, do dis-
the algorithm’s outcomes (Loomis vs. the State of Wiscon- criminate against smaller sub-populations.
sin [vs. State of Wisconsin, 2016]). Identifying the definitive Related Work In the growing literature on algorithmic
characteristics of a classifier’s discrimination empowers the fairness (see [Chouldechova and Roth, 2018] for a survey),
victims of such discrimination. Moreover, a classifier’s user this paper relates mostly to three axes of work. First, our
needs warnings for individual instances in which severe pro- paper, as in [Hébert-Johnson et al., 2017; Kim et al., 2018;
filing/discrimination has been detected. In this paper, we Kearns et al., 2017] provides a definition of fairness that
1
protects group of individuals as small as computationally Definition 2.1. (Individual Differential Fairness) For δ ≥ 0,
possible. Empirical observations in [Kearns et al., 2017; a classifier f is δ− differential fair if ∀x ∈ X , ∀s ∈ S, ∀y ∈
Dwork et al., 2012] support defining fairness at the individ- {−1, 1}
ual level since aggregate level fairness cannot protect specific P r[Y = y|S = s, x]
e−δ ≤ ≤ eδ (1)
sub-populations against severe discrimination. P r[Y = y|S 6= s, x]
Secondly, prior contributions on algorithmic disparate The parameter δ controls how much the distribution of the
treatment [Zafar et al., 2017] have focused on whether sen- classifier’s outcome Y depends on sensitive attributes S given
sitive attributes are used directly to train a classifier. This that auditing feature is x; larger value of δ implies a less dif-
is a limitation when dealing with classifiers whose inputs ferentially fair classifier.
are unknown. Differential fairness is inspired by differen- Differential Fairness and Disparate Treatments A δ−
tial privacy [Dwork et al., 2014] and offers a general frame- differential fair classifier f bounds the disparity of treatment
work to measure whether a classifier exacerbates the bi- between individuals with different sensitive attributes, the
ases encoded in the data. Reinterpretations of fairness as disparity being measured by the maximum divergence be-
a privacy problem can be found in [Jagielski et al., 2018; tween the distributions P r(Y |S = s, x) and P r(Y |S 6=
Foulds and Pan, 2018], but none of those contributions make s, x):
a connection to disparate treatment. Similar to prior work on
P r[Y |S = s, x]
disparate impact [Feldman et al., 2015; Chouldechova, 2017] max ln ≤δ
there is a need to re-balance the distribution of features y∈Y P r[Y |S 6= s, x]
conditioned on sensitive attributes. Our kernel matching Relation with Differential Privacy Differential fairness
technique deals with covariate shift [Gretton et al., 2009; re-interprets disparate treatment as a differential privacy is-
Cortes et al., 2008], and has been used in domain adaptation sue [Dwork et al., 2014] by bounding the leakage of sensi-
(see e.g. [Mansour et al., 2009]) and counterfactual analysis tive attributes caused by Y given what is already leaked by
(e.g. [Johansson et al., 2016]). the auditing features x. Formally, the fairness condition (1)
To the extent of our knowledge, there is no existing bounds the maximum divergence between the distributions
work on characterizing the most severe instances of algo- P r(S|Y, x) and P r(S|x) by δ.
rithmic discrimination. However, recent contributions of- Individual Fairness Def. (2.1) is an individual level defi-
fer approaches to bolster individual recourse. For example, nition of fairness, since it conditions the information leakage
[Ustun et al., 2018; Russell, 2019] develop algorithms to an- on auditing features x. Compared to the notion of individ-
swer what-if questions. However, unlike mdfa, these ap- ual fairness [Dwork et al., 2012], individual differential fair-
proaches do not account for the fact that individuals with dif- ness does not require an explicit similarity metric. This is a
ferent sensitive attributes are not drawn from similar distribu- strength of our framework since defining a similarity metric
tions. has been the main limitation of applying the concept of indi-
vidual fairness [Chouldechova and Roth, 2018].
2 Individual and Multi-Differential Fairness
2.2 Multi-differential fairness
Preliminary An individual i is defined by a tuple
Although useful, the notion of individual differential fairness
((xi , si ), yi ), where xi ∈ X denotes i’s audited features;
cannot be computationally efficiently audited for. Looking
si ∈ S denotes the sensitive attributes; and yi ∈ {−1, 1}
for violations of individual differential fairness will require
is a classifier f ’s outcome. The auditor draws m samples
{((xi , si ), yi )}m searching over a set of 2|X | individuals. Moreover, a sample
i=1 from a distribution D on X × S × {−1, 1}.
Features in X are not necessarily the ones used to train f . from a distribution over X × S × {−1, 1} has a negligible
First, the auditor may not have access to all features used probability to have two individuals with the same auditing
to train f . Secondly, the auditor may decide to deliberately features x but different sensitive attributes s.
leave out some features used to train f out of X because those Therefore, we relax the definition of individual differential
features should not be used to define similarity among indi- fairness and impose differential fairness for sub-populations.
viduals. For example, if f classifies individuals according to Formally, C denotes a collection of subsets or group of indi-
their probability of repaying a loan, the auditor may consider viduals G in X . The collection Cα is α-strong if for G ∈ C
that zipcode (correlated with race) should not be an auditing and y ∈ {−1, 1}, P r[Y = y & x ∈ G] ≥ α.
feature, although it was used to train f . Definition 2.2. (Multi-Differential Fairness) Consider a α-
strong collection Cα of sub-populations of X . For 0 ≤ δ, a
Assumptions In our analysis we assume that the distribu- classifier f is (Cα , δ)-multi differential fair with respect to S
tions of auditing features conditioned on sensitive attributes if ∀s ∈ S, ∀y ∈ {−1, 1} and ∀G ∈ Cα :
have common support.
Assumption 1. For all x ∈ X , P r[S|X = x] > 0. P r[Y = y|S = s, G]
e−δ ≤ ≤ eδ (2)
P r[Y = y|S 6= s, G]
2.1 Individual Differential Fairness Multi-differential fairness guarantees that the outcome of a
We define differential fairness as the guarantee that condi- classifier f is nearly mean-independent of protected attributes
tioned on features relevant to the tasks, a classifier’s outcome within any sub-population G ∈ Cα . The fairness condition in
is nearly independent of sensitive attributes: Eq. 2 applies only to subpopulations with P r[Y = y & x ∈
G] ≥ α for y ∈ {−1, 1}. This is to avoid trivial cases where Lemma 3.1. Let s ∈ S. Suppose that the data is balanced.
{x ∈ G & Y = y} is a singleton for some y, implying f is γ− multi-differential unfair for y ∈ {−1, 1} if and only
δ = ∞. there exists c ∈ Cα such that P r[rSY = c] ≥ 1 − ρ(y) + 4γ,
Collection of Indicators. We represent the collection of where r = sign(y) and ρ(y) = P r[S = rY ].
sub-populations C as a family of indicators: for G ∈ Lemma 3.1 allows us to reduce searching for a (G, y, s)
C, there is an indicator c : X → {−1, 1} such that unfairness certificate to predicting where sensitive attribute
c(x) = 1 if and only if x ∈ G. The relaxation and outcomes of f (if y = 1) or outcomes of ¬f (if y = −1)
of differential fairness to a collection of groups or sub- coincide. Our proposed approach is to solve the following
population is akin to [Kim et al., 2018; Kearns et al., 2017; empirical loss minimization:
Hébert-Johnson et al., 2017]. Cα is the computational bound m
on how granular our definition of fairness is. The richer Cα , 1 X
min l(c, ai yi ) + Reg(c), (5)
the stronger the fairness guarantee offers by Def. 2.2. How- c∈C m i=1
ever, the complexity of Cα is limited by the fact that we iden-
tify a sub-population G via random samples drawn from a where l(.) is a 0 − 1 loss function and Reg(.) a regularizer.
distribution over X × S × {−1, 1}. The following result shows that (i) our reduction to a learn-
ing problem leads to an unbiased estimate of γ ; (ii) there is
3 Fairness Diagnostics: Worst-case violations a computational limit on how granular multi-differential fair-
ness can be, since for many concept classes C agnostic learn-
The objective of mdfa is to find the sub-populations with the ing is a NP-hard problem ( [Feldman et al., 2012]).
most severe violation of multi-differential fairness – that is to ′
solve for s ∈ S and y ∈ {−1, 1} Theorem 3.2. Let ǫ, β > 0 and C ⊂ 2X . Let γ ∈ (γ −ǫ, γ +
ǫ).
P r[Y = 1|S, S = s]
sup ln . (3) (i) There exists an algorithm that by using
S∈Cα P r[Y = 1|S, S 6= s] O(log(|C, log( η1 ), ǫ12 ) samples {(xi , si ), yi } drawn
Our approach succeeds at tackling three challenges: (i) if Cα from a balanced distribution D outputs with probability
′
is large, an auditing algorithm linearly dependent on |C| can 1 − η a γ -unfairness certificate if yi are outcomes from
be prohibitively expensive; (ii) the data needs to be balanced a γ−unfair classifier;
conditioned on sensitive attributes; (iii) finding efficiently the (ii) C is agnostic learnable: there exists an algorithm that
most-harmed sub-population implies that we can predict a with O(log(|C, log( η1 ), ǫ12 ) samples {xi , oi } drawn from
function c ∈ Cα for which we do not directly observe val- a balanced distribution D outputs with probability 1−η,
ues c(x). P rD [h(xi ) = oi ] + ǫ ≥ maxc∈C P rD [c(xi ) = oi ]
3.1 Reduction to Agnostic Learning 3.2 Unbalanced Data
First, we reduce the problem of certifying for the lack of dif- Imbalance Problem Multi-differential fairness measures
ferential fairness to a agnostic learning problem. the max-divergence distance between the posterior distribu-
Multi Differential Fairness for Balanced Distribution In tion P r(S|Y, x) and the prior one P r(S|x). Therefore, it
this section, we assume that for any s ∈ S, the conditional requires knowledge of P r(S|x). In the previous section, we
distributions p(x|S = s) and p(x|S 6= s) are identical. This circumvent the issue by assuming P r[S = s|x] = 1/2. To
is not realistic for many datasets, but we will show how to generalize our approach, we propose to rebalance the data
handle unbalanced data in the next section. A balanced dis- with the following weights: for s ∈ S, ws (x, s) = P r[S 6=
′ ′
tribution does not leak any information on whether a sensitive s|x]/P r[S = s|x] and for s 6= s, ws (x, s ) = 1. Once
attribute is equal to s: P r[S = s|x] = P r[S 6= s|x]. A viola- reweighted, the conditional distributions P rw (X|S = s) and
tion of (Cα , δ)- multi differential fairness simplifies then to a P rw (X|S 6= s) are identical and our learning reduction from
sub-population G ∈ Cα , a y ∈ {−1, 1} and s ∈ S such that the previous section applies.
However, in practice we do not have direct access
1 to ws . One approach is to directly estimate the den-
P r[G, Y = y] P r[S = s|G, Y = y] − ≥ γ, (4)
2 sity P [S = s|x]. This method is used in propensity-
score matching methods ([Rosenbaum and Rubin, 1983]) in
with γ = α eδ /(1 + eδ ) − 1/2 . γ combines the size of the context of counterfactual analysis. But, exact or es-
the sub-population where a violation exists and the magni- timated importance sampling results in large variance in
tude of the violation. We call a γ− unfairness certificate any finite sample ([Cortes et al., 2010]). Instead, we use a
triple (G, y, s) that satisfies Eq. (4). Further we postulate that kernel-based matching approach ([Gretton et al., 2009] and
f is γ−unfair if and only if such certificate exists. Unfair- [Cortes et al., 2008]). Our method considers real-value clas-
ness for balanced distributions is equivalent to the existence sification h : X → R such that c(x) is equal to the sign of h1 .
of sub-populations for which sensitive attributes can be pre-
dicted once the classifier’s outcomes are observed. 1
To prove our results, we will need c(x) = g(h(x)), where for
Searching for γ-unfairness certificate reduces to mapping small τ > 0, g(h) = sign(h) for |h| > τ /2 and g(h) = 1 with
the auditing features {xi } to the labels {si yi }. probability h for |h| < τ /2.
The loss function l(h, sy) in Eq. (5) is assumed to be con- Algorithm 1 Worst Violation Algorithm (WVA)
vex. Our setting includes, for example, support vector ma- Input: {((xi , ai ), yi )}m |X |
i=1 , C ⊂ 2 , ξ, α, weights u, y ∈
chine and logistic classification. The following result bounds {−1, 1}
above the change in the solution of Eq. (5) when changing
1: α0 = 1, uit = u
the weighting scheme from u to w. 2: while αt > α do
m
Lemma 3.3. Let φ be a feature mapping and k be its associ- 1
X
′ ′ 3: ct = argminc∈C m uit (x)l(ai yi , c(xi )) + λc Reg(c)
ated kernel with k(x, x ) = hφ(x), φ(x )i and ||k||∞ < κ < i=1 ,
∞. Suppose that in Eq. (5), Reg(h) = λc ||g||2k and that for m
X m
X
x ∈ X , h(x) = hh|k(x, .)i. Suppose that l is σ− Lipchitz 4: δ̂t ← ui (xi ) ui (xi )
in its first argument. Denote hu and hw the solutions of the i=1,c(xi )=1 i=1,c(xi )=1
yi =y,ai =1 yi =y,ai =−1
risk minimization Eq. (5) with weights u and w respectively. m
,m
Then,
X X
5: α̂t ←← ui (xi ) ui (xi )
p i=1,c(xi )=1,yi =1 i=1
2 cond(k)
∀x ∈ X , |hu (x) − hw (x)| ≤ κ σ Gk (u, w), 6: t ← t + 1, uit ← uit + ui ξ if ai 6= yi and yi = y.
λc 7: end while
8: Return ln(δt ).
where cond(k) is the condition number of the Gram matrix of
k and Gk (u, w) is the maximum mean discrepancy between
the distributions weighted by u and w:
Worst-Case Violation Algorithm (WVA) At issue in the
X X
previous example is that for the sub-population G0 , choosing
Gk (u, w) =
u(xi )φ(xi ) − w(xi )φ(xi )
. c = 1 or c = −1 will lead to the same empirical risk Eq. 6. To
force c(x) = −1 for x ∈ G0 , our approach is to put a slightly
i i
Note that when the distribution is weighted with the im- larger weight on samples whenever si 6= yi . Now the em-
portance sampling ws , the maximum mean discrepancy of pirical risk is smaller for c = −1 wherever there is no viola-
P r(X|S = s) and P r(X|S 6= s) is zero. By minimizing tion of multi differential fairness. More generally, our worst-
the maximum-mean discrepancy Gk (u, ws ), we minimize an violation algorithm 1 iteratively increases by 1+ξt the weight
upper bound on the pointwise difference between hu and hws , on samples whenever si 6= yi , where ξ > 0. At iteration t, the
that is the difference between the unfairness certificate we solution ct of the empirical risk minimization (5) identifies a
choose with weigths u and the one we would have chosen sub-population Gt = {x|ct (x) = 1} with a δ(ct )− viola-
if P r(X|S = s) = P r(X|S 6= s). Therefore, mdfa solves: tion of differential fairness, with δ ≥ ln((1 − h(ξt))/h(ξt)),
where h is an increasing function. The Algorithm 1 termi-
X nates whenever either |Gt | ≤ α. At the second to the last it-
min ck (u, a)
ui (xi )l(φ(xi ), ai yi ) + Reg(c) + G (6) eration T , theorem 3.5 guarantees that Algorithm 1 will iden-
φ,u
i tify a sub-population GT with a δT -multi differential fairness
violation and δT asymptotically close to δm .
In our implementation, the feature representation φ is learned
via a neural network that is then shared with both tasks of Theorem 3.5. Suppose ξ > 0, ǫ > 0, η ∈ (0, 1) and
minimizing the re-weighted certifying risk and the empirical C ⊂ 2X is α−strong. Suppose that the classifier f has
counterpart Gck (u, s) of the maximum mean discrepancy be- been certified with γ-multi-differential unfairness for y ∈
tween P r(X|S = s) and P r(X|S 6= s). The following result {−1, 1}. Denote δm the worst-case violation of multi dif-
bounds above the bias in mdfa’s estimate of γ: C as defined in
ferential fairness for (3). With probabil-
Theorem 3.4. Let ǫ > 0 and η ∈ (0, 1). Suppose that C is ity 1 − η, with O ǫ12 log(|C|) log η4 samples and af-
a concept class of VC dimension d < ∞. Solving for Eq. (6) ter O 4(γ+α) 2(4γ−2ρ(y)+1)
iterations, Algorithm 1 learns
2γ+3α ξ
finds a γ − ǫ-unfairness
certificate
if f is γ− unfair and at
c ∈ C such that
least O ǫ12 log(d) log η1 samples are queried.
ln P r[Y = y|S = s, c(x) = 1] − δm ≤ ǫ. (7)
3.3 Worst-Case Violation P r[Y = y|S 6= s, c(x) = 1]
Solving the empirical minimization Eq. (6) allows certi-
fying whether any black box classifier is multi-differential 3.4 mdfa Auditor
fair, but the solution of Eq. (6) does not distinguish a large
sub-population S with low value of δ from a smaller sub- Putting the building blocks together allows us to design a fair-
population with larger value of δ. For example, consider two ness diagnostic tool mdfa that identifies efficiently the most
sub-populations of same size G0 and Gδ for δ > 0. Assume severe violation of differential unfairness.
that there is no violation of multi-differential fairness on G0 , Architecture Inputs are a dataset with a classifier’s out-
but a δ− violation on Gδ . The risk minimization Eq. 6 will comes (labels ±1) along with auditing features. mfda first
pick indifferently Gδ and Gδ ∪ G0 as unfairness certificates, uses a neural network with four fully connected layers of 8
although G0 mixes the violation Gδ with a sub-population neurons to express the weights u as a function of the features
without any violation of differential fairness. x and minimizes the maximum-mean discrepancy function
Gˆk (u, s). The outputs of the last hidden layer in the neural
network are used as a feature mapping φ and serve along the 3
e
at
9e-01
Estimated δ̂m
estimated weights u as an input to the empirical minimization
m
size
sti
Estimates
δ̂m
r-e
Eq. (6), which outputs a certificate (c, y, s) ∈ C×{−1, 1}×S δm
ve
2
O
of unfairness. The weights are re-adjusted until the identified 6e-01
worst-case violation has a size smaller than α. When termi-
e
at
5K
m
1
nating, mdfa outputs an estimate of the most-harmed sub-
sti
1K
-e
3e-01
er
population cm along with an estimate of δm .
nd
0
U
Cross-Validation The auditor chooses the minimum size α 0 1 2 3 20 25 30
of the worst-case violation they would like to identify. The True δm
advantage of our approach is that, although we do not have Iterations
ground truth for unfair treatment, we can propose heuristics
to cross-validate our choice of regularization parameters used (a) Effect of sample size (b) Iteration for WVA
in Eq. (6). First, we split 70%/30% the input data into a
train and test set. Using a 5−fold cross-validation, mdfa is
trained on four folds and a grid search looks for regularization UW 3
e
at
7.5e-02
Estimated δ̂m
m
parameters that minimize the maximum-mean-discrepancy
sti
IS
Bias γ̂ − γ
r-e
Gˆk (u, s) and the empirical risk on the fifth fold. Once mdfa
ve
5.0e-02 2
O
MMD
is trained, the estimated δm and the corresponding character-
e
at
istics of the most-harmed sub-population are computed on the
m
2.5e-02 1
sti
DT
-e
test set. RF
er
nd
SVM-Lin
0
U
0.0e+00 SVM-RBF
4 Experimental Results -0.2 0.0 0.2
0 1 2 3
Education (3.94) (3.85) (0.11) (0.06) (0.16) (0.10) for any x, x ∈ {−1, 1}. The result from lemma 3.1 follows
Years of 10.06 10.08 13.74 13.9 13.08 13.32 by remarking that hS, 1i = hS, ci = 2P rw [S = c] − 1 = 0,
Education (2.38) (2.66) (0.04) (0.04) (0.13) (0.08) since P rw [S = s|x] = P rw [S 6= s|x].
Hours/week 36.38 42.39 44.77 48.02 49.32 51.44
(12.22) (12.12) (1.19) (0.75) (0.94) (0.51) Theorem 3.2
Occupation 6.2 6.78 7.75 7.93 6.88 7.65
(4.39) (4.14) (0.07) (0.11) (0.30) (0.16) Proof. (i) ⇒ (ii). Denote (xi , si , oi ) a sample from a bal-
Age 37.07 39.62 49.72 50.47 49.48 48.46 anced distribution D over X × S × {−1, 1}. Denote c∗ ∈ C
(14.38) (13.50) (1.28) (0.60) (1.31) (0.88) such that P r[c∗ (xi ) = oi ] = maxc∈C P r[c(xi ) = oi ] = opt.
Construct a function f such that for (xi , si , oi ), f (xi , si ) =
Table 4: Adults: Identifying the worst-case violation of differential si oi . Therefore, f (xi )si = oi and P r[c∗ = si f (xi )] =
fairness for classifiers trained with LC and Red−LC. The sensitive
attribute is whether the individual is self-identified as Female (F ) or
P r[c∗ = oi ] = opt: by lemma 3.1, c∗ is a γ-unfairness cer-
Male (M ).( ) indicates standard deviation. tificate, with γ = opt+ρ−1
4 and ρ = P r[oi = 1]. By (i), the
certifying algorithm outputs a (γ − ǫ/4)− unfairness certifi-
cate c ∈ C with probability 1 − η and O(log(|C, log( η1 ), ǫ12 )
44% − 60% more likely to be of low-income than Males with sample draws. Hence, by lemma 3.1, P r[c(xi ) = oi ] =
similar characteristics. In Crimes dataset, disparate treatment P r[c(xi ) = f (xi )oi ] = 4(γ − ǫ/4) + 1 − ρ = opt − ǫ,
DTG is around 5.7 for both DI − LC, R − LC: this means which concludes (i) ⇒ (ii)
that there exist communities with dense African-American (ii) ⇒ (i). Suppose that f is a γ-unfair. Denote yi =
populations that are six times more likely to be classified at f (xi , si ). Samples {(xi , si ), yi } are drawn from a balanced
high risk than similar communities with lower percentages of distribution over X × S × {−1, 1}. By lemma 3.1, there
African Americans. exists c ∈ C such that P r[c(xi ) = si yi ] = 4γ + 1 − ρr ,
In Table 4, we identifies the average characteristics of the with r = ±. Assume, without loss of generality r = +.
worst-case violation sub-population G in the Adults dataset. Then, since maxc′ P r[c(xi ) = si yi ] ≥ 4γ + 1 − ρ+ . By
For brevity, we report the results for the logistic classifier (ii), there exists an algorithm that outputs with probability
without repair LC and with [Agarwal et al., 2018]’s reduc- 1−η and O(log(|C, log( η1 ), ǫ12 ) sample draws c ∈ C such that
tion repair Red − LC. Similar results can be obtained for
DI − LC. Compared to the overall populations, for both LC P r[c(xi ) = si yi ] ≥ maxc′ P r[c(xi ) = si yi ] − ǫ/4. There-
and Red − LC, individuals in the worst-case violation sub- fore P r[c(xi ) = si yi ] ≥ 4(γ − ǫ) + 1 − ρ+ . By lemma 3.1,
population work more hours/week, are older and have more c is a (γ − ǫ)− unfairness certificate for f , which concludes
years of education. Women in that group are 60%−80% more (ii) ⇒ (i).
likely to be classified as low-income by LC or Red − LC
than men with the same high level of education, same hours Theorem 3.4 We first show the following lemma
of work per week and same age. Lemma 6.1. With the same assumption aspin lemma 3.3, for
any weights u, w, ||u − w|| ≤ Gk (u, w)/ λmin (k), where
5 Conclusion λmin (k) is the smallest eigenvalue of the Gram matrix asso-
In this paper, we present mdfa, a tool that measures whether ciated with k
a classifier treats differently individuals with similar audit- p
ing features but different sensitive attributes. We hope that Proof. First, note that Gk (u, w) = (u − w)T k(u − w)
mdfa’s ability to identify sub-populations with severe vio- and by a standard
p bound on Rayleigh quotient, ||u − w|| ≤
lations of differential fairness could inform decision-makers Gk (u, w)/ λmin (k).
when to discount the classifier’s outcomes. It also provides
the victims with a framework to contest a classifier’s out- Now we prove lemma 3.3:
Proof. The proof relies on the fact that the solution of Eq. (5) Lemma 6.4. Consider a random variable Z and ĉu , ĉw as in
is distributionally stable as in [Cortes et al., 2008]: lemma 6.3. Then |P r[Z = 1 ∧ ĉu = 1] − P r[Z = 1 ∧ ĉw =
p 1]| ≤ P r[ĉu 6= ĉw ]
2 λmax (k)
||hu − hw ||k ≤ κσ ||u − w||,
2λc Proof. Note that P r[Z = 1 ∧ ĉu = 1] − P r[Z = 1 ∧ ĉw =
where λmax (k) is the largest eigenvalue of the Gram matrix 1] = P r[Z = 1 ∧ ĉu = 1 ∧ ĉw = −1] − P r[Z = 1 ∧ ĉw =
associated to k (see [Cortes et al., 2008], proof of theorem 1 ∧ ĉu = −1] ≤ P r[ĉu 6= ĉw ].
1). Moreover, |hu (x) − hw (x)| ≤ ||hu − hw ||k . The re-
sult in lemma 3.3 follows from lemma 6.1 and cond(k) = The last result we need to prove theorem 3.4 is to link ns
λmax (k)/λmin (k). and n¬s to sample size m
The next result follows from [Gretton et al., 2009] and Lemma 6.5. Denote αs = P r[S = s]. Let ǫ, η > 0.
bounds above Ĝk (u, ws ), the emprical counterpart of Therefore, if m ≥ Ω log(1/η) , with probability 1 − η,
Gk (u, ws ), where ws (x) = P r[S 6= s|x]/(1 − P r[S = s|x]) α2 ǫ2 s
c(x) = (8) −
P r[Y = 1|S = −1, ĉw = 1] P r[Y = 1|S = −1, ĉu = 1] ≤ ǫ
1 w.p. h+τ
2τ if |h| ≤ τ
Lemma 6.3. Let τ, η > 0, ǫ > 0. Let ĉu , ĉw denote the and
certificates constructed from the solution of the empirical risk
minimization ĥu (with weights u) and ĥw (with weights ws ). ′
There exists κ2 > 0 such that with probability 1 − η, |P r[ĉw = 1 ∧ Y = 1] − P r[ĉu = 1 ∧ Y = 1]| ≤ ǫ .
s
|i = 1, ..., m|ĉu (xi ) 6= ĉw (xi )| 2 B2 1 There exists κ3 such that for ǫ > 0, with probability
≤ κ2 2 log + 1 − η, |γ̂u − γ̂w | ≤ κ3 ǫ. Moreover, by theorem 3.2,
m η ns n¬s if C has finite VC dimension, with probability 1 − η and
.
1 2
r Ω log(|C|), ǫ2 , log η samples, |γ̂w − γw | ≤ ǫ. It follows
κ2 τ 2
Proof. Denote ǫ(m) = 5 2 log η2 B ns + 1
n¬s , with that with probability 1−η and Ω log(|C|), ǫ21α2 , log η2 sam-
√ s
part of the inequality uses the results obtained in the previous where uit are the weights at iteration t and the expectation
paragraph for |ĥu | > τ ; the second part uses the construction is taken over all the samples of size m drawn from Df .
of ĉu and ĉw .
Therefore, with probability 1 − η, ui (1 + νt) if yi 6= si ∧ yi = y
|i=1,...,m|ĉu (xi )6=ĉw (xi )| uit = (10)
m ≤ 5ǫ(m)/τ. ui otherwise.
Lemma 6.6. Let s ∈ S. Assume that the classifier f is γ-
unfair. At iteration t, denote δt = ln(P r[Y = 1|ct (x) = 2P ru [SY = ct ] − 2P ru [SY = cm ]
1, S = s]/P r[Y = 1|ct (x) = 1, S = s]). Then,
eδt m m
≥ 1 − h(ξt), X X
1 + eδt ≥ ξtE
ui − ui
i=1 i=1
4γ+1−2ρ+ si 6=yi si 6=yi
where h(ξt) = ξt . ct (xi )=1 cm (xi )=1
(11)
yi =1 yi =1
= ξt (P ru [ct = 1 ∧ S 6= Y ∧ Y = 1]−
Proof. Without loss of generality, we assumey = 1. Denote P ru [cm = 1 ∧ S 6= Y ∧ Y = 1])
c−1 such that c−1 (x) = −1 for all x. By comparing the value = (P ru [SY 6= ct |ct = 1, Y = 1]P ru [ct = 1, Y = 1]
of the empirical risks for ct and c−1 , if ct 6= c−1 , then we can − P r[SY 6= cm |cm = 1, Y = 1]P r[cm = 1, Y = 1])
show that
By definition of the worst-case violation for y = 1,
P ru [ct (xi ) = si yi )] ≥ P ru [si 6= yi ] P ru [SY 6= ct |ct = 1, Y = 1] ≥ P ru [SY 6= cm |cm =
1, Y = 1]. Moreover, when the algorithm stops, P ru [ct =
X 1, Y = 1] = α = P ru [cm = 1, Y = 1]. Therefore,
m the right-hand side of (11) is non-negative. It results that
+ ξtE ui
. P ru [SY = ct ] ≥ P ru [SY = cm ]. On the other hand,
i=1,si 6=yi
ct (xi )=1 since f is γ− unfair, P ru [SY = ct ] cannot be more than
yi =y 4γ − ρ+ + 1. Therefore, P ru [SY = ct ] = 4γ − ρ+ + 1.
It follows that at iteration t, when the algorithm stops
Moreover, δ(ct )
e 1
P ru [Y = 1, ct = 1] − = γ.
1 + eδ(ct ) 2
X
m The algorithm stops when P r[Y = 1, ct = 1] = α, which
E
u
i = P ru [si yi 6= ct (xi )∧ct (xi ) = 1∧yi = y].
i=1,si 6=yi implies δt = δm . Therefore, t ≤ T .
ct (xi )=1)
yi =y
4γ + 1 − 2ρ+
P ru [SY = ct |ct = 1, Y = y] ≥ 1− = 1−h(ξt).
ξαt
Lemma 6.7. Denote T = 4γ+1−2ρ(y)
ξα eδm + 1 . The algo-
rithm stops for t ≥ T with P r[Y = y, Ct = 1] = α and
δt = δm .