Three naïve Bayes approaches for discrimination-free classification Toon Calders and Sicco Verwer ECMLPKDD 2010
Overview • Introduction: • What is discrimination-free classification? • Why do (current) trivial solutions fail?
• Our approach: • Modified naïve Bayes • 2 naïve Bayes models • Naïve Bayes with a latent variable
• Results and Conclusions
3-1-2012
PAGE 2
Discrimination-free classification • Consider a bank that wants to partially automate their loan issuing system: 1. Obtain historic data 2. Learn a classifier 3. Use the classifier (as a guide) to assign new loan
• The data will favor some ethnic groups • A classifier will learn this favoritism as a rule • Discrimination-free classification avoids learning (or using) such rules
3-1-2012
PAGE 3
Why discrimination-free classification? • Discrimination laws do not allow the use of these rules for attributes such as ethnicity, gender, religion, etc. • We call these sensitive attributes • Using decision rules that base their decision on these attributes in a classifier can lead to big fines • Many companies and institutions cannot make use of state-of-the-art machine learning or data mining techniques • At least, not without modifying them… 3-1-2012
PAGE 4
A trivial solution • Why not simply remove the sensitive attribute from the data set? • Suppose we want a classifier that decides whether a person has a high or low income • In the census income data there is some favoritism: male
female
High income
3256
590
Low income
7604
4831
• About 30% of all males and about 11% of all females have a high income 3-1-2012
PAGE 5
A trivial solution • We learn a naïve Bayes classifier and obtain the following counts: male
female
High income
4559
422
Low income
6301
4999
• 42% of all males and only 8% of all females get assigned a high income by the classifier
3-1-2012
PAGE 6
A trivial solution • Removing the gender attribute and then learning an NB classifier results in the following: male
female
High income
4134
567
Low income
6726
4854
• About 38% males and about 10% females • Males are more favored than in the data itself! • The reason is the red-lining effect…
3-1-2012
PAGE 7
Red-lining 1936 Philidelphia Households or businesses in the red zones do not get mortgages or business loans
3-1-2012
PAGE 8
Accuracy – Discrimination trade-off • A classifier can use attributes correlated with the sensitive attribute in order to discriminate indirectly • We not only want to remove direct, but also indirect discrimination • Removing all such correlated attributes before training does remove discrimination, but with a high cost in classifier accuracy • We want to remove all discrimination with a minimal accuracy loss
3-1-2012
PAGE 9
Measuring discrimination • Measuring true discrimination is very difficult: • Would this female have gotten a loan if she were male?
• We use simply the difference in positive class (C+) probability of the sensitive attribute values (S-, S+): • disc = P(C+ | S+) – P(C+ | S-)
• This disc measure: • Is intuitive • Measures the severity of discrimination • Can distinguish between positive and negative discrimination
• Our aim is to learn a classifier with 0.0 discrimination 3-1-2012
PAGE 10
Naïve Bayes P(C, S, A1, …, An) = P(C)P(S|C)P(A1|C)…P(An|C)
C
S
A1
…
An
3-1-2012
PAGE 11
Modified naïve Bayes P(C, S, A1, …, An) = P(S)P(C|S)P(A1|C)…P(An|C)
S modify P(C|S) until disc = 0.0
A1
C
…
An 3-1-2012
PAGE 12
2 naïve Bayes models P(C, S, A1, …, An) = P(S)P(C|S)P(A1|C,S)…P(An|C,S)
S explain away the correlation between S and A1, …, An
A1
C
…
An 3-1-2012
PAGE 13
Modification algorithm • Pos = the number of positive labels in the data • S+ and S- are the favored and discriminated values • while (discrimination > 0) • If the classifier assigns more positive labels than Pos − Modify counts such that P(C+ | S-) increases a bit • Else − Modify counts such that P(C+ | S+) decreases a bit
3-1-2012
PAGE 14
Naïve Bayes with a latent variable P(L)P(S)P(C|S,L)P(A1|L,S)…P(An|L,S)
C try to discover the true label L
L
A1
S
…
An 3-1-2012
PAGE 15
Expectation maximization • In the model C can be correlated with S • But L is modeled to be independent of S • We try to discover the most likely setting to the nondiscriminating L values using EM: • • • •
Initialize L randomly for all tuples Update the model using the new L values Assign to L its expected value for every tuple Iterate until convergence
3-1-2012
PAGE 16
Adding prior knowledge • It makes no sense to increase discrimination: • Fix L = 0 for tuples with S = 1 and C = 0 • Fix L = 1 for tuples with S = 0 and C = 1
• Try to keep the number of positive class labels roughly the same: • P(C+) = P(L+), P(C-) = P(L-)
• Since • P(L | S-) = P(L | S+)
• This results in a unique setting to P(C | L, S) • Pre-compute and fix this setting 3-1-2012
PAGE 17
Generating data • Sample from the latent variable model, but: • bound the differences between P(A | L, S) for different L and S • fix the distribution of P(C | L, S): P(C | L, S)
L
S
0.1
0
0
0.2 0.8
0 1
1 0
0.9
1
1
• Accuracy is measured using the L label, which is not part of the training set 3-1-2012
PAGE 18
Results
3-1-2012
PAGE 19
Results
3-1-2012
PAGE 20
Results S included
Marginalizing over S
discrimination accuracy
discrimination accuracy
Modified NB
-0.003
0.813
0.286
0.818
2 NB
-0.003
0.812
0.047
0.807
EM
0.000
0.773
0.081
0.739
EM prior
0.013
0.790
0.077
0.765
EM stopped
-0.006
0.797
0.061
0.792
Emp stopped
-0.001
0.801
0.063
0.793
3-1-2012
PAGE 21
Conclusions
• We explained methods to remove discrimination from the naïve Bayes classifier: • Modifying the positive class probability • Learning two models for the different sensitive values and then modifying the positive class probability • Trying to discover the true class label using EM
• The two model method works best, even when the data is sampled from the latent variable model
3-1-2012
PAGE 22
Future work
• Find a more sophisticated and computable measure of discrimination • Use other base classifiers • Investigate why EM converges poorly
3-1-2012
PAGE 23