Anda di halaman 1dari 17

Lecture Slides for

INTRODUCTION
TO
MACHINE
LEARNING
3RD EDITION
ETHEM ALPAYDIN
The MIT Press, 2014
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e

CHAPTER 3:

BAYESIAN DECISION
THEORY

Probability and Inference


3

Result of tossing a coin is {Heads,Tails}


Random var X {1,0}
Bernoulli: P {X=1} = poX (1 po)(1 X)
Sample: X = {xt }Nt =1
Estimation: po = # {Heads}/#{Tosses} = t xt / N
Prediction of next toss:
Heads if po > , Tails otherwise

Classification

Credit scoring: Inputs are income and savings.


Output is low-risk vs high-risk
Input: x = [x1,x2]T ,Output: C {0,1}
Prediction:
C 1 if P(C 1| x1 ,x 2 ) 0.5
choose
C 0 otherwise
or
C 1 if P(C 1| x1 ,x 2 ) P(C 0 | x1 ,x 2 )
choose
C 0 otherwise
4

Bayes Rule
5

prior
posterior

likelihood

PC px|C
PC |x
px
evidence

P C 0 P C 1 1

px px|C 1P C 1 px|C 0P C 0
pC 0| x P C 1| x 1

Bayes Rule: K>2 Classes


6

px|C i P C i
P C i | x
px
px|C i P C i
K
px|C k PC k
k 1

P C i 0 and P C i 1
i 1

choose C i if P C i | x maxk P C k | x

Losses and Risks

Actions: i
Loss of i when the state is Ck : ik
Expected risk (Duda and Hart, 1973)
K

R i | x ikP C k | x
k 1

choose i if R i | x mink R k | x

Losses and Risks: 0/1 Loss


8

0 if i k
ik
1 if i k
K

R i | x ikP C k | x
k 1

P C k | x
k i

1 P C i | x
For minimum risk, choose the most probable class

Losses and Risks: Reject


0 if i k

ik if i K 1 , 0 1
1 otherwise

R K 1 | x P C k | x
k 1

R i | x P C k | x 1 P C i | x
k i

chooseC i if PC i | x PC k | x k i and PC i | x 1
reject

otherwise
9

Different Losses and Reject


10

Equal losses

Unequal losses

With reject

Discriminant Functions
chooseCi if gi x maxkgk x

gi x, i 1,, K

R i | x

gi x P C i | x
px | C P C
i
i

K decision regions R1,...,RK

Ri x|gi x maxkgk x
11

K=2 Classes

Dichotomizer (K=2) vs Polychotomizer (K>2)


g(x) = g1(x) g2(x)
C1 if gx 0
choose
C 2 otherwise

Log odds:

P C1 | x
log
P C 2 | x
12

Utility Theory

Prob of state k given exidence x: P (Sk|x)


Utility of i when state is k: Uik
Expected utility:
EU i | x UikP Sk | x
k

Choose i if EU i | x max EU j | x
j

13

Association Rules

Association rule: X Y
People who buy/click/visit/enjoy X are also likely to
buy/click/visit/enjoy Y.
A rule implies association, not necessarily causation.

14

Association measures
15

Support (X Y):
# customerswho bought X and Y
P X ,Y
# customers

Confidence (X Y):

P X ,Y
P Y | X
P( X )
# customerswho bought X and Y

Lift (X Y):
# customerswho bought X
P X ,Y P(Y | X )

P( X )P(Y )
P(Y )

Example
16

Apriori algorithm (Agrawal et al.,


1996)
17

For (X,Y,Z), a 3-item set, to be frequent (have


enough support), (X,Y), (X,Z), and (Y,Z) should be
frequent.
If (X,Y) is not frequent, none of its supersets can be
frequent.
Once we find the frequent k-item sets, we convert
them to rules: X, Y Z, ...
and X Y, Z, ...

Anda mungkin juga menyukai