You are on page 1of 8

Nave Bayes Classification

Bayes Recall
Bayes theorem uses the concept of
hypothesis and evidence
A record is an evidence
Bayes uses probability or statistics

Using Bayes for Classification


Let X be a data record (therefore, X is
considered an evidence)
X is described by a set of n attributes
Let H be a hypothesis, such as that the data
record X belongs to a specified class C
P (H | X) or the probability that the hypothesis
H holds given the evidence or observed
record X
Therefore, in nave bayes classification, we
look for the probability that X belongs to class
C given that we know the attribute description
of X

P (H|X) posterior probability of H


conditioned on X
P (H) prior probability of H independent
of X
P (X|H) posterior probability of X
conditioned on H
P (X) prior probability of X

In order to predict the class label of X,


P (X|Ci) P (Ci) is evaluated for each class
Ci
where P(X|Ci) = P(x1|Ci) * P(x1|Ci) * * P(xn|Ci)
and x1 to xn are attributes of X
where P(Ci) = |Ci| / |D|

The predicted class label is the class C i for


which P(X|Ci)P(Ci) is the maximum

X= (age=youth, income = medium, student = yes, credit_rating = fair)


AGE

INCOME

STUDENT

CREDIT RATING

BUYS

youth

high

no

fair

No

youth

high

no

excellent

No

adult

high

no

fair

Yes

old

medium

no

fair

Yes

old

low

yes

fair

Yes

old

low

yes

excellent

No

adult

low

yes

excellent

Yes

youth

medium

no

fair

No

youth

low

yes

fair

Yes

old

medium

yes

fair

Yes

youth

medium

yes

excellent

Yes

adult

medium

no

excellent

Yes

adult

high

yes

fair

Yes

old

medium

no

excellent

No

i = 2 ( Buys Computer and Does Not Buy Computer)

P (Ci) or the prior probability of each class


P(buys_computer = yes) = 9/14 = 0.643
P(buys_computer = no) = 5/14 = 0.357

P (X|Ci) or the posterior probability of X given Ci

P(age=youth | buys_computer=yes) = 2/9 = 0.222


P(age=youth | buys_computer=no) = 3/5 = 0.600
P(income=medium | buys_computer=yes) = 4/9 = 0.444
P(income=medium | buys_computer=no) = 2/5 = 0.400
P(student=yes | buys_computer=yes) = 6/9 = 0.667
P(student=yes | buys_computer=no) = 1/5 = 0.200
P(credit=fair | buys_computer=yes) = 6/9 = 0.667
P(credit=fair | buys_computer=no) = 2/5 = 0.400

P(X | buys_computer=yes)
0.222 * 0.444 * 0.667 * 0.667
0.044

P(X | buys_computer=no)
0.600 * 0.400 * 0.200 * 0.400
0.019

P(X|Ci) P(Ci)
P(X | buys_computer=yes) P(buys_computer=yes)
= 0.044 * 0.643 = 0.028
P(X | buys_computer=no) P(buys_computer=no)
= 0.019 * 0.357 = 0.006
Therefore, the class of X = (age=youth, income = medium,
student = yes, credit_rating = fair) buys computer.

You might also like