Artificial Intelligence 6. Machine Learning, Version Space Method

Artificial Intelligence 6.
Machine Learning, Version Space Method
Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka
Schedule
Oct 26 Oct 28 Nov 2 Nov 4 Nov 9 Machine Learning, Version Space Method Decision Trees Supervised and Unsupervised Learning Perceptron Neural Network
Office hour: exam. (logic)
Office hour: questions for the report
Office hour: exam.
Outline
Introduction to machine learning
What is machine learning? Applications of machine learning
Version space method

Representing hypotheses, version space Find-S algorithm Candidate-Elimination algorithm
http://www.jaist.ac.jp/~tsuruoka/lectures/
What is machine learning?

What does a machine learn? What machine learning can do:
Classification, regression, structured prediction Clustering
Machine learning involves

Theories of optimization, probability, graphs, search, logic, etc.
Recognizing handwritten digits
Hastie, Tibshirani and Friedman (2008). The Elements of Statistical Learning (2nd edition). Springer-Verlag.
Applications of machine learning

Image/speech recognition Part-of-speech tagging, syntactic parsing, word sense disambiguation Detection of spam emails Intrusion detection Credit card fraud detection Automatic driving AI players in computer games etc.
Types of machine learning

Supervised learning
correct output is given for each instance
Unsupervised learning
No output is given Analyses relations between instances
Reinforcement learning
Supervision is given via rewards
Why machine learning?

Why not write rules manually?
Detecting spam emails
If the mail contains the word Nigeria then it is a spam If the mail comes from IP X.X.X.X then it is a spam If the mail contains a large image then it is a spam
Too many rules Hard to keep consistency Each rule may not be completely correct
Version space method

Chapter 2 of Mitchell, T., Machine Learning (1997)
Concept Learning Training examples Representing hypotheses Find-S algorithm Version space Candidate-Elimination algorithm
Learning a concept with examples

Training examples
Ex. 1 2 3 4 Sky Sunny Sunny Rainy Sunny AirTemp Warm Warm Cold Warm
attributes
Humidity Normal High High High Wind Strong Strong Strong Strong Water Warm Warm Warm Cool Forecast Same Same Change Change EnjoySport Yes Yes No Yes
The concept we want to learn

Days on which my friend Aldo enjoys his favorite water sports
Hypotheses
Representing hypotheses
h1 = <Sunny, ?, ?, Strong, ?, ?>
Weather = Sunny, Wind = Strong (the other attributes can be any values)
h2 = <Sunny, ?, ?, ?, ?, ?>
Weather = Sunny
General and Specific

h1 is more specific than h2 (h2 is more general than h1)
Find-S Algorithm
1. Initialize h to the most specific hypothesis in H 2. For each positive training instance x
For each attribute constraint ai in h
If the constraint ai is satisfied by x Then do nothing Else replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h
Example
h0 = <0, 0, 0, 0, 0, 0> x1 = <Sunny, Warm, Normal, Strong, Warm, Same>, yes h1 = <Sunny, Warm, Normal, Strong, Warm, Same> x2 = <Sunny, Warm, High, Strong, Warm, Same>, yes h2 = <Sunny, Warm, ?, Strong, Warm, Same> x3 = <Rainy, Cold, High, Strong, Warm, Change>, no h3 = <Sunny, Warm, ?, Strong, Warm, Same> x4 = <Sunny, Warm, High, Strong, Cool, Change>, yes h4 = <Sunny, Warm, ?, Strong, ?, ?>
Problems with the Find-S algorithm

It is not clear whether the output hypothsis is the correct hypothesis
There can be other hypotheses that are consistent with the training examples. Why prefer the most specific hypothesis?
Cannot detect when the training data is inconsistent
Version Space
Definition
Hypothesis space H Training examples D Version space:
VSH , D h H Consistenth, D
The subset of hypotheses from H consistent with the training examples in D
LIST-THEN-ELIMINATE algorithm
1. VersionSpace a list containing every hypothesis in H 2. For each training example, <x, c(x)>
Remove from VersionSpace any hypothesis h for which h(x) c(x)
3. Output the list of hypothesis in VersionSpace
Version Space
Specific boundary and General boundary
S: { <Sunny, Warm, ?, Strong, ?, ?> }
<Sunny, ?, ?, Strong, ?, ?>
<Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
The version space can be represented with S and G. You dont have to list all the hypotheses.
Candidate-Elimination algorithm
Initialization
G: the set of maximally general hypotheses in H S: the set of maximally specific hypotheses in H
For each training example d, do

If d is a positive example
Remove from G any hypothesis inconsistent with d For each hypothesis s in S that is not consistent with d
Remove s from S Add to S all minimal generalization h of s such that h is consistent with d, and some member of G is more general than h
Remove from S any hypothesis that is more general than another hypothesis in S
If d is a negative example

Example
1st training example
<Sunny, Warm, Normal, Strong, Warm, Same>, yes S0: { <0, 0, 0, 0, 0, 0> } S1: { <Sunny, Warm, Normal, Strong, Warm, Same> }
G0, G1: { <?, ?, ?, ?, ?, ?> }
Example
2nd training example
<Sunny, Warm, High, Strong, Warm, Same>, yes S1: { <Sunny, Warm, Normal, Strong, Warm, Same> }
S2: { <Sunny, Warm, ?, Strong, Warm, Same> }
G0, G1 , G2 : { <?, ?, ?, ?, ?, ?> }
Example
3rd training example <Rainy, Cold, High, Strong, Warm, Change>, no S2,S3 :{ <Sunny, Warm, ?, Strong, Warm, Same> }
G3: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> }
G2 : { <?, ?, ?, ?, ?, ?> }
Example
4th training example <Sunny, Warm, High, Strong, Cool, Change>, yes S3 :{ <Sunny, Warm, ?, Strong, Warm, Same> } S4 :{ <Sunny, Warm, ?, Strong, ?, ?> }
G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }

G3: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> }
The final version space

S4 :{ <Sunny, Warm, ?, Strong, ?, ?> }
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }

Artificial Intelligence 6. Machine Learning, Version Space Method

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Intelligence 6. Machine Learning, Version Space Method

Uploaded by

Copyright:

Available Formats

Artificial Intelligence 6.

Machine Learning, Version Space Method

Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka

Office hour: questions for the report

Office hour: exam.

Version space method

What is machine learning?

Machine learning involves

Recognizing handwritten digits

Applications of machine learning

Types of machine learning

Why machine learning?

Version space method

Learning a concept with examples

The concept we want to learn

General and Specific

Problems with the Find-S algorithm

Cannot detect when the training data is inconsistent

3. Output the list of hypothesis in VersionSpace

<Sunny, ?, ?, Strong, ?, ?>

<Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>

G: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }

For each training example d, do

G0, G1: { <?, ?, ?, ?, ?, ?> }

S2: { <Sunny, Warm, ?, Strong, Warm, Same> }

G0, G1 , G2 : { <?, ?, ?, ?, ?, ?> }

G3: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> }

G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }

The final version space

G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }

You might also like