Learning From Examples: Knowledge Extraction

Learning from Examples
 Example of Learning from Examples

 Classification: Is car x a family car?
 Prediction: What is the amount of rainfall
tomorrow?
 Knowledge extraction: What do people
expect from a family car? What factors are
important to predict tomorrows rainfall?
1
Christoph Eick: Learning Models to Predict and Classify
Noise and Model Complexity
Use the simpler one because
 Simpler to use
(lower computational
complexity)
 Easier to train (needs less examples)
 Less sensitive to noise
 Easier to explain
(more interpretable)
 Generalizes better (lower
variance - Occam’s razor)
2
Alterantive Approach: Regression
X  x ,r  t

t N
t 1
g x   w1x  w 0
t
r 
g x   w 2x 2  w1x  w 0
rt  f xt    
1
  
N
E g | X  
Lecture  Notes
t
r g x t 2
for E Alpaydın
N t 1
2004 Introduction to Machine
1
 x  w 0 
 w 1Press
N
E w 1 ,Learning
w0 | X   ©  t t 2
The r MIT
N t 1
(V1.1)
3
Finding Regresssion Coefficients
g x   w1x  w 0
X  x t ,r  
t N
t 1
t How to find w1 and w0?
r 
Solve: dE/dw1=0 and dE/dw0=0
rt  f xt     And solve the two obtained equations!
Group Homework!
1
  
N
E g | X  
Lecture  Notes
t
r g x t 2
for E Alpaydın
N t 1
2004 Introduction to Machine
1
 x  w 0 
 w 1Press
N
E w 1 ,Learning
w0 | X   ©  t t 2
The r MIT
N t 1
(V1.1)
4
Model Selection & Generalization
 Learning is an ill-posed problem; data is not
sufficient to find a unique solution
 The need for inductive bias, assumptions about H
 Generalization: How well a model performs on new
data
 Overfitting: H more complex than C or f
 Underfitting: H less complex than C or f
5
Underfitting and Overfitting
Underfitting Overfitting
Complexity of a Decision
Tree := number of nodes
It uses
Complexity of the Used Model

Underfitting: when model is too simple, both training and test errors are large
Overfitting: when model is too complex and test errors are large although
training
Christoph errors
Eick: Learning Models are small.
to Predict and Classify
Generalization Error
Error on new examples!
 Two errors: training error, and testing error usually called

generalization error (http://en.wikipedia.org/wiki/Generalization_error ). Typically, the
training error is smaller than the generalization error.
 Measuring the generalization error is a major challenge in
data mining and machine learning (http://www.faqs.org/faqs/ai-faq/neural-
nets/part3/section-11.html )
 To estimate generalization error, we need data unseen

during training. We could split the data as
 Training set (50%)
 Validation set (25%)optional, for selecting ML algorithm
parameters
 Test (publication) set (25%)
7
Triple Trade-Off overfitting
 There is a trade-off between three factors

(Dietterich, 2003):
1. Complexity of H, c (H),
2. Training set size, N,
3. Generalization error, E on new data
 As N, E
 As c (H), first E and then E
 As c (H) the training error decreases for some
time and then stays constant (frequently at 0)
8
Notes on Overfitting
 Overfitting results in models that are more
complex than necessary: after learning knowledge
they “tend to learn noise”
 More complex models tend to have more
complicated decision boundaries and tend to be
more sensitive to noise, missing examples,…
 Training error no longer provides a good estimate
of how well the tree will perform on previously
unseen records
 Need “new” ways for estimating errors

Thoughts on Fitness Functions
for Genetic Programming
1. Just use the squared training error overfitting
2. Use the squared training error but restrict model
complexity
3. Split Training set into true training set and validation
set; use squared error of validation set as the fitness
function.
4. Combine 1, 2, 3 (many combination exist)
5. Consider model complexity in the fitness function:
fitness(model)= error(model) + b*complexity(model)
10

Learning From Examples: Knowledge Extraction

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Learning From Examples: Knowledge Extraction

Uploaded by

Copyright:

Available Formats

Learning from Examples

 Example of Learning from Examples

Complexity of the Used Model

 Two errors: training error, and testing error usually called

 To estimate generalization error, we need data unseen

 There is a trade-off between three factors

Christoph Eick: Learning Models to Predict and Classify

You might also like