Professional Documents
Culture Documents
1
Christoph Eick: Learning Models to Predict and Classify
Noise and Model Complexity
Use the simpler one because
Simpler to use
(lower computational
complexity)
Easier to train (needs less examples)
Less sensitive to noise
Easier to explain
(more interpretable)
Generalizes better (lower
variance - Occam’s razor)
2
Christoph Eick: Learning Models to Predict and Classify
Alterantive Approach: Regression
X x ,r t
t N
t 1
g x w1x w 0
t
r
g x w 2x 2 w1x w 0
rt f xt
1
N
E g | X
Lecture Notes
t
r g x t 2
for E Alpaydın
N t 1
2004 Introduction to Machine
1
x w 0
w 1Press
N
E w 1 ,Learning
w0 | X © t t 2
The r MIT
N t 1
(V1.1)
3
Christoph Eick: Learning Models to Predict and Classify
Finding Regresssion Coefficients
g x w1x w 0
X x t ,r
t N
t 1
t How to find w1 and w0?
r
Solve: dE/dw1=0 and dE/dw0=0
rt f xt And solve the two obtained equations!
Group Homework!
1
N
E g | X
Lecture Notes
t
r g x t 2
for E Alpaydın
N t 1
2004 Introduction to Machine
1
x w 0
w 1Press
N
E w 1 ,Learning
w0 | X © t t 2
The r MIT
N t 1
(V1.1)
4
Christoph Eick: Learning Models to Predict and Classify
Model Selection & Generalization
Learning is an ill-posed problem; data is not
sufficient to find a unique solution
The need for inductive bias, assumptions about H
Generalization: How well a model performs on new
data
Overfitting: H more complex than C or f
Underfitting: H less complex than C or f
5
Christoph Eick: Learning Models to Predict and Classify
Underfitting and Overfitting
Underfitting Overfitting
Complexity of a Decision
Tree := number of nodes
It uses
7
Christoph Eick: Learning Models to Predict and Classify
Triple Trade-Off overfitting
8
Christoph Eick: Learning Models to Predict and Classify
Notes on Overfitting
Overfitting results in models that are more
complex than necessary: after learning knowledge
they “tend to learn noise”
More complex models tend to have more
complicated decision boundaries and tend to be
more sensitive to noise, missing examples,…
Training error no longer provides a good estimate
of how well the tree will perform on previously
unseen records
Need “new” ways for estimating errors
10
Christoph Eick: Learning Models to Predict and Classify