w4 PDF

Logistics and Reminders
Lab 1 due Friday August 19th 11.55pm

Moodle submission link is open
Enrolment key - csl603_201620171
Tomorrow Mondays schedule

Class 9.55-10.45am
Quiz 3
Thursday August 25th holiday
Quiz 3 will be held on Wednesday August 24th during
regular class times 11.45-12.35
Instance Based Learning
CSL465/603 - Machine Learning
Instance Based
Learning
CSL465/603 - Fall 2016
Narayanan C Krishnan
ckn@iitrpr.ac.in
Outline
K-nearest neighbor
Other forms of IBL
Nonparametric Methods
Radial Basis Functions
Key Ideas
Training store all training examples (no explicit
learning)
Testing compute only locally the target function
Advantages
Can learn very complex target function
Training is very fast
No loss of information
Disadvantages
Slow during testing
Easily fooled by irrelevant attributes
Example
K-Nearest Neighbor Learning

Just store all training examples
Nearest neighbor
x" , %
(
%&'
Given query instance x, first locate the nearest example

x" , then estimate %
K- nearest neighbor
Given the query instance x,
take a vote among its nearest neighbors (if is discrete)
take the mean of the % values of nearest neighbors
-
1
, %
%&'
Distance Measures
Numeric features
Manhattan, Euclidean, / norm
7
/ x' , x0 =
, '4 04
4&'
Normalized by: range, standard deviation.
Symbolic (categorical) features

Hamming/ overlap
Value difference measure (VDM):
<
% , 4 = , |% |4
-&'
In general - encode knowledge

Metric learning
Illustrating k-NN
Voronoi Diagram
Voronoi cell of x : all points in closer to x than
any other instance in
What is the target concept ?

Behavior in the limit

x : Error of optimal prediction
(( x : Error of nearest neighbor
Theorem: lim (( 2
EG
Cover and Hart, IEEE TIT 1967
Proof sketch: (2-class case)

(( = K ((LM + M ((LK
= K (1 ((LK ) + 1 K ((LK
lim ((LK = K , lim ((LM = M
EG
EG
lim (( = K 1 K + 1 K K = 2 1 2
EG
lim Nearest Neighbor = Gibbs Classifier
EG
lim
k Nearest Neighbor = Bayes
EG,-G, _
E
10
Distance-Weighted k-NN
Simple refinement over k-NN
Might want to weight nearer neighbors more
heavily
-%& % %
= %&' %
Where
1
%
x, x" 0
And (x, x" )is the distance between x and x"
Makes senses to use all the training examples
instead of just
11
Issues with k-NN (1)

Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it
Distance computation depends on all attributes

Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.
12
Possible Solutions
Feature Selection
Filter Approach
Pre-select features individually using some measure
Wrapper approach
Experiment with different combinations of features using a
learner
Forward selection
Backward elimination
Feature Weighting
Stretch the gh attribute by weight j , where ', , 6 is

chosen to minimize prediction error.
13

Inductive bias

Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions
14
Curse of Dimensionality
Examples
Normal Distribution
Points on a hyper grid
Approximation of a sphere by a cube
15

Inductive bias

Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions
Computational cost
Distance computation while testing!
16
Reducing the Computational Cost

Efficient retrieval: k-D Trees (for lower dimensions)
Efficient Similarity computation
Use a cheap approximation to weed out most of the
instances
Use expensive measure on the remainder
Forming prototypes
Edited k-NN
Remove instances that do not affect the decision
17
Overfitting
What parameter of the model can indicate
overfitting?
set the parameter through validation experiments
Remove noisy instances

Remove x if all xs - nearest neighbors are of different
classes
18
Nonparametric methods
Form of underlying distributions unknown
Still want to perform classification (or regression)
Nonparametric Density Estimation

Given a training set I = x" , % , drawn i.i.d from

Divide data into bins of size h
o "q rst uvwt x"q vu o
Histogram: n x = p
(h
z
~
op oM{ | op } oK{
Nave estimator:n x =
(h
(
1
x x"
1, if < 1/2
n
x =
,
, =
0, otherwise
%&'
19

(1)
20

(2)
Smoothing the estimate
Use a kernel function, e.g., radial basis function or
Gaussian kernel
1
0
=
exp
2
2
Parzen windows (Kernel estimator)
(
1
x x"
n x =
,
%&'
21

(3)
22
K-Nearest Neighbor Estimator (1)

Instead of fixing bin width , and counting the
number of instances, fix the number of instances
(neighbors) and check bin width
n x =
2- x
Where
- x - distance to the kth closest instance to x
23
K-Nearest Neighbor Estimator (2)
24
Summary
Lazy learning
K-NN for classification
Issues with KNN and potential solutions
Density Estimation
Nonparameteric methods
25

w4 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

w4 PDF

Uploaded by

Copyright:

Available Formats

Logistics and Reminders

Lab 1 due Friday August 19th 11.55pm

Tomorrow Mondays schedule

Instance Based Learning

CSL465/603 - Machine Learning

Instance Based Learning

CSL465/603 - Machine Learning

Instance Based Learning

CSL465/603 - Machine Learning

Instance Based Learning

CSL465/603 - Machine Learning

K-Nearest Neighbor Learning

Given query instance x, first locate the nearest example

Instance Based Learning

CSL465/603 - Machine Learning

Normalized by: range, standard deviation.

Symbolic (categorical) features

In general - encode knowledge

CSL465/603 - Machine Learning

Instance Based Learning

CSL465/603 - Machine Learning

What is the target concept ?

CSL465/603 - Machine Learning

Behavior in the limit

Cover and Hart, IEEE TIT 1967

Proof sketch: (2-class case)

Instance Based Learning

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning

Issues with k-NN (1)

Distance computation depends on all attributes

Instance Based Learning

CSL465/603 - Machine Learning

Stretch the gh attribute by weight j , where ', , 6 is

Instance Based Learning

CSL465/603 - Machine Learning

Issues with k-NN (2)

Distance computation depends on all attributes

Instance Based Learning

CSL465/603 - Machine Learning

Instance Based Learning

CSL465/603 - Machine Learning

Issues with k-NN (3)

Distance computation depends on all attributes

CSL465/603 - Machine Learning

Reducing the Computational Cost

Instance Based Learning

CSL465/603 - Machine Learning

Remove noisy instances

Instance Based Learning

CSL465/603 - Machine Learning

Nonparametric Density Estimation

Instance Based Learning

CSL465/603 - Machine Learning

Nonparametric Density Estimation

Instance Based Learning

CSL465/603 - Machine Learning

Nonparametric Density Estimation

Instance Based Learning

CSL465/603 - Machine Learning

Nonparametric Density Estimation

Instance Based Learning

CSL465/603 - Machine Learning