You are on page 1of 25

Logistics and Reminders

Lab 1 due Friday August 19th 11.55pm


Moodle submission link is open
Enrolment key - csl603_201620171

Tomorrow Mondays schedule


Class 9.55-10.45am

Quiz 3
Thursday August 25th holiday
Quiz 3 will be held on Wednesday August 24th during
regular class times 11.45-12.35

Instance Based Learning

CSL465/603 - Machine Learning

Instance Based
Learning
CSL465/603 - Fall 2016
Narayanan C Krishnan
ckn@iitrpr.ac.in

Outline
K-nearest neighbor
Other forms of IBL
Nonparametric Methods
Radial Basis Functions

Instance Based Learning

CSL465/603 - Machine Learning

Key Ideas
Training store all training examples (no explicit
learning)
Testing compute only locally the target function
Advantages
Can learn very complex target function
Training is very fast
No loss of information

Disadvantages
Slow during testing
Easily fooled by irrelevant attributes

Instance Based Learning

CSL465/603 - Machine Learning

Example

Instance Based Learning

CSL465/603 - Machine Learning

K-Nearest Neighbor Learning


Just store all training examples
Nearest neighbor

x" , %

(
%&'

Given query instance x, first locate the nearest example


x" , then estimate %

K- nearest neighbor
Given the query instance x,
take a vote among its nearest neighbors (if is discrete)
take the mean of the % values of nearest neighbors
-

1
, %

%&'

Instance Based Learning

CSL465/603 - Machine Learning

Distance Measures
Numeric features
Manhattan, Euclidean, / norm
7

/ x' , x0 =

, '4 04

4&'

Normalized by: range, standard deviation.

Symbolic (categorical) features


Hamming/ overlap
Value difference measure (VDM):
<

% , 4 = , |% |4
-&'

In general - encode knowledge


Metric learning
Instance Based Learning

CSL465/603 - Machine Learning

Illustrating k-NN

Instance Based Learning

CSL465/603 - Machine Learning

Voronoi Diagram
Voronoi cell of x : all points in closer to x than
any other instance in

What is the target concept ?


Instance Based Learning

CSL465/603 - Machine Learning

Behavior in the limit


x : Error of optimal prediction
(( x : Error of nearest neighbor
Theorem: lim (( 2
EG

Cover and Hart, IEEE TIT 1967

Proof sketch: (2-class case)


(( = K ((LM + M ((LK
= K (1 ((LK ) + 1 K ((LK
lim ((LK = K , lim ((LM = M
EG
EG
lim (( = K 1 K + 1 K K = 2 1 2
EG
lim Nearest Neighbor = Gibbs Classifier
EG
lim
k Nearest Neighbor = Bayes
EG,-G, _
E

Instance Based Learning

CSL465/603 - Machine Learning

10

Distance-Weighted k-NN
Simple refinement over k-NN
Might want to weight nearer neighbors more
heavily
-%& % %
= %&' %
Where
1
%
x, x" 0
And (x, x" )is the distance between x and x"
Makes senses to use all the training examples
instead of just
Instance Based Learning

CSL465/603 - Machine Learning

11

Issues with k-NN (1)


Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

Distance computation depends on all attributes


Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

Instance Based Learning

CSL465/603 - Machine Learning

12

Possible Solutions
Feature Selection
Filter Approach
Pre-select features individually using some measure

Wrapper approach
Experiment with different combinations of features using a
learner
Forward selection
Backward elimination

Feature Weighting

Stretch the gh attribute by weight j , where ', , 6 is


chosen to minimize prediction error.

Instance Based Learning

CSL465/603 - Machine Learning

13

Issues with k-NN (2)


Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

Distance computation depends on all attributes


Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions

Instance Based Learning

CSL465/603 - Machine Learning

14

Curse of Dimensionality
Examples
Normal Distribution
Points on a hyper grid
Approximation of a sphere by a cube

Instance Based Learning

CSL465/603 - Machine Learning

15

Issues with k-NN (3)


Inductive bias
Classification of an instance will be most similar to the
classification of instances close to it

Distance computation depends on all attributes


Imagine instances described by 20 attributes, but only 2
are relevant to the target attribute.

Curse of dimensionality
Sensitive to dimensionality of the data
Low dimension intuitions do not apply in high dimensions

Computational cost
Distance computation while testing!
Instance Based Learning

CSL465/603 - Machine Learning

16

Reducing the Computational Cost


Efficient retrieval: k-D Trees (for lower dimensions)
Efficient Similarity computation
Use a cheap approximation to weed out most of the
instances
Use expensive measure on the remainder

Forming prototypes
Edited k-NN
Remove instances that do not affect the decision

Instance Based Learning

CSL465/603 - Machine Learning

17

Overfitting
What parameter of the model can indicate
overfitting?
set the parameter through validation experiments

Remove noisy instances


Remove x if all xs - nearest neighbors are of different
classes

Instance Based Learning

CSL465/603 - Machine Learning

18

Nonparametric methods
Form of underlying distributions unknown
Still want to perform classification (or regression)

Nonparametric Density Estimation


Given a training set I = x" , % , drawn i.i.d from

Divide data into bins of size h
o "q rst uvwt x"q vu o
Histogram: n x = p

(h
z
~
op oM{ | op } oK{

Nave estimator:n x =
(h
(
1
x x"
1, if < 1/2
n
x =
,
, =
0, otherwise
%&'

Instance Based Learning

CSL465/603 - Machine Learning

19

Nonparametric Density Estimation


(1)

Instance Based Learning

CSL465/603 - Machine Learning

20

Nonparametric Density Estimation


(2)
Smoothing the estimate
Use a kernel function, e.g., radial basis function or
Gaussian kernel
1
0
=
exp
2
2
Parzen windows (Kernel estimator)
(
1
x x"
n x =
,
%&'

Instance Based Learning

CSL465/603 - Machine Learning

21

Nonparametric Density Estimation


(3)

Instance Based Learning

CSL465/603 - Machine Learning

22

K-Nearest Neighbor Estimator (1)


Instead of fixing bin width , and counting the
number of instances, fix the number of instances
(neighbors) and check bin width

n x =
2- x
Where
- x - distance to the kth closest instance to x

Instance Based Learning

CSL465/603 - Machine Learning

23

K-Nearest Neighbor Estimator (2)

Instance Based Learning

CSL465/603 - Machine Learning

24

Summary
Lazy learning
K-NN for classification
Issues with KNN and potential solutions
Curse of dimensionality

Density Estimation
Nonparameteric methods

Instance Based Learning

CSL465/603 - Machine Learning

25

You might also like