You are on page 1of 6

Completely Lazy Learning

ABSTRACT:
Local classifiers are sometimes called lazy learners because they do not
train a classifier until presented with a test sample. However, such
methods are generally not completely lazy because the neighborhood
size k (or other locality parameter) is usually chosen by cross validation
on the training set, which can require significant preprocessing and risks
overfitting. We propose a simple alternative to cross validation of the
neighborhood size that requires no preprocessing: instead of committing
to one neighborhood size, average the discriminants for multiple
neighborhoods. We show that this forms an expected estimated posterior
that minimizes the expected Bregman loss with respect to the
uncertainty about the neighborhood choice. We analyze this approach
for six standard and state-of-the-art local classifiers, including
discriminative adaptive metric kNN (DANN), a local support vector
machine (SVM-KNN), hyperplane distance nearest neighbor (HKNN),
and a new local Bayesian quadratic discriminant analysis (local BDA).
The empirical effectiveness of this technique versus cross validation is
confirmed with experiments on seven benchmark data sets, showing that
similar classification performance can be attained without any training.
Index Terms:

Lazy learning, Bayesian estimation, cross validation, local learning,


quadratic discriminant analysis.

EXISTING SYSTEM:

 Local classifiers are sometimes called lazy learners because they


do not train a classifier until presented with a test sample.

 In existing on the training set, which can require significant


preprocessing and risks overfitting.

 The disadvantages with lazy learning include the large space


requirement to store the entire training dataset.

 Particularly noisy training data increases the case base


unnecessarily, because no abstraction is made during the training
phase.

 Another disadvantage is that lazy learning methods are usually


slower to evaluate, though this is coupled with a faster training
phase.
PROPOSED SYSTEM:
 In proposed, Lazy learning systems can simultaneously solve
multiple problems and deal successfully with changes in the
problem domain.
 We propose a simple alternative to cross validation of the
neighborhood size that requires no preprocessing.
 We analyze this approach for six standard and state-of-the-art local
classifiers, including discriminative adaptive metric kNN (DANN),
a local support vector machine (SVM-KNN), hyperplane distance
nearest neighbor (HKNN), and a new local Bayesian quadratic
discriminant analysis (local BDA).
 a Bayes estimator or a Bayes rule is an estimator or decision
rule that minimizes the posterior expected value of a loss function.
 We showing that this Bayesian neighborhood approach achieves
error rates that are similar to those given by a cross-validated
neighborhood size with six different local classifiers, but without
the preprocessing.
 Finally, we showing that similar classification performance can be
attained without any training.
SYSTEM SPECIFICATION

HARDWARE CONFIGURATION

• Hard disk : 40 GB
• RAM : 512mb
• Processor : Pentium IV
• Monitor : 17’’Color Monitor

SOFTWARE CONFIGURATION

 Front End : Java


 Operating System : Windows XP.
 Back End : SQL SERVER
REFERENCES:
• D. Aha, Lazy Learning. Springer, 1997.
• H. Zhang, A.C. Berg, M. Maire, and J. Malik, “SVM-KNN: Discriminative
Nearest Neighbor Classification for Visual Category Recognition,” Proc.
IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2126-
2136, 2006.
• M.R. Gupta, R. Gray, and R. Olshen, “Nonparametric Supervised Learning
by Linear Interpolation with Maximum Entropy,” IEEE Trans. Pattern
Analysis and Machine Intelligence, vol. 28, no. 5, pp. 766-781, May 2006.
• W. Lam, C. Keung, and D. Liu, “Discovering Useful Concept Prototypes for
Classification Based on Filtering and Abstraction,” IEEE Trans. Pattern
Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1075-1090, Aug.
2002.
• T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning. Springer-Verlag, 2001.
• C. Bo¨hm, S. Berchtold, and D. Keim, “Searching in High- Dimensional
Spaces- Index Structures for Improving the Performance of Multimedia
Databases,” ACM Computing Surveys, vol. 33, no. 3, pp. 322-373, Sept.
2001.
• D. Cantone, A. Ferro, A. Pulvirenti, D. Reforgiato, and D. Shasha, “Antipole
Indexing to Support Range Search and K-Nearest Neighbor on Metric
Spaces,” IEEE Trans. Knowledge and Data Eng., vol. 17, no. 4, pp. 535-550,
Apr. 2005.

You might also like