You are on page 1of 4

SUPPORT VECTOR MACHINES AND

ASSOCIATIVE CLASSIFIERS
SVMSupport Vector Machines
A new classification method for both linear and
nonlinear data
It uses a nonlinear mapping to transform the original
training data into a higher dimension
With the new dimension, it searches for the linear
optimal separating hyperplane (i.e., decision
boundary)
With an appropriate nonlinear mapping to a
sufficiently high dimension, data from two classes
can always be separated by a hyperplane
SVM finds this hyperplane using support vectors
(essential training tuples) and margins (defined by
the support vectors)
SVMHistory and Applications
Vapnik and colleagues (1992)groundwork from
Vapnik & Chervonenkis statistical learning theory in
1960s
Features: training can be slow but accuracy is high
owing to their ability to model complex nonlinear
decision boundaries (margin maximization)
Used both for classification and prediction
Applications:
handwritten digit recognition, object recognition,
speaker identification, benchmarking time-series
prediction tests

SVMGeneral Philosophy

Data Set D (X1, y1), (X2, y2)(Xn, yn)

Xi Training tuples

yi Class labels
Linearly Separable

Best separating line minimum classification


error

Plane / Hyperplane
Maximal Margin Hyperplane (MMH)

Larger margin more accurate at classifying


future samples

Gives largest separation among classes

Sides of margin are parallel to hyperplane

Distance from MMH to closest training tuple of


either class

SVMLinearly Separable High Dimensional Data

Classification using SVM


Maximal Margin Hyperplane: d(XT)= i=1 l
yiiXiXT + b0

Xi Support vectors (l numbers) and XTTest data

Class depends on sign of result

SVM vs. Neural Network


SVM
Relatively new concept
Deterministic algorithm
Nice Generalization properties
Hard to learn learned in batch mode using
quadratic programming techniques

Using kernels can learn very complex functions


Neural Network
Relatively old
Nondeterministic algorithm
Generalizes well but doesnt have strong
mathematical foundation
Can easily be learned in incremental fashion
To learn complex functionsuse multilayer
perceptron

Associative Classification

Associative classification

Association rules are generated and analyzed for


use in classification

Search for strong associations between frequent


patterns (conjunctions of attribute-value pairs)
and class labels

Classification: Based on evaluating a set of rules


in the form of
p1 ^ p2 ^ pl Aclass = C (conf, sup)

Effectiveness

It explores highly confident associations among


multiple attributes

Overcome some constraints introduced by


decision-tree induction which considers only one
attribute at a time

Associative classification has been found to be


more accurate than some traditional classification
methods, such as C4.5

Typical Associative Classification Methods

CBA (Classification By Association)

Mine possible rules in the form of

Cond-set (a set of attribute-value pairs)


class label

Iterative Approach similar to Apriori

Build classifier: Organize rules according to


decreasing precedence based on confidence and
then support

CMAR (Classification based on Multiple Association


Rules)

Uses variant of FP-Growth

Mined rules are stored and retrieved using tree


structure

Rule Pruning is carried out Specialized rules


with low confidence are pruned

Classification: Statistical analysis on multiple


rules

CPAR (Classification based on Predictive


Association Rules)

Generation of predictive rules (FOIL-like


analysis)

High efficiency, accuracy similar to CMAR

Uses best-k rules of each group to predict class


label

You might also like