Professional Documents
Culture Documents
Definition
Classification
assigns input data into one or more of C pre-specified classes based on the extracted features.
Description
an alternative to classification where a conceptual or structural description of the input pattern is desire
d.
Recognition
ability to classify , even identify in many cases.
Preprocessing
filtering and transformation of raw input to minimize noise and to extract rather reliable features.
Postprocessing
Refining the classification or recognition results by utilizing properties of the pattern classes
Noise
undesirable components in data
distortions or errors in raw input data (preprocessing can increase the distortion)
errors in feature extraction, errors in training data
More Classification Errors
Searching for Noise Robust Features
Just imagine how to classify apples from oranges; usually intuitional tasks
But very powerful in many applications
Still rather hard to generalize the algorithm, i.e., training data dependency
Recommended to design ones own decision based pattern recognizer before imple
menting the main PR algorithm as a practice.
p4
class i
p1
class j
class
k
Class membership or
description space: C
m1
p2
m2
p3
m3
pattern space: P
Observation or
measurement space: F
Preprocessing
Usually the first step after acquiring raw data from which pattern recognitio
n is performed.
Usually involves:
Noise reduction; Filtering
Adjusting or deduction of the average value of data (In many cases, including normal
ization)
Segmentation of region (object) of interest from the acquired raw data
Processing of the raw data and mapping the results into different representation dom
ains
Various transforms can be used such as Wavelet, Fast Fourier transforms
Preprocessing segmentation
Region Tree
Object Tree
Root
R1
Region
Segmentation
R3
R2
R4
R5
R8
R9
R12
Body
TV
Jacket
Hair
Face
Torso
R7
R6
R10
R 11
R13
Observation
environment
Feature extraction
process
X (feature vector)
X (string)
P (structure)
Classification
Description
Some Terminologies
Classifiers
partitions feature space into class-labeled decision regions.
Separable or inseparable
If inseparable, any compromise is possible and allowed?
A decision region
represents the corresponding class in feature space.
Discriminant function
computes the class-likelihood value of a feature (vector) with respect to a certain class.
Classification Rule
Classification on features can be done based on compari
ng a set of discriminant functions
Decision rules: Assign
gm x gi x
to class
(region
i 1,2, , c and i m
Rm
) if
Classification Rule
Decision boundary divides decision regions in feature space where
g k x gl x
Linear
discriminant
functions
R1
R2
Decision
boundary
R3
Quadratic
discriminant
function
R1
R2
arbitrary
discriminant
function
R1
R2
R2
Decision
region
R3
R2
R1
R4
Classification Rule
Which decision boundary is the best linear, quadra
tic or arbitrary decision boundary?
Classification Rule
Generalization problem
For a robust classification performance with test data, the decision boundary needs to
be conformed in a robust way.
So, the training samples should represent the nature of the underlying data successful
ly
Sufficient number of training data?
The more, the better. But not practical if too much.
Experimentally, or intuitively, we have to decide the number with some
restraints such as small number of available data.
The classifiers should be sufficiently trained! (Not always possible!)
There are some tricky methods to recover the shortage of the training data but sufficie
nt training data are better!
More insight for the tasks, better pattern recognizers!
Summary
Relationship of pattern recognition to other areas
Fuzzy sets
Structural Modeling
Formal language, ..
Summary
Pattern recognition applications
Image preprocessing, segmentation and analysis, coding
Seismic /weather/hurricane analysis and forecast
Radar, Sonar signal classification/analysis
Face, Gesture recognition
Speech/Audio recognition/understanding, Speaker and language recognitio
n
Fingerprint, DNA identification
Character (letter or number) recognition
Handwriting analysis; Electronic Signature for e-commerce
Human bio-signals such as ECG and EEG signal analysis/understanding
Medical diagnosis (tumor, mammogram, germ, virus, cancer,.)
Others to be mentioned
Terms
Authors are saying in p.12 that: Although the pattern classification techniques presented in this
book cannot substitute for domain knowledge, they can be helpful in making the feature values
less sensitive to noise.