Professional Documents
Culture Documents
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
ste
m
s
Image Analysis/Classification
Image Classification
s
m
s te
Sy
Classification is the process of assigning pixels to the Land cover classes
n
or themes in a remotely sensed data
io
at
A process of converting remotely sensed data into useful information
rm
fo
Image classification can be described as a process to automatically
In
assign land cover class categories to every pixel of the image
l
ca
An important part of image analysis is identifying groups of pixels that
hi
ap
have similar spectral characteristics and to determine the various
gr
m
s te
Sy
Visual Interpretation
n
io
Human analyst attempts to classify features in an image using the
at
elements of visual interpretation
rm
fo
In
Digital image classification
l
ca
hi
Uses the spectral information represented by the digital numbers in one
ap
or more spectral bands, and attempts by using a computer, to classify
gr
m
s te
Sy
An image represents both spectral and information classes which
n
io
at
must be distinguished while performing classification
rm
fo
Spectral classes: groups of pixels that have nearly uniform spectral
l In
characteristics in all the spectral bands
ca
hi
ap
Information classes: various themes or object that represent the
gr
m
s te
Sy
Image classification can also be defined as the process of matching the
n
spectral classes in the data to the information classes of interest
io
at
All pixels in an image are assigned to particular classes or themes resulting in
rm
a classified image that is essentially a thematic map of the original image
fo
In
Conventional classification approaches operate on a per-pixel basis
l
ca
Per-pixel methods examine each pixel of the image independently and
hi
ap
assign it to a class based on its spectral characteristics
gr
Supervised classifications
of
te
Unsupervised classifications
itu
st
In
s
m
Information Classes and Spectral Classes
ste
Sy
A broad information class may contain a number of spectral sub-classes
n
io
with unique spectral variations
at
rm
In Forest class , spectral sub-classes may be due to variations in age,
fo
In
species, and density, or perhaps as a result of shadowing or variations
l
ca
in scene illumination
hi
ap
gr
ste
Sy
n
Ideal classification results can be achieved once one spectral class is
io
at
rm
related to one information class.
fo
In
Likely results can be obtained once more than one spectral class
l
ca
relates to one information class.
hi
ap
However, when spectral classes relate to more than one information
gr
eo
Likely Results
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
s te
m
s
s
m
s te
Sy
Problematic Results Spectral Class Information Classes
n
io
1 water Water
at
Shadow
rm
2 Bare land Bare land
fo
In
Parking area
l
ca
hi
ap
gr
eo
G
of
te
itu
st
In
s
Classification
m
ste
Multispectral classification may be performed using
Sy
algorithms based on parametric and nonparametric statistics
n
io
nonmetric methods (Decision Tree based Methods)
at
rm
The use of
fo
Supervised or
In
Unsupervised classification logic
l
ca
The use of
Hard or
hi
ap
Soft (fuzzy)
gr
eo
The use of
G
Per-pixel or
of
tes
Sy
Parametric methods
n
io
assume normally distributed remote sensor data
at
rm
fo
Nonparametric methods
lIn
may be applied to remote sensor data that are not normally
ca
hi
distributed ap
gr
eo
Nonmetric methods
G
of
m
ste
Sy
Identity of some of the land-cover types are known a priori
n
io
The analyst selects training sites because the spectral
at
rm
characteristics of these known areas are used to train the
fo
classification algorithm for eventual land-cover mapping of the
In
l
ca
remainder of the image
hi
Univariate and multivariate statistical parameters are calculated
ap
gr
Every pixel both within and outside the training sites is then
of
te
ste
Sy
The identities of land-cover types within a scene are not generally
n
io
known a priori because ground reference information is lacking or
at
rm
surface features within the scene are not well defined
fo
In
The computer is instructed to group pixels with similar spectral
l
ca
characteristics into unique clusters according to some statistically
hi
ap
determined criteria
gr
eo
The analyst then re-labels and combines the spectral clusters into
G
of
information classes
te
itu
st
In
s
m
Unsupervised Classification
te
s
Sy
n
Supervised and unsupervised classification algorithms typically
io
at
use hard classification logic to produce a classification maps
rm
fo
lIn
ca
Conversely, it is also possible to use fuzzy logic, which takes into
hi
ap
account the imprecise nature of the real world
gr
eo
G
of
te
itu
st
In
s
m
Per-pixel vs. Object-oriented Classification
s te
Sy
Per-pixel classification
n
io
Digital image classification based on processing the entire scene pixel by
at
pixel
rm
Object-oriented (based) classification
fo
Techniques that allow the analyst to decompose the scene into many
In
relatively homogenous image objects (segments) using a multi-resolution
l
ca
image segmentation process
hi
The various statistical characteristics of these homogeneous image objects
ap
in the scene are then subjected to traditional statistical or fuzzy logic
gr
classification
eo
m
ste
Sy
Land use
n
io
refers to what people do on the land surface (e.g., agriculture,
at
rm
commerce, settlement)
fo
relates to the human activity or economic functions associated
lIn
with a specific piece of land, residential area ,industry
ca
Land cover hi
ap
gr
18
U.S. Geological Surveys Land-Use/Land-Cover
s
m
Classification System for Use with Remote Sensor Data
ste
Sy
The system is designed to be driven primarily by the interpretation of remote
n
io
at
sensor data obtained at various scales and resolutions
rm
Anderson Paper uploaded on LMS
fo
lIn
ca
hi
ap
gr
eo
G
of
te
itu
st
In
Level I Level II
s
11. Residential
1. Urban or built-up land
m
12. Commercial and service
13. Industrial
te
14. Transportation, communications and utilities
s
15. Industrial and commercial complexes
Sy
16. Mixed urban or built-up land
17. Other urban or built-up land
n
21. Cropland and pasture
2. Agricultural land
io
22. Orchards, groves, vineyards, nurseries, and ornamental horticultural areas
23. Confined feeding operations
at
24. Other agricultural land
rm
31. Herbaceous rangeland
3. Rangeland 32. Shrub and brush rangeland
33. Mixed rangeland
fo
41. Deciduous forest land
4. Forest land
In
42. Evergreen forest land
43. Mixed forest land
l
51. Streams and canals
ca
5. Water 52. Lakes
53. Reservoirs
hi
54.
ap Bays and estuaries
61. Forested wetland
6. Wetland 62. Non-forested wetland
gr
72. Beaches
73. Sandy area other than beaches
G
m
te
Level-III
s
Sy
n
io
Residential
at
rm
Level I Level II
Single Family
fo
1. Urban or 11. Residential
In
built-up land 12. Commercial and service
13. Industrial Multifamily
l
ca
14. Transportation, communications and
hi
utilities
Apartments
ap
15. Industrial and commercial
complexes
gr
Others
itu
st
In
21
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
ste
m
s
s
m
Nominal spatial resolution
te
requirements as a function of the
s
mapping requirements for Levels I to IV
Sy
land-cover classes in the United States
n
(based on Anderson et al., 1976). Note
io
the dramatic increase in spatial
at
resolution required to map Level II
rm
classes.
fo
In
l
ca
hi
ap
gr
eo
G
of
te
itu
st
In
s
m
Reference/Readings
tes
Sy
n
Chapters on Image Classification
io
at
rm
fo
In
l
ca
hi
ap
gr
eo
G
of
te
itu
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
ste
m
Unsupervised Classification
s
s
Unsupervised Classification
m
ste
Sy
Commonly referred to as clustering
n
io
Compared to supervised classification, unsupervised
at
rm
classification normally requires only a minimal amount
fo
In
of initial input from the analyst
l
ca
hi
This is because clustering does not normally require
ap
gr
eo
training data
G
of
te
itu
st
In
s
Unsupervised Classification
m
s te
Sy
Process that search for natural groupings of the spectral properties of pixels
n
io
Clustering process results in a classification map consisting of m spectral
at
rm
classes (clusters)
fo
The analyst then attempts a posteriori (after the fact) to assign or transform
In
the spectral classes into thematic information classes of interest (e.g., forest,
l
ca
agriculture)
hi
ap
Some spectral clusters may be meaningless because they represent mixed
gr
eo
The analyst must understand the spectral characteristics of the terrain well
of
te
s
m
te
Two step process
s
Sy
Iterative process of Clusters forming
n
Cluster means utilized to assign pixels to class based on
io
at
distance
rm
fo
lIn
ca
hi
ap
gr
eo
G
of
te
itu
st
In
To classify unknown pixel, the distance between the unknown pixel and the
s
m
mean values for each (cluster)class is determined
te
Assigns the unknown pixel to the closest class
s
Sy
n
io
at
rm
fo
l In
ca
hi
ap
gr
eo
G
of
te
itu
Min distance to means relatively easy, involves means values , ignores within class
variance
st
ste
Sy
n
Two widely used unsupervised classification methods
io
at
rm
K-Mean
fo
In
Iterative Self-Organizing Data Analysis Technique (ISODATA)
l
ca
hi
ap
gr
eo
G
of
te
itu
st
In
s
K mean
m
s te
The main idea of the k-means clustering is to specify number of clusters (as a
Sy
constraint) to be generated, and define a set of arbitrary pixels as cluster
n
io
at
centroids, i.e. one for each cluster (as the no of clusters known a priori)
rm
The clustering starts with the fixed a priori specified number of clusters to be
fo
In
created.
l
ca
It calculates the distance (normally Euclidian) between the initial cluster
hi
centroids and each pixel of the image, and then, each pixel in the image is
ap
gr
It iteratively calculates the new cluster means and the pixels assignment is
G
of
ste
Sy
The ISODATA algorithm is a modification of the k-means
n
io
clustering algorithm, which includes
at
rm
Merging of the clusters
fo
In
Splitting a single cluster into two clusters
l
ca
ISODATA is iterative because it makes a large number of
hi
ap
passes through the remote sensing dataset until specified
gr
eo
s te
Sy
Uses the minimum spectral distance formula to form clusters
n
io
Begins with either arbitrary cluster means or the means of an existing
at
rm
signature set
fo
Each time the clustering repeats, the means of these clusters are shifted
l In
ca
The new cluster means are used for the next iteration
hi
ap
The process stops when
gr
m
ste
Sy
ISODATA algorithms normally require the analyst to specify the
n
io
following criteria:
at
rm
Maximum number of clusters (classes) to be identified by the
fo
algorithm
l In
ca
Minimum Distance: Cluster pairs that have a Euclidean
hi
distance less than this value will be merged into one cluster
ap
gr
m
s te
Sy
Minimum members in a cluster (%): If a cluster contains less than the
n
io
minimum percentage of members, it is deleted and the members are
at
rm
assigned to an alternative cluster (percent of pixels in a cluster relative to
fo
the total number of pixels in the image)
l In
ca
hi
Maximum standard deviation (smax): When the standard deviation for a
ap
cluster exceeds the specified maximum standard deviation in any band,
gr
eo
s
m
(Iterative Self-Organizing Data Analysis Techniques)
te
s
Sy
The ISODATA algorithm starts with an initial threshold
n
io
It assigns each pixel to a cluster with the closest means and the process
at
rm
iteratively continues until a predefined threshold is approached
fo
It additionally merges the clusters that have less number of pixels than the
l In
threshold or if the center of two clusters are too close
ca
hi
It also split a big cluster if the cluster standard deviation exceeds a predefined
ap
value
gr
eo
After the required number of clusters are formed, the analyst using his priori
G
of
knowledge about the area or some reference data assigns the information
te
s
m
s te
Sy
n
io
at
rm
fo
l In
ca
hi
ap
gr
eo
G
of
te
itu
st
In
s
m
Post Classification
ste
Sy
n
In post classification phase, analyst compare spectral classes
io
at
with some reference data to identify the spectral classes
rm
fo
Spectral reflectance curves can be used to identify the
lIn
spectral classes
ca
hi
Defining the level of classification
ap
gr
Sy
ste
m
s
s
m
te
Initialize from Statistics generate arbitrary
s
Sy
clusters from the file statistics for the entire
.img file
n
Use Signature Means use only the selected
io
at
signatures in the Signature Editor to generate
rm
the clusters
fo
In
Convergence Threshold: is the maximum
l
percentage of pixels whose cluster assignments
ca
can go unchanged between iterations
hi
ap
gr
m
s te
Sy
n
io
at
rm
fo
l In
ca
hi
ap
gr
eo
G
of
te
itu
Raster Attribute editor > edits> column properties > up /down > Display width
st
In
s
Cluster Assignment to classes
m
te
Display both original and classified image in viewer
s
Sy
n
io
at
rm
fo
In
l
ca
hi
ap
gr
eo
G
of
te
itu
st
In
In
st
itu
individual class
Opacity to check
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
ste
m
s
s
m
ste
Figure 2. ISODATA
Sy
Figure 1.
Washington DC cluster map of the
Mall data set.
n
Washington DC Mall. A
io
total of 15 clusters were
at
identified in order to
rm
discern the spectral
fo
heterogeneity of
In
information classes.
l
ca
hi Key
ap
gr
eo
G
of
te
itu
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
ste
m
s
Classes
s
m
Reference/Reading
ste
Sy
n
Chapter 12,Section 12.3, Campbell
io
at
Chapter 8, Section 8.3.2, Mather
rm
fo
lIn
ca
hi
ap
gr
eo
G
of
te
itu
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
s
Supervised Classification
te
m
s
s
Supervised Classification
m
ste
Sy
Supervised classification can be defined as the process of using
n
io
samples of known classes to classify the remaining unknown
at
rm
pixels to these classes with in the image
fo
Supervised classification exploits a priori knowledge about the
In
l
ca
data to identify which of the predefined informational classes
hi
ap
closely resembles a classification unit, Prior Decision
gr
eo
m
ste
Sy
The analyst first selects training samples for all representative
n
io
classes
at
rm
These training samples are a collection of representative data
fo
In
points for which the class labels are known to the analyst
l
ca
Statistical characteristics (based on spectral information) of these
hi
ap
training samples are used to train the classifier, to identify and
gr
eo
m
ste
Sy
As the classification process is based on the training data, they
n
io
must be homogeneous and represent the classes as closely as
at
rm
possible
fo
Pixels that are at the boundaries of the different information
lIn
ca
classes, (pixels with mixed spectral characteristics) should not be
hi
ap
selected as training pixels to avoid between-class confusions
gr
eo
mixtures
itu
st
In
s
Supervised Classification
m
ste
Sy
The training samples should be well spread over the entire image
n
io
at
to adequately represent the information classes
rm
fo
In
An image with high within-class and low between-class spectral
l
ca
hi
variability makes it hard to select true representative class
ap
gr
m
ste
Sy
The training pixels should be sufficient in numbers and the
n
io
general rule is to collect at least 10xN to 100xN pixels for each
at
rm
class, where N is the number of bands of the image being used
fo
The brightness value of each pixel of the training data of the
lIn
ca
desired classes are used to calculate the mean, standard
hi
ap
deviation, variance-covariance and correlation matrices
gr
eo
m
te
Classification: Feature Selection
s
Sy
n
io
After getting training statistics a judgment is often made to
at
rm
determine the bands (channels) that are most effective in
fo
discriminating each class from all others
l In
ca
This process is commonly called feature selection
hi
ap
The goal is to delete from the analysis the bands that provide
gr
eo
s te
Sy
Training stage
n
Analyst need to have significant prior knowledge of the area and ground
io
at
cover
rm
fo
Training sample or training signature collection
In
Feature Selection
l
ca
hi
Classification Stage : categorization of each pixel to landscape class it
ap
most closely resemble
gr
Accuracy Assessment
eo
s te
Sy
Several widely adopted nonparametric classification algorithms
n
include:
io
at
Minimum distance
rm
Box Classifier or Parallepiped
fo
In
The most widely adopted parametric classification algorithms is
l
the Maximum likelihood
ca
hi
These algorithms require that
ap
gr
are to be allocated are known, these estimates are derived from the
te
training samples
itu
st
In
s
Minimum Distance to Means Classification Algorithm
m
te
To assign unclassified pixels to their nearest classes
s
Sy
The minimum distance to means decision rule is computationally
n
io
at
simple and commonly used
rm
fo
It requires that the user provide the mean vectors for each class
In
in each band ck from the training data
l
ca
hi
To perform a minimum distance classification, a program must
ap
gr
m
ste
Sy
Point 1, an unknown, the shortest straight-
line distance to the several means is to the
n
Soil
io
class ----------------------". Point 1, then, is
Soil
at
assigned to -------------------------------------.
rm
Point 2 is slightly closer to the "soil" but lies
at the edge of the "urban" spread. Here, the
fo
classification seems ambiguous.
In
By the minimum distance rule, it would
go to "soil" but this may be erroneous
l
ca
("urban" would have been a greater
hi
likelihood). ap
Point 3 is not near any of the class DN
gr
clusters, but is about equidistance between
"urban", "water", "forest", and "heather". If
eo
s te
Sy
n
io
Statistically, Range is the simplest measure of variability for a set of
at
data values
rm
fo
Range- Min and Max Values obtained from training set to bound
In
the class
l
ca
Thus, a rectangular decision region for each class is defined
hi
ap
The range in all bands describes a multidimensional box or
gr
parallelepiped
eo
G
of
te
itu
st
In
s
m
Parallelepiped Classification Algorithm
ste
Sy
n
Requires the analyst to provide
io
at
an estimate of the lowest and highest pixel values for each
rm
class in each band
fo
l In
or
ca
a range in terms of a standard deviation units on either
hi
ap
side of the mean of each band
gr
eo
s te
Sy
Each pixel to be classified, its values are checked to see whether they lie
n
io
inside any of the parallelepipeds
at
If the pixel lies inside just one of the parallelepipeds, is therefore labeled as a
rm
member of the class represented by that parallelepiped
fo
If a particular pixel in this space does not lie inside any of the regions defined
In
by the parallelepipeds, pixels are of an unknown type and left unclassified
l
ca
A pixel may lie inside two or more overlapping parallelepipeds, and the
hi
decision then becomes more complicated. The easiest way around is to
ap
allocate the pixel to the first parallelepiped inside whose boundaries it falls
gr
or
eo
Find distance between the doubtful pixel and the center point of each
G
classification
te
itu
st
In
s
m
Parallelepiped Classification Algorithm
s te
Sy
n
io
at
rm
fo
l In
ca
hi
ap
gr
eo
G
of
te
itu
If a pixel value lies above the low threshold and below the high threshold for all n
st
In
s te
Sy
The maximum likelihood decision rule is based on probability
n
io
at
The maximum likelihood procedure assumes that the training data statistics for
rm
each class in each band are normally distributed
fo
In
The probability of a pixel belonging to each of a predefined set of m classes is
l
ca
calculated, and the pixel is then assigned to the class for which the probability
hi
is the highest
ap
gr
density functions ( a probability density function (PDF), or density of a continuous random variable, is a function that describes the
of
The mean, variance and co-variance of each training class is needed to compute
itu
n
io
Band 2 wi
at
rm
fo
In
Band 1
l
ca
class overlap
hi
ap
gr
# of
eo
pixels
G
of
te
0 255
itu
Digital Number
st
Band 1
In
s
m
What happens when the
te
probability density
s
functions of two or more
Sy
training classes overlap
n
in feature space? For
io
example, consider two
at
hypothetical normally
rm
distributed probability
fo
density functions
In
associated with forest
and agriculture training
l
ca
data measured in bands
hi
ap 1 and 2. In this case,
pixel X would be
gr
assigned to forest
because the probability
eo
density of unknown
G
measurement vector X is
of
for agriculture.
itu
st
In
s
m
Training Samples Guidelines
s te
Sy
Minimum number of samples = number of parameters to be estimated:
n
io
Practical minimum = 10 to 100 x no. bands.
at
rm
Distribute training pixels for a spectral class across the image to better
fo
represent the variation of the class in the image.
l In
Do not include pixels in training areas that are adjacent to borders of
ca
hi
spectral classes (like field boundaries).
ap
Do not include pixels for groups containing obvious mixtures (e.g. deep
gr
eo
selection
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
Signature Styles
fo
rm
at
io
n
Sy
ste
m
s
s
m
te
Merge Signature of Same Class
s
Sy
Select the classes need to be merged> shift click Class #
n
Press merge selected signature button
io
New class of merged signatures is added at the end
at
Delete the previous selected on and assign name to the merged signatures
rm
fo
l In
ca
hi
ap
gr
eo
G
of
te
itu
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
Signature Alarm
rm
at
io
n
Sy
ste
m
s
s
m
Evaluate Signatures Accuracy
s te
Sy
Reference Data
n
io
at
Classified
rm
Data Agricultur Agricultur Forest_1 Forest_2 Urban_1 Urban_2 Row Total
fo
e e
In
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
l
Agriculture 62 0 0 0 0 0 62
ca
Agriculture 0 45 0 0 0 0 45
hi
Forest_1 0 0
ap 493 15 0 0 508
Forest_2 0 0 7 402 0 0 409
gr
Water 0 0 0 0 0 0 0
eo
Urban_1 0 0 0 0 75 0 75
G
Urban_2 0 0 0 0 0 81 81
of
Total
itu
st
In
s
Signatures Accuracy
m
ste
Which one will be a good signature?
Sy
n
io
at
rm
good
fo
lIn
ca
hi
ap
gr
eo
G
bad
of
te
itu
st
In
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
Signatures Accuracy-Histogram
io
n
Sy
ste
m
s
In
st
itu
te
of
G
eo
gr
ap
hi
ca
l In
fo
rm
at
io
Original and classified image
n
Sy
ste
m
s
s
Class thematic map
m
ste
Sy
n
io
at
rm
fo
lIn
ca
hi
ap
gr
eo
G
of
s
Washington
Sy
DC Mall data DISTANCE TO
set. MEAN
n
io
METHOD
at
rm
fo
In
l
ca
hi
ap
gr
Key
eo
G
of
te
itu
st
In
In
st
itu
te
of
G
eo
Key
MAX LIKELY HOOD METHOD
gr
ap
hi
ca
l In
fo
rm
at
io
n
Sy
s te
m
s
s
Supervised vs. Unsupervised
m
s te
Sy
Select Training fields Run clustering
algorithm
n
io
at
rm
Edit/evaluate Identify classes
fo
signatures
l In
ca
hi
Classify image
ap Edit/evaluate
signatures
gr
eo
G
of
Evaluate Evaluate
te
classification classification
itu
st
In
s
m
s te
Sy
n
Spectral classes for an image be represented by
io
at
i, i = 1, . . . M
rm
Determine the class or category to which a pixel vector x belongs, the
fo
probability is to be calculated
In
p(i |x), i = 1, . . . M
l
ca
p(i |x) gives the likelihood that the correct class is i for a pixel at
hi
position x. ap
Classification is performed according to
gr