You are on page 1of 20

ADVANCED IMAGE

PROCESSING
Sanjay K. GHOSH
Professor of Civil Engg
IIT ROORKEE
email: scangfce@iitr.ernet.in
scanghosh@yahoo.co.in

IMAGE PROCESSING AND


ANALYSIS
Act of examining images for the purpose of identifying
objects and judging their significance
Image analyst studies the remotely sensed data and
attempts through logical process

detection,

identification

classification

measurement
Evaluate the significance of physical and cultural
objects, their patterns and spatial relationship.

Representation of Data.
Photograph
Image
The data is in digital form
where
the
area
is
subdivided into equal size
picture elements or pixels.
The
information
is
collected
in
narrow
wavelength range referred
as a BAND

FCC OF ROORKEE AREA IRS LISS III DATA

IKONOS DATA OF ROORKEE DATA

PROCESSING & ANALYSIS


INTERPRETATION
Visual - Human based
Digital - Computer assisted

COMPARISON

VISUAL ANALYSIS
Single band or as FCC
Subjective
Slow
Analyst Bias

DIGITAL ANALYSIS
Multi Image
Objective
Fast with many options
Free of Analyst bias

Elements of Image Interpretation


Black and White Tone
Color

Primary Elements

Stereoscopic Parallax
Size
Spatial Arrangement of Tone
& Color

Shape
Texture
Pattern

Based on Analysis
Primary Elements

of Height
Shadow
Site

Contextual Elements
Association

DIGITAL IMAGE PROCESSING

Image classification and analysis


digitally identify and
classify pixels
supervised
unsupervised

Image Classification and Analysis


Spectral pattern recognition
Digital image classification uses the spectral
information represented by the digital numbers in one
or more spectral bands, and attempts to classify each
individual pixel based on this spectral information
The resulting classified image is comprised of a mosaic of
pixels, each of which belong to a particular theme, and is
essentially a thematic "map" of the original image .

Common classification procedures


Supervised classification
Unsupervised classification

Supervised classification
Training areas
the
analyst
identifies
homogeneous representative
samples of the different
surface cover types
To determine the numerical
"signatures
Once the computer has
determined the signatures for
each class, each pixel in the
image is compared to these
signatures and labeled as the
class
it
most
closely
"resembles" digitally

Unsupervised classification
reverse
of
supervised
classification
Spectral classes are grouped
first
Then matched to information
classes
the analyst specifies how
many groups or clusters
It is iterative in nature
not completely without
human intervention

Comparison

PROBLEM OF MIXED PIXEL


With coarse resolution data, the occurrence of mixed pixels
had been intense, and it was thought that this aspect will
reduce with increase in spatial resolution.
However, this problem has remained same in magnitude
with increase in spatial resolution.
With coarse resolution, the chances of two or more classes
contributing to a mixed pixel were high but the number of
such pixels was small.
With improved spatial resolution, the number of classes
within a pixel has reduced but the number of mixed pixels
has increased.
In a way, the problem of mixed pixels remained, may be its
direction of impact has changed.

PROBLEM OF MIXED PIXEL

Consider a simple land area consisting of two classes,


namely, water and land (Fig.1).
Two pixels belong to only one class, i.e., pixel 1 has
water and pixel 4 has land, and these are called as pure
pixels.
Pixels 2 and 3 has varying composition of land and
water, and are called mixed pixels.
A mixed pixel displays a composite spectral response
that may be dissimilar to the spectral response of each
of its component classes, and therefore, the pixel may
not be allocated to any of its constituent classes.
Therefore, an error is likely to occur in the
classification of the image.
Convention statistical based image classification (also
known as hard classification) which assumes that the
pixels contain pure information, would identify the
pixel to one and only one class.
Thus pixel 2 may be classified as water and pixel 3 as
land (Fig.1b).
Depending upon the proportion of mixed information,
it may result into a loss of pertinent information
present in a pixel and subsequently in an image.

Pixel 1

Pixel 2

Pixel 2

Land

Pixel 3

Pixel 4

Pixel 3

Water

(a)

Water

Land

Actual land cover

Pixel 4
Water

Land

(b) Hard

classification

Pixel 1

Pixel 2

Pixel 1

Pixel 2

Pixel 3

Pixel 4

Pixel 3

Pixel 4

(ii) Land

(i) Water

1
(c)

Fraction Image

Mixed pixels have to be accommodated in the


classification process in some way, by making use
of sub-pixel or soft classification methods based
on certain heuristic and logical reason has to be
adopted.
The output from these methods is a set of class
membership values for each pixel known as soft,
fuzzy or sub-pixel classification outputs which
represent the probability fraction or proportion
images (Fig.1c).
These soft outputs strongly relate to actual extents
of the classes on ground.

Soft classification methods


Spectral mixture analysis.
Fuzzy set theory.
Artificial neural network.

Linear Mixture Model (LMM)


Widely used for the decomposition of the class proportion of
mixed pixels.
The method assumes that the spectral response of a pixel is a
linear sum of the mean spectral responses of the various land
cover classes weighted by their relative proportion on the
ground
The model can be mathematically expressed as
c

xi = f j Mij + ei
j=1

where Mij is the end member spectra representing the mean


class spectral response of j th land cover class in the ith band,
f j are the proportions of j th land cover class in a pixel,
ei is the error term for ith band, which expresses the difference
between the observed spectral response and the model derived
spectral response of the pixel.

10

Linear Mixture Model (LMM)


It may be noted that class proportions of a mixed
pixel are not negative and that the sum of all the
class proportions is equal to one, and can be
expressed as
c

= 1

j =1

And fj 0 for all j land cover classes.


The end member spectra matrix M represents the
spectral responses of classes, and may be calculated
by taking the average spectral response of that class
having pure pixels, or estimated from laboratory and
field measurements of the classes, or by performing
principal component analysis.

When applied to remote sensing of semi- vegetated areas


the linear mixture model approach assumes that endmembers can be frequently be recognized from the image
itself ('image end- members').
Disregarding theoretical considerations, such as the fact
that the model assumes a single-scattering approach, it is
the difficulty in locating end- member spectra that present
the main difficulty to the user.
Logic indicates that an end- member proportion can not be
negative and, if the model is properly specified, that the
sum of the proportions of end-members at a given point
must be less than or equal to unity.

11

It is possible to build these constraints into the linear


mixture model so that the result derived for every
individual pixel satisfy these logical requirements.
It is, however, more practical to consider the
unconstrained model which simply computes, from a
library of end member spectra, the end-member
proportions at a given point.
If the model fits perfectly then there should be no endmember proportions less than zero or greater than
unity, and the sum of the proportions at a given point.
If the model fits perfectly then there should be no endmember proportion less than zero or greater than
unity, and the sum of the proportions should not
exceed 1.0.

Furthermore a root mean squared error may not show


any systematic pattern.
Only by using and unconstrained model is it possible
to check that these conditions are met.
One constraint imposed by linear unmixing is that the
number of end-members cannot exceed the number of
spectral bands available.
Even so, the selection of end-members which is
crucial to the successful application of the linear
mixing model in fraught with difficulties.

12

Fuzzy c-Means (FCM)


FCM is an iterative clustering method employed to
partition pixels of remote sensing images into different
class membership values.
The key is to represent the similarity that a pixel shares
with each cluster with a function (membership function)
whose value lies between zero and one.
Each pixel will have membership in every cluster.
Memberships close to unity signify a high degree of
similarity between the pixel and that cluster.
The net effect of such a function of clustering is to
produce fuzzy c-partitions of a given data.
A fuzzy c-partition of the data is the one which
characterizes the membership of each pixel in all the
clusters by a membership function that ranges from zero to
one.

Possibilistic c-Means (PCM)


The main motivation behind the use of PCM relates to the
relaxation of the probabilistic constraint of FCM.
Formulation of PCM is based on a modified FCM objective
function whereby an additional term called as regularizing term
is included.
It is similar to FCM as PCM clustering is also an iterative
process where the class membership values are obtained by
minimizing the generalized least-square error objective function
J m (U , V ) =

(
i=1 j =1

ij

)m xi v j

2
A

(i
j

j =1

ij )

i= 1

where is a parameter that depends on the distribution of pixels


in the cluster j and is assumed to be proportional to the mean
value of the intra cluster distance
j

13

Neural Network Based Methods


Artificial neural networks have the capability to generalize
the relation between the evidence (e.g., remote sensing
data) and the conclusion (e.g., landcover classification)
without developing any mathematical models.
Thus, unlike statistical parametric methods, they do not
assume that the data follows a distribution.
The neural network contains interconnected layers each
containing a number of units, symbolizing the biological
concept of a neuron.
The interconnections carry weights, which are adjusted in
an interactive learning process to provide neural network
solution.
The learning process may be supervised or unsupervised
depending on whether training data are required or not.
Accordingly, a number of supervised an unsupervised
neural network algorithms have been developed.

Supervised Neural Network


Os = Oi Wsj

net s = xi Wis

Typically, a supervised
neural network consists of
three layers; an input
layer, a hidden layer and
an output layer.
The input layer receives
the data (i.e., the multispectral remote sensing
image data).

Remote Sensing Data

Land Cover Classes


Class 1

Band1
Class 2
Band2
Class 3
Band3
Class 4
Band4
Class 5

s
j

input Layer (i)

Input Layer (s)

Output Layer (j)

Number of units in the input layer is equal the number of bands


used for the classification.
Unlike input layer, hidden and output layers process the data.
The output layer produces the neural network results.
The number of units in the output layer is generally equal to the
number of classes to be mapped.

14

Supervised Neural Network


Therefore, the number of units in the input and output
layers are fixed by the application designed.
Selection of the number of hidden layers and their units is
a critical step for the successful operation of the neural
network.
Using too few units in the hidden layer may result into
inaccurate classification as the network may not be
powerful enough to process the data.
On the other hand, by using a large number of hidden
units, the computational time becomes large. It may also
result into the network being over-trained.
The optimum number of units in the hidden layer is often
determined by trial and error, though some empirical
relations do exist.

Back Propagation Neural Network (BPNN)


The BPNN is a generalized least squares algorithm that adjusts the
connection weights between units to minimize the mean square error
between the network output and the target output.
The target output is known from reference data.
Data provided to input unit are multiplied by the connection weights
and are summed to derive the net input to the unit in the hidden layer.
net s = x iW is
i

where, x i is a vector of magnitude of the ith input (i.e., spectral


response of pixel),
Wis is matrix of the connection weights between ith input layer unit and
sth hidden layer unit.
Each unit in sth hidden layer computes a weighted sum of its inputs,
and passes the sum via an activation function to the units in the j th
output layer through weight vector Wsj.

15

There is a range of activation functions to transform the data from


hidden layer unit to an output layer unit. These include pure linear,
tangent, hyperbolic, sigmoid functions , etc.
Although, the use of these functions may lead to difference in
accuracy of classification. Generally, sigmoid function has been
widely used, and may be defined as
Os = 1 /[1 + exp nets ]

where is the output from the sth hidden layer unit, and is a gain
parameter that controls the connection weights between the hidden
layer unit and the output layer unit .
Outputs from the hidden units are multiplied with the connection
weights, and are summed to produce the output of the j th unit in the
output layer
O j = O sWsj

where Oj is the network output for the j th output unit (i.e., the land
cover class) and Wsj is the weight of the connection between sth hidden
layer unit and j th output layer unit.

An error function E, determined from a sample of target (known)


outputs and network outputs, is minimized iteratively. The
process continues until E converges to some minimum value, and
the adjusted weights are obtained.
c

E = 0 .50 (T j O j ) 2
j =1

where Tj is the target output vector, Oj is the network output


vector, and c is the number of classes.
The target vector is determined from the known class allocations
of the training pixels, which are coded in binary form. For
example, a pixel belonging to class 3 shall be coded as 0 0 1 0 0 at
the five output units.
The collection of known class allocations of all pixels will form
the target vector.

16

Target output coding for BPNN


Class Allocation

Remote Sensing Data

Band 1
0
Band 2
1
Band 3
0
Band 4
0

Input Layer

Input Layer

Output Layer

Parameter
/ issue

Comment

Number
of hidden
unit & layers

Determines the capacity of the network to learn and generalize. In


general, large network may learn more accurately but have poorer
generalization ability than a small network. Larger networks are
also slower to train. How many hidden units and layers should be
used?

Learning
algorithm

There are a range of learning algorithms available.


Backpropagation is the most widely used but can be slow and
faster variants, which make assumptions about the error surface,
are popular. Which should be used.

Learning
parameters

Learning algorithms such as backpropagation have parameters


(e.g momentum and learning rate) that mush be selected. These
can significantly influence the performance of a network. What
values should be selected and should be they be varied in training.

17

Data input
and scaling

There is usually one input unit associated with each discriminating


variable but other approaches may be used. Also the data input t o the
neural network generally have to be rescaled for the analysis,
typically to a 0 to 1 or -1 to 1 scale. What method should be used to
achieve this and what allowance should be made for data to extend
beyond the range observed in the training set?

Number of
training
iterations

The training error is generally negatively related to the number of


training iterations. The accuracy of generalizations may be nonmonotonically related to the intensity of training: typically the
accuracy of generalization increases as the network gradually learns
the underlying relationship with greater accuracy but will eventually
decline as the network becomes over trained. How many iterations of
the learning algorithm should be used?

When/how to There is a need to ensure that the network has learnt to correctly
terminate
identify class membership from the training data but is not
training
overtraining and so has acceptable generalization ability. How is this
to be assessed? Should verification sets be used?
Initial weights The initial weight settings of the pre-trained network can significantly
influence network performance. Typically, these are generally set
randomly, but within what range?

CLASSIFICATION ACCURACY
ASSESSMENT
The accuracy assessment is a critical step in any mapping process, and
thus is an essential component that allows a degree of confidence to
be attached to maps for their effective use.
Traditionally, the accuracy of classification has been assessed using
error matrix based measures.
Here, each pixel in the image is assumed pure, containing one class
per pixel on the ground.
Thus, in essence, the continuum of variation found in the landscape is
divided into a finite set of classes such that pixels representing these
classes became pure, and the error matrix based measures may be
used.
However, these classes become less separable as the class mixture
increases, and therefore, the error matrix based measures may be
inappropriate.
Alternate accuracy measures are, therefore, sought to evaluate the
accuracy of soft classification which represents the class mixture in a
meaningful way.

18

CLASSIFICATION ACCURACY
ASSESSMENT

Euclidean distance
L1 distance
the cross-entropy
correlation coefficients
fuzzy error matrix (FERM)
All these measures may be treated as indirect methods of
assessing the accuracy of soft classification because the
accuracy evaluation is interpretative rather than a
representation of actual value as denoted by the traditional
error matrix based measures.

Correlation Coefficient CC
The correlation coefficient CC may also be used to indicate the
accuracy on individual class basis estimated from a soft classification
output and a soft reference data.
The higher the correlation coefficient, the higher is the classification
accuracy of a class.
CC =

Cov ( 1 ij , 2 ij )

ij

ij

where
Cov( 1 ij , 2 ij ) is the covariance between the two distributions (i.e. the soft classified

output and the soft reference data) and


1 , 2 are the standard deviations of both the distributions.
ij

ij

19

THANK
YOU

20

You might also like