You are on page 1of 209

Satellite Image Processing

and Analysis
Prof. B. Krishna Mohan
CSRE, IIT Bombay
bkmohan@csre.iitb.ac.in

About CSRE
One of the academic units of IIT Bombay offering M.Tech. and
PhD programs in Geoinformatics and Natural Resources
Engineering
Current faculty strength: 12; multidisciplinary in nature PhDs in
EE, CS, CE, Earth Sciences, Physics, Chemistry, Maths
Research Areas satellite image analysis, GIS, GPS, Microwave
Remote Sensing, Hyperspectral Remote Sensing, Applications
to Resources Exploration, Environmental Monitoring,
Engineering Applications, Natural Hazards and Disaster
Management
Excellent Infrastructure ArcGIS, ERDAS, ENVI, Ecognition,
Matlab, Geomatica, Spectroradiometers, GPS Receivers, Large
Format Map Scanner, Marine Research Lab, Photogrammetry
Workstations, Wireless Sensor modules

Todays Presentation
(Very brief) Introduction to Remote Sensing source of images
Steps in Satellite Image Processing and Analysis
Image Analysis and relation to Geospatial Technologies
Some Applications
Noise Filtering with Curvelets
High spatial resolution image analysis
High spectral resolution image analysis

What is Remote Sensing?


Remote sensing is the art and science of making
measurements about an object or the
environment without being in physical contact
with it

Importance of Remote Sensing


Remote Sensing provides vital data for many critical
applications
Resources management
Environmental monitoring
Defence
Urban / rural development and planning
Crop yield forecasting
Hazard zonation and disaster mitigation

Electromagentic Spectrum

Visible and Reflective Infrared


Reflectance measurements in different wavelengths
ratio of incident to reflected energy
Ranges 0% to 100%
Highly wavelength dependent

Basic Premise of RS
Each object on the earth surface has a unique reflectance
pattern as a function of wavelength

Reflectance Spectra of Earth Objects

Atmospheric Windows
The atmosphere interferes with the radiation passing
through it
It is essential to block the harmful UV rays in solar
radiation from reaching the earth
Should not block the radiation in in wavelengths used for
earth observation
Choice of wavelengths
Clear response of earth surface features
Minimal interference from atmospheric constituents

Atmospheric Windows

Transmission (%)

Visible

Near Infrared

Far Infrared

Wavelength (microns)

Concept of Resolution
Four types of resolution in remote sensing:
Spatial resolution
Spectral resolution
Radiometric resolution
Temporal resolution

Spatial Resolution
Ability of the sensor to observe closely spaced features on the
ground
Function of the instantaneous field of view of the sensor
Large IFOV Coarse spatial resolution pixel covers more
area on ground
Small IFOV Fine spatial resolution pixel covers less area on
ground
A sensor with pixel area 5x5 metres has a higher spatial
resolution than a sensor with pixel area 10x10 metres

CSRE

0.6m x 0.6m

5.8m x 5.8m

23.25m x 23.25m

Effect of High Spatial Resolution


High resolution images are information rich
Spatial information
Multispectral information
Textural information

Image can be viewed as a collection of objects with


spatial relationships adjacent, north of, south of,

Spectral Resolution
Ability of the sensor to distinguish differences in
reflectance of ground features in different
wavelengths
Characterized by many sensors, each operating in a
narrow wavelength band
Essential to discriminate between sub-classes of a
broad class such as vegetation

High Spectral Resolution


Response

Large number of contiguous sensors


Narrow bandwidth

wavelength

Coarse Spectral Resolution


Response

Small number of sensors


Large bandwidth
wavelength

Reflectance Spectra
reflectance

Unique spectra of objects

wavelength

Steerable Sensor Systems

Steps in Digital Image Processing


Image

Image

Image

Acquisition

Corrections

Enhancement

Image

Feature Selection

Image

Classification

Final Interpretation

Transforms

Concept of Image Classification


Image classification - assigning pixels in the
image to categories or classes of interest
Examples: built-up areas, waterbody, green
vegetation, bare soil, rocky areas, cloud,
shadow,

Concept of Image Classification


In order to classify a set of data into different classes or
categories, the relationship between the data and the
classes into which they are classified must be well
understood
To achieve this by computer, the computer must be
trained
Training is key to the success of classification
Classification techniques were originally developed out
of research in Pattern Recognition field

Concept of Image Classification


Computer classification of images involves the
process of the computer program learning the
relationship between the data and the
information classes
Important aspects of accurate classification
Learning techniques
Feature sets

Supervised Classification
The classifier has the advantage of an analyst
or domain knowledge using which the classifier
can be guided to learn the relationship between
the data and the classes.
The number of classes, prototype pixels for
each class can be identified using this prior
knowledge

Unsupervised Classification
When access to domain knowledge or the
experience of an analyst is missing, the data
can still be analyzed by numerical exploration,
whereby the data are grouped into subsets or
clusters based on statistical similarity

Partially Supervised Classification


When prior knowledge is available
For some classes, and not for others,
For some dates and not for others in a
multitemporal dataset,

Combination of supervised and unsupervised


methods can be employed for partially
supervised classification of images

Statistical Characterization of Classes


Each class has a conditional probability density
function (pdf) denoted by p(x | ck)
The distribution of feature vectors in each class ck
is indicated by p(x | ck)
We estimate P(ck | x), the conditional probability of
class ck given that the pixels feature vector is x

Supervised Classification Principles


Typical characteristics of classes

Mean vector
Covariance matrix
Minimum and maximum gray levels within each band
Conditional probability density function p(Ci|x) where Ci is
the ith class and x is the feature vector

Number of classes L into which the image is to be


classified should be specified by the user

Parallelepiped Classifier - Example of a


Supervised Classifier
Assign ranges of values for each class in each
band
Really a feature space classifier
Training data provide bounds for each feature for
each class
Results in bounding boxes for each class
A pixel is assigned to a class only if its feature
vector falls within the corresponding box

Band 2

C4

Band 3

C5
C1

C6
C2
C3
Band 1

Parallepiped Classifier

Band 2

C4

Band 3

Assigned to class 5
C5
C1

C6
C2
C3
Unclassified
Band 1

Parallepiped Classifier

Hierarchical Classification
Decision Tree Classifier
Multistage classification technique
Series of decisions taken to determine the label to
be associated with a pixel
Different data sources, different features, and even
different classification algorithms can be used at
each decision stage
Classifier training is simpler since fewer classes
considered at each stage

w1, w2, w3, w4, w5


w1, w2, w3

w1

w4, w5

w2, w3

w2

w4

w3

Decision Tree
Classifier
Hierarchy

w5

Supervised
Classification

Classification Result

Non-Parametric Classifiers

Nearest-Neighbor Classifier
Non-parametric in nature
The algorithm is:
Find the distance of given feature vector x from ALL
the training samples
x is assigned to the class of the nearest training
sample (in the feature space)
This method does not depend on the class
statistics like mean and covariance.

Concept
2-D Feature Space
Band 2

Class 3

xxxxx
x x
xxx
xxx

ooo
o o o
ooo

Feature
Vectors

Class 2

Sample to be classified

++++
+++
++++

Band 1
Class 1

Nearest neighbor

K-NN Classifier
K-nearest neighbour classifier
Simple in concept, time consuming to implement
For a pixel to be classified, find the K closest training
samples (in terms of feature vector similarity or
smallest feature vector distance)
Among the K samples, find the most frequently
occurring class Cm
Assign the pixel to class Cm

K-NN Classifier
Let ki be number of samples for class Ci (out of K
closest samples), i=1,2,,L (number of classes)
Note that

The discriminant for K-NN classifier is


gi(x) = ki
The classifier rule is
Assign x to class Cm if gm(x) > gi(x), for all i, im

K-NN Classifier
Let ki be number of samples for class Ci (out of K
closest samples), i=1,2,,L (number of classes)
Note that

The discriminant for K-NN classifier is


gi(x) = ki
The classifier rule is
Assign x to class Cm if gm(x) > gi(x), for all i, im

Spectral Angle Mapper


Given a large dimensional data set, computing the covariance
matrix, its inverse, and the distance for each pixel
(X m)TS-1(X m) is highly time consuming and if the
covariance matrix is close to singular then its inverse can be
unstable, leading to erroneous results
In such cases, alternate methods can be applied, such as
Spectral Angle Mapper

S.A.M. Principle
If each class is represented by a vector vi, then the
angle between the class vector and the pixel feature
vector x is given by
cosq = [vi.x] / [|vi||x|]
For small values of q, the value of cosq is large
The likelihood of x to belong to different classes can be
ranked according to the value of cosq.

S.A.M. Advantage
The value of the vector would not be greatly
affected by minor changes in vi or x.
The computation is simpler compared to the
Mahalanobis distance computation involved in
ML method

Role of Image Classifier


The image classifier performs the role of a discriminant
discriminates one class against others
Discriminant value highest for one class, lower for other
classes (multiclass)
Discriminant value positive for one class, negative for
another class (two class)

Discriminant Function
g(ck, x) is discriminant function, relating feature vector x
and class ck, k=1,,L
Denote g(ck,x) as gk(x) for simplicity
Multiclass Case
gk(x) > gl(x), l = 1,,L, l k x ck
Two Class Case
g(x) > 0 x c1; g(x) < 0 x c2

Output
class

Decision

g1(x)

x1

g2(x)

x2

g3(x)

x3

gK(x)

xn

Linear Discriminant Function


The discriminant in the 2-class case can be written as
g(x) = wtx + wo = 0
The weight vector w = [w1 wL]t is an L-dimensional
vector, where L is the number of bands
If two samples x1 and x2 are on the plane, we can write
wt(x1-x2) = 0 i.e., w is perpendicular to the
discriminant function

Algorithm for Linear Discriminant


Classifier
Given two classes c1 and c2, we have
wtx > 0 when x is in class c1
wtx < 0 when x is in class c2
Training this classifier involves determining the weights
w
They may be determined using training data by
optimization methods

Support Vector Machines

x
denotes +1
denotes -1

yest

f(x,w,b) = sign(w. x - b)

Linear Classifiers

a
x
denotes +1
denotes -1

yest

f(x,w,b) = sign(w. x - b)

Linear Classifiers

a
x
denotes +1
denotes -1

yest

f(x,w,b) = sign(w. x - b)

How would you


classify this data?

Linear
Classifiers

yest
f(x,w,b) = sign(w. x - b)

denotes +1
denotes -1

Linear
Classifiers

x
denotes +1

yest
f(x,w,b) = sign(w. x - b)

denotes -1

Any of these
would be fine..
..but which is
best?

x
denotes +1
denotes -1

Margin
yest
f(x,w,b) = sign(w. x - b)

Define the margin


of a linear
classifier as the
width that the
boundary could be
increased by
before hitting a
datapoint.

Maximum Margin
denotes +1
denotes -1

Linear SVM

The maximum
margin linear
classifier is the
linear classifier
with the, um,
maximum margin.
This is the
simplest kind of
SVM (Called an
LSVM)

Maximum Margin
The maximum
margin linear
classifier is the
linear classifier
with the, um,
maximum margin.

denotes +1
denotes -1

Support Vectors
are those
datapoints that
the margin
pushes up
against

This is the
simplest kind of
SVM (Called an
LSVM)
Linear SVM

Why Maximum Margin?


1. Intuitively this feels safest.
denotes +1
denotes -1

f(x,w,b)
= sign(w.
- b)
2. If weve made
a small
error inx the
location of the boundary (its been
The maximum
jolted in its perpendicular
direction)
this gives us leastmargin
chance linear
of causing a
misclassification. classifier is the
3. The model is immune
removal of
linearto classifier
any non-support-vector
datapoints.
with the,
um,

maximum
4. Empirically it works
very very margin.
well.
This is the
simplest kind of
SVM (Called an
LSVM)
Support Vectors are those data points that the
margin pushes up against

denotes +1
denotes -1

wx +b = 0

Estimate the Margin


What is the distance expression for a point x to a line wx+b= 0?

d ( x)

xw b
w

2
2

xw b

d
2
w
i 1 i

denotes +1

wx +b = 0

denotes -1

Margin

What is the expression for margin?


margin arg min d (x) arg min
xD

xD

xw b

d
2
w
i 1 i

Maximize Margin
denotes +1

wx +b = 0

denotes -1

Margin

argmax margin(w, b, D )
w ,b

= argmax arg min d (xi )


w ,b

xi D

argmax arg min


w ,b

xi D

b xi w

d
2
w
i
i 1

Maximize Margin
denotes +1

wx +b = 0

denotes -1

Margin

argmax arg min


w ,b

xi D

b xi w

d
2
w
i 1 i

subject to xi D : yi xi w b 0

Maximize Margin
denotes +1
denotes -1

Margin

wx +b = 0
argmax arg min
w ,b

xi D

b xi w

d
2
w
i
i 1

subject to xi D : yi xi w b 0

Strategy:
xi D : b xi w 1

argmin i 1 wi2
d

w ,b

subject to xi D : yi xi w b 1

* *
d
2
{w , b }= argmax
w
k 1 k

w, b

subject to

y1 w x1 b 1

y2 w x2 b 1

Maximum Margin
Linear Classifier

....


y N w xN b 1

This is a standard constrained quadratic optimization


problem solvable using known techniques, described in
detail in most standard textbooks

Multilayer Perceptron
Neural Networks

Artificial Neural Networks for


Engineering Problems

face recognition
detect fraudulent use of credit cards
control blast furnaces
rate bond investment risk
automatically diagnose the cause of aircraft engine
failures
drive robotic automobiles down busy highways
Satellite image classification

Inspiration from Neurobiology

A physical neuron

Artificial
Neuron

Input
Layer

Elements of an
artificial neuron

Link
Weights

Summation
and activation

Output
layer

Mathematical Representation
x1

Inputs

x2

w1
n

w2

xn .

net wx
i ib

Output

i1

y f(net)

wn
b

Mathematical Representation
of the Activation Function

Mathematical Representation
of the Activation Function

Examples
Nonlinear form of the neuron:

y f ( x, w)
y

1
1 e

ye

w x

Sigmoidal Function

|| x w||2
2 a2

Gaussian Function

A Simple Perceptron Training


Algorithm
Its a single-layer
network
Change the weight by
an amount proportional
to the difference
between the target
output and the actual
output.
W = * (T-Y)X
Wnew =Wold + W

Perceptron Learning Rule

x1

w21

x2
xm

w1m

w11
w12

y1

w22
w2m

Y=hardlim(WX)

y2

Delta Rule
The delta rule requires that the activation function be
differentiable. The learning takes place using a set of pairs of
inputs and corresponding known outputs and weights are
adjusted accordingly.

If
If
If

ynk t nk
ynk t nk
ynk t nk

, then change no weights.


n
, then w
x
kj
j
wkj
, then
n
wkj

xj

x nj t nk

If a perceptron can solve the problem, the perceptron learning


procedure will converge in a finite number of steps to a
solution.

Multilayer Perceptron Network


I
N
P
U
T

O
U
T
P
U
T

N
O
D
E
S

N
O
D
E
S
H I D D E N LAY E R S

Multilayered Perceptron
ANN research was in the limbo for nearly two
decades when the single stage perceptron
network was found unable to deal with many
commonly encountered problems
The multistage perceptron network was felt to be
the answer to this problem. However, the problem
is
How is the multilayered perceptron network
trained?

Training the Multilayer


Perceptron Network
Any given neuron

Notation

y0=+1

wj0(n)=bj (n)
y1(n) wj1(n) netj(n)
yi(n)
m

net j ( n ) w ji ( n) yi ( n)
i 0

y j ( n ) f j ( net j ( n ))
e j ( n) d j ( n ) y j ( n )

wji(n)

f(.)

yj(n)

dj(n) is the desired output


and yj(n) is the computed
output

Training the Multilayer


Perceptron Network
Total Error computed over all neurons in the output layer
1
( n) e 2 ( n)
2 jC
If there are N training samples, then the average error can
be computed as

1
av
N

( n)
n 1

Training the Multilayer


Perceptron Network
Given that the weights are the unknowns, find the
derivative of the error function with respective to the
weights and move in the direction of the negative gradient.
(n)
(n) e j ( n) y j (n) net j ( n)

w ji (n) e j (n) y j (n) net j (n) w ji (n)


We have to compute each of the terms on the RHS to obtain
the derivative on the LHS

Multilayer Feed-forward
Network
Signals flow from the
input layer towards
the output layer

INPU
T
INPU
T

Forward
pass

Error is backpropagated!
Error signal flows from
output layer backwards
towards the input layer,
updating the weights
along the way

INPU
T
INPU
T

backward

Moving Towards Error Minimum


The error does not decrease monotonously
towards the minimum value
Oscillations, and stagnation are common during
the gradient descent procedure

Local Minima Of The Error


Function
Error

Weight Values

Network Configuration issues

The network size cannot be too small. It cannot


learn the relationship between the input and the
output
The network size cannot be too large. It takes
too long to train. Then it generalizes poorly.

Weight Initialization
Before training begins, what values should weights
have?
Too large weights or all zero weights are not
desirable.
Thus, weights are typically initialized to small random
values.
Optimization techniques can be used for smart
network weight initialization. Genetic algorithms are
one such approach

Momentum
Oscillations can be reduced:
E
w (t 1) w (t ) ( ) m wij( ) (t ) wij( ) (t 1)
wij (t )
()
ij

()
ij

Weights at iteration t+1 depend on error derivative


and difference between weights at iterations t and t1.
m - momentum term. Usually smaller than the gain
term .

Problems with BP Algorithm

Inability to learn
Error on training patterns never reaches a low level. This
typically means that either the network architecture is
inappropriate or the learning process has pushed the
network into a bad part of weight space.

Inability to generalize
The network may master the training patterns but fail to
generalize to novel situations. This typically means that
either the training environment is impoverished or the
network has over-learned from the inputs

Selected Applications of MLP


Classifier
Landuse/Landcover classification
Edge and line detection

Supervised Image
Classification
Identify the number of classes
Identify the training data and generate the
training patterns
Define the network
Input/Output layers
Hidden layers
Gain and momentum terms

Supervised Image Classification


Size of input layer = number of bands in the input data
Size of output layer = number of classes into which
the data is mapped
Hidden layer(s): = in practice, one or two hidden
layers are used
Usually first hidden layer has more nodes
And second hidden layer has fewer nodes

Supervised Image
Classification
Gain term = 0.1 to 0.5 in practice
Higher the gain term value, faster the change in
weights, but can lead to instability
Momentum term = 0.01 to 0.1 in practice
Gain and momentum values are modified if
convergence is slow

Low resolution image (WiFS)

Seven class
classification
using 3-layer
neural
network

Neural network classification

Texture
Analysis
MUMBAI
Data: IRS-1C,
PAN
Consists of
1024x1024
pixels.

Texture Feature (IDM)

LEGEND
WATER

MARSHY LAND /
SHALLOW WATER
HIGHLY
BUILT-UP AREA

PARTIALLY
BUILT-UP AREA
OPEN AREAS/
GROUNDS

Texture
Classification by
neural networks

Genetic Algorithms

Genetic Algorithms
Genetic algorithms are one of the well known
tools for optimization.
They are employed to generate optimal solutions
employing the principles of genetic evolution.
They employ the concept of random search
instead of deterministic search
One of the applications in the context of neural
nets is smart initialization of network weights

Basic Principle
Numerical approach to problem solving
Uses genetics as its model of problem solving.
Apply rules of reproduction, gene crossover, and
mutation
Start with a number of candidate solutions that evolve
over genetic cycles towards the optimal solution
Solutions evaluated using fitness criteria
Fittest will survive

Genetic Algorithm Approach


Maximization by Hill climbing
Global peak

Local peak

Climber i

104

Multiple Candidates
Multi-climbers
Climber j
Climber k
Climber i

Motivation
In course of time at least one candidate may
reach the global peak
Candidate
n at
global
peak!
Candidate j
Candidate i

Survival of the Fittest


The main principle of evolution used in GA is
survival of the fittest.
The good solutions survive, while bad ones die.
The definition of fitness is application dependent

Smart Initialization of Link Weights by


GA
Initialize the population

Select individuals for the mating pool

Perform crossover

Perform mutation

Insert offspring into the population


no
Stop?
yes
The End

Designing GA...

How to represent genomes?


How to define the crossover operator?
How to define the mutation operator?
How to define fitness function?
How to generate next generation?
How to define stopping criteria?

Simple Crossover Step


RANDOM CROSSOVER
POINT = 3

OFFSPRING 1

OFFSPRING 2

Multi-point Crossover

Shuffling Crossover

Single Point Mutation


1

0
RANDOM
MUTATION
POINT

AFTER
MUTATION

Uniform Mutation
1

Swap Mutation
1

RANDOM
POINTS FOR
SWAPPING

Fitness Function
Cost associated with a weight set =
Average error in classification for the entire set of
test samples
Lower error = Higher fitness
Using a number of candidate weight sets, a
multilayer perceptron network is initialized.
Image classification application using ANN

Selection
Roulette wheel selection
Rank selection
Tournament selection

Roulette Wheel Selection

START

COMBINE ALL 2n MEMBERS


FROM BOTH OLD AND NEW
POPULATION AND RANK THEM
ACCORDING TO THEIR
FITNESSES.

SELECT FIRST n RANKS AS


THE POPULATION FOR THE
NEXT ITERATION

END

Rank
Selection

Tournament
Selection
START

COMBINE ALL 2n MEMBERS OF BOTH OLD AND


NEW POPULATION
FROM SUCH 2n MEMBERS RANDOMLY SELECT
2 MEMBERS

Compare the fitnesses between members and


select the fitter one while return the other
member back to the population
CONTINUE THE PROCEDURE UNTIL n MEMBERS
ARE SELECTED FOR THE NEW ITERATION
STOP

Convergence Criteria
Image Classification
Accept the member if the corresponding error in
classification is within the user specified
tolerance limit (e.g. accuracy > 95% or error <
5%)

Fitness Function
Cost associated with a weight set =
Average error in classification for the entire set of
test samples
Lower error = Higher fitness
Using a number of candidate weight sets, a
multilayer perceptron network is initialized.
Image classification application using ANN

Sample Input Data

InputDataFileForGA01
InputNodes
4; HiddenLayers
2
HiddenNodes
16 13; OutPutNodes
7
PopulationSize
20; No.OfGenerations 200
SearchMinValue -5.0; SearchMaxValue +5.0
AllowbleError
0.01; CrossOverProbability 0.80
MutationProbability 0.1
TrainingDataFileName rajtrpat.dat
NetWorkWeightsFile raj.wgt

Sample Input Data

InputDataFileForGA01
InputNodes
4; HiddenLayers
2
HiddenNodes
16 19; OutPutNodes
15
PopulationSize
20; No.OfGenerations 20
SearchMinValue -2.0; SearchMaxValue +2.0
AllowbleError
0.01; CrossOverProbability 0.80
MutationProbability 0.1
TrainingDataFileName kdatrpat.dat
NetWorkWeightsFile kdaga.wgt

Input Image

GA-NN Supervised Classification

Example 2

Results

High Resolution Image Analysis


High Resolution

Spatial

Spectral

IRS-1C LISS3
23.5m

Quickbird Window

Per-pixel methods in high spatial resolution image


analysis
Classif
ication

Color/

Form/

Area/

Texture

Context

Spectr
al

Shape

Size

Pixelbased

Objectbased

Excessively detailed classification of the land surface.


OB

PB

Generic Framework for High Spatial


Resolution Image Analysis
High Resolution Satellite Image
Pre-processing
Decompose image at different level
Segment image at Different Resolutions
Linking the regions of different resolutions
Connected Component Labeling

Spatial Features

Spectral Features

Object-Specific Classification

Texture Features

Context

General Purpose Classification

Post-processing (Context Based Process)


Classified Image
133

Rizvi, I.A. and Mohan, B.K., IEEE-TGRS Dec. 2011

Illustration of
the working of
edge-preserving
smoothing

Classification Strategies
High Resolution Satellite Image
Pre-processing
Decompose image at different level
Segment image at Different Resolutions
Linking the regions of different resolutions
Connected Component Labeling

Spatial Features

Spectral Features

Texture Features

Object-Specific Classification

Context

General Purpose Classification

Post-processing (Relaxation Labeling Process)


Classified Image

135

Medium Resolution Input Image

CLASSIFIED
IMAGE
(Mumbai)

Results

Study Area 1
OB Segmentation

Object based image


classification

Study Area 2
OB Segmentation

Buildup
Open ground
Vegetation

Object based classification

Study Area 3

Parameters for OB Segmentation


Region reduced
From

To

3541

1986

Merging
Threshold

OB Segmentation
AFI
No. of
Regions

Value

15

0.022

Grass
Vegetation
Roof top
Concrete
Open ground

Object based classification

Study Area 4

OB Segmentation

Water
Sand
Vegetation
Buildup
Road
Slum
Open Area

CBFNN+RLP
Mod CBFNN

Study Area 5
OB Segmentation

Buildings1
Open ground
Road
Shadow
Buildings2
Vegetation

Object based classification

Study Area 6

OB Segmentation

Aero plane
Open Area
Vegetation
Water
Shadow
Settlements
Vehicle
Road

Mod CBFNN

CBFNN+RLP

Object Specific Classifiers

Extraction of buildings
Roads
Trees
Waterbodies
Airfields

Examples

Road Extraction
Biplab Banerjee, Siddharth Buddhiraju and
Krishna Mohan Buddhiraju, Proc. ICVGIP 2012

Examples

Building outline extraction


by object based image
analysis

Biplab Banerjee and Krishna Mohan Buddhiraju, UDMS 2013,


Claire Ellul et al. (ed.), CRC Press, May 2013

High Spectral
Resolution
Image
Analysis

High spectral resolution image


Atmospheric Correction

Dimensionality Reduction

Pure Pixel / Training Data


Identification

Supervised
Classification

Spectral
libraries
Mixture Modeling

Spectral
Matching

Abundance Mapping
General Purpose
classification

Sub-pixel Mapping &


Super-Resolution

Classification

INTRODUCTION
Hyperspectral sensors
Large number of contiguous
bands
Narrow spectral BW

Advantages
Better discrimination among
classes on ground is offered
Highly correlated bands
Huge information from a
contiguous and smooth
spectra

Hyperspectral data of a scene


(Source: remotesensing.spiedigitallibrary.org)

11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

154

INTRODUCTION
Problems in Hyperspectral Remote Sensing

Cost of the system

Computational
complexity

Huge storage memory


Fast processors
High transmission BW

Hughes
phenomenon

More number of dimensions


require more training samples to
represent class statistics
For limited size of training set the
accuracy
of
classification
decreases
as
dimension
increases beyond certain point
The Hughes Phenomenon (Source: doi.ieeecopmutersociety.org)

11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

155

INTRODUCTION
Dimension Reduction approaches
Feature Selection

Feature Extraction

Feature subset selection from available Transformation of


features.
dimensional space.
Original band information is preserved.

bands

on

lower

New projected axes are formed.

Search algorithm with a criterion function is A transform that maximizes the de-correlation,
involved.
ranks the axes and is invertible is involved.
Eg: Sequential search, Tabu search,
algorithms, etc.

11/13/2016

genetic Eg: PCA, MNF, Projection pursuit etc.

Centre of Studies in Resources Engineering, IIT


BOMBAY

156

GA-Based IMPLEMENTATION
List of all parameters used
Desired features = 30
Available features = 155
Population size = 30
Number of subpopulations = 2
Maximum generations = Different runs for 10, 20, 30 and 40
Crossover probability = 0.75
Mutation probability = 0.05 (Simple mutation)
Stopping criteria = Maximum iterations
Migration rate = 0.5 i.e. half candidates are exchanged
Migration policy = Best-Worst exchange
Migration interval = Different runs for 0.5, 0.33 and 0.25
11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

157

IMPLEMENTATION

Integer encoding: chromosome is a string of integers equal to the desired number of bands
E.g. chromosome = { 4 12 14 20 21 23 25 29 32 35 41 45 46 57 60 67 69 72
86 91 93 99 105 115 120 131 141 145 149 154 }

Single point crossover:

Fitness function: accuracy of classification i.e. the number of pixels that are wrongly classified by 1
nearest neighbour classifier
fitness (x) = | misclassified pixels |

and minimize the fitness function

Selection: Rank selection where we first combine the members from both the old and the new
population and then we rank them according to their fitness values. The members having low fitness
function value represent better candidates and they are selected for further evolution.

11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

158

IMPLEMENTATION

The stepwise workflow adopted

11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

159

INPUT DATASET
Details of the hyperspectral scene

Sensor

Hyperion on board the EO-1

Date of imagery

Jan 27, 2014

Scene Request ID

EO11480472014027110KZ

Site Latitude

19.04

Site Longitude

72.84

HYP Start Time

2014 027 04:50:36

HYP Stop Time

2014 027 04:54:56

Spatial co-ordinates of input image


Image point

Latitude

Longitude

Upper left corner

19.544844

72.916448

Upper right corner

19.531030

72.988297

Lower left corner

18.626129

72.701583

Lower right corner

18.612367

72.773605

It covers the part of Mumbai and consists of 242 bands some of which do not contain any
information due to atmospheric absorption

The bad bands are removed and remaining bands are pre-processed for atmospheric correction
and we are left with final dataset consisting of 155 bands

The original full scene is 3400 x 256 with 16 bit original radiometric resolution and the band
interleave as BIL
11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

160

600 x 256 x 242

11/13/2016

Subset 1
imageSubset1

500 x 256 x 242

Original
Scene Engineering, IIT
Centre of Studies in Resources
BOMBAY

Subset 2
imageSubset2

161

RESULTS : MGA Selected Features


Sr.
No.

Band Number

Wavelength
(nm)

Sr.
No.

Band
Number

Wavelength
(nm)

1124.28
1124.28

Band 12

467.517

16

Band 113

1275.66

1134.38
1134.38

Band 22

569.27

17

Band 115

1295.86

599.80

1184.87
18 18 Band
Band
104 104 1184.87

Band 25

599.796

18

Band 136

1507.73

Band 28

630.32

1225.17
19 19 Band
Band
108 108 1225.17

Band 31

660.848

19

Band 144

1588.42

Band 32

671.02

1537.92
20 20 Band
Band
139 139 1537.92

Band 37

721.899

20

Band 151

1659

Band 35

701.55

1578.32
21 21 Band
Band
143 143 1578.32

Band 43

782.951

21

Band 156

1709.5

Band 42

772.78

1608.61
22 22 Band
Band
146 146 1608.61

Band 44

793.127

22

Band 161

1759.89

Pm = 0.05

Band 44

793.13

1709.50
23 23 Band
Band
156 156 1709.50

Band 46

813.477

23

Band 188

2032.35

Migration interval = 0.25

Band 46

813.48

1739.70
24 24 Band
Band
159 159 1739.70

Band 83

972.993

24

Band 203

2183.63

Migration rate = Half

10

Band 51

864.35

1759.89
25 25 Band
Band
161 161 1759.89

10

Band 85

993.171

25

Band 205

2203.83

11

Band 79

932.64

2032.35
26 26 Band
Band
188 188 2032.35

11

Band 89

1033.49

26

Band 206

2213.93

12

Band 87

1013.30

2042.45
27 27 Band
Band
189 189 2042.45

12

Band 93

1073.89

27

Band 207

2224.03

13

Band 90

1043.59

2203.83
28 28 Band
Band
205 205 2203.83

13

Band 94

1083.99

28

Band 212

2274.42

14

Band 93

1073.89

2224.03
29 29 Band
Band
207 207 2224.03

14

Band 103

1174.77

29

Band 215

2304.71

15

Band 95

1094.09

2254.22
30 30 Band
Band
210 210 2254.22

15

Band 105

1194.97

30

Band 216

2314.81

MGA Parameters
d = 30
D = 155
Psize = 30
Pc = 0.75

Migration Policy = Best


Worst exchange

Sr.
No.

Band Number

Wavelength Sr. Sr. Band


Band Wavelength
Wavelength
No. No. Number
(nm)
Number
(nm)(nm)

Band 11

457.34

16 16 Band
Band
98 98

Band 15

498.04

17 17 Band
Band
99 99

Band 25

Selected subset for imageSubset1 by MGA

11/13/2016

Selected subset for imageSubset2 by MGA

Centre of Studies in Resources Engineering, IIT


BOMBAY

162

RESULTS : Spectral Plot

Spectrum of a random pixel taken from dataset imageSubset2

Spectrum of a random pixel taken


reduced
Centre offrom
Studiesdimension
in Resources Engineering,
IIT image of imageSubset2

11/13/2016

BOMBAY

163

RESULTS : MGA vs SGA


The fitness values
for SGA and MGA
with
different
migration
intervals

SGA

MGA (0.5)

Best Fitness Value

Sr. No.

Number of
Generations

SGA

MGA (0.5)

MGA (0.33)

MGA (0.25)

10

104

94

93

88

20

94

88

80

79

30

87

83

76

75

40

87

82

75

73

MGA (0.33)

MGA (0.25)

Best Fitness Value

105
100
95
90
85
80
75
70
10

20

30

40

Performance of SGA and


MGA
with
different
migration interval vs the
number of generations

Number of generations
11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

164

RESULTS : MGA vs FE METHODS


Legend

All Bands

MGA

PCA

MNF

ICA

Results of LU-LC classification on the dataset imageSubset1 All 155 bands, MGA selected 30 bands, 30
PCA components,
MNFincomponents,
30 ICA
Centre of30
Studies
Resources Engineering,
IIT components
11/13/2016

BOMBAY

165

RESULTS : MGA vs FE METHODS


Accuracy and kappa coefficient for all datasets used for classification for
imageSubset1
Sr. No.

11/13/2016

Dataset used for classification

Overall Accuracy (%)

Kappa coefficient

Classification with all 155 bands

92.6883

0.8969

MGA reduced 30 bands

94.7009

0.9251

PCA transformed 30 components

94.1654

0.9176

MNF transformed 30 components

93.8146

0.9127

ICA transformed 30 components

94.6455

0.9244

Centre of Studies in Resources Engineering, IIT


BOMBAY

166

RESULTS : MGA vs FE METHODS


Legend

All Bands

MGA

PCA

MNF

ICA

Results of LU-LC classification on the dataset imageSubset2 All 155 bands, MGA selected 30 bands, 30
PCA components, 30 MNF components, 30 ICA components
11/13/2016

Centre of Studies in Resources Engineering, IIT


BOMBAY

167

RESULTS : MGA vs FE METHODS


Accuracy and kappa coefficient for all datasets used for classification for
imageSubset1
Sr. No.

11/13/2016

Dataset used for classification

Overall Accuracy (%)

Kappa coefficient

Classification with all 155 bands

93.4378

0.9104

MGA reduced 30 bands

96.0134

0.9442

PCA transformed 30 components

94.7816

0.9271

MNF transformed 30 components

94.7144

0.9262

ICA transformed 30 components

94.7816

0.9271

Centre of Studies in Resources Engineering, IIT


BOMBAY

168

Robust Watermarking of
Satellite Images

What is Watermarking?
A watermark is pattern of bits inserted into a digital
image, audio or video that identifies the files copyright
information (author, rights, etc).
Objective:
To find a technique to watermark satellite images robustly
against attacks
To implement and analyze the proposed technique
To find suitable technique for point selection from vector
dataset for watermarking.
To implement and analyze the proposed technique

Transform and LSB combination


Methodology

Comparison
Methods

Domain and Type

Image

Experimental Attacks

LSB

Spatial , fragile

Greyscale

cant handle compression and


simple operations

DCT

Frequency, Robust

Greyscale

robust to Jpeg compression, filters,


gaussian noise, histogram
equalization, stretching

Third Level DWT

Frequency, Robust

Satellite Image

robust to jpeg compression,


suitable for copyright protection

172

Texture Classification
Texture is the visual and especially tactile quality of a surface.

Each pixel of an image is classified into two classes.


1.High Texture
2.Low Texture

Method applied is modified k-means (unsupervised


classification algorithm)

K-means Algorithm

Texture based classification

Suitable areas and Embedding of


Watermark

Watermarking locations

Proposed method for Raster WM


Texture Based DWT Alpha method

Results of Texture based DWT Alpha


DWT Texture based Alpha method
No.

Input Image

Input Watermark Attack type

Wmr Cor

Iwm cor

12048x2048

16x16

Smoothing 3x3

0.9482

22048x2048

16x16

Smoothing 5x5

0.8768

32048x2048

16x16

Smoothing 7x7

0.8254

42048x2048

16x16

0.528

52048x2048

16x16

0.7837

62048x2048W

16x16

Contrast stretch
salt_and_paper(0.
01)
speckle noise
(0.005)

0.8174

Vector Dataset point selection


The watermarking technique for Vector image is
shown by Zope-Chaudhari et al. 2015.
Here all the points are selected for watermarking
which made watermarked vector data look visibly
distorted.
So to reduce this distortion we propose two
techniques.
1. Slope based point selection method.
2. Alternate point selection method.

Vector point selection method proposed


1. Slope based selection of points
Concept here is to select points that dont fall in
straight line or are junction points to reduce
distortion in image.
Also we reduce the alpha used by the watermarking
method in
Zope-Chaudhari et al. 2015a to embed watermark
in the selected points.
During retrieval we take the same algorithm to
select the points.

Algorithm for slope based selection of


points
Input shape file is info.
List all x and y coordinates in vector x and y. Remove
the junction points.
Calculate the slopes by taking three consecutive points
and select the points which are having opposite slopes
and having difference between the slopes atleast 0.3.
Save this set of points in other vector xn and yn.
Watermark the points selected and Replace them in the
vector data with original set.

Results of slope based point selection


method.

Comparison of original dataset and


the watermarked

Vector point selection method proposed


2. Alternate point selection method
Select every alternate point in the dataset which is
not a junction.
Apply watermarking algorithm on the selected
points.
This method reduces the number of points used for
watermarking thereby reducing error.
It is simple method to implement.

Results of Alternate points method

Comparison of original dataset and


the watermarked

Comparison of vector methods


Method
All points
Slope Based
Selection
Alternate
points

Input points

Selected
Points

Maximum
average error Error

MSE

58568

58568

0.0068

0.01

6.77E-05

58568

6814

6.89E-08

1.00E-07

6.89E-15

58568

28280

5.06E-04

0.001

5.06E-07

The above results show that the slope based method has minimum error as it has minimum
number of points for watermarking.
Also points embedded in slope based method are not in straight line so less distortion is
caused by it.
The Error is seen as reduced with number of points embedded is reduced.
Also, the watermarking strength alpha when reduced we have the error reduced. But if we
reduce the alpha more it can result in reduction in watermark robustness.

Multi-resolution Noise Filtering

189

Fast Discrete Curvelet Transform


Forward Transform

2 D data

FFT

Curvelet
coeff. in
frequency
domain

Zero padding &


Windowing

Curvelet
coeff. in
frequency
domain

Freq. data

Inverse Transform
IFFT
Reconstructed
data

Windowing &
Truncation

Reconstructed
Freq. data

IFFT

Curvelet
coefficients

FFT

Curvelet
coefficients

190

Proposed Adaptive Thresholding


Input curvelet coefficients.
Find maximum and minimum value of coefficients.
Iterate through (threshold=min to max) and execute
reconstruction of curvelet algorithm
Store the results in array for each loop such that temp =
[threshold, PSNR]
When loop completed, select the threshold that gives
maximum PSNR.
Perform curvelet reconstruction using best threshold
and store the results.
191

MRA based Hybrid Approach for noise


filtering

192

Original

193

Wavelet
Denoising

194

Curvelet
Denoising

195

Proposed Method
Curvelet patterns/artifacts observed while
denoising
These patterns not present in wavelet based
denoising
To overcome this problem and to retain
advantages of curvelet denoising
We propose, a combined approach of wavelet and
curvelet algorithms
Homogeneous areas => Wavelets
Heterogeneous areas => Curvelets
196

Proposed Method
To evolve a hybrid scheme whereby an image
may be delineated first into smooth and
textured regions first and
The denoising scheme may be applied
differently for each type of region

197

Proposed Method
1) Find edge magnitude by a standard edge detector
on noisy image
2) Run a 5x5 or 7x7 window over result of 1 and
calculate edge magnitude, and find average edge
gradient magnitude in this window
3) Apply k-means to select the threshold for
segmentation, apply thresholding, and declare
homogeneous and heterogeneous regions
4) Perform denoising using iterative thresholding on
transform domain coefficients
Ansari, Rizwan Ahmed and Buddhiraju, Krishna Mohan, k-means based Hybrid Wavelet
and Curvelet Transform Approach for Denoising of Remotely Sensed Images, Remote
198
Sensing Letters (IJRS), vol. 6 (12), pp. 982-991, 2015.

Extended Work

Four Different Methods are considered for


segmentation into homogeneous and
heterogeneous areas
Quadtree based
Entropy from GLCM based
Variance based
Edge magnitude based with fuzzy c-means

Edge Preservation Analysis


Edge Enhancing Index (EEI) EEI | V V | | V V
Image Detail Preserving Coefficient (IDPC) (correlation
coefficient between original image and filtered image)
f1

f2

199

Original Ikonos image (Powai area)


1m 1m

200

Noisy Ikonos image = 20

201

Wavelet Based Denoising

PSNR
26.7dB

EEI
0.63

MSE
149

IDPC
0.73
202

Contourlet Based Denoising

PSNR
25.9dB

EEI
0.69

MSE
157

IDPC
0.75
203

Curvelet Based Denoising

PSNR
28.9dB

EEI
0.88

MSE
127

IDPC
0.91
204

Quadtree based segmentation

205

ContCurv Based Denoising

PSNR
29.1dB

EEI
0.81

MSE
119

IDPC
0.88
206

WavCurv Based Denoising

PSNR
32.1dB

EEI
0.92

MSE
99

IDPC
0.90
207

Summary
Many remote sensing / image processing operations are extracting
information from images that can feed spatial problems with useful
data
The outcomes of satellite remotely sensed image analysis are
eventually linked to GIS
High spatial resolution image analysis outputs can be vectorized
into polygon layer with some pre-processing to remove tiny
polygons
Hyperspectral image analysis and like specialized tools can
provide information to add to fields in GIS like degraded forest /
polluted lake which requires detailed investigation which is
possible with specialized tools
Image preprocessing through noise filtering can improve the
analysis procedures

Thank You!

You might also like