You are on page 1of 62

Removing Camera Shake via Weighted

Fourier Burst Accumulation


Abstract
Numerous recent approaches attempt to remove image blur due to camera shake, either with one
or multiple input images, by explicitly solving an inverse and inherently ill-posed deconvolution
problem.

If the photographer takes a burst of images, a modality available in virtually all modern digital
cameras, we show that it is possible to combine them to geta clean sharp version.

This is done without explicitly solving any blur estimation and subsequent inverse problem. The
proposed algorithm is strikingly simple: it performs a weighted average in the Fourier domain,
with weights depending on the Fourier spectrum magnitude.

The method can be seen as a generalization of the align and average procedure, with a weighted
average, motivated by hand-shake physiology and theoretically supported, taking place in the
Fourier domain.

The method’s rationale is that camera shake has a random nature, and therefore, each image in
the burst is generally blurred differently.

Experiments with real camera data, and extensive comparisons, show that the proposed Fourier
burst accumulation algorithm achieves stateof- the-art results an order of magnitude faster, with
simplicity for on-board implementation on camera phones.

Finally, we also present experiments in real high dynamic range (HDR) scenes, showing how the
method can be straightforwardly extended to HDR photography.

KeyTerms:

Multi-image deblurring,
burst fusion,
camera shake,
low light photography,
high dynamic range.
1. INTRODUCTION

One of the most challenging experiences in photography is taking images in low-light


environments. The basic principle of photography is the accumulation of photons in
the sensor during a given exposure time.

In general, the more photons reach the surface the better the quality of the final image, as
the photonic noise is reduced. However, this basic principle requires the photographed
scene to be static and that there is no relative motion between the camera and the scene.

Otherwise, the photons will be accumulated in neighboring pixels, generating a loss of


sharpness (blur).

This problem is significant when shooting with hand-held cameras, the most popular
photography device today, in dim light conditions.

Under reasonable hypotheses, the camera shake can be modeled mathematically as a


convolution,

v = u _ k + n,

where v is the noisy blurred observation, u is the latent sharp image, k is an unknown
blurring kernel and n is additive white noise. For this model to be accurate, the camera
movement has to be essentially a rotation in its optical axis with negligible in-plane
rotation.

The kernel k results from several blur sources: light diffraction due to the finite aperture,
out-offocus, light integration in the photo-sensor, and relative motion between the camera
and the scene during the exposure.

To get enough photons per pixel in a typical low light scene, the camera needs to capture
light for a period of tens to hundreds of milliseconds.

In such a situation (and assuming that the scene is static and the user/camera has correctly
set the focus), the dominant contribution to the blur kernel is the camera shake—mostly
due to hand tremors.

Current cameras can take a burst of images, this being popular also in camera phones.
This has been exploited in several approaches for accumulating photons in the different
images and then forming an image with less noise (mimicking a longer exposure time a
posteriori).
However, this principle is disturbed if the images in the burst are blurred. The classical
mathematical formulation as a multiimage deconvolution, seeks to solve an inverse
problem where the unknowns are the multiple blurring operators and the underlying sharp
image.

This procedure, although it produces good results, is computationally very expensive, and
very sensitive to a good estimation of the blurring kernels, an extremely challenging task
by itself.

Furthermore, since the inverse problem is ill-posed it relies on priors either, or both, for
the calculus of the blurs and the latent sharp image. Camera shake originated from hand
tremor vibrations is essentially random.

This implies that the movement of the camera in an individual image of the burst is
independent of the movement in another one. Thus, the blur in one frame will be different
from the one in another image of the burst.

Our work is built on this basic principle. We present an algorithm that aggregates a burst
of images (or more than one burst for high dynamic range), taking what is less blurred of
each frame to build an image that is sharper and less noisy than all the images in the
burst.

The algorithm is straightforward to implement and conceptually simple. It takes as input


a series of registered images and computes a weighted average of the Fourier coefficients
of the images in the burst.

Similar ideas have been explored by Garrel in the context of astronomical images,
where a sharp clean image is produced from a video affected by atmospheric turbulence
blur.

With the availability of accurate gyroscope and accelerometers in, for example, phone
cameras, the registration can be obtained “for free,” rendering the whole algorithm very
efficient for on-board implementation.

Indeed, one could envision a mode transparent to the user, where every time he/she takes
a picture, it is actually a burst or multiple bursts with different parameters each. The set is
then processed on the fly and only the result is saved.

Related modes are currently available in “permanent open” cameras. The explicit
computation of the blurring kernel, as commonly done in the literature, is completely
avoided.
This is not only an unimportant hidden variable for the task at hand, but as mentioned
above, still leaves the ill-posed and computationally very expensive task of solving the
inverse problem.

Evaluation through synthetic and real experiments shows that the final image quality is
significantly improved. This is done without explicitly performing deconvolution, which
generally introduces artifacts and also a significant overhead.

Comparison to state-of-the-art multi-image deconvolution algorithms shows that our


approach produces similar or better results while being orders of magnitude faster and
simpler.

The proposed algorithm does not assume any prior on the latent image; exclusively
relying on the randomness of hand tremor. A preliminary short version of this work was
submitted to a conference.

The present version incorporates a more detailed analysis of the burst aggregation
algorithm and its implementation. We also introduce a detailed comparison to lucky
image selection techniques, where from a series of short exposure images, the sharpest
ones are selected, aligned and averaged into a single frame.

Additionally, we present experiments in real high dynamic range (HDR) scenes showing
how the method can be extended to hand-held HDR photography.

The remaining of the paper is organized as follows. Section II discusses the related work
and the substantial differences to what we propose. Section III explains how a burst can
be combined in the Fourier domain to recover a sharper image, while in Section IV we
present an empirical analysis of the algorithm performance through the simulation of
camera shake kernels.

The algorithm implementation and in Section VI we present results of the proposed


aggregation procedure in real data, comparing the algorithm to state-of-the-art multi-
image deconvolution methods.

Experiments in real high dynamic range (HDR) scenes, showing how the method can be
straightforwardly extended to HDR photography.

astronomical images, where a sharp clean image is produced from a video affected by
atmospheric turbulence blur. With the availability of accurate gyroscope and
accelerometers in, for example, phone cameras, the registration can be obtained “for
free,” rendering the whole algorithm very efficient for on-board implementation.
Indeed, one could envision a mode transparent to the user, where every time he/she takes
a picture, it is actually a burst or multiple bursts with different parameters each. The set is
then processed on the fly and only the result is saved.

Related modes are currently available in “permanent open” cameras. The explicit
computation of the blurring kernel, as commonly done in the literature, is completely
avoided. This is not only an unimportant hidden variable for the task at hand, but as
mentioned above, still leaves the ill-posed and computationally very expensive task of
solving the inverse problem.

Evaluation through synthetic and real experiments shows that the final image quality is
significantly improved. This is done without explicitly performing deconvolution, which
generally introduces artifacts and also a significant overhead. Comparison to state-of-the-
art multi-image deconvolution algorithms shows that our approach produces similar or
better results while being orders of magnitude faster and simpler.

The proposed algorithm does not assume any prior on the latent image; exclusively
relying on the randomness of hand tremor. A preliminary short version of this work was
submitted to a conference. The present version incorporates a more detailed analysis of
the burst aggregation algorithm and its implementation.

We also introduce a detailed comparison to lucky image selection techniques, where from
a series of short exposure images, the sharpest ones are selected, aligned and averaged
into a single frame. Additionally, we present experiments in real high dynamic range
(HDR) scenes showing how the method can be extended to hand-held HDR photography.
The remaining of the paper is organized as follows.

the related work and the substantial differences to what we propose. how a burst can be
combined in the Fourier domain to recover a sharper image, while in Section IV we
present an empirical analysis of the algorithm performance through the simulation of
camera shake kernels.

the algorithm implementation we present results of the proposed aggregation procedure


in real data, comparing the algorithm to state-of-the-art multi-image deconvolution
methods.

experiments in real high dynamic range (HDR) scenes, showing how the method can be
straightforwardly extended to HDR photography.
Proposed method:

 The proposed algorithm is strikingly simple: it performs a weighted average in the


Fourier domain, with weights depending on the Fourier spectrum magnitude.

 The method can be seen as a generalization of the align and average procedure,
with a weighted average, motivated by hand-shake physiology and theoretically
supported, taking place in the Fourier domain.

 The method’s rationale is that camera shake has a random nature, and therefore,
each image in the burst is generally blurred differently.

 Experiments with real camera data, and extensive comparisons, show that the
proposed Fourier burst accumulation algorithm achieves stateof- the-art results an
order of magnitude faster, with simplicity for on-board implementation on camera
phones.

 Finally, we also present experiments in real high dynamic range (HDR) scenes,
showing how the method can be straightforwardly extended to HDR photography.

II. RELATED WORK


 Removing camera shake blur is one of the most challenging problems in image
processing. Although in the last decade several image restoration algorithms have
emerged giving outstanding performance, their success is still very dependent on
the scene.
 Most image deblurring algorithms cast the problem as a deconvolution with either
a known (non-blind) or an unknown blurring kernel (blind). the review by
A. Single Image Blind Deconvolution

 Most blind deconvolution algorithms try to estimate the latent image without any other
input than the noisy blurred image itself. A representative work is the one by Fergus et al

 This variational method sparked many com-petitors seeking to combine natural image
priors, assumptions on the blurring operator, and complex optimization frame-works, to
simultaneously estimate both the blurring kernel and the sharp image . Fergus et al.

 approximated the heavy-tailed distribution of the gradient of natural images using a


Gaussian mixture. In , the authors exploited the use of sparse priors for both the sharp
image and the blurring kernel. Cai et al.

 proposed a joint optimization framework, that simultaneously maximizes the sparsity of


the blur kernel and the sharp image in a curvelet and a framelet systems respectively.
Krishnan et al. introduced as a prior the ratio between the and the norms on the high
frequencies of an image. This normalized sparsity measure gives low cost for the sharp
image.

 In the authors discussed unnatural sparse representations of the image that mainly retain
edge information. This representation is used to estimate the motion kernel. Michaeli and
Irani recently proposed to use as an image prior the recurrence of small natural image
patches across different scales. The idea is that the cross-scale patch occurrence should be
maximal for sharp images.

 Several attempt to Þrst estimate the degradation operator and then applying a non-blind
deconvolution algorithm. For instance, accelerates the kernel estimation step by using
fast image Þlters for explicitly detecting and restoring strong edges in the latent sharp
image.

 Since the blurring kernel has typically a very small support, the kernel estimation
problem is better conditioned than estimating the kernel and the sharp image together. In
and the authors concluded that it is better to Þrst solve a maximum a posteriori estimation
of the kernel than of the latent image and the kernel simultaneously.

 However, even in non-blind deblurring, i.e., when the blurring kernels are known, the
problem is generally ill-posed, because the blur introduces zeros in the frequency domain.
Thereby avoiding explicit inversion, as here proposed, becomes critical.
B. Multi-Image Blind Deconvolution


 Rav-Acha and Peleg claimed that “Two motion-blurred images are better than
one,” if the direction of the blurs are different. In [20] the authors proposed to
capture two photographs: one having a short exposure time, noisy but sharp, and
one with a long exposure, blurred but with low noise. The two acquisitions are
complementary, and the sharp one is used to estimate the motion kernel of the
 blurred one.
 Close to our work are papers on multi-image blind deconvolution . In the authors
showed that given multiple observations, the sparsity of the image under a tight
frame is a good measurement of the clearness of the

 recovered image. Having multiple input images improves the accuracy of


identifying the motion blur kernels, reducing the illposedness of the problem.
Most of these multi-image algorithms introduce cross-blur penalty functions
between image pairs.

 However this has the problem of growing combinatorially with the number of
images in the burst. This idea is extended in using a Bayesian framework for
coupling all the unknown blurring kernels and the latent image in a unique prior.

 Although this prior has numerous good mathematical properties, its optimization
is very slow. The algorithm produces very good results but it may take several.

Fig.1
Fig. 1. When the camera is set to a burst mode, several photographs are captured
sequentially. Due to the random nature of hand tremor, the camera shake blur is mostly
independent from one frame to the other. An image consisting of white dots was
photographed with a DSLR handheld camera to depict the camera motion kernels. The
kernels are mainly unidimensional regular trajectories that are not completely random
(perfect random walk) nor uniform.

III. FOURIER BURST ACCUMULATION


A. Rationale
Camera shake originated from hand tremor vibrations has undoubtedly a random
nature. The independent movement of the photographer hand causes the camera to be
pushed randomly and unpredictably, generating blurriness in the captured image.

Figure 1 shows several photographs taken with a digital single-lens reßex (DSLR)
handheld camera.

The photographed scene consists of a laptop displaying a black image with white dots.
The captured picture of the white dots illustrates the trace of the camera movement in the
image plane.

If the dots are very small Ñ mimicking Dirac masses Ñ their photographs represent the
blurring kernels themselves. As one can see, the kernels mostly consist of one-
dimensional regular random trajectories. This stochastic behavior will be the key
ingredient in our proposed approach.
ˆ
F
Let denote the Fourier Transform and k the Fourier Transform of the kernel k. Images
are deÞned in a regular grid indexed by the 2D position x and the Fourier domain is
indexed by the 2D frequency ζ .

Let us assume, without loss of generality, that the kernel k due to camera shake is
normalized such that k(x)dx = 1.

The blurring kernel is nonnegative since the integration of incoherent light is always
nonnegative. This implies that the motion blur does not amplify the Fourier spectrum:
Fig. 2. Real camera shake kernels were computed using a sharp snapshot captured with a
tripod as a reference. The Þrst row shows a crop of each input image (frames 1 to 14) and
the proposed Fourier Burst Accumulation (from (3), no additional sharpening). As noted
before, the kernels are generally regular unidimensional trajectories (second row). The
four last columns in the second row show the resultant point spread functions (PSF) after
the Fourier weighted average for different values of p. The kernel due to the Fourier
average with p > 0 is closer to a Delta function, showing the success of the method. The
two bottom rows show the Fourier real and imaginary parts of each blurring kernel (the
red box indicates the π/2 frequency). The real part is mostly positive and signiÞcantly
larger than the imaginary part, implying that the blurring kernels do not introduce
signiÞcant phase distortions. This might not be the case for large motion kernels,
uncommon in standard hand shakes

IV. FOURIER BURST ACCUMULATION ANALYSIS

A. Anatomy of the Fourier Accumulation

The value of p sets a tradeoff between sharpness and noise reduction. Although one
would always prefer to get a sharp image, due to noise and the unknown Fourier phase
shift introduced by the aggregation procedure, the resultant image would not necessary be
better as p → ∞.
Deblurring

Deblurring is the process of removing blurring artifacts from images, such as blur
caused by defocus aberration or motion blur. The blur is typically modeled as
the convolution of a (sometimes space- or time-varying) point spread function with a
hypothetical sharp input image, where both the sharp input image (which is to be
recovered) and the point spread function are unknown.

This is an example of an inverse problem. In almost all cases, there is insufficient


information in the blurred image to uniquely determine a plausible original image,
making it an ill-posed problem.

In addition the blurred image contains additional noise which complicates the task of
determining the original image. This is generally solved by the use of
a regularization term to attempt to eliminate implausible solutions.

ImageFusion
In computer vision, Multisensor Image fusion is the process of combining relevant
information from two or more images into a single image The resulting image will be
more informative than any of the input images
In remote sensing applications,the increasing availability of space borne sensors gives a
motivation for different image fusion algorithms.
Several situations in image processing require high spatial and high spectral resolution in
a single image. Most of the available equipment is not capable of providing such data
convincingly.
Image fusion techniques allow the integration of different information sources. The fused
image can have complementary spatial and spectral resolution characteristics. However,
the standard image fusion techniques can distort the spectral information of the
multispectral data while merging.
In satellite imaging, two types of images are available. The panchromatic image acquired
by satellites is transmitted with the maximum resolution available and the multispectral
data are transmitted with coarser resolution.
This will usually be two or four times lower. At the receiver station, the panchromatic
image is merged with the multispectral data to convey more information.

Many methods exist to perform image fusion. The very basic one is the high pass
filtering technique. Later techniques are based on Discrete Wavelet Transform, uniform
rational filter bank, and Laplacian pyramid.

Image fusion methods can be broadly classified into two groups - spatial domain fusion
and transform domain fusion.
The fusion methods such as averaging, Brovey method, principal component analysis
(PCA) and IHS based methods fall under spatial domain approaches.
Another important spatial domain fusion method is the high pass filtering based
technique. Here the high frequency details are injected into upsampled version of MS
images.
The disadvantage of spatial domain approaches is that they produce spatial distortion in
the fused image. Spectral distortion becomes a negative factor while we go for further
processing, such as classification problem.
Spatial distortion can be very well handled by frequency domain approaches on image
fusion. The multiresolution analysis has become a very useful tool for analysing remote
sensing images.
The discrete wavelet transform has become a very useful tool for fusion. Some other
fusion methods are also there, such as Laplacian pyramid based, curvelet transform based
etc.
These methods show a better performance in spatial and spectral quality of the fused
image compared to other spatial methods of fusion.
The images used in image fusion should already be registered. Misregistration is a major
source of error in image fusion. Some well-known image fusion methods are:

 High pass filtering technique


 IHS transform based image fusion
 PCA based image fusion
 Wavelet transform image fusion
 Pair-wise spatial frequency matching
IMAGE STABILIZATION
Image stabilization (IS) is a family of techniques used to reduce blurring associated with
the motion of a camera or other imaging device during exposure. Generally, it
compensates for pan and tilt (angular movement, equivalent to yaw and pitch) of the
imaging device, although electronic image stabilization can also be used to compensate
for rotation.

It is used in image-stabilized binoculars, still and video cameras,


astronomical telescopes, and also smartphones, mainly the high-end. With still cameras,
camera shake is particularly problematic at slow shutter speeds or with long focal length
(telephoto or zoom) lenses.

With video cameras, camera shake causes visible frame-to-framejitter in the recorded
video. In astronomy, the problem of lens-shake is added to by variations in the
atmosphere over time, which will cause the apparent positions of objects to change.

Optical image stabilization


An optical image stabilizer, often abbreviated OIS, IS, or OS, is a mechanism used in a
still camera or video camera that stabilizes the recorded image by varying the optical path
to the sensor.
This technology is implemented in the lens itself, as distinct from In Body Image
Stabilization, which operates by moving the sensor as the final element in the optical
path.
The key element of all optical stabilization systems is that they stabilize the image
projected on the sensor before the sensor converts the image into digital information.
Different companies have different names for the OIS technology, for example:

 Vibration Reduction (VR) - Nikon (produced the first optical stabilized lens, a 38–
105 mm f/4-7.8 zoom built into the Nikon Zoom 700VR (US: Zoom-Touch 105 VR)
camera in 1994)[5]
 Image Stabilizer (IS) - Canon (introduced the first optically 2-axis stabilized lens (the
EF 75-300mm F4-5.6 IS USM) in 1995. In 2009, they introduced their first lens (the
EF 100mm F2.8 Macro L) utilizing their 4-axis Hybrid IS.)
 AntiShake (AS) - Konica Minolta (Minolta introduced the first sensor-based 2-axis
image stabilizer with the DiMAGE A1 in 2003)
 IBIS - In Body Image Stabilisation - Olympus
 Optical SteadyShot (OSS) - Sony (for Cyber-shot and several Alpha E-mount lenses)
 MegaOIS, PowerOIS - Panasonic and Leica
 SteadyShot (SS), Super SteadyShot (SSS), SteadyShot INSIDE (SSI) - Sony (based
on Konica Minolta's AntiShake originally, Sony introduced a 2-axis full-frame
variant for the DSLR-A900 in 2008 and a 5-axis stabilizer for the full-frame ILCE-
7M2 in 2014)
 Optical Stabilization (OS) - Sigma
 Vibration Compensation (VC) - Tamron
 Shake Reduction (SR) - Pentax
 PureView - Nokia (produced the first cell phone optical stabilised sensor, built into
the Lumia 920)
 UltraPixel - HTC (Image Stabilization is only available for the 2013 HTC One with
UltraPixel. It is not available for the HTC One (M8) or HTC Butterfly S which also
have UltraPixel)
Most of high-end smartphones as of late 2014 use Optical Image Stabilization for photos
and videos.

ALGORITHM IMPLEMENTATION

The proposed burst restoration algorithm is built on three main blocks: Burst
Registration, Fourier Burst Accumulation, and Noise Aware Sharpening as a post-
processing. These are described in what follows.

A. Burst Registration

There are several ways of registering images (see [34] for a survey). In this work, we
use image correspondences to estimate the dominant homography relating every image of
the burst and a reference image (the Þrst one in the burst).

The homography assumption is valid if the scene is planar (or far from the camera) or
the viewpoint location is Þxed, e.g., the camera only rotates around its optical center.

Image correspondences are found using SIFT features [35] and then Þltered out through
the ORSA algorithm [36], a variant of the so called RANSAC method [37]. To mitigate
the effect of the camera shake blur we only detect SIFT features having a larger scale
than σmin = 1.8.

Recall that as in prior art, the registration can be done with the gyroscope and
accelerometer information from the camera.

Noise Aware Sharpening

While the results of the Fourier burst accumulation are already very good,
considering that the process so far has been computationally non-intensive, one can
optionally apply a Þnal sharpening step if resources are still available.

The sharpening must contemplate that the reconstructed image may have some
remaining noise. Thus, we Þrst apply a denoising algorithm (we used the
NLBAYES algorithm [38]2), then on the Þltered

Memory and Complexity Analysis

Once the images are registered, the algorithm runs in O(M · m · log m), where m = mh ×
mw is the number of image pixels and M the number of images in the burst.

The heaviest part of the algorithm is the computation of M FFTs, very suitable and
popular in VLSI implementations. This is the reason why the method has a very low
complexity.

Regarding memory consumption, the algorithm does not need to access all the images
simultaneously and can proceed in an online fashion.

This keeps the memory requirements to only three buffers: one for the current image,
one for the current average, and one for the current weights sum.

Comparison to Multi-Image Blind Deblurring

Since this problem is typically addressed by multi-image blind deconvolution techniques,


we selected two state-of-the-art algorithms for comparison.

Both algorithms are built on variational formulations and estimate Þrst the blurring
kernels using all the frames in the burst and then do a step of multi-image non-blind
deconvolution, requiring signiÞcant memory for normal operation. We used the code
provided by the authors.

The algorithms rely on parameters that were manually tuned to get the best possible
results.

We also compare to the simple align and average algorithm (which indeed is the
particular case p = 0).

In addition, we show two input images for each burst: the best one in the burst and a
typical one in the series.

The proposed algorithm obtains similar or better results than the one by Zhang et al. at
a signiÞcantly lower computational and memory cost.

Since this algorithm explicitly seeks to deconvolve the sequence, if the convolution
model is not perfectly valid or there is misalignment, the restored image will have
deconvolution artifacts.

This is clearly observed in the bookshelf sequence where produces a slightly sharper
restoration but having ringing artifacts (see Jonquet book).

Also, it is hard to read the word ÒWomenÓ in the spine of the red book. Due to the
strong assumed priors, generally leads to very sharp images but it may also produce
overshooting/ringing in some regions like in the brick wall (parking night).

The proposed method clearly outperforms in all the sequences.

This algorithm introduces strong artifacts that degraded the performance in most of the
tested bursts.

Tuning the parameters was not trivial since this algorithm relies on 4 parameters that
the authors have linked to a single one (named γ). We swept the parameter γ to get the
best possible performance.

Our approach is conceptually similar to a regular align and average algorithm, but it
produces signiÞcantly sharper images while keeping the noise reduction power of the
average principle.

In some cases with numerous images in the burst (e.g., see the parking night
sequence), there might already be a relatively sharp image in the burst (lucky imaging).

Our algorithm does not need to explicitly detect such ÒbestÓ frame, and naturally uses
the others to denoise the frequencies not containing image information but noise.

B. Execution Time
Once the images are registered, the proposed approach runs in only a few seconds in
our Matlab experimental code, while needs several hours for bursts of 8-10 images.

Even if the estimation of the blurring kernels is done in a cropped version (i.e., 200 ×
200 pixels region), the multi-image non-blind deconvolution step is very slow, taking
several hours for 6-8 megapixel images.

C. Multi-Image Non-Blind Deconvolution

The algorithm results in two sequences provided in. The algorithm proposed uses
gyroscope information present in new camera devices to register the burst and also to
have an estimation of the local blurring kernels.

Then a more expensive multi-image non-blind deconvolution algorithm is applied to


recover the latent sharp image.

Our algorithm produces similar results without explicitly solving any inverse problem
nor using any information about the motion kernels.
Restoration results with the data provided in [25] (anthropologie and tequila sequences).

D. HDR Imaging: Multi-Exposure Fusion

 In many situations, the dynamic range of the photographed scene is larger than the
one of the cameraÕs image sensor.

 A popular solution to this problem is to capture several images with varying


exposure settings and combine them to produce a single high dynamic range high-
quality image .

 However, in dim light conditions, large exposure times are needed, leading to the
presence of image blur when the images are captured without a tripod.
 This presents an additional challenge. A direct extension of the present work is to
capture several bursts, each one covering a different exposure level.

 Then, each of the bursts is processed with the FBA proce-dure leading to a clean
sharper representation of each burst.

 The obtained sharp images can then be merged to produce a high quality image
using any existent exposure fusion algorithm.

 the results of taking two image bursts with two different exposure times, and
separately aggregating them using the proposed algorithm.

 We then applied the exposure fusion algorithm of to get a clean tone mapped
image.
MATLAB
1. What is MATLAB?
MATLAB (matrix laboratory) is a multi-paradigm numerical
computing environment and fourth- generation programming language. A proprietary
programming language developed by MathWorks, MATLAB
allows matrix manipulations, plotting of functions and data, implementation
of algorithms, creation of user interfaces, and interfacing with programs written in other
languages, including C, C++, Java, Fortran and Python.

Although MATLAB is intended primarily for numerical computing, an optional toolbox


uses the MuPAD symbolic engine, allowing access to symbolic computing capabilities.
An additional package, Simulink, adds graphical multi-domain simulation and model-
based design for dynamic and embedded systems.

MATLAB is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where
problems and solutions are expressed in familiar mathematical notation.

Typical uses include:

 Math and computation


 Algorithm development
 Modeling, simulation, and prototyping
 Data analysis, exploration, and visualization
 Scientific and engineering graphics
 Application development, including Graphical User Interface building

MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning.

This allows you to solve many technical computing problems, especially those with
matrix and vector formulations, in a fraction of the time it would take to write a program
in a scalar noninteractive language such as C or Fortran.
The name MATLAB stands for matrix laboratory.

MATLAB was originally written to provide easy access to matrix software developed by
the LINPACK and EISPACK projects, which together represent the state-of-the-art in
software for matrix computation.

MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses
in mathematics, engineering, and science. In industry, MATLAB is the tool of choice for
high-productivity research, development, and analysis.

MATLAB features a family of application-specific solutions called toolboxes. Very


important to most users of MATLAB, toolboxes allow you to learn and apply specialized
technology.

Toolboxes are comprehensive collections of MATLAB functions (M-files) that extend


the MATLAB environment to solve particular classes of problems.

Areas in which toolboxes are available include signal processing, control systems, neural
networks, fuzzy logic, wavelets, simulation, and many others.

B.AVERAGING FILTER:

Mean filter or averaging filter is a simple linear filter and easy


implementation method of smoothing images. Average filter is often used to reduce noise
and also reduce the amount of intensity variation from one pixel to another. Here, first
take an average that is sum of the elements and divide the sum by the number of elements.
Next, replace each pixel in an image by the average of pixels in a square window
surrounding this pixel [8, 9, and 10]. Fig 2 depicts the functionality behind the averaging
filter.
where M is the total number of pixels in the neighborhood N and k, l =1, 2 ..
For example, a 3x3 neighborhood about [i, j] yield

Problem with averaging of filter is that it can remove noise more effectively in larger
windows, but also blur the details in an image
ARITHMETIC MEAN FILTER:

DESCRIPTION:
Applies a arithmetic mean filter to an image.

An arithmetic mean filter operation on an image removes short tailed noise such as uniform and
Gaussian type noise from the image at the cost of blurring the image. The arithmetic mean filter
is defined as the average of all pixels within a local region of an image.

The arithmetic mean is defined as:

Pixels that are included in the averaging operation are specified by a mask. The larger the
filtering mask becomes the more predominant the blurring becomes and less high spatial
frequency detail that remains in the image.

GEOMETRIC MEAN FILTER:

DESCRIPTION
Applies a geometric mean filter to an image.

In the geometric mean method, the color value of each pixel is replaced with the geometric mean
of color values of the pixels in a surrounding region. A larger region (filter size) yields a stronger
filter effect with the drawback of some blurring.

The geometric mean is defined as:

The geometric mean filter is better at removing Gaussian type noise and preserving edge features
than the arithmetic mean filter. The geometric mean filter is very susceptible to negative outliers.

MAXIMUM FILTER:

DESCRIPTION:
Applies a maximum filter to an image. The maximum filter is defined as the
maximum of all pixels within a local region of an image.
The maximum filter is typically applied to an image to remove negative outlier noise.

MINIMUM FILTER:

DESCRIPTION:
Applies a minimum filter to an image. The minimum filter is defined as the
minimum of all pixels within a local region of an image. The minimum filter is typically
applied to an image to remove positive outlier noise.

MIDPOINT FILTER:

DESCRIPTION:
Applies a midpoint filter to an image.

In the midpoint method, the color value of each pixel is replaced with the average of maximum
and minimum (i.e. the midpoint) of color values of the pixels in a surrounding region. A larger
region (filter size) yields a stronger effect.

The midpoint filter is typically used to filter images containing short tailed noise such as
Gaussian and uniform type noise.

YP MEAN FILTER:

DESCRIPTION
Applies a Yp Mean filter to an image.

The Yp mean filter is member of a set of nonlinear mean filters which are better removing
Gaussian type noise and preserving edge features than the arithmetic mean filter. Yp mean filter
is very good at removing positive outliers for negative values of P and negative outliers for
positive values of P

MEDIAN FILTER:
Median filtering is a nonlinear operation often used in image
processing to reduce noise. A median filter is more effective than convolution when the goal is
to simultaneously reduce noise and preserve edges. The median filter also like mean filter that
considers each pixel in the image in turn and looks at its nearby neighbors to decide whether or
not it is the representative of its surroundings. Instead of simply replacing the pixel value with
the mean of neighboring pixel values, it replaces it with the median of those values. The median
is calculated by sorting all the pixel values from the surrounding neighborhood into numerical
order and then replacing the pixel being considered with the middle pixel value [11, 12].Note
that if the window has an odd number of entries, then the median is simple to define. It is the
middle value after all the entries in the window are sorted numerically. For an even number of
entries, there is more than one possible median. In median filtering, the neighboring pixels are
ranked according to brightness (intensity) and the median value becomes the
new value for the central pixel. Fig shows the working principle of median filter.

N Working principle of median filter


Advantages of median filter are there is no reduction in contrast across steps, since output values
available consist only of those present in the neighborhood (no averages). The median is less
sensitive than the mean to extreme values (outliers), those extreme values are more effectively
removed. The disadvantage of median filter is sometimes this is not subjectively good at dealing
with large amount of Gaussian noise as the mean filter.

WIENER FILTER
The important use of wiener filter is to reduce the amount ofnoise present in an image by
comparison with an estimation of the desired noiseless signal. It is based on a statistical
approach. Wiener filters are characterized by three important Factors
1) Assumption: The stationary linear stochastic processes of image and noise
with known spectral characteristics or known autocorrelation and cross-correlation
2) Requirement: The filter must be physically realizable/causal
3) Performance criterion: minimum mean-square error (MMSE).

This filter is frequently used in the process of de-convolution. The inverse filtering is a
restoration technique for de-convolution, i.e., when the image is blurred by a known low pass
filter, it is possible to recover the image inverse filtering or generalized inverse filtering.
However, inverse filtering is very sensitive to additive noise. The approach of reducing
degradation at a time induces to develop a restoration algorithm. The Wiener filtering executes
an optimal tradeoff between inverse filtering and noise smoothing [13, 14, 15]. It removes the
additive noise and inverts the blurring simultaneously.
The Wiener filtering is optimal in terms of the mean square error. In other words, it minimizes
the overall mean square error in the process of inverse filtering and noise smoothing. The Wiener
filtering is a linear estimation of the original image. The orthogonality principle implies that the
Wiener filter in Fourier domain can be expressed as follows:

Where are power spectra of the original image and the additive noise, and
H (f1, f2) is the blurring filter. It is easy to see that the Wiener filter has two separate parts, an
inverse filtering part and a noise smoothing part. It is not only performs the de-convolution by
inverse filtering (high pass filtering) but also removes the noise with a compression operation
(low pass filtering).

The goal of the Wiener filter is to filter out noise that has corrupted a signal. It is based on
a statistical approach, and a more statistical account of the theory is given in the MMSE
estimator article.
Typical filters are designed for a desired frequency response. However, the design of the Wiener
filter takes a different approach. One is assumed to have knowledge of the spectral properties of
the original signal and the noise, and one seeks the linear time-invariant filter whose output
would come as close to the original signal as possible. Wiener filters are characterized by the
following:

1. Assumption: signal and (additive) noise are stationary linear stochastic processes with
known spectral characteristics or known autocorrelation and cross-correlation
2. Requirement: the filter must be physically realizable/causal (this requirement can be
dropped, resulting in a non-causal solution)
3. Performance criterion: minimum mean-square error (MMSE)

This filter is frequently used in the process of deconvolution; for this application, see Wiener de
convolution.

The realization of the causal Wiener filter looks a lot like the solution to the least
squares estimate, except in the signal processing domain. The least squares solution, for input
matrix and output vector is

The FIR Wiener filter is related to the least mean squares filter, but minimizing the error
criterion of the latter does not rely on cross-correlations or auto-correlations. Its solution
converges to the Wiener filter solution.

The Wiener filter can be used in image processing to remove noise from a picture. For example,
using the Mathematical function: Wiener Filter[image,2] on the first image on the right, produces
the filtered image below it. It is commonly used to denoise audio signals, especially speech, as a
preprocessor before speech recognition.

This section formulates the general filtering problem and explains the conditions under which the
general filter simplifies to a Kalman filter (KF). Typical application of the Kalman Filter,
reproduced from, illustrates the application context in which the Kalman Filter is used. A
physical system, (e.g., a mobile robot, a chemical process, a satellite) is driven by a set of
external inputs or controls and its outputs are evaluated by measuring devices or sensors, such
that the knowledge on the system’s behavior is solely given by the inputs and the observed
outputs. The observations convey the errors and uncertainties in the process, namely the sensor
noise and the system errors. Based on the available information (control inputs and observations)
it is required to obtain an estimate of the system’s state that optimizes a given criteria. This is the
role played by a filter. In particular situations, explained in the following sections, this filter is a
Kalman Filter.
The general filtering problem may formulated along the following lines. Let
x(k + 1) = f(x(k), u(k),w(k))
y(k) = h(x(k), v(k))
be the state dynamics of a general non-linear time-varying system, where

• x 2 Rn is the system state vector,

• f(., ., .) defines the system’s dynamics,

• u 2 Rm is the control vector,

• w is the vector that conveys the system error sources,

• y 2 Rr is the observation vector,


• h(., ., .) is the measurement function,
• v is the vector that represents the measurement error sources.

Any type of filter tries to obtain an optimal estimate of the desired quantities (the system’s state)
from data provided by a noisy environment. The concept of optimality expressed by the words
best estimate corresponds to the minimization of the state estimation error in some respect.
Taking a Bayesian viewpoint, the filter propagates the conditional probability density function of
the desired quantities, conditioned on the knowledge of the actual data coming from the
measuring devices, i.e., the filter evaluates and propagates the conditional pdf
Different optimization criteria may be chosen, leading to different estimates of the system’s state
vector. The estimate can be

• the mean, i.e., the center of the probability mass, corresponding to the minimum mean-square

error criteria,
• the mode that corresponds to the value of x that has the highest probability, corresponding to

the Maximum a Posterior (MAP) criteria,

• the median, where the estimate is the value of x such that half the probability weight lies to the

left and half to the right of it. For the conditional pdf represented in these criteria leads to
different state estimates. So far, we formulated the general filtering problem. Under a set of
particular conditions related with the linearity of the system (state and observation) dynamics
and the normality of the random vectors involved (e.g., initial condition, state and measurement
noise), the conditional probability density functions propagated by the filter are Gaussian for
every k. The involved pdf are thus completely characterize by the mean vector and the
covariance matrix. Rather 6
General conditional pdf than propagating the entire pdf, the filter only propagates (recursively)
the first and second moments of the conditional pdf. The general filter simplifies to what is
known as the Kalman filter, whose dynamics is be derived.
The Kalman filter dynamics will be derived as a general random parameter vector estimation.
The KF filter evaluates the minimum mean-square error estimate of the random vector that is the
system’s state. Results on the estimation of a general random parameter vector are presented
This section presents basic results on the estimation of a random parameter vector based on a set
of observations. This is the framework in which the Kalman filter will be derived, given that the
state vector of a given dynamic system is interpreted as a random vector whose estimation is
required. Deeper presentations of the issues of parameter estimation may be found the previous
results are particularized

The Kalman Filter


The filtering problem for a general nonlinear system dynamics. Consider now that the system
represented, has a linear time-varying dynamics, i.e., particularizes to,
xk+1 = Akxk + Bkuk + Gwk k
yk = Ckxk + vk
where x(k) 2 Rn, u(k) 2 Rm, w(k) 2 Rn, v(k) 2 Rr, y(k) 2 Rr, {wk} and
{vk} are sequences of white, zero mean, Gaussian noise with zero mean
E[wk] = E[vk] = 0,
and joint covariance matrix

Kalman Filter dynamics


When vk, wk and x0 are Gaussian vectors, the random vectors xk, xk+1, Y k1 arejointly Gaussian.
As discussed before, the Kalman filter propagates the Gaussian15
Propagation of the conditional pdf: General filter and Kalman filter
pdf p(xk)|Y k1 , Uk− 10 ) and therefore the filter dynamics defines the general transition
from p(xk)|Y k1 , Uk− 10 ) to p(xk+1)|Y k+11 , Uk0 )
where both pdf are Gaussian and the input and observation information available at time instant
k and k+1 are displayed. Rather than being done directly, this transition is implemented as a two
step-procedure, a prediction cycle and a filtering orupdate cycle

• p(xk+1|Y k1 , Uk0 ), defined for time instant k + 1, represents what can be saidabout x(k + 1)

before making the observation y(k + 1).

• The filtering cycle states how to improve the information on x(k + 1) aftermaking the

observation y(k + 1).In summary, the Kalman filter dynamics results from a recursive
application ofprediction and filtering cycles.

The Extended Kalman Filter


In this section we address the filtering problem in case the system dynamics (state and
observations) is nonlinear. With no loss of generality we will consider that the system has no
external inputs. Consider the non-linear dynamics
xk+1 = fk(xk) + wk
yk = hk(xk) + vk

where, xk 2 Rn, fk(xk) : Rn,−! Rn

yk 2 Rr hk(xk) : Rn −! Rrvk 2 Rrwk 2 Rn

and {vk}, {wk} are white Gaussian, independent random processes with zero mean and covariance
matrix
E[vkvTk ] = Rk, E[wkeTk] = Qk (5.4)
and x0 is the system initial condition considered as a Gaussian random vector,
x0 _ N(x0,_0).
Let Y k1 = {y1, y2, . . . , yk} be a set of system measurements. The filter’s goal is to obtain an
estimate of the system’s state based on these measurements. As presented in Section 2, the
estimator that minimizes the mean-square error evaluates the condition mean of the pdf of xk
given Y k1 . Except in very particular cases, the computation of the conditional mean requires the
knowledge of 31 the entire conditional pdf. One of these particular cases, referred in Section 4, is
the one in which the system dynamics is linear, the initial conditional is a Gaussian random
vector and system and measurement noises are mutually independent white Gaussian processes
with zero mean. As a consequence, the conditional pdf
To evaluate its first and second moments, the optimal nonlinear filter has to propagate the entire
pdf which, in the general case, represents a heavy computational burden. The Extended Kalman
filter (EKF) gives an approximation of the optimal estimate. The non-linearities of the systems’s
dynamics are approximated by a linearized version of the non-linear system model around the
last state estimate. For this approximation to be valid, this linearization should be a good
approximation of the non-linear model in all the uncertainty domain associated with the
stateestimate. Represents one cycle of consecutive prediction and filtering updateswith the
consecutive pdf transitions,Rather than propagating the non Gaussian pdf, the Extended Kalman
filter considers,at each cycle, a linearization of the non-linear dynamics (5.1)-(5.2) around thelast
consecutive predicted and filtered estimates of the state, and for the linearizeddynamics, it
applies the Kalman Filter.One iteration of the EKF is composed by the following consecutive
steps:
1. Consider the last filtered state estimate ˆx(k|k),
2. Linearize the system dynamics, xk+1 = f(xk) + wk around ˆx(k|k),
3. Apply the prediction step of the Kalman filter to the linearized system dynamicsjust obtained,
yielding ˆx(k + 1|k) and P(k + 1|k),
4. Linearize the observation dynamics, yk = h(xk) + vk around ˆx(k + 1|k),
5. Apply the filtering or update cycle of the Kalman filter to the linearized observation dynamics,
yielding ˆx(k + 1|k + 1) and P(k + 1|k + 1). Let F(k) and H(k) be the Jacobian matrices of f(.)
and h(.), denoted by F(k) = 5fk |ˆx(k|k)H(k + 1) = 5h |ˆx(k+1|k)
The Extended Kalman filter algorithm is stated below:
Predict Cycle
ˆx(k + 1|k) = fk(ˆx(k|k))
P(k + 1|k) = F(k)P(k|k)FT (k) + Q(k)
Filtered Cycle

ˆx(k + 1|k + 1) = ˆx(k + 1|k) + K(k + 1)[yk+1 − hk+1(ˆx(k + 1|k))]

K(k + 1) = P(k + 1|k)HT (k + 1)[H(k + 1)P(k + 1|k)HT (k + 1) + R(k + 1)]− 1

P(k + 1|k + 1) = [I − K(k + 1)H(k + 1)]P(k + 1|k)

It this important to state that the EKF is not an optimal filter, but rathar it is implemented based
on a set of approximations. Thus, the matrices P(k|k) and P(k + 1|k) do not represent the true
covariance of the state estimates. Moreover, as the matrices F(k) and H(k) depend on previous
state estimates and therefore on measurements, the filter gain K(k) and the matrices P(k|k) and
P(k + 1|k) cannot be computed off-line as occurs in the Kalman filter. Contrary to the Kalman
filter, the EKF may diverge, if the consecutive linearizations are not a good approximation of the
linear model in all the associated uncertainty domain.
Image fusion methods can be broadly classified into two as transform domain fusion and spatial
domain fusion.

Transform Domain Fusion: In transform domain fusion methods the input images are first
transformed then fused and the result is converted back by an inverse transform. In these
methods the fusing coefficients are calculated with fusion rules which are either pixel based or
region based (Oudre, 2007).

Spatial Domain Fusion: In spatial domain fusion input images are worked on directly. Weights
are estimated for each input image and for each pixel with iterative methods which optimize a
cost function (Oudre, 2007).The fusion methods such as averaging, Brovey Transformation (BT)
(Al-Wassai, et al, 2011), principal component analysis (PCA) (Canga, 2002), and Intensity-Hue-
Saturation IHS based methods (Tu, et al, 2001) fall under spatial domain approaches. Another
important spatial domain fusion method is the high pass filtering based technique. Here the high
frequency details are injected into up-sampled version of MS images. The disadvantage of spatial
domain approaches is that they produce spatial distortion in the fused image. Spectral distortion
becomes a negative factor while we go for further processing, such as classification problem.
Spatial distortion can be handled by transform domain approaches on image fusion. Some other
transform domain fusion methods have also been developed, such as Laplacian pyramid based,
curvelet transform based, etc. These methods show a better performance in spatial and spectral
quality of the fused image compared to other spatial methods of fusion. The images to be used in
image fusion should be registered before the process and misregistration is a major source of
error in image fusion. The most commonly used image fusion methods are listed below and will
be described together with their basic formulation in the following sections.
IHS transform based image fusion,
Brovey Transform (BT),
Wavelet transform image fusion (WT),
Principal component analysis (PCA) based image fusion,

Since the emergence of image fusion techniques in various applications, methods that can assess
or evaluate the performance of different fusion techniques objectively, systematically and
quantitatively have been recognized as an important necessity. In this chapter various fusion
assessment techniques that have been proposed in the field of image fusion, is discussed.
However, in general, the receiver of the fused images will not be a human viewer but some form
of automated image processing system. The loss of some information appears in the fused image
compare to the individual source images,and this loss sometimes becomes severe because, this
lost information in one par-ticular image processing application might be important for another.
Therefore, ageneral assessment method is always been needed, even if the application of image
fusion is unknown in advance. To assess the fusion performance Li et al. proposed a method to
calculate the standard deviation between the reference image (ground truth) and the fused image.
Other statistical measurements such as Signal to Noise Ratio (SNR), Peak Signal to Noise Ratio
(PSNR) and Mean Square Error (MSE) from digital signal processing are also commonly used
to assess the image fusion methods, in case when the reference image is available.However, in a
practical situation the reference image is rarely known. Some image fusion assessment methods
are recently developed, which evaluate the fusion method without any reference image. These
methods assess the fusion on input- output relationship. In a Mutual Information (MI) principle
has been used to assess the fusion method. MI calculates the quantity of information transferred
from source images (input) to a fused image. Xydeas and Petrovic Proposed a fusion assessing
technique based on pixel level (Qp) in which visual information or perceptual information is
directly associated with the edge information while region information is ignored. For the
assessment of structural information, a Structural Similarity (SSIM) index Framework is
developed and used to measure the quality of the fused image. These methods (MI, EI and
SSIM) mea- sure the amount of information transferred from input images to a fused image and
evaluate the fusion method means does not need a reference image. This chapter is organized as
follows. Section 3.1 discusses some statistical measures such as the SNR, PSNR, and MSE,
which require an ideal or reference image to assess the fusion technique.
Image fusion is the technique of merging several images from multi-modal sources with
respective complementary information to form a new image, which carries all the common as
well as complementary features of individual images. With the recent rapid developments in the
domain of imaging technologies, multisensory systems have become a reality in wide fields such
as remote sensing, medical imaging, machine vision and the military applications. Image fusion
provides an effective way of reducing this increasing volume of information by extracting all the
useful information from the source images. Image fusion provides an effective method to enable
comparison and analysis of Multi-sensor data having complementary information about the
concerned region. Image fusion creates new images that are more suitable for the purposes of
human/machine perception, and for further image-processing tasks such assegmentation, object
detection or target recognition in applications such as remote sensing and medical imaging.
Images from multiple sensors usually have different geometric representations, which have to be
transformed to a common representation for fusion. This representation should retain the best
resolution of either sensor. The alignment of multi-sensor images is also one of the most
important preprocessing steps in image fusion. Multi-sensor registration is also affected by the
differences in the sensor images. However, image fusion does not necessarily imply multi-sensor
sources. There can be single-sensor or multi-sensor image fusion, which has been vividly
described in this report.
Analogous to other forms of information fusion, image fusion is usually performed at one of the
three different processing levels: signal, feature and decision. Signal level image fusion, also
known as pixel-level image fusion, represents fusion at the lowest level, where a number of raw
input image signals are combined to produce a single fused image signal. Object level image
fusion, also called feature level image fusion, fuses feature and object labels and property
descriptor information that have already been extracted from individual input images. Finally,
the highest level, decision or symbol level image fusion represents fusion of probabilistic
decision information obtained by local decision makers operating on the results of feature level
processing on image data produced from individual sensors.
Instances a system using image fusion at all three levels of processing. This general structure
could be used as a basis for any image processing system, for example an
1.1 Single Sensor Image Fusion System
The basic single sensor image fusion scheme has been presented. The sensor shown could be
visible-band sensors or some matching band sensors. This sensor captures the real world as a
sequence of images. The sequence of images are then fused together to generate anew image
with optimum information content. For example in illumination variant and noisy environment, a
human operator may not be able to detect objects of his interest which can be highlighted in the
resultant fused image.

Fig. 2.1 Single Sensor Image Fusion System


The shortcoming of this type of systems lies behind the limitations of the imaging sensor that is
being used. The conditions under which the system can operate, the dynamic range, resolution,
etc. are all restricted by the competency of the sensor. For example, a visible-band sensor such as
the digital camera is appropriate for a brightly illuminated environment such as daylight scenes
but is not suitable for poorly illuminated situations found during night, or under adverse
conditions such as in fog or rain.

Multi-Sensor Image Fusion System


A multi-sensor image fusion scheme overcomes the limitations of a single sensor image fusion
by merging the images from several sensors to form a composite image. Figure 1.4illustrates a
multi-sensor image fusion system. Here, an infrared camera is accompanying the digital camera
and their individual images are merged to obtain a fused image. This approach overcomes the
issues referred to before. The digital camera is suitable for daylight scenes; the infrared camera is
appropriate in poorly illuminated environments.

Fig.2.2 Multisensory Image Fusion System


The benefits of multi-sensor image fusion include :
i. Improved reliability – The fusion of multiple measurements can reduce noise and therefore
improve the reliability of the measured quantity.

ii. Robust system performance – Redundancy in multiple measurements can help in systems
robustness. In case one or more sensors fail or the performance of a particular sensor
deteriorates, the system can depend on the other sensors

iii. Compact representation of information – Fusion leads to compact representations. For


example, in remote sensing, instead of storing imagery from several spectral bands, it is
comparatively more efficient to store the fused information.

iv. Extended range of operation – Multiple sensors that operate under different operating
conditions can be deployed to extend the effective range of operation. For example, different
sensors can be used for day/night operation.

v. Extended spatial and temporal coverage – Joint information from sensors that differ in spatial
resolution can increase the spatial coverage. The same is true for the temporal dimension.
vi. Reduced uncertainty – Joint information from multiple sensors can reduce the uncertainty
associated with the sensing or decision process.

Image Preprocessing
Analogous to signal processing, there are very often some concerns that have to be normalized
before the final image fusion. Most of the time the images are geometrically misaligned.
Registration is the technique to establish a spatial correspondence between the sensor images and
to determine a spatial geometric transformation. The misalignment of image features is induced
by various factors including the geometries of the sensors, different spatial positions and
temporal capture rates of the sensors and the inherent misalignment of the sensing elements.
Registration techniques align the images by exploiting the similarities between sensor images.
The mismatch of image features in multi sensor images reduces the similarities between the
images and makes it difficult to establish the correspondence between the images.
The second issue is the difference in spatial resolution between the images developed by
different sensors. There are several techniques to overcome this issue such as the Super
resolution techniques. Another methodology is to use multi-resolution image representations so
that the lower resolution imagery does not adversely affect the higher resolution imagery.

Image Fusion Techniques


The most essential dispute concerning image fusion is to decide how to merge the sensor images.
In recent years, a number of image fusion methods have been projected. One of the primitive
fusion schemes is pixel-by-pixel gray level average of the source images. This simplistic method
often has severe side effects such as dropping the contrast. Some more refined approaches began
to develop with the launching of pyramid transform in mid-80s. Improved results were obtained
with image fusion, performed in the transform domain. The pyramid transform solves this
purpose in the transformed domain. The basic idea is to perform a multi resolution
decomposition on each source image, then integrate all these decompositions to develop a
composite depiction and finally reconstruct the fused image by performing an inverse multi-
resolution transform. A number of pyramidal decomposition techniques have been developed for
image fusion, such as, Laplacian Pyramid, Ratio-of-low-pass Pyramid, Morphological Pyramid,
and Gradient Pyramid. Most recently, with the evolution of wavelet based multi resolution
analysis concepts, the multi-scale wavelet decomposition has begun to take the place of pyramid
decomposition for image fusion. Actually, the wavelet transform can be considered one special
type of pyramid

MATLAB
INTRODUCTION TO MATLAB

What Is MATLAB?

MATLAB® is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where problems and
solutions are expressed in familiar mathematical notation. Typical uses include

Math and computation

Algorithm development

Data acquisition

Modeling, simulation, and prototyping

Data analysis, exploration, and visualization

Scientific and engineering graphics

Application development, including graphical user interface building.

MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning. This allows you to solve many technical computing problems, especially
those with matrix and vector formulations, in a fraction of the time it would take to write a
program in a scalar non interactive language such as C or FORTRAN.

The name MATLAB stands for matrix laboratory. MATLAB was originally written to
provide easy access to matrix software developed by the LINPACK and EISPACK projects.
Today, MATLAB engines incorporate the LAPACK and BLAS libraries, embedding the state of
the art in software for matrix computation.

MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in
mathematics, engineering, and science. In industry, MATLAB is the tool of choice for high-
productivity research, development, and analysis.

MATLAB features a family of add-on application-specific solutions called toolboxes.


Very important to most users of MATLAB, toolboxes allow you to learn and apply specialized
technology. Toolboxes are comprehensive collections of MATLAB functions (M-files) that
extend the MATLAB environment to solve particular classes of problems. Areas in which
toolboxes are available include signal processing, control systems, neural networks, fuzzy logic,
wavelets, simulation, and many others.

The MATLAB System:

The MATLAB system consists of five main parts:

Development Environment:

This is the set of tools and facilities that help you use MATLAB functions and files. Many
of these tools are graphical user interfaces. It includes the MATLAB desktop and Command
Window, a command history, an editor and debugger, and browsers for viewing help, the
workspace, files, and the search path.

The MATLAB Mathematical Function:


This is a vast collection of computational algorithms ranging from elementary functions
like sum, sine, cosine, and complex arithmetic, to more sophisticated functions like matrix
inverse, matrix eigen values, Bessel functions, and fast Fourier transforms.

The MATLAB Language:

This is a high-level matrix/array language with control flow statements, functions, data
structures, input/output, and object-oriented programming features. It allows both "programming
in the small" to rapidly create quick and dirty throw-away programs, and "programming in the
large" to create complete large and complex application programs.

Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and
three-dimensional data visualization, image processing, animation, and presentation graphics. It
also includes low-level functions that allow you to fully customize the appearance of graphics as
well as to build complete graphical user interfaces on your MATLAB applications.

The MATLAB Application Program Interface (API):

This is a library that allows you to write C and Fortran programs that interact with
MATLAB. It includes facilities for calling routines from MATLAB (dynamic linking), calling
MATLAB as a computational engine, and for reading and writing MAT-files

MATLAB WORKING ENVIRONMENT:

MATLAB DESKTOP:-

Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic.
The command window is where the user types MATLAB commands and expressions at
the prompt (>>) and where the output of those commands is displayed. MATLAB defines the
workspace as the set of variables that the user creates in a work session. The workspace browser
shows these variables and some information about them. Double clicking on a variable in the
workspace browser launches the Array Editor, which can be used to obtain information and
income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which are
organize in directories in the computer file system. Any file run in MATLAB must reside in the
current directory or in a directory that is on search path. By default, the files supplied with
MATLAB and math works toolboxes are included in the search path. The easiest way to see
which directories are on the search path. The easiest way to see which directories are soon the
search path, or to add or modify a search path, is to select set path from the File menu the
desktop, and then use the set path dialog box. It is good practice to add any commonly used
directories to the search path to avoid repeatedly having the change the current directory.

The Command History Window contains a record of the commands a user has entered in
the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This action launches a
menu from which to select various options in addition to executing the commands. This is useful
to select various options in addition to executing the commands. This is a useful feature when
experimenting with various commands in a work session.
Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files and a graphical
MATLAB debugger. The editor can appear in a window by itself, or it can be a sub window in
the desktop. M-files are denoted by the extension .m, as in pixelup.m. The MATLAB editor
window has numerous pull-down menus for tasks such as saving, viewing, and debugging files.
Because it performs some simple checks and also uses color to differentiate between various
elements of code, this text editor is recommended as the tool of choice for writing and editing M-
functions. To open the editor , type edit at the prompt opens the M-file filename.m in an editor
window, ready for editing. As noted earlier, the file must be in the current directory, or in a
directory in the search path.

Getting Help:

The principal way to get help online is to use the MATLAB help browser, opened as a
separate window either by clicking on the question mark symbol (?) on the desktop toolbar, or by
typing help browser at the prompt in the command window. The help Browser is a web browser
integrated into the MATLAB desktop that displays a Hypertext Markup Language(HTML)
documents. The Help Browser consists of two panes, the help navigator pane, used to find
information, and the display pane, used to view the information. Self-explanatory tabs other than
navigator pane are used to perform a search.
DIGITAL IMAGE PROCESSING
Digital image processing

Background:

Digital image processing is an area characterized by the need for extensive experimental
work to establish the viability of proposed solutions to a given problem. An important
characteristic underlying the design of image processing systems is the significant level of
testing & experimentation that normally is required before arriving at an acceptable solution.
This characteristic implies that the ability to formulate approaches &quickly prototype candidate
solutions generally plays a major role in reducing the cost & time required to arrive at a viable
system implementation.

What is DIP
An image may be defined as a two-dimensional function f(x, y), where x & y are spatial
coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray
level of the image at that point. When x, y & the amplitude values of f are all finite discrete
quantities, we call the image a digital image. The field of DIP refers to processing digital image
by means of digital computer. Digital image is composed of a finite number of elements, each of
which has a particular location & value. The elements are called pixels.

Vision is the most advanced of our sensor, so it is not surprising that image play the
single most important role in human perception. However, unlike humans, who are limited to the
visual band of the EM spectrum imaging machines cover almost the entire EM spectrum, ranging
from gamma to radio waves. They can operate also on images generated by sources that humans
are not accustomed to associating with image.

There is no general agreement among authors regarding where image processing stops &
other related areas such as image analysis& computer vision start. Sometimes a distinction is
made by defining image processing as a discipline in which both the input & output at a process
are images. This is limiting & somewhat artificial boundary. The area of image analysis (image
understanding) is in between image processing & computer vision.

There are no clear-cut boundaries in the continuum from image processing at one end to
complete vision at the other. However, one useful paradigm is to consider three types of
computerized processes in this continuum: low-, mid-, & high-level processes. Low-level
process involves primitive operations such as image processing to reduce noise, contrast
enhancement & image sharpening. A low- level process is characterized by the fact that both its
inputs & outputs are images.

Mid-level process on images involves tasks such as segmentation, description of that


object to reduce them to a form suitable for computer processing & classification of individual
objects. A mid-level process is characterized by the fact that its inputs generally are images but
its outputs are attributes extracted from those images. Finally higher- level processing involves
“Making sense” of an ensemble of recognized objects, as in image analysis & at the far end of
the continuum performing the cognitive functions normally associated with human vision.

Digital image processing, as already defined is used successfully in a broad range of


areas of exceptional social & economic value.

What is an image?

An image is represented as a two dimensional function f(x, y) where x and y are spatial
co-ordinates and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity of the
image at that point.

Gray scale image:


A grayscale image is a function I (xylem) of the two spatial coordinates of the image
plane.

I(x, y) is the intensity of the image at the point (x, y) on the image plane.

I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] [0, b]I:
[0, a]  [0, b]  [0, info)

Color image:

It can be represented by three functions, R (xylem) for red, G (xylem) for green and B
(xylem) for blue.

An image may be continuous with respect to the x and y coordinates and also in
amplitude. Converting such an image to digital form requires that the coordinates as well as the
amplitude to be digitized. Digitizing the coordinate’s values is called sampling. Digitizing the
amplitude values is called quantization.

Coordinate convention:

The result of sampling and quantization is a matrix of real numbers. We use two
principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the
resulting image has M rows and N columns. We say that the image is of size M X N. The values
of the coordinates (xylem) are discrete quantities. For notational clarity and convenience, we use
integer values for these discrete coordinates.

In many image processing books, the image origin is defined to be at (xylem)=(0,0).The


next coordinate values along the first row of the image are (xylem)=(0,1).It is important to keep
in mind that the notation (0,1) is used to signify the second sample along the first row. It does
not mean that these are the actual values of physical coordinates when the image was sampled.
Following figure shows the coordinate convention. Note that x ranges from 0 to M-1 and y from
0 to N-1 in integer increments.

The coordinate convention used in the toolbox to denote arrays is different from the
preceding paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the
notation (race) to indicate rows and columns. Note, however, that the order of coordinates is the
same as the order discussed in the previous paragraph, in the sense that the first element of a
coordinate topples, (alb), refers to a row and the second to a column. The other difference is that
the origin of the coordinate system is at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to
N in integer increments. IPT documentation refers to the coordinates. Less frequently the toolbox
also employs another coordinate convention called spatial coordinates which uses x to refer to
columns and y to refers to rows. This is the opposite of our use of variables x and y.

Image as Matrices:

The preceding discussion leads to the following representation for a digitized image
function:

f (0,0) f(0,1) ……….. f(0,N-1)

f (1,0) f(1,1) ………… f(1,N-1)

f (xylem)= . . .

. . .

f (M-1,0) f(M-1,1) ………… f(M-1,N-1)

The right side of this equation is a digital image by definition. Each element of this array
is called an image element, picture element, pixel or pel. The terms image and pixel are used
throughout the rest of our discussions to denote a digital image and its elements.

A digital image can be represented naturally as a MATLAB matrix:

f (1,1) f(1,2) ……. f(1,N)

f (2,1) f(2,2) …….. f (2,N)

. . .

f= . . .

f (M,1) f(M,2) …….f(M,N)


Where f (1,1) = f(0,0) (note the use of a monoscope font to denote MATLAB quantities).
Clearly the two representations are identical, except for the shift in origin. The notation f(p ,q)
denotes the element located in row p and the column q. For example f(6,2) is the element in the
sixth row and second column of the matrix f. Typically we use the letters M and N respectively
to denote the number of rows and columns in a matrix. A 1xN matrix is called a row vector
whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a scalar.

Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array
and so on. Variables must begin with a letter and contain only letters, numerals and underscores.
As noted in the previous paragraph, all MATLAB quantities are written using mono-scope
characters. We use conventional Roman, italic notation such as f(x ,y), for mathematical
expressions

Reading Images:

Images are read into the MATLAB environment using function imread whose syntax is

Imread (‘filename’)

Format name Description recognized extension

TIFF Tagged Image File Format .tif, .tiff

JPEG Joint Photograph Experts Group .jpg, .jpeg

GIF Graphics Interchange Format .gif

BMP Windows Bitmap .bmp

PNG Portable Network Graphics .png

XWD X Window Dump .xwd


Here filename is a spring containing the complete of the image file(including any
applicable extension).For example the command line

>> f = imread (‘8. jpg’);

Reads the JPEG (above table) image chestxray into image array f. Note the use of single
quotes (‘) to delimit the string filename. The semicolon at the end of a command line is used by
MATLAB for suppressing output If a semicolon is not included. MATLAB displays the results
of the operation(s) specified in that line. The prompt symbol (>>) designates the beginning of a
command line, as it appears in the MATLAB command window.

Data Classes:

Although we work with integers coordinates the values of pixels themselves are not
restricted to be integers in MATLAB. Table above list various data classes supported by
MATLAB and IPT are representing pixels values. The first eight entries in the table are refers to
as numeric data classes. The ninth entry is the char class and, as shown, the last entry is referred
to as logical data class.

All numeric computations in MATLAB are done in double quantities, so this is also a
frequent data class encounter in image processing applications. Class unit 8 also is encountered
frequently, especially when reading data from storages devices, as 8 bit images are most
common representations found in practice. These two data classes, classes logical, and, to a
lesser degree, class unit 16 constitute the primary data classes on which we focus. Many ipt
functions however support all the data classes listed in table. Data class double requires 8 bytes
to represent a number uint8 and int 8 require one byte each, uint16 and int16 requires 2bytes and
unit 32.
Name Description

Double Double _ precision, floating_ point numbers the Approximate.

Uint8 unsigned 8_bit integers in the range [0,255] (1byte per


Element).
Uint16 unsigned 16_bit integers in the range [0, 65535] (2byte per element).

Uint 32 unsigned 32_bit integers in the range [0, 4294967295](4 bytes per
element). Int8 signed 8_bit integers in the range [-128,127] 1 byte per element)

Int 16 signed 16_byte integers in the range [32768, 32767] (2 bytes per
element).

Int 32 Signed 32_byte integers in the range [-2147483648, 21474833647] (4


byte per element).

Single single _precision floating _point numbers with values

In the approximate range (4 bytes per elements)

Char characters (2 bytes per elements).

Logical values are 0 to 1 (1byte per element).

Int 32 and single required 4 bytes each. The char data class holds characters in Unicode
representation. A character string is merely a 1*n array of characters logical array contains only
the values 0 to 1,with each element being stored in memory using function logical or by using
relational operators.

Image Types:

The toolbox supports four types of images:

1 .Intensity images;

2. Binary images;

3. Indexed images;

4. R G B images.

Most monochrome image processing operations are carried out using binary or intensity
images, so our initial focus is on these two image types. Indexed and RGB colour images.
Intensity Images:
An intensity image is a data matrix whose values have been scaled to represent
intentions. When the elements of an intensity image are of class unit8, or class unit 16, they have
integer values in the range [0,255] and [0, 65535], respectively. If the image is of class double,
the values are floating point numbers. Values of scaled, double intensity images are in the range
[0, 1] by convention.

Binary Images:

Binary images have a very specific meaning in MATLAB.A binary image is a logical
array 0s and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not
considered as a binary image in MATLAB .A numeric array is converted to binary using
function logical. Thus, if A is a numeric array consisting of 0s and 1s, we create an array B using
the statement.

B=logical (A)

If A contains elements other than 0s and 1s.Use of the logical function converts all
nonzero quantities to logical 1s and all entries with value 0 to logical 0s.

Using relational and logical operators also creates logical arrays.

To test if an array is logical we use the I logical function: islogical(c).

If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be
converted to numeric arrays using the data class conversion functions.

Indexed Images:

An indexed image has two components:

A data matrix integer, x


A color map matrix, map

Matrix map is an m*3 arrays of class double containing floating point values in the range
[0, 1].The length m of the map are equal to the number of colors it defines. Each row of map
specifies the red, green and blue components of a single color. An indexed images uses “direct
mapping” of pixel intensity values color map values. The color of each pixel is determined by
using the corresponding value the integer matrix x as a pointer in to map. If x is of class double
,then all of its components with values less than or equal to 1 point to the first row in map, all
components with value 2 point to the second row and so on. If x is of class units or unit 16, then
all components value 0 point to the first row in map, all components with value 1 point to the
second and so on.

RGB Image:

An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet
corresponding to the red, green and blue components of an RGB image, at a specific spatial
location. An RGB image may be viewed as “stack” of three gray scale images that when fed in to
the red, green and blue inputs of a color monitor

Produce a color image on the screen. Convention the three images forming an RGB color
image are referred to as the red, green and blue components images. The data class of the
components images determines their range of values. If an RGB image is of class double the
range of values is [0, 1].

Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or
unit 16 respectively. The number of bits use to represents the pixel values of the component
images determines the bit depth of an RGB image. For example, if each component image is an
8bit image, the corresponding RGB image is said to be 24 bits deep.

Generally, the number of bits in all component images is the same. In this case the
number of possible color in an RGB image is (2^b) ^3, where b is a number of bits in each
component image. For the 8bit case the number is 16,777,216 colors.
The MATLAB System

The MATLAB system consists of five main parts:

The MATLAB language.

This is a high-level matrix/array language with control flow statements, functions,


data structures, input/output, and object-oriented programming features. It allows both
"programming in the small" to rapidly create quick and dirty throw-away programs,
and "programming in the large" to create complete large and complex application
programs.

The MATLAB working environment.

This is the set of tools and facilities that you work with as the MATLAB user or
programmer. It includes facilities for managing the variables in your workspace and
importing and exporting data.

It also includes tools for developing, managing, debugging, and profiling M-files,
MATLAB's applications.

Handle Graphics.

This is the MATLAB graphics system. It includes high-level commands for two-
dimensional and three-dimensional data visualization, image processing, animation,
and presentation graphics.
It also includes low-level commands that allow you to fully customize the appearance
of graphics as well as to build complete Graphical User Interfaces on your MATLAB
applications.

The MATLAB mathematical function library.

This is a vast collection of computational algorithms ranging from elementary


functions like sum, sine, cosine, and complex arithmetic, to more sophisticated
functions like matrix inverse, matrix eigenvalues, Bessel functions, and fast Fourier
transforms.

The MATLAB Application Program Interface (API).

This is a library that allows you to write C and Fortran programs that interact
with MATLAB. It include facilities for calling routines from MATLAB
(dynamic linking), calling MATLAB as a computational engine, and for
reading and writing MAT-files.

History of MATLAB

Cleve Moler, the chairman of the computer science department at the University of New
Mexico, started developing MATLAB in the late 1970s.[4] He designed it to give his
students access to LINPACK and EISPACK without them having to learn Fortran. It
soon spread to other universities and found a strong audience within the applied
mathematics community.
Jack Little, an engineer, was exposed to it during a visit Moler made to Stanford
University in 1983.

Recognizing its commercial potential, he joined with Moler and Steve Bangert. They
rewrote MATLAB in C and founded MathWorks in 1984 to continue its development.
These rewritten libraries were known as JACKPAC.[5] In 2000, MATLAB was rewritten
to use a newer set of libraries for matrix manipulation, LAPACK
MATLAB was first adopted by researchers and practitioners in control engineering,
Little's specialty, but quickly spread to many other domains. It is now also used in
education, in particular the teaching of linear algebra, numerical analysis, and is popular
amongst scientists involved in image processing.

With 2004 marking the 20th anniversary of The MathWorks, it’s a good time to look
back at the origins of MATLAB.

MATLAB is now a full-featured technical computing environment, but it started as a


simple “Matrix Laboratory.”

Three men, J. H. Wilkinson, George Forsythe, and John Todd, played important roles in
the origins of MATLAB. Our account begins more than 50 years ago.

Wilkinson was a British mathematician who spent his entire career at the National
Physical Laboratory (NPL) in Teddington, outside London. Working on a simplified
version of a sophisticated design by Alan Turing, Wilkinson and colleagues at NPL built
the Pilot Automatic Computing Engine (ACE), one of Britain’s first stored-program
digital computers. The Pilot ACE ran its first program in May 1950.

Wilkinson did matrix computations on the machine and went on to become the world’s
leading authority on numerical linear algebra.

2. Strengths of MATLAB ?

 MATLAB is relatively easy to learn


 MATLAB code is optimized to be relatively quick when performing matrix
operations
 MATLAB may behave like a calculator or as a programming language
 MATLAB is interpreted, errors are easier to fix

Although primarily procedural (for example: C), MATLAB does have some object-
oriented elements(for example: C++)
Key Features
 High-level language for numerical computation, visualization, and application
development
 Interactive environment for iterative exploration, design, and problem solving
 Mathematical functions for linear algebra, statistics, Fourier analysis, filtering,
optimization, numerical integration, and solving ordinary differential equations
 Built-in graphics for visualizing data and tools for creating custom plots
 Development tools for improving code quality and maintainability and maximizing
performance
 Tools for building applications with custom graphical interfaces
 Functions for integrating MATLAB based algorithms with external applications and
languages such as C, Java, .NET, and Microsoft® Excel®

Analyzing and visualizing data using the MATLAB desktop. The MATLAB environment
also lets you write programs and develop algorithms and applications.

Numeric Computation

MATLAB provides a range of numerical computation methods for analyzing data,


developing algorithms, and creating models. The MATLAB language includes
mathematical functions that support common engineering andscience operations. Core
math functions use processor-optimized libraries to provide fast execution of vector and
matrix calculations.
Available methods include:

 Interpolation and regression


 Differentiation and integration
 Linear systems of equations
 Fourier analysis
 Eigenvalues and singular values
 Ordinary differential equations (ODEs)
 Sparse matrices
MATLAB add-on products provide functions in specialized areas such as statistics,
optimization, signal analysis, and machine learning.

INTRODUCTION TO IMAGE PROCESSING

IMAGE

An image is an array, or a matrix, of square pixels (picture elements) arranged in columns


and rows.
In a (8-bit) greyscale image each picture element has an assigned intensity that ranges
from 0 to 255. A grey scale image is what people normally call a black and white image,
but the name emphasizes that such an image will also include many shades of grey.

A normal greyscale image has 8 bit colour depth = 256 greyscales. A “true colour” image
has 24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours.

Some greyscale images have more greyscales, for instance 16 bit = 65536 greyscales. In
principle three greyscale images can be combined to form an image with
281,474,976,710,656 greyscales. There are two general groups of ‘images’: vector
graphics (or line art) and bitmaps (pixel-based or ‘images’).

Some of the most common file formats are: GIF — an 8-bit (256 colour), non-
destructively compressed bitmap format. Mostly used for web. Has several sub-standards
one of which is the animated GIF.
JPEG — a very efficient (i.e. much information per byte) destructively compressed 24 bit
(16 million colours) bitmap format. Widely used, especially for web and Internet
(bandwidth-limited).

TIFF — the standard 24 bit publication bitmap format. Compresses nondestructively


with, for instance, Lempel-Ziv-Welch (LZW) compression.

PS — Postscript, a standard vector format. Has numerous sub-standards and can be


difficult to transport across platforms and operating systems.

PSD – a dedicated Photoshop format that keeps all the information in an image including
all the layers.

COLOURS

For science communication, the two main colour spaces are RGB and CMYK.

RGB

The RGB colour model relates very closely to the way we perceive colour with the r, g
and b receptors in our retinas. RGB uses additive colour mixing and is the basic colour
model used in television or any other medium that projects colour with light.

It is the basic colour model used in computers and for web graphics, but it cannot be used
for print production. The secondary colours of RGB – cyan, magenta, and yellow – are
formed by mixing two of the primary colours (red, green or blue) and excluding the third
colour.

Red and green combine to make yellow, green and blue to make cyan, and blue and red
form magenta. The combination of red, green, and blue in full intensity makes white.

In Photoshop using the “screen” mode for the different layers in an image will make the
intensities mix together according to the additive colour mixing model.

This is analogous to stacking slide images on top of each other and shining light through
them.
CMYK

The 4-colour CMYK model used in printing lays down overlapping layers of varying
percentages of transparent cyan (C), magenta (M) and yellow (Y) inks. In addition a layer
of black (K) ink can be added. The CMYK model uses the subtractive colour model.

Gamut
The range, or gamut, of human colour perception is quite large. The two colour spaces
discussed here span only a fra
ction of the colours we can see. Furthermore the two spaces do not have the same gamut,
meaning that converting from one colour space to the other may cause problems for
colours in the outer regions of the gamuts.

Astronomical images

Images of astronomical objects are usually taken with electronic detectors such as a CCD
(Charge Coupled Device). Similar detectors are found in normal digital cameras.
Telescope images are nearly always greyscale, but nevertheless contain some colour
information.

An astronomical image may be taken through a colour filter. Different detectors and
telescopes also usually have different sensitivities to different colours (wavelengths).

SOFTWARE AND HARDWARE REQUIREMENTS


 Operating system : Windows XP/7/8.
 Coding Language : MATLAB
 Tool : MATLAB R 2013 a

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.


 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.

CONCLUSION

We presented an algorithm to remove the camera shake blur in an image burst. The
algorithm is built on the idea that each image in the burst is generally differently blurred;
this being a consequence of the random nature of hand tremor.

By doing a weighted average in the Fourier domain, we reconstruct an image


combining the least attenuated frequencies in each frame.

Experimental results showed that the reconstructed image is sharper and less noisy than
the original ones.

This algorithm has several advantages. First, it does not introduce typical ringing or
overshooting artifacts present in most deconvolution algorithms.

This is avoided by not formu-lating the deblurring problem as an inverse problem of


decon-volution. The algorithm produces similar or better results than the state-of-the-art
multi-image deconvolution while being signiÞcantly faster and with lower memory
footprint.

We also presented a direct application of the Fourier Burst Accumulation algorithm to


HDR imaging with a hand-held camera.

As a future work, we would like to incorporate a gyroscope registration techniqu, to


create a real-time system for removing camera shake in image bursts.

A very related problem is how to determine the best capture strategy. Giving a total
exposure time, would it be more convenient to take several pictures with a short exposure
(i.e., noisy) or only a few with a larger exposure time Variants of these questions have
been previously tackled in the context of denoising/ deconvolution tradeoff.

We would like to explore this analysis using the Fourier Burst Accumulation principle.

You might also like