You are on page 1of 45

A DWT Based Approach for Image Steganography

ABSTRACT:
In this paper we propose a new steganography technique which embeds the secret messages in
frequency domain. According to different users demands on the embedding capacity and image
quality, the proposed algorithm is divided into two modes and 5 cases. Unlike the space domain
approaches, secret messages are embedded in the high frequency coefficients resulted from
Discrete Wavelet Transform. Coefficients in the low frequency sub-band are preserved unaltered
to improve the image quality. Some basic mathematical operations are performed on the secret
messages before embedding. These operations and a well-designed mapping Table keep the
messages away from stealing, destroying from unintended users on the internet and hence
provide satisfactory security.

INTRODUCTION
In a highly digitalized world we live today, computers help transforming analog data into digital
forms before storing and/or processing. In the mean while, the internet develops very fast and
hence becomes an important medium for digital data transmission. However, being a fully open
medium, the internet brought us not only convenience but also some hazards and risks. If the
data to be transmitted are confidential, it is convenient as well for some malicious users to
illegally copy, destroy, or change them on the internet. As a result, information security becomes
an essential issue . Various schemes for data hiding are developed recently . According to the
purposes of data hiding, these schemes are classified into two categories: watermarking and
steganography. Watermarking is a protecting technique which protects (claims) the authors
property right for images by some hidden watermarks. On the other hand, steganography
techniques apply some cover images to protect the confidential data from unintended internet
users. According to the domain where watermarks or confidential data are embedded, both
categories can be further classified as the time domain methods and the frequency domain
methods. Watermarking designs are usually consistent with the following features .
(1) Imperceptibility: Human eyes cannot distinguish the difference between the watermarked
image and the original version. In other words, the watermarked images still preserve high image
quality.
(2) Security: The watermarked image cannot be copied, modified, or deleted by any
animus observer.

(3) Robustness: The watermark still can be extracted out within certain acceptable quality even
the image has endured some signal processing or noises before extraction.
(4) Statistically undetectable: It is extremely hard (or impossible) to detect the watermark by
statistical and/or mathematical analysis.
(5) Blind detection: The extracting procedures have not to access the original image.
For spatial domain watermarking methods the processing is applied on the image pixel values
directly. In other words, the watermark is embedded in image by modifying the pixel values. The
advantage of this type of watermarking is easy and computationally fast. The disadvantage is its
low ability to bear some signal processing or noises. For frequency domain methods, the first
step is to transform the image data into frequency domain coefficients by some mathematical
tools (e.g. FFT, DCT, or DWT). Then, according to the different data characteristics generated by
these transforms, embed the watermark into the coefficients in frequency domain. After the
watermarked coefficients are transformed back to spatial domain, the entire embedding
procedure is completed. The advantage of this type of watermarking is the high ability to face
some signal processing or noises. However, methods of this type are computationally complex
and hence slower. The second category of data hiding is called Steganography. The methods are
designed to embed some confidential data into some cover-media (such as texts, voices, images,
and videos). After the confidential data are embedded, they are then transmitted together with the
cover-media. The major objective is to prevent some unintended observer from stealing or
destroying those confidential data. There are two things to be considered when designing a
steganography system:
(1) Invisibility: Human eyes cannot distinguish the difference between the original image and the
stego-image (the image with confidential data embedded in).
(2) Capacity: The more data an image can carry the better it is. However, large embedded data
usually degrade the image quality significantly. How one can increase the capacity without
ruining the invisibility is the key problem. The design of a steganography system also can be
categorized into spatial domain methods and frequency domain ones The advantages and
disadvantages are the same as those we mentioned about watermarking methods earlier.

DIGITAL IMAGE PROCESSING


BACKGROUND:
Digital image processing is an area characterized by the need for extensive experimental
work to establish the viability of proposed solutions to a given problem.

An important

characteristic underlying the design of image processing systems is the significant level of
testing & experimentation that normally is required before arriving at an acceptable solution.
This characteristic implies that the ability to formulate approaches &quickly prototype candidate
solutions generally plays a major role in reducing the cost & time required to arrive at a viable
system implementation.

What is DIP?
An image may be defined as a two-dimensional function f(x, y), where x & y are spatial
coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray
level of the image at that point. When x, y & the amplitude values of f are all finite discrete
quantities, we call the image a digital image. The field of DIP refers to processing digital image
by means of digital computer. Digital image is composed of a finite number of elements, each of
which has a particular location & value. The elements are called pixels.
Vision is the most advanced of our sensor, so it is not surprising that image play the single
most important role in human perception. However, unlike humans, who are limited to the visual
band of the EM spectrum imaging machines cover almost the entire EM spectrum, ranging from
gamma to radio waves. They can operate also on images generated by sources that humans are
not accustomed to associating with image.
There is no general agreement among authors regarding where image processing stops &
other related areas such as image analysis& computer vision start. Sometimes a distinction is
made by defining image processing as a discipline in which both the input & output at a process
are images. This is limiting & somewhat artificial boundary. The area of image analysis (image
understanding) is in between image processing & computer vision.

There are no clear-cut boundaries in the continuum from image processing at one end to
complete vision at the other. However, one useful paradigm is to consider three types of
computerized processes in this continuum: low-, mid-, & high-level processes. Low-level
process involves primitive operations such as image processing to reduce noise, contrast
enhancement & image sharpening. A low- level process is characterized by the fact that both its
inputs & outputs are images. Mid-level process on images involves tasks such as segmentation,
description of that object to reduce them to a form suitable for computer processing &
classification of individual objects. A mid-level process is characterized by the fact that its inputs
generally are images but its outputs are attributes extracted from those images. Finally higherlevel

processing involves Making sense of an ensemble of recognized objects, as in image

analysis & at the far end of the continuum performing the cognitive functions normally
associated with human vision.
Digital image processing, as already defined is used successfully in a broad range of areas
of exceptional social & economic value.

What is an image?
An image is represented as a two dimensional function f(x, y) where x and y are spatial coordinates and the amplitude of f at any pair of coordinates (x, y) is called the intensity of the
image at that point.

Gray scale image:


A grayscale image is a function I (xylem) of the two spatial coordinates of the image plane.
I(x, y) is the intensity of the image at the point (x, y) on the image plane.
I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] [0, b]I:
[0, a] [0, b] [0, info)

Color image:
It can be represented by three functions, R (xylem) for red, G (xylem) for green and B
(xylem) for blue.
An image may be continuous with respect to the x and y coordinates and

also in

amplitude. Converting such an image to digital form requires that the coordinates as well as the
amplitude to be digitized. Digitizing the coordinates values is called sampling. Digitizing the
amplitude values is called quantization.

Coordinate convention:
The result of sampling and quantization is a matrix of real numbers. We use two principal
ways to represent digital images. Assume that an image f(x, y) is sampled so that the resulting
image has M rows and N columns. We say that the image is of size M X N. The values of the
coordinates (xylem) are discrete quantities. For notational clarity and convenience, we use
integer values for these discrete coordinates. In many image processing books, the image origin
is defined to be at (xylem)=(0,0).The next coordinate values along the first row of the image are
(xylem)=(0,1).It is important to keep in mind that the notation (0,1) is used to signify the second
sample along the first row. It does not mean that these are the actual values of physical
coordinates when the image was sampled. Following figure shows the coordinate convention.
Note that x ranges from 0 to M-1 and y from 0 to N-1 in integer increments.
The coordinate convention used in the toolbox to denote arrays is different from the
preceding paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the
notation (race) to indicate rows and columns. Note, however, that the order of coordinates is the
same as the order discussed in the previous paragraph, in the sense that the first element of a
coordinate topples, (alb), refers to a row and the second to a column. The other difference is that
the origin of the coordinate system is at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to
N in integer increments. IPT documentation refers to the coordinates. Less frequently the toolbox
also employs another coordinate convention called spatial coordinates which uses x to refer to
columns and y to refers to rows. This is the opposite of our use of variables x and y.

Image as Matrices:
The preceding discussion leads to the following representation for a digitized image
function:
f (0,0)

f(0,1)

..

f(0,N-1)

f(1,0)

f(1,1)

f(1,N-1)

f(xylem)=

.
.

f(M-1,0) f(M-1,1) f(M-1,N-1)


The right side of this equation is a digital image by definition. Each element of this array is
called an image element, picture element, pixel or pel. The terms image and pixel are used
throughout the rest of our discussions to denote a digital image and its elements.
A digital image can be represented naturally as a MATLAB matrix:
f(1,1) f(1,2) . f(1,N)
f(2,1)
f=

f(2,2) .. f(2,N)

f(M,1) f(M,2) .f(M,N)


Where f(1,1) = f(0,0) (note the use of a monoscope font to denote

MATLAB quantities). Clearly the two representations are identical, except for the shift in origin.
The notation f(p ,q) denotes the element located in row p and the column q. For example f(6,2)
is the element in the sixth row and second column of the matrix f. Typically we use the letters M
and N respectively to denote the number of rows and columns in a matrix. A 1xN matrix is
called a row vector whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a scalar.

Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and
so on. Variables must begin with a letter and contain only letters, numerals and underscores. As
noted in the previous paragraph, all MATLAB quantities are written using mono-scope
characters. We use conventional Roman, italic notation such as f(x ,y), for mathematical
expressions

Reading Images:
Images are read into the MATLAB environment using function imread whose syntax is
imread(filename)
Format name
TIFF

Description
Tagged Image File Format

recognized extension
.tif, .ti

JPEG

Joint Photograph Experts Group

.jpg, .jpeg

GIF

Graphics Interchange Format

.gif

BMP

Windows Bitmap

.bmp

PNG

Portable Network Graphics

XWD

X Window Dump

.png
.xwd

Here filename is a spring containing the complete of the image file(including any applicable
extension).For example the command line
>> f = imread (8. jpg);
reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes
() to delimit the string filename. The semicolon at the end of a command line is used by
MATLAB for suppressing output. If a semicolon is not included. MATLAB displays the results

of the operation(s) specified in that line. The prompt symbol(>>) designates the beginning of a
command line, as it appears in the MATLAB command window.
When as in the preceding command line no path is included in filename, imread reads the
file from the current directory and if that fails it tries to find the file in the MATLAB search path.
The simplest way to read an image from a specified directory is to include a full or relative path
to that directory in filename.
For example,
>> f = imread ( D:\myimages\chestxray.jpg);
reads the image from a folder called my images on the D: drive, whereas
>> f = imread( . \ myimages\chestxray .jpg);
reads the image from the my images subdirectory of the current of the current working
directory. The current directory window on the MATLAB desktop toolbar displays MATLABs
current working directory and provides a simple, manual way to change it. Above table lists
some of the most of the popular image/graphics formats supported by imread and imwrite.

Data Classes:
Although we work with integers coordinates the values of pixels themselves are not
restricted to be integers in MATLAB. Table above list various data classes supported by
MATLAB and IPT are representing pixels values. The first eight entries in the table are refers to
as numeric data classes. The ninth entry is the char class and, as shown, the last entry is referred
to as logical data class.
All numeric computations in MATLAB are done in double quantities, so this is also a
frequent data class encounter in image processing applications. Class unit 8 also is encountered
frequently, especially when reading data from storages devices, as 8 bit images are most common
representations found in practice. These two data classes, classes logical, and, to a lesser degree,
class unit 16 constitute the primary data classes on which we focus. Many ipt functions however

support all the data classes listed in table. Data class double requires 8 bytes to represent a
number uint8 and int 8 require one byte each, uint16 and int16 requires 2bytes and unit 32.

Image Types:
The toolbox supports four types of images:
1 .Intensity images
2. Binary images
3. Indexed images
4. R G B images
Most monochrome image processing operations are carried out using binary or intensity
images, so our initial focus is on these two image types. Indexed and RGB colour images.

Intensity Images:
An intensity image is a data matrix whose values have been scaled to represent intentions.
When the elements of an intensity image are of class unit8, or class unit 16, they have integer
values in the range [0,255] and [0, 65535], respectively. If the image is of class double, the
values are floating _point numbers. Values of scaled, double intensity images are in the range [0,
1] by convention.

Binary Images:
Binary images have a very specific meaning in MATLAB.A binary image is a logical array
0s and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not considered
as a binary image in MATLAB .A numeric array is converted to binary using function logical.
Thus, if A is a numeric array consisting of 0s and 1s, we create an array B using the statement.

B=logical (A)
If A contains elements other than 0s and 1s.Use of the logical function converts all nonzero
quantities to logical 1s and all entries with value 0 to logical 0s.
Using relational and logical operators also creates logical arrays.
To test if an array is logical we use the I logical function:
islogical(c)
If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be
converted to numeric arrays using the data class conversion functions.

Indexed Images:
An indexed image has two components:
A data matrix integer, x.
A color map matrix, map.
Matrix map is an m*3 arrays of class double containing floating_ point values in the range
[0, 1].The length m of the map are equal to the number of colors it defines. Each row of map
specifies the red, green and blue components of a single color. An indexed images uses direct
mapping of pixel intensity values color map values. The color of each pixel is determined by
using the corresponding value the integer matrix x as a pointer in to map. If x is of class double
,then all of its components with values less than or equal to 1 point to the first row in map, all
components with value 2 point to the second row and so on. If x is of class units or unit 16, then
all components value 0 point to the first row in map, all components with value 1 point to the
second and so on.

RGB Image:

An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet
corresponding to the red, green and blue components of an RGB image, at a specific spatial
location. An RGB image may be viewed as stack of three gray scale images that when fed in to
the red, green and blue inputs of a color monitor
Produce a color image on the screen. Convention the three images forming an RGB color
image are referred to as the red, green and blue components images. The data class of the
components images determines their range of values. If an RGB image is of class double the
range of values is [0, 1].
Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or unit
16 respectively. The number of bits use to represents the pixel values of the component images
determines the bit depth of an RGB image. For example, if each component image is an 8bit
image, the corresponding RGB image is said to be 24 bits deep.
Generally, the number of bits in all component images is the same. In this case the number
of possible color in an RGB image is (2^b) ^3, where b is a number of bits in each component
image. For the 8bit case the number is 16,777,216 colors.

LITERATURE REVIEW:
Since the rise of the Internet one of the most important factors of information technology
and communication has been the security of information. Cryptography was created as a
technique for securing the secrecy of communication and many different methods have
been developed to encrypt and decrypt data in order to keep the message secret.
Unfortunately it is sometimes not enough to keep the contents of a message secret, it may
also be necessary to keep the existence of the message secret. The technique used to
implement this, is called steganography.
Steganography is the art and science of invisible communication. This is accomplished
through hiding information in other information, thus hiding the existence of the

communicated information. The word steganography is derived from the Greek words
stegos meaning cover and grafia meaning writing
defining it as covered writing. In image steganography the information is hidden
exclusively in images. The idea and practice of hiding information has a long history. In
Histories the Greek historian Herodotus writes of a nobleman, Histaeus, who needed to
communicate with his son-in-law in Greece. He shaved the head of one of his most
trusted slaves and tattooed the message onto the slaves scalp. When the slaves hair grew
back the slave was dispatched with the hidden message. In the Second World War the
Microdot technique was developed by the Germans. Information, especially photographs,
was reduced in size until it was the size of a typed period. Extremely difficult to detect, a
normal cover message was sent over an insecure channel with one of the periods on the
paper containing hidden information. Today steganography is mostly used on computers
with digital data being the carriers and networks being the high speed delivery channels.
Steganography differs from cryptography in the sense that where cryptography focuses
on keeping the contents of a message secret, steganography focuses on keeping the
existence of a message secret. Steganography and cryptography are both ways to protect
information from unwanted parties but neither technology alone is perfect and can be
compromised. Once the presence of hidden information is revealed or even suspected, the
purpose of steganography is partly defeated. The strength of steganography can thus be
amplified by combining it with cryptography.

Two other technologies that are closely related to steganography are watermarking and
fingerprinting. These technologies are mainly concerned with the protection of
intellectual property, thus the algorithms have different requirements than steganography.
These requirements of a good steganographic algorithm will be discussed below. In
watermarking all of the instances of an object are marked in the same way. The kind of
information hidden in objects when using watermarking is usually a signature to signify

origin or ownership for the purpose of copyright protection. With fingerprinting on the
other hand, different, unique marks are embedded in distinct copies of the carrier object
that are supplied to different customers. This enables the intellectual property owner to
identify customers who break their licensing agreement by supplying the property to third
parties.

In watermarking and fingerprinting the fact that information is hidden inside the files
may be public knowledge sometimes it may even be visible while in steganography
the imperceptibility of the information is crucial. A successful attack on a steganographic
system consists of an adversary observing that there is information hidden inside a file,
while a successful attack on a watermarking or fingerprinting system would not be to
detect the mark, but to remove it.
Research in steganography has mainly been driven by a lack of strength in cryptographic
systems. Many governments have created laws to either limit the strength of a
cryptographic system or to prohibit it altogether, forcing people to study other methods of
secure information transfer. Businesses have also started to realize the potential of
steganography in communicating trade secrets or new product information. Avoiding
communication through well-known channels greatly reduces the risk of information
being leaked in transit. Hiding information in a photograph of the company picnic is less
suspicious than communicating an encrypted file.

This project intends to offer a state of the art overview of the different algorithms used for
image steganography to illustrate the security potential of steganography for business and
personal use. After the overview it briefly reflects on the suitability of various image
steganography techniques for various applications. This reflection is based on a set of
criteria that we have identified for image steganography.

Steganography concepts
Although steganography is an ancient subject, the modern formulation of it is often given
in terms of the prisoners problem proposed by Simmons, where two inmates wish to
communicate in secret to hatch an escape plan. All of their communication passes
through a warden who will throw them in solitary confinement should she suspect any
covert communication.

The warden, who is free to examine all communication exchanged between the inmates,
can either be passive or active. A passive warden simply examines the communication to
try and determine if it potentially contains secret information. If she suspects a
communication to contain hidden information, a passive warden takes note of the
detected covert communication, reports this to some outside party and lets the message
through without blocking it. An active warden, on the other hand, will try to alter the
communication with the suspected hidden information deliberately, in order to remove
the information.

Different kinds of steganography


Almost all digital file formats can be used for steganography, but the formats that are
more suitable are those with a high degree of redundancy. Redundancy can be defined as
the bits of an object that provide accuracy far greater than necessary for the objects use
and display. The redundant bits of an object are those bits that
can be altered without the alteration being detected easily. Image and audio files
especially comply with this requirement, while research has also uncovered other file
formats that can be used for information hiding.

Figure 1 shows the four main categories of file formats that can be used for
steganography.

Figure 1: Categories of steganography

Hiding information in text is historically the most important method of steganography.


An obvious method was to hide a secret message in every nth letter of every word of a
text message. It is only since the beginning of the Internet and all the different digital file
formats that is has decreased in importance. Text steganography using digital files is not
used very often since text files have a very small amount of redundant data. Given the
proliferation of digital images, especially on the Internet, and given the large amount of
redundant bits present in the digital representation of an image, images are the most
popular cover objects for steganography.

This project will focus on hiding information in images in the next sections.
To hide information in audio files similar techniques are used as for image files. One
different technique unique to audio steganography is masking, which exploits the
properties of the human ear to hide information unnoticeably. A faint, but audible, sound
becomes inaudible in the presence of another louder audible sound.

This property creates a channel in which to hide information. Although nearly equal to
images in steganographic potential, the larger size of meaningful audio files makes them
less popular to use than images.

The term protocol steganography refers to the technique of embedding information


within messages and network control protocols used in network transmission. In the
layers of the OSI network model there exist covert channels where steganography can be
used An example of where information can be hidden is in the header of a TCP/IP packet
in some fields that are either optional or are never used.

Image steganography
As stated earlier, images are the most popular cover objects used for steganography. In
the domain of digital images many different image file formats exist, most of them for
specific applications. For these different image file formats, different steganographic
algorithms exist.

Image definition
To a computer, an image is a collection of numbers that constitute different light
intensities in different areas of the image. This numeric representation forms a grid and
the individual points are referred to as pixels. Most images on the Internet consists of a
rectangular map of the images pixels (represented as bits) where each pixel is located
and its colour. These pixels are displayed horizontally row by row. The number of bits in
a colour scheme, called the bit depth, refers to the number of bits used for each pixel.

The smallest bit depth in current colour schemes is 8, meaning that there are 8 bits used
to describe the colour of each pixel. Monochrome and greyscale images use 8 bits for
each pixel and are able to display 256 different colours or shades of grey. Digital colour
images are typically stored in 24-bit files and use the RGB
colour model, also known as true colour. All colour variations for the pixels of a 24-bit
image are derived from three primary colours: red, green and blue, and each primary
colour is represented by 8 bits. Thus in one given pixel, there can be 256 different
quantities of red, green and blue, adding up to more than 16-million
combinations, resulting in more than 16-million colour. Not surprisingly the larger
amount of colours that can be displayed, the larger the file size..
Image steganography techniques:
Image steganography techniques can be divided into two groups: those in the Image
Domain and those in the Transform Domain.

Image also known as spatial domain techniques embed messages in the intensity of
the pixels directly, while for transform also known as frequency domain, images are
first transformed and then the message is embedded in the image.

Image domain techniques encompass bit-wise methods that apply bit insertion and noise
manipulation and are sometimes characterised as simple systems. The image formats
that are most suitable for image domain steganography are lossless and the techniques are
typically dependent on the image format.

Steganography in the transform domain involves the manipulation of algorithms and


image transforms. These methods hide messages in more significant areas of the cover
image, making it more robust. Many transform domain methods are independent of the
image format and the embedded message may survive conversion between lossy and
lossless compression.

Image Domain
Least Significant Bit
Least significant bit (LSB) insertion is a common, simple approach to embedding
information in a cover image. The least significant bit (in other words, the 8th bit) of
some or all of the bytes inside an image is changed to a bit of the secret message. When
using a 24-bit image, a bit of each of the red, green and blue colour components can be
used, since they are each represented by a byte. In other words, one can store 3 bits in
each pixel. An 800 600 pixel image, can thus store a total amount of 1,440,000 bits or
180,000 bytes of embedded data.

For example a grid for 3 pixels of a 24-bit image can be as follows:

(00101101 00011100 11011100)


(10100110 11000100 00001100)
(11010010 10101101 01100011)
When the number 200, which binary representation is 11001000, is embedded into the
least significant bits of this part of the image, the resulting grid is as follows:
(00101101 00011101 11011100)
(10100110 11000101 00001100)
(11010010 10101100 01100011)
Although the number was embedded into the first 8 bytes of the grid, only the 3
underlined bits needed to be changed according to the embedded message. On average,
only half of the bits in an image will need to be modified to hide a secret message using
the maximum cover size. Since there are 256 possible intensities of
each primary colour, changing the LSB of a pixel results in small changes in the intensity
of the colours. These changes cannot be perceived by the human eye - thus the message is
successfully hidden. With a well-chosen image, one can even hide the message in the
least as well as second to least significant bit.

In the above example, consecutive bytes of the image data from the first byte to the
end of the message are used to embed the information. This approach is very easy to
detect. A slightly more secure system is for the sender and receiver to share a secret key
that specifies only certain pixels to be changed. Should an
adversary suspect that LSB steganography has been used, he has no way of knowing
which pixels to target without the secret key.

In its simplest form, LSB makes use of BMP images, since they use lossless compression.
Unfortunately to be able to hide a secret message inside a BMP file, one would require a
very large cover image. Nowadays, BMP images of 800 600 pixels are not often used
on the Internet and might arouse suspicion. For this reason, LSB steganography has also
been developed for use with other image file formats.

LSB and Palette Based Images


Palette based images, for example GIF images, are another popular image file format
commonly used on the Internet. By definition a GIF image cannot have a bit depth
greater than 8, thus the maximum number of colours that a GIF can store is 256. GIF
images are indexed images where the colours used in the image are stored in a palette,
sometimes referred to as a colour lookup table. Each pixel is represented as a single byte
and the pixel data is an index to the colour palette. The colours of the palette are typically
ordered from the most used colour to the least used colours to reduce lookup time.

GIF images can also be used for LSB steganography, although extra care should be taken.
The problem with the palette approach used with GIF images is that should one change
the least significant bit of a pixel, it can result in a completely different colour since the
index to the colour palette is changed. If adjacent palette entries are similar, there might
be little or no noticeable change, but should the adjacent palette entries be very dissimilar,
the change would be evident. One possible solution is to sort the palette so that the colour
differences between consecutive colours are minimized. Another solution is to add new
colours which are visually similar to the existing colours in the palette. This requires the
original image to have less unique colours than the maximum number of colours (this
value depends on the bit depth used). Using this approach, one should thus carefully

choose the right cover image. Unfortunately any tampering with the palette of an indexed
image leaves a very clear signature, making it easier to detect.
A final solution to the problem is to use greyscale images. In an 8-bit greyscale GIF
image, there are 256 different shades of grey. The changes between the colours are very
gradual, making it harder to detect.
WAVELETS
A wavelet is a wave-like oscillation with an amplitude that starts out at zero, increases, and then
decreases back to zero. It can typically be visualized as a "brief oscillation" like one might see
recorded by a seismograph or heart monitor. Generally, wavelets are purposefully crafted to have
specific properties that make them useful for signal processing. Wavelets can be combined, using
a "shift, multiply and sum" technique called convolution, with portions of an unknown signal to
extract information from the unknown signal.
For example, a wavelet could be created to have a frequency of Middle C and a short duration of
roughly a 32nd note. If this wavelet were to be convolved at periodic intervals with a signal
created from the recording of a song, then the results of these convolutions would be useful for
determining when the Middle C note was being played in the song. Mathematically, the wavelet
will resonate if the unknown signal contains information of similar frequency - just as a tuning
fork physically resonates with sound waves of its specific tuning frequency. This concept of
resonance is at the core of many practical applications of wavelet theory.
As a mathematical tool, wavelets can be used to extract information from many different kinds of
data, including - but certainly not limited to - audio signals and images. Sets of wavelets are
generally needed to analyze data fully. A set of "complementary" wavelets will deconstruct data
without gaps or overlap so that the deconstruction process is mathematically reversible. Thus,
sets of complementary wavelets are useful in wavelet based compression/decompression
algorithms where it is desirable to recover the original information with minimal loss.

1.1. Wavelet transforms


A wavelet is a mathematical function used to divide a given function or continuous-time signal
into different scale components. Usually one can assign a frequency range to each scale
component. Each scale component can then be studied with a resolution that matches its scale. A
wavelet transform is the representation of a function by wavelets. The wavelets are scaled and
translated copies (known as "daughter wavelets") of a finite-length or fast-decaying oscillating
waveform (known as the "mother wavelet"). Wavelet transforms have advantages over traditional
Fourier transforms for representing functions that have discontinuities and sharp peaks, and for
accurately deconstructing and reconstructing finite, non-periodic and/or non-stationary signals.
Wavelet transforms are classified into discrete wavelet transforms (DWTs) and continuous
wavelet transforms (CWTs). Note that both DWT and CWT are continuous-time (analog)
transforms. They can be used to represent continuous-time (analog) signals. CWTs operate over
every possible scale and translation whereas DWTs use a specific subset of scale and translation
values or representation grid.
Wavelet Transform:
a wavelet series is a representation of a square-integrable (real- or complex-valued) function by a
certain orthonormal series generated by a wavelet. This article provides a formal, mathematical
definition of an orthonormal wavelet and of the integral wavelet transform.The integral wavelet
transform is the integral transform defined as

The wavelet coefficients c jk

are then given by

Here, a = 2 j is called the binary dilation or dyadic dilation, and b = k2 j is the binary or dyadic
position.

There are a large number of wavelet transforms each suitable for different applications. For a full
list see list of wavelet-related transforms but the common ones are listed below:

Continuous wavelet transform (CWT)

Discrete wavelet transform (DWT)

Fast wavelet transform (FWT)

Stationary wavelet transform (SWT)

Lifting Scheme

Complex Wavelet Transform(CWT)

1.1.1. Continuous Wavelet Transform (CWT


A continuous wavelet transform (CWT) is used to divide a continuous-time function into
wavelets. Unlike Fourier transform, the continuous wavelet transform possesses the ability to
construct a time-frequency representation of a signal that offers very good time and frequency
localization. In mathematics, the continuous wavelet transform of a continuous, squareintegrable function x(t) at a scale a > 0 and translational value

is expressed by the

following integral

where (t) is a continuous function in both the time domain and the frequency domain called the
mother wavelet and

represents operation of complex conjugate. The main purpose of the

mother wavelet is to provide a source function to generate the daughter wavelets which are
simply the translated and scaled versions of the mother wavelet. To recover the original signal x
(t), inverse continuous wavelet transform can be exploited.

is the dual function of (t). And the dual function should satisfy

Sometimes,

, where

Is called the admissibility constant and

is the Fourier transform of . For a successful inverse

transform, the admissibility constant has to satisfy the admissibility condition:


.
It is possible to show that the admissibility condition implies that

, so that a wavelet

must integrate to zero.


Mother wavelet:
In general, it is preferable to choose a mother wavelet that is continuously differentiable with
compactly supported scaling function and high vanishing moments. A wavelet associated with a
multi resolution analysis is defined by the following two functions: the wavelet function (t),
and the scaling function

The scaling function is compactly supported if and only if the scaling filter h has a finite support,
and their supports are the same. For instance, if the support of the scaling function is [N1, N2],
then the wavelet is [(N1-N2+1)/2,(N2-N1+1)/2]. On the other hand, the kth moments can be
expressed by the following equation

If m0 = m1 = m2 = ..... = mp

= 0, we say (t) has p vanishing moments. The number of

vanishing moments of a wavelet analysis represents the order of a wavelet transform. According
to the Strang-Fix conditions, the error for an orthogonal wavelet approximation at scale a = 2

globally decays as aL, where L is the order of the transform. In other words, a wavelet transform
with higher order will result in better signal approximations.
Mathematically, the process of Fourier analysis is represented by the Fourier transform:
Which is the sum over all time of the signal f (t) multiplied by a complex exponential. (Recall
that a complex exponential can be broken down into real and imaginary sinusoidal components.)
The results of the transform are the Fourier coefficients
sinusoid of frequency

, which when multiplied by a

yield the constituent sinusoidal components of the original signal.

Graphically, the process looks like

Similarly, the continuous wavelet transform (CWT) is defined as the sum over all time of the
signal multiplied by scaled, shifted versions of the wavelet function

The results of the CWT are many wavelet coefficients C, which are a function of scale and
position.
Multiplying each coefficient by the appropriately scaled and shifted wavelet yields the
constituent wavelets of the original signal:

Continuous wavelets
Real-valued

Beta wavelet

Hermitian wavelet

Hermitian hat wavelet

Mexican hat wavelet

Shannon wavelet

Complex-valued

Complex Mexican hat wavelet

Morlet wavelet

Shannon wavelet

Modified Morlet wavelet

1.1.2. Discrete Wavelet Transform (DWT)


A discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are
discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier
transforms is temporal resolution: it captures both frequency and location information (location
in time).
Definition:
One level of the transform
The DWT of a signal x is calculated by passing it through a series of filters. First the samples are
passed through a low pass filter with impulse response g resulting in a convolution of the two:

The signal is also decomposed simultaneously using a high-pass filter h. the outputs giving the
detail coefficients (from the high-pass filter) and approximation coefficients (from the low-pass).
It is important that the two filters are related to each other and they are known as a quadrature
mirror filter.

However, since half the frequencies of the signal have now been removed, half the samples can
be discarded according to Nyquists rule. The filter outputs are then sub sampled by 2 (Mallat's
and the common notation is the opposite, g- high pass and h- low pass):

This decomposition has halved the time resolution since only half of each filter output
characterizes the signal. However, each output has half the frequency band of the input so the
frequency resolution has been doubled.
Examples:
Haar wavelets:
The first DWT was invented by the Hungarian mathematician Alfred Haar. For an input
represented by a list of 2n numbers, the Haar wavelet transform may be considered to simply pair
up input values, storing the difference and passing the sum.

This process is repeated recursively, pairing up the sums to provide the next scale: finally
resulting in 2n 1 differences and one final sum.
Daubechies wavelets
The most commonly used set of discrete wavelet transforms was formulated by the Belgian
mathematician Ingrid Daubechies in 1988. This formulation is based on the use of recurrence
relations to generate progressively finer discrete samplings of an implicit mother wavelet
function; each resolution is twice that of the previous scale. In her seminal paper, Daubechies
derives a family of wavelets, the first of which is the Haar wavelet. Interest in this field has
exploded since then, and many variations of Daubechies' original wavelets were developed.

Calculating wavelet coefficients at every possible scale is a fair amount of work, and it generates
an awful lot of data. What if we choose only a subset of scales and positions at which to make
our calculations?
It turns out, rather remarkably, that if we choose scales and positions based on powers of two -so-called dyadic scales and positions -- then our analysis will be much more efficient and just as
accurate. We obtain such an analysis from the discrete wavelet transform (DWT).
An efficient way to implement this scheme using filters was developed in 1988 by Mallat. The
Mallat algorithm is in fact a classical scheme known in the signal processing community as a
two-channel sub band coder. This very practical filtering algorithm yields a fast wavelet
transform -- a box into which a signal passes, and out of which wavelet coefficients quickly
emerge.
Given a signal s of length N, the DWT consists of log2N stages at most. Starting from s, the first
step produces two sets of coefficients: approximation coefficients cA1, and detail coefficients
cD1.

These vectors are obtained by convolving s with the low-pass filter Lo_ D for approximation,
and with the high-pass filter Hi_ D for detail, followed by dyadic decimation.
More precisely, the first step is

The length of each filter is equal to 2N. If n = length (s), the signals F and G are of length
n + 2N - 1, and then the coefficients cA1 and cD1 are of length

The next step splits the approximation coefficients cA1 in two parts using the same scheme,
replacing s by cA1 and producing cA2 and cD2, and so on.
Applications of discrete wavelet transform:
Generally, an approximation to DWT is used for data compression if signal is already sampled,
and the CWT for signal analysis. Thus, DWT approximation is commonly used in engineering
and computer science, and the CWT in scientific research.

Wavelet transforms are now being adopted for a vast number of applications, often replacing the
conventional Fourier Transform. Many areas of physics have seen this paradigm shift, including
molecular dynamics, astrophysics, density-matrix localization, seismology, optics, turbulence
and quantum mechanics. This change has also occurred in image processing, blood-pressure,
heart-rate and ECG analyses, DNA analysis, protein analysis, climatology, general signal
processing, speech recognition, computer graphics and multi fractal analysis. In computer vision
and image processing, the notion of scale-space representation and Gaussian derivative operators
is regarded as a canonical multi-scale representation.
One use of wavelet approximation is in data compression. Like some other transforms, wavelet
transforms can be used to transform data, then encode the transformed data, resulting in effective
compression. For example, JPEG 2000 is an image compression standard that uses bi-orthogonal
wavelets. This means that although the frame is over complete, it is a tight frame, and the same

frame functions (except for conjugation in the case of complex wavelets) are used for both
analysis and synthesis, i.e., in both the forward and inverse transform. For details see wavelet
compression.
A related use is for smoothing/de-noising data based on wavelet coefficient thresholding, also
called wavelet shrinkage. By adaptively thresholding the wavelet coefficients that correspond to
undesired frequency components smoothing and/or de-noising operations can be performed.
Wavelet transforms are also starting to be used for communication applications. Wavelet OFDM
is the basic modulation scheme used in HD-PLC (a power line communications technology
developed by Panasonic), and in one of the optional modes included in the IEEE 1901 standard.
Wavelet OFDM can achieve deeper notches than a traditional FFT OFDM system does not
require a guard interval (which usually represents significant overhead in FFT OFDM systems.

1.1.3. Fast Wavelet Transform (FWT)


The Fast Wavelet Transform is a mathematical algorithm designed to turn a waveform or signal
in the time domain into a sequence of coefficients based on an orthogonal basis of small finite
waves, or wavelets. The transform can be easily extended to multidimensional signals, such as
images, where the time domain is replaced with the space domain.
It has as theoretical foundation the device of a finitely generated, orthogonal multi resolution
analysis (MRA). In the terms given there, one selects a sampling scale J with sampling rate of 2J
per unit interval, and projects the given signal f onto the space VJ; in theory by computing the
scalar products

Where is the scaling function of the chosen wavelet transform; in practice by any suitable
sampling procedure under the condition that the signal is highly oversampled, so

Is the orthogonal projection or at least some good approximation of the original signal in V J .

The MRA is characterized by its scaling sequence

or, as Z-transform,
And its wavelet sequence

or
(Some coefficients might be zero). Those allow to compute the wavelet coefficients

, at least

some range k=M,-----, J-1, without having to approximate the integrals in the corresponding
scalar products. Instead, one can directly, with the help of convolution and decimation operators,
compute those coefficients from the first approximation s (J)
Forward DWT:
One computes recursively, starting with the coefficient sequence s(J) and counting down from
k=J-1 down to some M<J,

Single application of a wavelet filter bank, with filters g=a*, h=b*

or
And

or
For k=J-1,J-2,...,M and all

. In the Z-transform notation:

Recursive application of the filter bank

The down sampling operator

reduces an infinite sequence, given by its Z-

transform, which is simply a Laurent series, to the sequence of the coefficients

with even indices,

The starred Laurent-polynomial a

(z) denotes the ad-joint filter, it has time-

reversed ad joint coefficients,

. (The ad-joint of a real

number being the number itself, of a complex number its conjugate, of a real
matrix the transposed matrix, of a complex matrix its hermitian ad-joint).

Multiplication is polynomial multiplication, which is equivalent to the


convolution of the coefficient sequences.

It follows that

Is the orthogonal projection of the original signal f or at least of the first approximation PJ[f](x)
onto the subspace V k , that is, with sampling rate of 2 k per unit interval. The difference to the
first approximation is given by
,
Where the difference or detail signals are computed from the detail coefficients as

,
With denoting the mother wavelet of the wavelet transform.
1.1.4. Stationary Wavelet Transform (SWT)

We know that the classical DWT suffers a drawback: the DWT is not a time- invariant transform.
This means that, even with periodic signal extension, the DWT of a translated version of a signal
X is not, in general, the translated version of the DWT of X.
The Stationary wavelet transform (SWT) is a wavelet transform algorithm designed to
overcome the lack of translation-invariance of the discrete wavelet transform (DWT).
Translation-invariance is achieved by removing the down samplers and up samplers in the DWT
and up sampling the filter coefficients by a factor of 2(j 1) in the

jt h

level of the algorithm.

The SWT is an inherently redundant scheme as the output of each level of

SWT contains the same number of samples as the input so for a decomposition of N levels
there is a redundancy of N in the wavelet coefficients.
The idea is to average some slightly different DWT, called -decimated DWT, to define the
stationary wavelet transform (SWT). This property is useful for several applications such as
breakdown points detection. The main application of the SWT is de-noising.
The principle is to average several de-noised signals. Each of them is obtained using the usual
de-noising scheme, but applied to the coefficients of an e-decimated DWT.
There is a restriction: we define the SWT only for signals of length divisible by 2J, where J is the
maximum decomposition level, and we use the DWT with periodic extension.
Implementation:
The following block diagram depicts the digital implementation of SWT.

A 3 level SWT filter bank


In the above diagram, filters in each level are up-sampled versions of the previous (see figure
below).

SWT filters
Applications:
A few applications of SWT are specified below.

Signal de-noising

Pattern recognition

1.1.5. Lifting scheme


The lifting scheme is a technique for both designing wavelets and performing the discrete
wavelet transform. Actually it is worthwhile to merge these steps and design the wavelet filters
while performing the wavelet transform. This is then called the second generation wavelet
transform. The technique was introduced by Wim Sweldens.
The discrete wavelet transform applies several filters separately to the same signal. In contrast to
that, for the lifting scheme the signal is divided like a zipper. Then a series of convolutionaccumulate operations across the divided signals is applied.
The basic idea of lifting is the following: If a pair of filters (h, g) is complementary, that is it
allows for perfect reconstruction, then for every filter s the pair (h', g) with

allows for perfect reconstruction, too. Of course, this is also true for every pair (h, g') of the
form.

The converse is also true: If the filter banks (h, g) and (h', g) allow for perfect reconstruction,
then there is a unique filter s with
.
Each such transform of the filter bank (or the respective operation in a wavelet transform) is
called a lifting step. A sequence of lifting steps consists of alternating lifts, that is, once the low
pass is fixed and the high pass is changed and in the next step the high pass is fixed and the low
pass is changed. Successive steps of the same direction can be merged.

1.1.6. Complex Wavelet Transform (CWT)


The complex wavelet transform (CWT) is a complex-valued extension to the standard discrete
wavelet transform (DWT). It is a two-dimensional wavelet transform which provides multi
resolution, sparse representation, and useful characterization of the structure of an image.
Further, it purveys a high degree of shift-invariance in its magnitude. However, a drawback to
this transform is that it is exhibits 2d (where d is the dimension of the signal being transformed)
redundancy compared to a separable (DWT).
The use of complex wavelets in image processing was originally set up in 1995 by J.M. Lina and
L. Gagnon in the framework of the Daubechies orthogonal filters banks. It was then generalized
in 1997 by Prof. Nick Kingsbury of Cambridge University.
In the area of computer vision, by exploiting the concept of visual contexts, one can quickly
focus on candidate regions, where objects of interest may be found, and then compute additional
features through the CWT for those regions only. These additional features, while not necessary
for global regions, are useful in accurate detection and recognition of smaller objects. Similarly,
the CWT may be applied to detect the activated voxels of cortex and additionally the temporal
independent component analysis (tICA) may be utilized to extract the underlying independent
sources whose number is determined by Bayesian information criterion.

INTRODUCTION TO MATLAB
What Is MATLAB?

MATLAB is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where problems and
solutions are expressed in familiar mathematical notation. Typical uses include

Math and computation


Algorithm development
Data acquisition
Modeling, simulation, and prototyping
Data analysis, exploration, and visualization
Scientific and engineering graphics
Application development, including graphical user interface building.

MATLAB is an interactive system whose basic data element is an array that does not
require dimensioning. This allows you to solve many technical computing problems, especially
those with matrix and vector formulations, in a fraction of the time it would take to write a
program in a scalar non interactive language such as C or FORTRAN.

The name MATLAB stands for matrix laboratory. MATLAB was originally written to
provide easy access to matrix software developed by the LINPACK and EISPACK projects.
Today, MATLAB engines incorporate the LAPACK and BLAS libraries, embedding the state of
the art in software for matrix computation.

MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in

mathematics, engineering, and science. In industry, MATLAB is the tool of choice for highproductivity research, development, and analysis.
MATLAB features a family of add-on application-specific solutions called toolboxes. Very
important to most users of MATLAB, toolboxes allow you to learn and apply specialized
technology. Toolboxes are comprehensive collections of MATLAB functions (M-files) that
extend the MATLAB environment to solve particular classes of problems. Areas in which
toolboxes are available include signal processing, control systems, neural networks, fuzzy logic,
wavelets, simulation, and many others.

The MATLAB System:


The MATLAB system consists of five main parts:

Development Environment:
This is the set of tools and facilities that help you use MATLAB functions and files. Many
of these tools are graphical user interfaces. It includes the MATLAB desktop and Command
Window, a command history, an editor and debugger, and browsers for viewing help, the
workspace, files, and the search path.

The MATLAB Mathematical Function:


This is a vast collection of computational algorithms ranging from elementary functions
like sum, sine, cosine, and complex arithmetic, to more sophisticated functions like matrix
inverse, matrix eigen values, Bessel functions, and fast Fourier transforms.

The MATLAB Language:


This is a high-level matrix/array language with control flow statements, functions, data
structures, input/output, and object-oriented programming features. It allows both "programming
in the small" to rapidly create quick and dirty throw-away programs, and "programming in the
large" to create complete large and complex application programs.

Graphics:
MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and
three-dimensional data visualization, image processing, animation, and presentation graphics. It
also includes low-level functions that allow you to fully customize the appearance of graphics as
well as to build complete graphical user interfaces on your MATLAB applications.

The MATLAB Application Program Interface (API):


This is a library that allows you to write C and Fortran programs that interact with
MATLAB. It includes facilities for calling routines from MATLAB (dynamic linking), calling
MATLAB as a computational engine, and for reading and writing MAT-files.

MATLAB working environment:


MATLAB desktoP:-

Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic.

The command window is where the user types MATLAB commands and expressions at
the prompt (>>) and where the output of those commands is displayed. MATLAB defines the
workspace as the set of variables that the user creates in a work session. The workspace browser
shows these variables and some information about them. Double clicking on a variable in the

workspace browser launches the Array Editor, which can be used to obtain information and
income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
work is a subdirectory of the main directory MATLAB; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which are
organize in directories in the computer file system. Any file run in MATLAB must reside in the
current directory or in a directory that is on search path. By default, the files supplied with
MATLAB and math works toolboxes are included in the search path. The easiest way to see
which directories are on the search path. The easiest way to see which directories are soon the
search paths, or to add or modify a search path, is to select set path from the File menu the
desktop, and then use the set path dialog box. It is good practice to add any commonly used
directories to the search path to avoid repeatedly having the change the current directory.
The Command History Window contains a record of the commands a user has entered in
the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands.

This action launches a menu from which to select various options in addition to executing the
commands. This is useful to select various options in addition to executing the commands. This
is a useful feature when experimenting with various commands in a work session.

Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files and a graphical
MATLAB debugger. The editor can appear in a window by itself, or it can be a sub window in
the desktop. M-files are denoted by the extension .m, as in pixelup.m. The MATLAB editor
window has numerous pull-down menus for tasks such as saving, viewing, and debugging files.
Because it performs some simple checks and also uses color to differentiate between various
elements of code, this text editor is recommended as the tool of choice for writing and editing Mfunctions. To open the editor, type edit at the prompt opens the M-file filename.m in an editor
window, ready for editing. As noted earlier, the file must be in the current directory, or in a
directory in the search path.

You might also like