You are on page 1of 10

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Web Site: www.ijettcs.org Email: editor@ijettcs.org


Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

IMAGE COMPRESSION USING GENETIC


PROGRAMMING
Dr. Ahmed M. Ebid
Lecturer, Faculty of Engineering & Technology, Future University, Cairo, Egypt

Abstract acceptably looks like the original (accepted accuracy


The fast growth in digital image applications such as web depends on the purpose). JPEG and JPEG2000 are the
sites, multimedia and even personal image archives most famous lossy image formats.
encouraged researchers to develop advanced techniques to
compress images. Many compression techniques where Compression efficiency could be presented by
introduced whether reversible or not. Most of those techniques
compression ratio CR (which is the ratio between
were based on statistical analysis of repetition or
mathematical transforming to reduce the size of the image. uncompressed and compressed image), while measuring
This research is concerning in applying Genetic programing compression quality is a little bit complicated.
(GP) technique in image compression. In order to achieve that Compression quality could be evaluated objectively using
goal, a parametric study was carried out to determine the Mean Square Error (MSE) and the Peak Signal to Noise
optimum combination of (GP) parameters to achieve Ratio (PSNR) as follows:
maximum quality and compression ratio. For simplicity the
study considered 256 level gray scale image. A special C++
software was developed to carry out all calculations, the
compressed images was rendered using Microsoft Excel.
Study results was compared with JPEG results as one of the
most popular lossy compression techniques. It is concluded
that using optimum (GP) parameters leads to acceptable
quality (objectively and subjectively) corresponding to
compression ratio ranged between 2.5 and 4.5.

Keywords: Image compression, Genetic Programing, GP.


Where f(x,y) is the pixel value of the original image, and
1. INTRODUCTION f’(x,y)is the pixel value of the compressed image, (L) is
the dynamic range of allowable pixel intensities,
1.1 Image Compression
The rapid increasing in using photos and images in web,
(L = 2 (No. of bits/pixel) -1).
social media, commercial applications, personal
archives,..etc encourages the developers to figure out new
Also, compression quality could be evaluated subjectively
techniques in order to reduce image size without
by human observer judgment (Very Poor, Poor, Good,
significant reduction in its quality, hence, image
Very Good or Excellent) or using structural similarity
compression becomes one of the main topics in digital
index (SSIM) as follows:
data processing researches. Image compression
techniques could be divided into two main types,
Reversible (Lossless) and Irreversible (Lossy).
Reversible techniques ,such as variable length, Bit-plan
and Huffman coding are based on eliminating redundant Where l(x,y), c(x,y) and s(x,y) present luminance, contrast
information and keep only the necessary information. The and structure components. Relativity importance factors
reconstructed image is exactly looks like the original (α, β, γ) are usually equal to (1.0). (SSIM) for two images
(error free reconstruction). GIF, PNG, and TIFF are the (x,y) could be presented in terms of their means (μx), (μy)
most famous lossless image formats. and standard deviations (x), (y) as follows:
Irreversible techniques, such as Karhunen-Loeve
Transform (KLT), Discrete Cosine Transform (DCT) and
Discrete Wavelet Transform(DWT) are based on
approximating the image using mathematical
transformation technique. The reconstructed image is

Volume 5, Issue 5, September – October 2016 Page 7


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

Where C1, C2 are small constants used to avoid variables part is (2 No. of levels) and total length of
mathematical instability when (μx2+ μy2) or (x2+ y2) are chromosome is (2 No. of levels+1 -1).
almost zero.
Two-points crossover technique was suggested by
In order to achieve maximum compression ratio, Riccardo (1996), where the re- generated member is
successive compression techniques may be used, for produced by mixing operators parts and variable parts of
example JPEG format uses Huffman coding (lossless) after two randomly selected survivors. The author developed
applying DCT (loosy). another crossover technique in (2004) called Random
crossover, where the re- generated member is produced
1.2 Genetic Programming gene by gene by random selection from many survivors
Genetic Programming (GP) (also kwon as Gene (typically all survivors). figure (3) shows both techniques.
expression programming (GEP)) is one of the most recent
developed knowledge-based techniques and it is the next
development to the Genetic Algorithm (GA). (GP) could
be classified as multivariable regression technique . The
basic concept of (GP) is to find the best fitting formula for
a certain given points using (GA) technique. (Koza,
1996).
Figure 1: The five basic mathematical operators in
Traditional (GA) based on generating random set of
(GP)
solutions for the considered problem called (population),
testing the success (fitness) of each solution in the
population, keeping the most fitting solutions (survivors)
and delete the rest of population , re-generating new
population by mixing parts of survivors (crossover) and/or
randomly change some parts of the generated new
solutions (mutation), repeating previous steps till
achieving acceptable fitting error.

For (GP), the population is a set of randomly generated


mathematical formulas. Usually, mathematical formula
could be presented in many ways, but in order to facilitate
applying genetic operations such as crossover and
mutation, mathematical formula must be presented in
unified form consists of a series of parameters with certain
length, this form called genetic form (or chromosome). Figure 2: Mathematical and genetic representation of
Binary tree form is used to convert any mathematical binary tree
formula into standard fixed format which can easily
presents in genetic form. Generally, any mathematical or
logical operator could be used in (GP), the only restrain is
that the operator should have two or less inputs and only
one output, the basic five mathematical operators are
(=,+,-,x,/) as shown in figure (1). Figure (2) shows an
example for a formula presented in mathematical format,
binary tree format and genetic format and it could be
noted that:
 Any formula in binary form could be divided into
levels, the more the level the more the formula
complexity.
 Genetic form (chromosome) is divided into two parts, (a) (b)
the upper part for operators and the lower part for Figure 3: (a) Two-point crossover, (b) Random
variables crossover
 Length of operators part is (2 No. of levels -1), length of

Volume 5, Issue 5, September – October 2016 Page 8


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

2. RESEARCH PLAN applying GP technique on each zone, collecting the results


in output file and compressing it using WinZip to produce
Research plan aims to determine the optimum
the compressed image. In order to apply GP on certain
combination of (GP) parameters to achieve maximum
zone, the software generates set of records to present the
quality and compression ratio. Since GP is used to find
data of this zone in usable form, each record contains X,
best fitting formula, hence, there will always be some
Y coordinates of each pixel (referring to zone edge) and
error in the results and so image compression using GP is
level of gray of that pixel. After generating the population,
irreversible (Loosy).
the software checks the singularities of each formula to
Success in compression were measured using (MSE) and
avoid any mathematical, overflow or underflow errors
(PSNR) for objective image quality, (SSIM) for subjective
during calculations, the defected formulas are replaced by
image quality and (CR) for compression efficiency.
new randomly generated ones. Then, the software picks
A parametric study was carried out to evaluate the effect
the survivors based on their sum of squared error (SSE)
of several parameters on compressed image quality and
and regenerates the population using random crossover
compression ratio, the considered parameters are sampling
and starts the next generation. Figure (4) shows the
size, number of levels, population size, number of
flowchart of the compression software.
generations.
A special C++ software were developed to carry out all
Output file contains the best fitting formula for each zone
compression and decompression calculations, while the
in genetic form, hence, its size should be (chromosome
rendering of both original and reconstructed images were
size x number of zones). Because operator gene can be (+,
carried out using built in cell color function in Microsoft
-, x, or /) and variable gene can be just (x or y) , both
Excel. (MSE), (PSNR) and (SSIM) were calculated using
types of genes is stored in 2 bits, so chromosome size
software written by Ming-Jun Chen,2010
equals to quarter chromosome length (2 No. of levels ) in
In order to maximize the compression ratio, compressed
bytes. It means that the size of output file (before applying
image by (GP) were recompressed using WinZip.
WinZip) depends only on number levels and zones
A well-known gray scale photo “lena.tif” is used as the
regardless the content of the image.
subject of the parametric study, it is 256x256 pixel with
256 level of gray scale, study results were verified using
Decompression software reads the compressed image,
“Cameraman.tif”.
unzip it, regenerates each zone by substituting in
Both “lena.tif” and “Cameraman.tif” was compressed
corresponding formula and finally puts all pieces together
using JPEG technique by commercial software “Snagit
to form the reconstructed image. Figure (5) shows the
Editor” with two different quality degrees (80% & 90%) to
flowchart of the decompression software.
be compared with study results.

3. METHODOLOGY
3.1 Parametric study
Because GP is a resource consuming technique, it is
important to figure out the optimum GP parameters that
produces best image quality, maximum compression ratio
with minimum resources. In order to achieve that goal, a
parametric study was designed to test the effect of
changing each parameter on the quality and size of
compressed image. The considered parameters in the
study and their values are as follows:
Sample size (2x2, 4x4, 8x8) pixels
Number of chromosome levels (2, 3, 4) levels
Population size (50, 100, 300)
Number of generations (25, 50, 100)

3.2 Compression & decompression software


Author had developed a general purpose GP software back
in 2004 using C++, this software is used as core for the
compression software, it starts with reading the original
image, dividing it into zones according to sample size, Figure (4): Flowchart of compression software

Volume 5, Issue 5, September – October 2016 Page 9


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

stage trials for visual quality judgment. From table (1) it


could be noted that (SSIM) increases with decreasing
sample size.

Figure (5): Flowchart of decompression software

3.3 Image rendering


Both original and reconstructed images are graphically
presented using cell color conditional formatting of (a)
Microsoft Excel.

4. RESULTS & COMMENTS


Results of the parametric study are summarized in table
(1), the tables shows GP parameters of each trial and the
corresponding (MSE), (PSNR), (SSIM), compressed file
size (before and after WinZip) and the compression ratio.
It should be noted that number of survivors is kept in
optimum range (25-35% of population size) in all trials to
speed up the iteration and prevent sticking in local
minimum. The study was divided into two stages as
follows:

First stage: (b)


Nine trials (from 1 to 9 ) was dedicated to study the effect Figure (6): Effect of Sample size & No. of levels on
of sampling size and number of levels on the quality and (MSE) and (CR).
size of compressed image, figure (6-a) present the relation
between sample size and (MSE), while figure (6-b) Second stage:
present the relation between sample size and compression This stage aims to study the effect of population size and
ratio for different number of levels. number of generations on the quality of compressed
image. Based on both quality and compression ratio
Figure (6-a) shows that (MSE) value decreases almost results, trials (1) and (5) were selected to be considered in
linearly with decreasing the sample size regardless the the second stage. Although trial (8) has the same quality
number of levels. While figure (6-b) shows that (CR) value and compression ratio of trial (1), but it was neglected
decreases with decreasing the sample size regardless the because it uses more complex formulas (4 levels) and
number of levels. Also, (CR) value decreases faster for consumes much more time.
simple formulas than complicated ones.
This stage consists of twelve trials (form 10 to 21). Trials
(CR) value of trial no. (7) is less than 1.00, which means it from (10) to (18) were dedicated to expand the study of
is inefficient combination because the size of compressed trial (1), while trials from (19) to (21) concerning in
image is larger than the original one. expanding the study of trial (5).
Figure (7-a) shows the relation between number of
Figure (8) shows the reconstructed images of the first generation and (MSE) for different population size, it

Volume 5, Issue 5, September – October 2016 Page 10


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

indicates that regardless population size, increasing Based on the previous results, it is recommended to use
number of generations enhance (MSE) value, but population size not less than 1000 and number of
enhancement rate decreases when number of generations generations up to 50 generations.
exceeded 50, while figure (7-b) shows the relation between
population size and (MSE) for different sample sizes, it Verification:
could be noted that the (MSE) value decreases with “Cameraman.tif” were used to verify the results of the
increasing population size and the rate of decreasing in study, it was compressed using parameters combinations
case of small sample size is faster than in case of large of trials (17), (21) and (8) besides the JPEG trials as
shows in table (2).
one.
The verification shows good matching with study results.
Figure (10) presents the reconstructed images of the three
verification trials.

5. CONCLUSIONS
Results of this research could be concluded as follows:
 Using (GP) technique in image compression shows
acceptable quality (SSIM and MSE) for compression
ratio ranged between 2.5 to 4.5
 Two (GP) parameters combinations showed both
accepted quality and compression ratio, those
combinations are:
 Sample size 2 x 2 pixels with 2 levels formula and
compression ratio of 2.5
 Sample size 4 x 4 pixels with 3 levels formula and
(a) compression ratio of 4.5
 For same compression ratio, the quality of (GP)
compressed images are so close to the JPEG ones.
 It is recommended to use population size not less than
1000 and number of generations up to 50 generations.
 Using Winzip as lossless compression tool after (GP)
compression enhance the compression ratio by about
20%.
 Results of this study are valid for gray scale image
and need to be verified for colored ones.
 (GP) is a very time and resources consuming
technique, it is recommended to use parallel
processing in farther studies.

ACKNOWLEDGEMENT
The author is very grateful to his colleague
Dr. Omar M. Fahmy, for his kind revision and useful
comments.

(b)
Figure (7): Effect of No. of generations & Population size
on (MSE)

Figure (9) shows the reconstructed images of the second


stage trials of the study. Table (1) shows that number of
generations has very minor effect on (SSIM) values.
Comparing (GP) study results with equivalent JPEG ones
(trial 17 with trial 23 & trial 21 with trial 22) shows that
for same compression ratio, the (SSIM) values are so
close.

Volume 5, Issue 5, September – October 2016 Page 11


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

Table (1): Summary for parametric study results


Trial Sample No. of Pop. No. of No. of Comp. Size (KB)
No. Size Levels Size Survivors Gen. MSE PSNR SSIM GP GP+WinZip CR
Effect of Sample size & No. of levels
1 2x 2 2 1000 300 100 59 30.4 0.970 32.0 25.9 2.5
2 4x 4 2 1000 300 100 175 25.7 0.926 8.0 6.6 9.7
3 8x 8 2 1000 300 100 365 22.5 0.830 2.0 1.7 37.6
4 2x 2 3 1000 300 100 17 35.8 0.983 64.0 50.6 1.3
5 4x 4 3 1000 300 100 105 27.9 0.958 16.0 13.1 4.9
6 8x 8 3 1000 300 100 240 24.3 0.884 4.0 3.4 18.8
7 2x 2 4 1000 300 100 9 38.6 0.987 128.0 102.4 0.6
8 4x 4 4 1000 300 100 82 29.0 0.967 32.0 25.7 2.5
9 8x 8 4 1000 300 100 202 25.1 0.906 8.0 6.6 9.7
Effect of Population size & No. of Generations
10 2x 2 2 100 30 25 95 28.4 0.955 32.5 25.7 2.5
11 2x 2 2 100 30 50 86 28.8 0.958 32.5 25.8 2.5
12 2x 2 2 100 30 100 81 29.0 0.960 32.5 25.8 2.5
13 2x 2 2 500 150 25 74 29.4 0.965 32.5 25.8 2.5
14 2x 2 2 500 150 50 67 29.9 0.966 32.5 25.9 2.5
15 2x 2 2 500 150 100 63 30.1 0.968 32.5 25.9 2.5
16 2x 2 2 1000 300 25 63 30.1 0.968 32.5 25.9 2.5
17 2x 2 2 1000 300 50 60 30.3 0.970 32.5 25.9 2.5
18 2x 2 2 1000 300 100 59 30.4 0.970 32.5 25.9 2.5
19 4x 4 3 100 30 50 136 26.8 0.950 16.0 12.9 5.0
20 4x 4 3 500 150 50 119 27.4 0.957 16.0 13.1 4.9
21 4x 4 3 1000 300 50 110 27.7 0.960 16.0 13.1 4.9
JPEG technique
22 ------------- 80% quality ------------ 29 33.5 0.98 - 14.0 4.6
23 -------------90% quality ------------ 11 37.7 0.99 - 22.0 2.9

Table (2): Summary for verification results

Trial Sample No. of Pop. No. of No. of Comp. Size (KB)


No. Size Levels Size Survivors Gen. MSE PSNR SSIM GP GP+WinZip CR
1 2x 2 2 1000 300 50 72 29.6 0.965 32.5 25.9 2.5
2 4x 4 3 1000 300 50 178 25.6 0.930 16.0 13.1 4.9
3 4x 4 4 1000 300 50 161 26.1 0.960 32.0 25.7 2.5
JPEG technique
4 ------------- 80% quality ------------ 20 35.1 0.98 - 13.0 4.9
5 -------------90% quality ------------ 7 39.7 0.99 - 20.0 3.2

Volume 5, Issue 5, September – October 2016 Page 12


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

Figure (8): Effect of Sample size & No. of levels on visual quality

Volume 5, Issue 5, September – October 2016 Page 13


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

Figure (9): Effect of population size on visual quality

Figure (10): Verifying study results using “cameraman” image

Volume 5, Issue 5, September – October 2016 Page 14


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org
Volume 5, Issue 5, September - October 2016 ISSN 2278-6856

References
[1] Ahmed M. Ebid, “Applications of genetic
programming in geotechnical engineering”, 2004,
Ph.D. thesis, Ain Shams University, Cairo, Egypt.
[2] Rafael C. Gonzalez, and Richard E. Woods, ”Digital
Image Processing” (2nd Edition), Prentice Hall,
2002, ISBN: 0201180758.
[3] John R. Koza, “Genetic Programming On the
Programming of Computers by Means of Natural
Selection”, Sixth printing, 1998 © 1992
Massachusetts Institute of Technology
[4] Pengwei Hao, “C++ for Image Processing”, Lecture
notes, Department of Computer Science Queen
Mary, University of London
[5] James Rosenthal, “JPEG Image Compression Using
an FPGA”,2006, M.Sc. Thesis, University of
California, USA.
[6] Wei-Yi Wei, “An Introduction to Image
Compression”, National Taiwan University, Taipei,
Taiwan, ROC
[7] Ja-Ling Wu, “Information Theory – part II: Image
Data Compression”, Lecture notes, Department of
Computer Science and Information Engineering
National Taiwan University
[8] Ming-Jun Chen and Alan C. Bovik, “Fast structural
similarity index algorithm”, Journal of Real-Time
Image Processing, pp. 1-7. August, 2010
[9] M. Kudelka Jr., “Image Quality Assessment”,
WDS'12 Proceedings of Contributed Papers, Part I,
94–99, 2012.
[10] M.R Bonyadi, E.Dehghani, and Mohsen Ebrahimi
Moghaddam, “A Non-uniform Image Compression
Using Genetic Algorithm”, 15th International
Conference on Systems, Signals and Image
Processing, IWSSIP 2008, Bratislava, Slovak
Republic.

Volume 5, Issue 5, September – October 2016 Page 15


International Journal of Emerging Trends & Technology in Computer Science
A Motivation for Recent Innovation & Research

ISSN 2278-6856

www.ijettcs.org
____________________________________________________________________________________________________________________________________________________

IJETTCS/Certificate/Volume5Issue5/IJETTCS-2016-09-06-5
Date:2016-11-10

CERTIFICATE

Volume 5, Issue 5, September - October 2016

http://www.ijettcs.org/Volume5Issue5/IJETTCS-2016-09-06-5.pdf

Editor in Chief,
International Journal of Emerging Trends & Technology in Computer Science
ISSN 2278-6856
Website: www.ijettcs.org
Email: editor@ijettcs.org

You might also like