You are on page 1of 16

SVM Based Model for Content Based Image Retrieval using Colour, Straight line and

Outline Signatures of the Image.


L.Jayanthi1 , Dr.K.Lakshmi2
1

Assistant Professor,Periyar Maniammai University, Thanjavur, Tamil Nadu, India

Dean, School of Computing Science and Engineering, Periyar Maniammai University.


Abstract
In this era of internet and multimedia, Content Based Image Retrieval (CBIR) plays a

vital role especially where the search speed and accuracy of the results are of utmost importance.
Many research scholars in the past have used various classifiers including Artificial Neural
Network (ANN), Genetic Algorithm (GA) while this work HAS employed Support Vector
Machines (SVM). In THIS method, feature extraction has been achieved by using combined
colour histograms for three regions (top, bottom and middle) of the image which would
indirectly include the space information within the image, besides the colour information. In
addition, straight line signatures and outline signatures of the images are also extracted for
improving the performance. Empirical study carried out using Semantics Sensitive Integrated
Matching for Picture Libraries (SIMPLIcity) database .Average precision and recall are used as
metrics to measure the performance of the system. Our proposed SVM based model for CBIR
using Colour, Straight line and outline signatures (SVM CSLOS) are compared with other
previous methods of Jhanwar et al,, Hung et al., Chuen et al.,and Elalami which used the same
data set and evaluation method. According to the arrived result, the performance of the proposed
SVM based model for CBIR using Colour , Straight Line and Outline Signature method provided
improvement in performance.
Keywords : CBIR, Support Vector Machines, Colour Histogram, Straight Line Signature, Outline
Signature, SIMPLIcity.
___________________________________________________________________________
1. INTRODUCTION
An image retrieval system is used for browsing, searching and retrieving images from large
digital image database. Large amounts of images are created everyday in different areas

including remote sensing, crime prevention, fashion, publishing, medicine, architecture etc with
the development of the internet and availability of capturing devices such as cameras[1]. So, one
has to develop efficient and effective methodology to manage large databases for retrieval.
Text Based Image Retrieval (TBIR) uses the text associated with an image to determine what the
image contains. Google, Yahoo Image search engines are the examples of systems using this type
of approach. These search engines are easy to implement and fast but at times fail to retrieve
relevant images. Manual annotation is very laborious , time consuming and next to impossible
for large database. It is not accurate. It is difficult to describe the content of different types of
images with human languages. Polysemy problem also occurs and surrounding text may not
describe the image[2]. To overcome these difficulties encountered by a text based image retrieval
system, content based image retrieval (CBIR) was proposed in the early1990s. CBIR involves
data collection, build up feature database, search in database, arrange the order and the results of
the retrieval. Features include colour, texture, shape and spatial , similarities are measured based
on the distance between the features and images to be retrieved automatically[3].
The organization of this paper is as follows: After introduction in section 1, detailed
discussions are carried out in the existing CBIR Systems in section 2.Algorithm of the refined
Euclidean distance matching technique for improved CBIR using colour, straight line and outline
sketch signatures of the Image is explained in section 3. Design of the proposed CBIR system is
outlined in section 4. Feature extraction of combined three region colour histogram signatures
[3RCS] is performed in section 5.Extraction of Straight Line Signatures is illustrated in section
6.Extraction of outline signature is narrated in section 7. Improved 3 region colour, straight line
and outline sketch based CBIR (ICSLOS) system is explained in section 8.Algorithm of
proposed SVM based model for 3 region colour, straight line and outline sketch signatures of the
image (SVM CSLOS) are introduced in section 9.Experimental results and the evaluation of the
performance of the proposed system is reported in terms of precision and recall in section 10.
Finally, the conclusion and future work are presented in section 11.
2. CBIR SYSTEMS A COMPARISON
Jhanwar et al. proposed a new technique for content based image retrieval using motif co
occurrence matrix (MCM). The MCM is derived from the motif transformed image. The whole
image is divided into 2x 2 pixel grids. Number of scan motif are reduced to 6 . The MCM is
defined as 3D matrix with (i,j,k). The transformed image is used to calculate the probability of

finding a motif "i" at a distance "k" from a motif "j". The distance between the MCMs of two
different images is used as the similarity measure while retrieving the images from the database.
This method is efficient in computation and storage requirement but expensive. MCM combines
the information related to both colour and texture[4].
Huang et al. presents an image retrieval system based on texture similarity. The composite sub
band gradient vector (CSG) and the energy distribution pattern (EDP)string are the two features
extracted. Both features are generated from the sub images of a wavelet decomposition of the
original image. At first, a fuzzy matching process based on EDP strings will serve as a filter to
remove undesired images in the database. At the second stage, the images passing through the
filter will be compared with the query image based on their CSG vectors which are powerful in
discriminating the texture features [5][6].
Chuen Horng Lin et al. integrate three image features to facilitate the image retrieval process.
The first and second image features are useful for describing the relationship between colors
and textures in an image. These features are called color co-occurrence matrix(CCM) and
difference between pixels of scan pattern (DBPSP)respectively. The third image feature is based
on color distribution called color histogram for K-mean(CHKM). CCM calculates the probability
of same pixel colour between each pixel and its adjacent ones in each image and this probability
is considered as the attribute of the image. DBPSP is the difference between pixels and converts
it into the probability of occurrence on the entire image. Each pixel color in an image is then
replaced by one color in the common color palette that is most similar to color so as to classify
all pixels in image into k cluster, called the CHKM feature. Difference in image properties and
contents indicate that different features are contained. It is thus, CCM, DBPSP, and CHKM
facilitate image retrieval. Optimal features are selected from original features to enhance the
detection rate[7].
Elalami introduced a CBIR which depends on extracting the most relevant features. Colour
features are extracted from colour histogram and texture features are extracted from Gabor filter
algorithm . Genetic algorithm was used for performing feature discrimination. The most relevant
features are selected from the original features set by two successive functions, preliminary and
deeply reduction function. This method simplify the calculation, achieve maximum detection rate
and reduce the retrieval process time.[8]

Elalami also presented CBIR using Artificial classifier. The proposed model is composed of
four major phases namely: features extraction, dimensionality reduction, ANN classifier and
matching strategy. In feature extraction phase, color features are extracted from color cooccurrence matrix (CCM) and texture features are extracted from difference between pixels of
scan pattern(DBPSP). The dimensionality reduction technique selects the effective features that
have the largest dependency on the target class and minimal redundancy among themselves.
These features reduce the calculation work and the computation time in the retrieval process. The
artificial neural network (ANN) classifier is used so that the selected features of query image are
the input and its output is one of the multi classes that have the largest similarity to the query
image. In addition, the proposed model presents an effective feature matching strategy that
depends on the idea of the minimum area between two vectors to compute the similarity value
between a query image and the images in the determined class[9].
3. THE PROPOSED ALGORITHMS
Refined Euclidean Distance Matching Technique for the CBIR Using Colour, Straight line and
Outline Sketch Signatures of the Image are the key points of the proposed algorithms which is
described as follows.
Let I be the input query image of size m x n.
Apply median filter on the input image
Convert the high colour input image in to a 256 colour image
Separate the top, middle and bottom regions R1, R2, R3, of the image
Find the histogram of the regions R1, R2, and R3 as hi ( r1 , r2 , r3 )
Use that combined 3 region colour histograms as the signature S1 of input image
S1 = hi ( r1 , r2 , r3 )
Find all the long straight lines in the the image above the threshold and use it as Straight
line signature S2 of the image.
Detect the outlines of the input image.
Find the FFT of the one dimentional representation of the binary outline Image and this
is represented as the outline signature S3 of the image
Form the combined signature S ={ S1, S2, S3} of input image

Let M be the image signatures of all the images in the dataset.


Find Euclidean Distance between the input image signature and the signature of the all
the images of the database
The distance D1=Euclidean distance (M,S)
Sort the distance matrix D in ascending order
Find the index of Top N ranked minimum distances of D
Find the average image signature of the top N matching images
Use that average image signature V as a virtual query image and repeat the search.
Again, find Euclidean Distance between the virtual input image signature V and the
signature of the all the images of the database
The distance D2= Euclidean Distance(M,V)
Find the index of Top N ranked minimum distances of D2
Display the Top N ranked Images from the image database using the index.

4.

DESIGN OF PROPOSED CBIR SYSTEM

Figure 1 represents the indexing phase and figure 2 represents the CBIR system.

TheImage
Database

Readalltheimages
andCreateImage
signatureusingcolor,
StraightLineandoutline
information

TheoutputImage
Signatureswhichcanbe
usedasaIndexforthe
imagecollection

The output
Image
Signatures
can be stored
alongwiththe
database
images

Figure.1 Indexing Phase

Figure. 2 CBIR system

5. COMBINED THREE REGION COLOUR HISTOGRAM SIGNATURES [3RCS]


Experiments were carried out with the data base of James .Z.wang research group SINPLIcity :
Simplicity Semantic Sensitive Integrated Matching for Picture Libraries[10].
Figure 3 represents the colour signature from 3 regions of the image. For an m*n image I, the
colours in that image are quantized to C1, C2, , Ck. The colour histogram H(I)={h1, h2, ,
hk}, where hi represents the number of pixels in colour Ci.

Ci is the ith colour index, hi is the number of pixels with that colour index. m x n

is the total

number of pixels in the image.


i = 0,1,2,.,L-1. (where L = 256).
Let the image is decomposed in to three regions R1, R2, R3.
I1={ R1, R2, R3 }
Let h1 and h2 represent two colour histogram of three regions R 1, R2, R3, of two images I1 and
I2. Now, we can represent the combined histograms of the image I1 as
h1 h1 ( r1 , r2 , r3 )
h2 h2 ( r1 , r2 , r3 )

The Histogram Euclidean Distance between the colour histograms h1 and h2 can be computed
as:
d 2 (h2 , h2 )
R1

h (r , r , r ) h (r , r , r )
1

R2

R3

Combination of histograms for different regions of the image will indirectly include the space
information within it. [11]
Figure.3.Creating Colour Signature from 3 Regions of the Image

6. STRAIGHT LINE SIGNATURES.


A line in the image space can be expressed with two variables (r, ) in polar coordinate
system and a line equation can be written as

cos
r
y
x

sin
sin
r x cos y sin

In general for each point (x0 , y0), we can define the family of lines that goes through that point
as:
ro xo cos yo sin

A sinusoid - r plane shown in Figure 4. represents the line passing through the point (x0 ,
y0). Only points such between r>0 and 0 < <2 are considered and the same operation for all
the points in the image.If the curves of two different points intersect in the - r plane, then two
points belong to the same line.

Figure .4 Illustrates the intersection of three plots in one single point (0.925, 9,6).
The three plots which intersect in one single point is shown in Figure 4. It represents line in
which (xo,yo) (x1, y1) and (x2,y2)lay.Hough Line Transform keeps track of the intersection
between curves of every point in the image.In general a threshold of mimumum number of
intersections needed to detect a line can be defined.If the number of intersections is above some
threshold,then it is declared as a line with the parameters ( - r) of the intersection point.[12].

7. OUTLINE SKETCH SIGNATURES

Figure 5: Colour, Straight line and Outline Sketch Signature from image.
Figure 5 represents the combined colour signature, straight line signature and outline
signature. Edge detection is done by identifying and locating sharp discontinuities in an image.
The discontinuities are abrupt changes in pixel intensity which characterize boundaries of objects
in a scene. Edge detection is implemented by convolving the image with an operator (2D
filter).This is sensitive to large gradients in the image while returning zero values in uniform
regions. Edges occurring in images should not be missed. It should not respond to non edges.
The distance between the edge pixels found by the detector and the actual edge should be
minimum. It should give only one response to a single edge. The canny edge detector first
smoothes the image to eliminate noise. It then finds the image gradient to highlight regions with
high spatial derivatives. The algorithm then tracks along these regions and suppresses any pixel
that is not at the maximum. If the magnitude is above high threshold, it is made an edge. After
detecting the outline sketch image using the canny edge detection operation, the Outline Sketch
signature is created using fast Fourier transform. For that, the two dimensional sketch image is
converted in to a one dimensional array and FFT operation is applied. Since the outputs of one
half of the FFT will be the mirror image of the another half, only 50% of the output is used as the
signature. So, the 128 frequency values will be used as the outline signature of the image.[13]

8. IMPROVED 3 REGION COLOUR, STRIGHT LINE AND OUTLINE SKETCH


BASED CBIR (ICSLOS)
Combined 3 region colour histograms is represented as the signature S1 of input image. All the
long straight lines in the the image above the threshold is used as Staight line signature S2 of the
image. Outlines of the input image are detected.Then FFT of the one dimentional representation
of the binary outline Image is determined . The signature is represented as outline signature S3
of the image.The combined signature S ={ S1, S2, S3} of input image is then formed. The
Euclidean Distance between the input image signature and the signature of the all the images of
the database is then calculated. Distance of the images are sorted in ascending order. The average
image signature of the top N matching images is then determined. This average image signature
V is considered as the image signature of the virtual query image and the search is repeated.
Euclidean distance between the virtual input image signature V and the signature of the all the
images of the database is again found. Top N ranked Images corresponding to the index from the
image database are displayed.[14]
9.ALGORITHMOFPROPOSEDSVMBASEDMODELFORCBIRUSINGCSLOS

Let I be the input query images of size mxn.


Apply median filter on the input image.
Convert the high colour input image in to a 256 color image.
Separate the top, middle and bottom regions R1, R2, R3, of the image.
Find the histogram of the regions R1, R2, and R3 and combine the histograms as
hi ( r1 , r2 , r3 ) .

Use that combined 3 region color histograms as the signature

of input image

S1 hi ( r1 , r2 , r3 ) .

Find all the long straight lines in the the image above the threshold and use it as straight
line signature S2 of the image.
Detect the outlines of the input image.
Find the FFT of the one dimentional representation of the binary outline Image . This is
represented as the outline signature S3 of the image

Form the combined signature S ={ S1, S2 S3} of input image.


Let M be the image signatures of all the images in the dataset.
Train an SVM with randomly selected sample image signatures of all the categories from
the Dataset M.
Find the category labels of all the image signatures of the Dataset M using the trained
network.
Find the category label of input image signature using the same trained network.
Find all the matching images based on the input image Category label and select all the
corresponding image features M1.
The distance D= Euclidean Distance(M1,S).
Sort the distance matrix D in ascending order.
Find the index of Top N ranked minimum distances of D
Display the Top n ranked Images from the image database using the index
10.RESULTSANDANALYSIS
Repeatedexperimentswereconductedwith1000imagesofSIMPLIcitydatabase[10]whichcontains10
Differentcategories(African,Ocean,Building,Bus,Dinosaurs,Elephant,Flower,Horse,Mountain,Food)of
imagestoprovetheefficiencyoftheproposedmethod.Theretrievalresultsobtainedusingthe performance of
proposed SVM based model for CBIR using three region colour, straight line and outline sketch signatures was
compared with some of the existing retrieval systems [4,5,6,7,8,9]. The prototype of the CBIR system was
developedusingMATLAB.
PrecisionwascomputedusingtheentireimagecollectionintheSIMPLIcitydataset.Theeachandeveryimages
ofthedatasetwasusedasqueryimageandtheentiredatasetissearchedformatchingimagesandaverageprecision
wascalculatedforeachquery.Since,thetotalnumberofsemanticallyrelatedimagesforeachqueryisfixedAT
100.Only100toprankedqueryresultsand100imagesineachcategorywereconsidered.

Table 1 . The Average Precision and Recall of Different Methods


Jhanwar

Hung

et al.
[4]
Categories
Africa
Beach
Buildings
Buses
Dinosaurs
Elephants
Flowers
Food
Horses
Mountains
Average

P
0.453
0.398
0.374
0.741
0.915
0.304
0.852
0.369
0.568
0.293
0.527

et al.

Chuen et al.

[5][6]
R
0.115
0.121
0.127
0.092
0.072
0.132
0.087
0.129
0.102
0.135
0.111

P
0.424
0.446
0.411
0.852
0.587
0.426
0.898
0.427
0.589
0.268
0.533

R
0.126
0.113
0.132
0.099
0.104
0.119
0.093
0.122
0.103
0.152
0.116

[7]
P
0.683
0.540
0.562
0.888
0.992
0.658
0.891
0.733
0.803
0.522
0.727

R
0.141
0.192
0.174
0.121
0.101
0.149
0.112
0.132
0.134
0.213
0.147

Proposed SVM
Elalami [8]
P
R
0.703
0.153
0.561
0.198
0.571
0.182
0.876
0.116
0.987
0.098
0.675
0.156
0.914
0.118
0.741
0.138
0.834
0.139
0.536
0.228
0.740
0.153

Elalami [9]
P
R
0.726
0.161
0.593
0.203
0.587
0.191
0.891
0.126
0.993
0.109
0.702
0.163
0.928
0.129
0.772
0.148
0.856
0.144
0.562
0.236
0.761
0.161

CSLOS[13]
P
R
0.634
0.293
0.395
0.178
0.473
0.208
0.703
0.329
0.989
0.499
0.433
0.182
0.755
0.352
0.593
0.273
0.878
0.422
0.482
0.218
0.634
0.295

ICSLOS[14]
P
R
0.667
0.312
0.376
0.169
0.476
0.211
0.739
0.348
0.988
0.498
0.424
0.178
0.774
0.363
0.603
0.279
0.905
0.439
0.478
0.219
0.643
0.302

CSLOS
P
R
0.278
0.106
0.771
0.383
0.699
0.336
0.890
0.444
0.982
0.495
0.778
0.383
0.895
0.445
0.880
0.434
0.954
0.475
0.837
0.414
0.796
0.392

11. CONCLUSION
The proposed CBIR system has been successfully implemented using Mat Lab and the
performance of the system in terms of precision and recall was validated with images from an
image retrieval system SIMPLIcity. The arrived results were compared with previous methods
and the performance of the proposed SVM based CBIR system was significantly better than all
other compared methods.
So, future works may address more efficient and significant ways to separate the image using
the colour information which may model the space information of the image in a better manner..
Future works may address the ways to reduce this size of the colour signature by using suitable
feature reduction techniques.
REFERENCES
[1] Meenkashi Shruti Pal , Sushil Kumar Garg, Image Retrieval : A Literature review ,
International Journal of Advanced Research in Computer Engineering and Technology
(IJARCET) , Volume 2, Issue 6, 2013, 2077- 2080.
[2] Yong Rui and Thomas S.Huang, Image Retrieval : Current Techniques, Promising
Directions and Open Issues , Journal of Visual Communication and Image Representations,
Volume 10, 1999,39-62
[3] A.W.M. Smeulders, et al., Content based image retrieval at the end of the early years,

IEEE Transaction Pattern Analysis and Machine Intelligence. Volume 22 ,2000, 13491379.
[4] N.Jhanwar,S.Chaudhuri,G.Seetharaman,B.Zavidovique,Content Based Image Retrieval
using motif cooccurence matrix,Elseiver,Image and Vision Computing Volume 22, 2204,
1211- 1220.
[5] P.W.Huang,S.K.Dai, Image Retrieval by texture similarity, Pergamon,Pattern Recognition,
26,2003,665-679.
[6] P.W.Huang, S.K.Dai, Design of a two stage content based image retrieval system using
texture similarity,Elseiver, Information Processing and Management , 40,2004, 81-96.
[7] Chuen Horng Lin, Rong Tai. Chen, Yung-Kuan Chan, A Smart Content Based Image
Retrieval System based on Colour and Texture Feature, Elseiver, Image and Vision
Computing ,27, 2009, 658-665.
[8] M.E.ElAlami, A novel image retrieval model based on the most relevant features,Elseiver,
Knowledge Based Systems,24, 2011, 23-32.
[9] M.E.ElAlami, "A new matching strategy for content based image retrieval system",
Elsevier, Applied Soft Computing, 14 ,2014, 407-418.
[10] B.M. Mehtre, et al., Shape measures James Z.Wang, Jia Li, Gio Wiederhold,SIMPLIcity :
Semantics Sensitive Integrated Matching for Picture Libraries,in IEEE Transaction on
Pattern Analysis and Machine Intellingence,Vol.23, issue 9 , 2001,1-17.
[11]L.Jayanthi, K.Lakshmi, An Enhanced Content Based Image Retrieval using Three
Region Colour Histogram of the Image in the International Journal of Scientific Research,
Volume 3, Issue 4, ISSN No.2277-8179, 2014, 73- 76.
[12]L.Jayanthi, K.Lakshmi, An improved Content Based Image Retrieval using Three Region
Colour and Straight line Signatures in the IEEE International Conference on
Communication and Signal Processing(ICCSP - 2014).
[13]L.Jayanthi, K.Lakshmi, An improved CBIR using Three region colour, Straight line and
Outline Sketch Signatures in the IEEE International Conference on Information,
Communication and Embedded Systems (ICICES- 2014).
[14]L.Jayanthi, K.Lakshmi, A Refined Euclidean Distance Matching Technique for Improved
CBIR Using Colour, Straight line and Outline Sketch features of the image in the Global
Journal for research analysis, Volume :3, Issue 11, ISSN No.2277- 8160,2014,41-44.

You might also like