Professional Documents
Culture Documents
* Faculty of Science and Technology, Computer Science Department, Laboratory of Information Processing and
Telecommunications, Sultan Moulay Slimane University, Bni Mellal, Morocco.
Emails: oujaouram@yahoo.fr, rachidieea@yahoo.fr, bra_min@yahoo.fr, & fakfad @yahoo.fr
** Higher School of Technology, Computer Science Department, Cadi Ayyad University, Essaouira, Morocco.
Email: bencharef98@gmail.com
Abstract: To perform a semantic search on a large dataset of images, we need to be able to transform the visual content of
images (colors, textures, shapes) into semantic information. This transformation, called image annotation, assigns a caption
or keywords to the visual content in a digital image. In this paper we try to resolve partially the region homogeneity problem
in image annotation, we propose an approach to annotate image based on grouping adjacent regions, we use the k-means
algorithm as the segmentation algorithm while the texture and GIST descriptors are used as features to represent image
content. The Bayesian networks were been used as classifiers in order to find and allocate the appropriate keywords to this
content. The experimental results were been obtained from the ETH-80 image database.
2
p j mi
Var (Ri ) =
p j Ri
is the variance of pixel Figure 2. Example of K-means image segmentation.
(Card (Ri ))2 The adjacent regions are regrouped in order to get
cluster or region. compact object. As illustrated in Figure 3, the adjacent
The k-means image segmentation algorithm finds the regions are grouped iteratively and annotated by the
pixels groups that minimize the defined above quantity appropriated keyword if its probability is higher than
E. This comes somehow for each cluster or region, to 0.5.
minimize the following quantity:
Input Image
2
p j mi
p j Ri Image Segmentation
Annotation Results
244
We have, in Figure 4, the K-means automatic image in features database in order to be used for the
segmentation of an image representing the object "car" annotation by classification.
that is segmented into multiple regions or clusters. It
also shows the grouping possibility of these clusters or
6.1. Texture Descriptors
regions to have a semantically compact object. We can Several images have textured patterns. Therefore, the
see from this figure that the grouping of clusters 1 and texture descriptor was been used as feature extraction
3 is a compact object that could been easily annotated method from the segmented image.
more correctly than the objects of other non-grouped
The texture descriptor was been extracted using the
clusters. Hence, the major interest of regrouping and
co-occurrence matrix introduced by Haralick in 1973
merger of adjacent regions for the semantic image
annotation. [7]. So for a color image I of size N N 3 in a
colour space (C1 , C 2 , C3 ) , for (k , l ) [1,L, N ]2
and (a , b ) [1, L , G ] , the co-occurrence matrix
2
245
dimensions. They require no segmentation. The authors {(
= P X i Pa ( X i ) )} is a conditional probabilities
have tried to capture the gist descriptor of the image by
analyzing the spatial frequency and orientation. The of each node X i relative to the state of his parents
global descriptor was been constructed by combining Pa( X i ) in G.
the amplitudes obtained at the output of the K Gabor The graphical part of the Bayesian network indicates
filters [10] at different scales E and orientations O. To the dependencies between variables and gives a visual
reduce the feature vector size, each filtered output representation tool of knowledge more easily
image is scaled and divided into N * N blocks (N understandable by users. Bayesian networks combine
between 2 and 16), which gives a vector of dimension qualitative part that is graphs and a quantitative part
N * N * K * E * O. This dimension might be further representing the conditional probabilities associated
reduced by a principal component analysis (PCA), with each node of the graph with respect to parents.
which also gives the weights applied to different filters. Pearl and all [12] have also shown that Bayesian
The computation and extraction of GIST descriptors networks allow to compactly representing the joint
were done through several steps. After the pre- probability distribution over all the variables:
processing step of the input image, the next step n
consists on changing the image into different scales and P( X ) = P( X 1 , X 2 , L , X n ) = P ( X i Pa( X i ))
orientations. Finally, the features vectors are calculated i =1
for each scale, orientation and frequency. Those Where Pa( X i ) is the set of parents of node X i in the
features vectors are combined to form a global feature
graph G of the Bayesian network.
descriptor which is reduced by a principal component
This joint probability could be actually simplified by
analysis (PCA). The Figure 5 shows an image and its
the Bayes rule as follows [13]:
GIST descriptors. n
P( X ) = P( X1 , X2 , L, Xn ) = P(Xi Pa( Xi ))
i =1
246
(
P(Ci ) P X j Pa(X j ) , Ci ) descriptor is higher than the annotation rate based on
n
if X j has parents
Texture descriptors.
P(Ci X ) =
j =1
P(C ) P(X C )
n
Table 1 General Annotation and Error Rates with Global Learning
i j i else and Execution Times of Texture and GIST Descriptors using
j =1
Bayesian Network classifier.
For the naive Bayes classifier, the absence of parents
Extraction Learning Execution Annotation Error Rate
and the variables independence assumption are used to Method Time (s) Time (s) Rate (%) (%)
write the posterior probability of each class as given in Texture 9.41 413.68 55.00 45.00
GIST 11.32 1796.17 62.50 37.50
the following equation [16]:
P(C i X ) = P(C i ) P(X j C i )
n
The results are also affected by the accuracy of the
image segmentation method. In most cases, it is very
j =1
difficult to have an automatic ideal segmentation. The
Therefore, the decision rule d of attribute X is given by: predicate used to control the image segmentation is
d ( X ) = arg max P(C i X ) low level. This leads to regions that are not compact
Ci
semantically. This problem decreases the annotation
= arg max P(C i ) P (X j C i )
n
rates. Therefore, any annotation attempt must consider
Ci j =1 image segmentation as an important step, not only for
The class with maximum probability leads to the automatic image annotation system, but also for the
suitable keywords for the input image. other systems which requires its use. So, to reduce this
problem, we developed a new method based on
5. Experiments and Results
regrouping adjacent region in order to have more
6.3. Experiments compact region that can represent object in the
In our experiments, for each region that represent an image.
object from the query image, the number of input Table 2 General Annotation and Error Rates with Global Learning
features extracted using Texture extraction method is and Execution Times of Texture and GIST Descriptors using
14 x 6 = 84 while the number of input features Bayesian Network classifier based on regrouping region.
extracted using GIST extraction method is 32. These Extraction Learning Execution Annotation Error Rate
Method Time (s) Time (s) Rate (%) (%)
inputs are presented and feed to the classifier; which is
Texture 9.41 1829.51 60.00 40.00
the Bayesian network in order to select the appropriate GIST 11.32 1882.77 65.00 35.00
keywords from the reference database.
The Figure 6 shows some examples of image objects The general annotation rates and error rates of Texture
from ETH-80 image database used in our experiments. and GIST descriptors based on the Bayesian network
The experiments are made based on eight classes of classifier and the regrouping approach were been
objects (Apple, Car, Cow, Cup, Dog, Horse, Pears, and given in Table 2. The experimental results showed that
Tomato). the annotation rate of the proposed approach based on
regrouping different region is higher than the
annotation rate when using k-means segmentation
directly.
The Figure 7 gives the confusion matrix of the
annotation system based on the Texture descriptor and
the Bayesian network classifier in the case of using k-
means segmentation directly and the case of
regrouping region.
247
The Figure 8 gives the confusion matrix of the 10.1109/ ICMCS.2014.6911218 (ICMCS'14),
annotation system based on the GIST descriptor and the 2014 IEEE.
Bayesian network classifier in the case of using k- [3] Frank Y. Shih, Shouxian Cheng, Automatic
means segmentation directly and the case of regrouping seeded region growing for color image
region. segmentation, Image and Vision Computing 23,
pp. 877886, 2005.
[4] Aristidis Likas, Nikos A. Vlassis, and Jakob J.
Verbeek. The global k-means clustering
algorithm. Pattern Recognition, 36(2): pp. 451-
461, 2003.
[5] Lior Rokach, Oded Maimon. Data Mining and
Knowledge Discovery Handbook, Chapter 15:
Clustering Methods. pp 321-352, Springer series,
2nd Edition, New York, October 1, 2010.
[6] Ryszard S. Choras, Image Feature Extraction
Techniques and Their Applications for CBIR
and Biometrics Systems, International Journal
Of Biology And Biomedical Engineering, Issue
1, Vol. 1, pp. 6-16, 2007.
[7] R. Haralick, K. Shanmugan, et I. Dinstein.
Textural features for image classification. IEEE
Figure 8. Confusion matrix of the annotation system based on the Transactions on SMC, 3(6) : pp. 610621, 1973.
GIST descriptor and the Bayesian network classifier.
[8] Aude Oliva and Antonio Torralba. Modeling the
6. Conclusion shape of the scene : A holistic representation of
the spatial envelope. International Journal of
In this paper, we developed and presented an image Computer Vision, 42 : pp. 145175, 2001.
annotation system using k-means as image [9] Aude Oliva , Antonio Torralba, Building the gist
segmentation algorithm. For this image annotation of a scene: the role of global image features in
system, we discussed the effect of regrouping different recognition, Progress in Brain Research, 2006.
region in order to have compact object. The texture and [10] Hans G. Feichtinger, Thomas Strohmer: "Gabor
the GIST descriptor are been used with Bayesian Analysis and Algorithms", Birkhuser, 1998.
networks to classify and annotate the input image by [11] Ann.Becker, Patrick Naim : les rseaux
the suited keywords that are selected from the reference baysiens: modles graphiques de connaissance.
database image. The performance of the proposed Eyrolles.1999.
method was been experimentally analysed. This [12] J. Pearl, "Bayesian Networks" UCLA Cognitive
approach increases the general annotation rates. The Systems Laboratory, Technical Report (R-216),
successful experimental results proved that the Revision I. In M. Arbib (Ed.), Handbook of
proposed image annotation system gives good results Brain Theory and Neural Networks, MIT Press,
for some image that are well and properly segmented. 149-153, 1995.
However, Image segmentation remains a challenge that [13] Sabine Barrat, Modles graphiques probabilistes
needs more attention in order to increase precision and pour la reconnaissance de formes, thse de
accuracy of the image annotation system. In addition, luniversit Nancy 2, Spcialit informatique,
the gap between the low-level features and the semantic dcembre 2009.
content of an image would been reduced and [14] George H. John and Pat Langley. Estimating
considered for more accuracy of any image annotation continuous distributions in bayesian classifiers,
system. Other segmentation method and other features the Eleventh Conference on Uncertainty in
extraction method would been considered for future Artificial Intelligence, 1995.
work. [15] Patrick Nam, Pierre Henri Wuillemin, Philippe
References Leray, Olivier pourret, Anna becker, Rseaux
[1] Oujaoura, M.; Minaoui, B.; Fakir, M., "A semantic baysiens, Eyrolles, 3me dition, Paris, 2008.
approach for automatic image annotation," [16] Tom .Mitchell: Generative and discriminative
Intelligent Systems: Theories and Applications classifier: Nave bayes and logistic regression.
(SITA), 8th International Conference on , vol., Machine learning. Draft 2010.
no., pp.18, 8-9 May 2013, doi: [17] ETH-80 database image. [Online]. Available:
10.1109/SITA.2013.6560800, 2013 IEEE. http://www.d2.mpi-inf.mpg.de/Datasets/ETH80
[2] Oujaoura, Mustapha; Minaoui, Brahim; Fakir,
Mohamed, "Combined descriptors and classifiers
for automatic image annotation," International
Conference on Multimedia Computing and
Systems (ICMCS), 2014 , vol., no., pp.270-276,
14-16 April 2014, Marrakesh, Morocco, doi:
248