You are on page 1of 11

Leaf Image Retrieval with Shape Features

Zhiyong Wang, Zheru Chi, Dagan Feng and Qing Wang

Center for Multimedia Signal Processing
Department of Electronic and Information Engineering
The Hong Kong Polytechnic University
Hung Hom, Kowloon, Hong Kong
Email Address:
Tel: (852)2766 6219 Fax: (852)2362 8439
Abstract. In this paper we present an efficient two-step approach of
using a shape characterization function called centroid-contour distance
curve and the object eccentricity (or elongation) for leaf image retrieval.
Both the centroid-contour distance curve and the eccentricity of a leaf
image are scale, rotation, and translation invariant after proper normalizations. In the frist step, the eccentricity is used to rank leaf images, and
the top scored images are further ranked using the centroid-contour distance curve together with the eccentricity in the second step. A thinningbased method is used to locate start point(s) for reducing the matching
time. Experimental results show that our approach can achieve good performance with a reasonable computational complexity.

Keywords: Centroid-contour distance, Shape representation, Content-based

image retrieval, Leaf image processing.


Plant identification is a process resulting in the assignment of each individual

plant to a descending series of groups of related plants, as judged by common
characteristics. So far, this time-consuming process has mainly been carried out
by botanists. Plant identification has had a very long history, from the dawn of
human existence. Currently, automatic (machine) plant recognition from color
images is one of most difficult tasks in computer vision due to (1) lack of proper
models or representations; (2) a great number of biological variations that a
species of plants can take; (3) imprecise image preprocessing techniques such as
edge detection and contour extraction, thus resulting in possible missing features.
Since the shape of leaves is one of important features for charactizing various plants, the study of leaf image retrieval will be an important step for plant
identification. In this paper, leaf image retrieval based on shape features is be
addressed. In particular, we discuss two issues, shape feature extraction and
shape feature matching. A number of shape representations such as chain codes,
Fourier descriptors, moment invariants, and deformable templates [1, 2] as well
as various matching strategies [3] have been proposed for shape-based image retrieval. There have been some successful applications reported [49]. In this paper, we first present a leaf shape representation with a centroid-contour distance
R. Laurini (Ed.): VISUAL 2000, LNCS 1929, pp. 477487, 2000.
Springer-Verlag Berlin Heidelberg 2000


Z. Wang et al.

curve. It will be demonstrated that the representation can be scale, rotation and
translation invariant after proper normalizations. Particularly, a thinning-based
method is adopted to locate the start point(s) for reducing the computation time
in image retrieval. In order to further reduce the retrieval time, we then propose
a two-step approach which uses both the centroid-contour distance curve and
the eccentricity of the leaf object for shape-based leaf image retrieval.
In Section 2, we define the centroi-contour distance curve and explain its
invariant properties. A similarity measure with the distance curve is also discussed in the section. A leaf image retrieval scheme based on the eccentricity
and centroid-contour distance curve is presented in Section 3. Experimental results and discussions are given in Section 4. Finally, concluding remarks are
drawn in Section 5.


Center-Contour Distance Curve


Tracing the leaf contour can be considered as circling around its centroid. The
trace path from fixed start point represents a shape contour uniquely, that is to
say, a contour point sequence corresponds to a shape uniquely if the start point
is fixed. This is the basic idea for chain code representation of shape. As Figure
1 shows, a point P on the contour is determined by the centroid C, the distance
R between the point P and the centroid C, and the angle . In fact, it is not
necessary that the object centroid has to be fixed in a coordinate system since the
change of object centroid only leads to the translation of the object. The shape
reconstruction can also be independent of the angle given the fixed start point.
Therefore, for the same start point, the object contour can be resonstructed with
the centroid-contour distance curve. The distance between the object centroid
and a contour point is termed as the centroid-contour distance. The contour can
be represented by one dimensional curve, the centroid-contour distance curve.

Properties of Center-Contour Distance Curve

Generally, the properties of scale, translation and rotation invariance are expected for the shape feature based image retrieval. After some normalization, we can
achieve these invariant properties with the centroid-contour distance curve.
Translation Invariant Property
The contour shape of an object is fully determined by the centroid-contour distance function and is nothing to do with the coordinates of the centroid position.
Therefore, the distance curve is translation invariant. This is elaborated as follows. As shown in Figure 1,
|CP | = (xC xP )2 + (yC yP )2

Leaf Image Retrieval with Shape Features






Fig. 1. Illustration of translation invariant property of the centroid-contour distance


where (xC , yC ) and (xP , yP ) are the coordinates of points C and P respectively.
And the object centroid (xC , yC ) is defined as follows.
mpq =
xp y q dxdy, xC =
, yC =
If the object is traslated by increasements x and y along x-axis and yaxis, repectively. The point P on the object contour is moved to point P1 with
its coordinates being (xP + x, yP + y). According to Equation 2,
(x + x)dxdy
1 xdxdy
xC1 = 1 = R R
= R RR
= xC + x
where R1 is the region of the new positioned object.
Similarly, yC1 = yC + y, that is, the new centroid point C1 of the object is
(xC + x, yC + y). Obviously,
|C1 P1 | = (xC1 xP1 )2 + (yC1 yP1 )2
= (xC xP )2 + (yC yP )2
= |CP |
The above equations show that the translation invariant property of the centroidcontour distance curve.
Scale Invariant Property
Let us consider a circle first. The centroid-contour distance of a circle object is
a constant. For a larger circle, its distance is a larger constant and there are


Z. Wang et al.


Fig. 2. Illustration of the scale invariant property of the centroid-contour distance


more points on the contour. To make a matching possible, we should reduce

the number of points on the contour of the larger object by down sampling.
Figure 2 illustrate the situation that the arc AB of contour S1 is scaled to arc
CD of contour S2 without any other change. We assume that |OA| = s|OC|,
|OB| = s|OD|. For any point A on contour S1 , a correspondent point C can be
found on contour S2 . In other words, we can represent shape S2 with the same
sample points of the shape S1 . The only difference is the distance. It is noticed
that |OA| = s|OC|, |OB| = s|OD|. If the distance is normalized, the scale factor
s will not exist any more. Therefore, the centroid distance curve of the scaled
object contour S2 is the same as that of the original object contour S1 with
sampling and normalizing processing. In our experiments, the centroid-contour
distance is normalized to range [0, 1].
As Figure 3 shows, with proper sampling and normalizing operations, the
centroid-contour distancue curve is scale invariant.
Rotation Invariant Property
Rotation may be introduced in data acquisition such as picture grapping by a
camera or scanner. If a shape feature is sensitve to the rotation, the same leaf
rotated to an angle will not be retrieved with that shape feature, which is not
desirable. As shown in Figure 4, an object contour with m points is rotated
one pixel clockwisely. If the object contour is traced with the start position ,
the centroid-contour distance sequences will be (|OP1 |, |OP2 |, . . . , |OPm |) and
(|OPm |, |OP1 |, |OP2 , . . . , |OPm1 |) for the contour on the left and that on the
right respectively. It is noticed that the only difference between the two sequences is the sequence of contour points. If the latter sequence is shift left one
item wrappedly, the two sequences will be the same. Generally, for an arbitary
rotation, we need to shift the sequence n items wrappedly. That is to say, the
centroid-contour distance curve is rotation invariant after being shift n items
At the first sight of the Figure 5(b), the rotation invariant property is not be
discovered, because it is not the same as Figure 3(d) at all. It is not difficult to
find that the different start points cause the problem. This problem can be solved
by shifting the curve wrappedly. As Figure 5 The centroid-contour distance curve

Leaf Image Retrieval with Shape Features







Fig. 3. Scale invariant property of center distance curve. (a) Original image; (b) Halfsize image; (c) Centroid-contour distance curve for original image; (d) Centroid-contour
distance curve for half size image; (e) Sampled center distance curve of original image.





Rotate One Pixel






Fig. 4. Illustration of rotation invariant property of the centroid-contour distance



Z. Wang et al.




Fig. 5. Rotation invariant property of centroid-contour distance curve. (a) Rotated

image; (b) Centroid-contour distance curve of rotated image; (c) Wrappedly shifted
centroid-contour distance curve of rotated image.

shown in Figure 5(c) is generated by shifting that in Figure 5(b) 24 points to

the left. The resulting curve is similar to that shown in Figure 3(d). Fourier
transformation or correlation can be sued to locate the start point, however it
is also computationally costly [3, 6]. A thinning-based start point(s) locating
method is proposed to reduce the computation. First the binarized leaf image
is thinned to obtain its skeleton. Several end-points are then located on the
skeleton. The closest point for each end-point on the leaf contour is a possible
start point. As shown in Figure 6, points A, B, C and D are the end-points, and
points A1 , B1 , C1 and D1 are their closest points on the contour, repectively.
Therefore, the distance curve is only needed to shift to those closest points during
the matching process, which will reduce the computation greatly. Because of
the quantization error, some neighbour points of the closest points will also be
considered(5 neighbor points will be considered in our experiments).

(a) Skeleton

(b) Contour

Fig. 6. Illustration of the thinning-based start point(s) locating method.

Leaf Image Retrieval with Shape Features



Similarity Measurement

The centroid-contour distance curve can be used to measure the dissimilarity

between image shapes. We define the following distance function D to mearsure
the dissimilarity between two images :
r Pn
i=1 |f1 (i) f2 (i)|
where f1 (i) and f2 (i) are the centroid-contour distances of the i-th point of two
object contours, and n is the number of the sample points on the centroid-contour
distance curve.
Rotation invariance is only true when we shift one of the curves by a number
of points. In order to find the best matching result, we have to shift one curve
n times, where n is the number of possible start points. The minimal distance
Dmin is recorded. We define the dissimilarity Dc between two object contours
Dc = min{D1, ..., Dj , ..., Dn }
where Dj is the distance between two object contours when one of the contours
is shift by j points.

Leaf Image Retrieval with Feature Combination


Gray scale and histogram

based segmentation

Color leaf image

Shape features extration

Two-step retrieval


Score the m candidates

based on the centroidcontour distance curve
and eccentricity

Binarized leaf image

Retrieve top m


candidates in terms
of eccentricity

distance curve

Fig. 7. Block diagram of our two-step leaf image retrieval approach.

In order to make leaf image retrieval more efficient, a computationally-simple

feature is used to select top candidates first. We propose to use another shape
parameter, the eccentricity of an object, for the first-stage leaf image retrieval.
Eccentricity is defined with moments as [1]
eI =
upq =

(u20 u02 )2 + 4u11


(x xC )p (y yC )q dxdy


where R is the whole region of an object and A is its area. and (xC , yC ) is the
object centroid which is the same as that used in computing the centroid-contour


Z. Wang et al.

distance(Equation 2). It is easy to verify that the eccentricity is translation, scale

and rotation invariant. The eccentricity dissimilarity De between two leaf images
I and J is defined as
De (I, J) = |eI eJ |
The smaller De is, the more similar the leaf images I and J are.
Leaf shape is regular in some way. For example, for rough classification leaves
can be classified into fat and round or thin and long type without considering the
details of their contour (sawtooth or not). Eccentricity could be a good feature
for the first-stage leaf image retrieval.
For the second-stage leaf image retrieval, the combination of two features,
the eccentricity and centroid-contour distance curve, will be used to score the
retrieved leaf images:
Ds (I, J) =

w1 De (I, J) + w2 Dc (I, J)
w1 + w2


where I and J denotes two leaf images, and De and Dc are the distance measures
with two features. w1 and w2 are used to weigh the relative importance of two
features,which are determined by simulation tests.
Figure 7 shows the flow chart of our approach. In terms of the eccentricity measure, the top scored M images are retrieved from the image database.
From these M images, the top scored N (N < M ) images are selected based on
Ds (I, J) defined in Equation 8. In order to reduce search time furtherly, images in the database can be indexed with the eccentricity values of the images.
When a query is submitted, the leaf image with the closest eccentricity value
can be found with a fast search algorithm such as the half search algorithm.
The top M leaf images with close eccentricity values can be easily found nearby. Since eccentricity is one dimensional feature, a multi-dimensional k-nearest
search algorithm, such as the k-d tree approach [10], is not necessary.

Experimental Results and Discussion

In our experiments, two data set are used to test our approach. Data set 1
contains 135 320 240 color leaf images that seldomly come from the same
plant. Data set 2 containing 233 color leaf images in arbitary sizes with about 10
samples collected from each plant. From Table 1 we can see that the thinningbased start point(s) locating method has reduced the shift-and-matching times
When a query request is submitted, the two-setp retrieval approach is performed. In the experiments, the top 30 closest images will be returned in the
first step.
Figure 8 shows retrieval results for a query performed on the data set 1.The
retrieved images are circularly except that the boundary of image Figure 8(f) are
round. This result indicates that the features pay more attention to the global
shape information. Another retrieval example performed on the data set 2 is

Leaf Image Retrieval with Shape Features


Table 1. Comparison of the average numbers of shift-and-matching operations.

Data Set 1
Data Set 2

Contour points Closest points



(b) 0.2128

(c) 0.2191

(d) 0.2282

(e) 0.2393

(f) 0.2592

(g) 0.2639

Fig. 8. A retrieval example for a sawtooth contour shape in the data set 1. (a) Query
image; (b)-(g) Top 6 retrieved images with their Ds values.


(b) 0.0494

(c) 0.0630

(d) 0.0661

(e) 0.0692

(f) 0.0740

(g) 0.0758

Fig. 9. A retrieval example for leaf image with stiff corners with the data set 2. (a)
Query image; (b)-(g) Top 6 retrieved images with their Ds values.


Z. Wang et al.

shown in Figure 9. We can find the top six images are with the same plant. In
these experiments, we set w1 = 0.4, w2 = 0.6 emperically.
These examples show that our approach can achieve retrieval results that are
similar to the results from human visual perception. Table 2 shows the retrieval
time for the above two examples when the exhaust search and our two-step search
scheme are used. Our experiments are carried out on a Pentium 333 HZ PC. It
is found that the two-step retrieval scheme can reduce the retrieval time significantly. The experimental results show that our approach is computationally
more efficient.
Table 2. Comparison of retrieval time.
Exhaust search (Sec.) Two-step search (Sec.)
Example 1
Example 2


In this paper we present an efficient two-step leaf image retrieval scheme of

using two shape features, the centroid-contour distance curve and eccentricity.
It is shown that both features are scale, rotation, and translation invariant after
proper normalizations. Our approach has been tested on two data sets with 135
and 233 leaf images respectively with good results. Compared with the exhaust
search, our two-step approach is computationally more efficient. In addition, the
identification of start point(s) using the skeleton of a leaf image reduces the
matching time in image retrieval to a great extent. Since the proposed feature
in this paper pay much attention to the global shape information, some local
features will be adopted in our future work in order to improve the retrieval

The work described in this paper was substantially supported by a grant from
the Hong Kong Polytechnic University (Project No. G-V780).

1. A. K. Jain. Fundamentals of Digital Image Processing. Prentice Hall, London, UK,

Leaf Image Retrieval with Shape Features


2. BV. M. Mehtre, M. S. Kankanhalli, and W. F. Lee. Shape measures for content

based image retrieval: a comparison. Information Processing & Management, 33(3),
3. Xianfeng Ding, Weixing Kong, Changbo Hu, and Songde Ma. Image retrieval
using schwarz representation of one-dimensional feature. In Visual Information and
Information Systems, pages 443450, Amsterdam, The Netherlands, June 1999.
4. C.W. Richard and H. Hemami. Identification of three-dimensional objects using
Fourier descriptors of the boundary curve. IEEE TRANS. on Systems, Man and
Cybernetics, SMC-4(4), July 1974.
5. S. A. Dudani, K. J. Breeding, and R. B. McGhee. Aicraft identification by moment
invariants. IEEE TRANS. on Computers, C-26(1), Jan. 1977.
6. E. Persoon and K. S. Fu. Shape discrimination using Fourier description. IEEE
TRANS. on Systems, Man And Cybernetics, SMC-7(3), Mar. 1977.
7. C. Chen. Improved moment invariants for shape discrimination. Pattern Recognition, 26(5), 1993.
8. A.K. Jain and A. Vailaya. Image retrieval using color and shape. Pattern Recognition, 29(8), 1996.
9. A.K. Jain and A. Vailaya. Shape-based retrieval:a case study with trademark
image database. Pattern Recognition, 31(9), 1998.
10. R. Egas, N. Huijsmans, M. Lew, and N. Sebe. Adapting k-d trees to visual retrieval.
In Visual Information and Information Systems, pages 533540, Amsterdam, The
Netherlands, June 1999.

You might also like