Chamelat 2006

Grape Detection
By Image Processing
R. Chamelat1 , E. Rosso1 , A. Choksuriwong2 , C. Rosenberger2 , H. Laurent2 , P. Bro3
1 ENSI de Bourges
10 boulevard lahitolle, 18020 Bourges - France

email: {remi.chamelat,estelle.rosso}@ensi-bourges.fr
2 Laboratoire de Vision et Robotique - UPRES EA 2078
ENSI de Bourges
10 boulevard lahitolle, 18020 Bourges - France
email: {christophe.rosenberger,helene.laurent}@ensi-bourges.fr
3 University of Talca
Department of Engineering Science, Curicó, Chile

email: pBro@utalca.cl
Abstract— We propose in this paper a new method for

grape detection in outdoor images. Zernike moments are
used for the description of the grapes shapes. Color infor-
mation is also exploited. A support vector machine is used
for the learning and recognition steps. This method is tested
on real images with different grabbing conditions by using
a manually made ground truth. The proposed method per-
mits the recognition of grapes in 99% cases with very few
samples used in the learning step.
I. Introduction
The use of robotic systems for fruit harvesting has been
of interest for almost four decades [?]. Many agricultural
systems have been described in the literature for the au-
tomation of various harvesting processes such as for fruit
location, detachment, transferring, obstacle avoidance, ma-
neuvering, etc [1], [2], [5], [14], [12]. There have been some
good survey papers that covered several aspects of these
systems [19], [18]. Of interest has been harvesting of var-
ious fruits and vegetables [5], [21], [22], [23]. In this pa-
per, we focus on a system for grape harvesting and trans-
portation. The wine industry [20] is interested in using
autonomous robotic systems to realize these tasks for mul- 1-4244-0136-4/06/$20.00 '2006 IEEE
tiple reasons.
First of all, existing machines, such as those shown in Fig. 1. Some Existing Grape Harvesting Machines
Figure 1, harvest grapes by striking the vine. This harvest-
ing process is not possible for some wines, such as cham-
pagne, for chemical reasons (oxidation). Moreover, some
deposits are collected with the grapes. Finally, these ma- One important step in this process is first to locate
chines need at least one operator to harvest. grapes in a vine area by using artificial vision. There
are many applications of image processing in agriculture.
Second, a vineyard can be harvested by humans. Using Lots of works have been performed for the quality con-
robots should be performed such that robot picking rate trol of fruits or vegetables by artificial vision [9], [15].
is higher than manual picking rate. For large vineyards, a Such systems increase worker productivity, augment prod-
large number of workers is needed. An autonomous system uct throughput and improve selection reliability and uni-
could reduce the harvesting cost also. Another important formity. For a human, grape detection is not so easy es-
issue is that quality control using these systems must be pecially when the grape and the leafs have a similar color.
superior to that achieved by humans manually. For such an application, the environment makes the grape
detection difficult. Indeed, the luminance of images can
be very different (sun, shadow,...). A grape can appear at
different scales and can be occulted by a leaf.
Fig. 3. Principle of the proposed method
interesting to locate grapes. For each block of the origi-

nal image, we compute the average and standard deviation
values of the color of a pixel expressed in these two spaces.
Fig. 2. Grape harvesting robot developed by Monta et al. [8] We have so 12 values concerning the color information.
In order to contribute to solve this problem, we tried to

use different techniques from the invariant pattern recog-
nition theory. Many works have been devoted to the defi-
nition of object invariant descriptors for simple geometric
transformations [7], [11]. Amongst the available invariant
descriptors, the Zernike moments [8], [4] have been devel-
oped to overcome the major drawbacks of regular geomet-
rical moments regarding noise effects and presence of image
quantization error. Based on a complete and orthonormal
set of polynomials defined on the unit circle, these moments
help in achieving a near zero value of redundancy measure.
In [3], a comparative study showed the relative efficiency of
Zernike moments face to other invariant descriptors such
as Fourier-Mellin ones or Hu moments.
We propose in this paper a method for the detection of

grapes in an image using Zernike moments and color infor-
mation. A support vector machine is used for the learn-
ing and recognition steps. In the next section, we detail
the proposed method. We illustrate the efficiency of the
method in section 3 through several experimental results
on real images. Section 4 is devoted to the conclusions of
this work and to its perspectives.
II. Developed method

Fig. 4. An image in the color spaces RGB and HSV
In order to locate a grape very quickly, we subdivide the
image to process in blocks. For each block, we compute
some parameters describing it. A support vector machine B. Zernike moments
is then used in order to decide if the block contains any
grape or not (see Figure 3). Zernike moments [8], [4] belong to the algebraic class
for which the features are directly computed on the image.
These moments use a set of Zernike polynomials that is
A. Color spaces complete and orthonormal in the interior of the unit circle.
A color image can be expressed in the Red, Green, Blue The Zernike moments formulation is given below :
(RGB) space or Hue, Saturation, Value (HSV) one. Figure
4 shows an example of a color image represented in these m + 1 XX
Amn = I(x, y)[Vmn (x, y)] (1)
two spaces. As we can see, the HSV space seems to be π x y
with x2 + y 2 ≤ 1. The values of m and n define the These moments yield invariance with respect to transla-
moment order and I(x, y) is a pixel gray-level of the image tion, scale and rotation. In order to differentiate objects
I over which the moment is computed. Zernike polynomials with the same shape, we have used Zernike moments ap-
Vmn (x, y) are expressed in the radial-polar form : plied on color images, exactly, using f (x, y) gray-level im-
age and adding the mean of the three color components
Vmn (r, θ) = Rmn (r)e−jnθ (2) RGB (Red, Green, Blue). The function f (x, y) is defined
by:
where Rmn (r) is the radial polynomial given by : f (xy) = 0.3R + 0.6G + 0.1B (4)
m−|n|
For this study, the Zernike moments with order 10 and
2
X (−1)s (m − s)! rm−2s 11 have been computed (it represents 72 descriptors con-
Rmn (r) = (3)
s=0 s!( m+|n|
2 − s)! ( m−|n|
2 − s)! sidering all the components). Figure 7 shows the values of
the Zernike moments computed on the objects from figure
Classical methods for object recognition based on 5 by taking into account the color information. We can
Zernike moments use function f (x, y) equal to 1 or 0 (bi- notice, in this case, that the Zernike moments values are
nary image). So, for two objects with the same shape, we different for both objects.
have the same binary image (see figure 5), in other words
the same function f (x, y).
Fig. 5. 2 objects with a similar shape Fig. 7. Zernike moments values computed on the color image
Figure 6 plots 12 components of the 10th and 11th

Zernike moments applied on binary image of the objects C. Training and recognition method
(object 1, object 2) from the COIL-100 database. We can Suppose we have a training set {xi , yi } where xi is the in-
notice that we obtain the same values for both objects. variant descriptors vector described previously (xi is com-
posed of N BBLOCKS values corresponding to each block
in the original image and yi the object class. For two classes
problems, yi ∈ {−1, 1}, the Support Vector Machines [17],
[16] implement the following algorithm. First of all, the
training points {xi } are projected in a space H (of possi-
bly infinite dimension) by means of a function Φ(·). Then,
the goal is to find, in this space, an optimal decision hyper-
plane, in the sense of a criterion that we will define shortly.
Note that for the same training set, different transforma-
tions Φ(·) lead to different decision functions. Figure 8
shows an illustration of computation of a linear margin.
A transformation is achieved in an implicit manner using

a kernel K(·, ·) and consequently the decision function can
be defined as :
X̀
Fig. 6. Zernike moments values of 2 objects having a similar shape f (x) = hw, Φ(x)i + b = αi∗ yi K(xi , x) + b (5)
i=1
Fig. 8. Example of computed margin in the linear case
with αi∗ ∈ R. The value w and b are the parameters defin-

ing the linear decision hyperplane. We use in the proposed
system a radial basis function as kernel function.
In SVMs, the optimality criterion to maximize is the

margin, that is the distance between the hyperplane and
the nearest point Φ(xi ) of the training set. The αi∗ allow-
ing to optimize this criterion are defined by solving the
following problem :
 P` 1
P`

 maxαi i=1 αi − 2 i,j=1 αi αj yi K(xi , xj yj )
with constraints,

(6)

 0 ≤ αi ≤ C ,
 P`
i=1 αi yi = 0 .
where C is a penalization coefficient for data points lo-

cated in or beyond the margin and provides a compromise
between their numbers and the width of the margin (for
this study C = 100). For this application, we have a two-
class problem that is to say a block contains a grape or
not. 3
III. Experimental results

In order to test the proposed approach, we used a Fig. 9. Some images and the associated ground truth used for the
database composed of 18 outdoor images. Figure 9 presents learning and recognition steps
two examples of images. Note that these images had been
acquired with different values of luminance, scales and even
orientations. Each image has been manually segmented to 1. The learning set xi : corresponding to the color Zernike
differentiate some blocks containing grapes from others. moments computed in each block of the image database.
The performances of our algorithm are analyzed with re- 2. The classes yi ∈ {−1, 1} of each block that is to say the
spect to the ratio of examples in the learning set. Hence, block contains some grape or not.
for a given ratio, the learning and testing sets have been
built by splitting randomly all examples. Then, due to the 3. Algorithm performance: the efficiency is given accord-
randomness of this procedure, multiple trials have been ing to the number of examples selected from the learning
performed with different random draws of the learning and dataset. For a given rate, the learning and the test sets are
testing set. randomly constructed among all the examples.
4. Number of trials: fixed to 10, in order to compensate
the random drawing.
5. Kernel K: a gaussian kernel of bandwidth σ is chosen

−kx−yk2
K(x, y) = e 2σ 2 (7)
x and y correspond to the descriptors vector of two blocks.
6. bandwidth σ: set to 2 (data are normalized) after opti-

mization [6].
7. penalization coefficient C: set to 100 after optimization

[6].
3
Figure 10 presents some results concerning grape recogni-
tion. We use here only the color information that is to
say 12 parameters describing each block. The size of the Fig. 11. Efficiency of grape detection by taking into account all
block have been set to 6 and 10 pixels. We vary the size parameters
of the learning and test databases. As for example, if we
use 10% of the learning database, we use 10% of samples
of blocks with and without grapes. By using a block of size IV. Conclusions and perspectives
W = 10, we obtain better results than for W = 6. This is We proposed in this paper a method for the detection of
normal because the grape detection is less precise in terms grapes in outdoor images. The proposed approach consists
of localization. We obtain a recognition rate of 80% that in determining if one block of the image contains a grape
means the class of a block is correctly defined in 80% cases. by using invariant descriptors. The use these descriptors
In this case, most of blocks containing grapes are not cor- namely Zernike moments permits us to detect a grape for
rectly recognized. any rotation and scales in the image. Color information
improves the recognition rate of the proposed approach.
Support vector machines performs very well with less 0.5%
errors for the recognition with few samples in the learning
database.
For harvesting grapes with a robot, the grape detection

is an important step but not the only one. We have to
determine the 3D location of the detected grape in order to
pick it. This is one perspective of this work and stereovision
will be investigated.
References
[1] Allotta, B., G. Buttazzo, P. Dario and F.A. Quaglia ”
Force/torque sensor-based technique for robot harvesting of
fruits and vegetables ” Proceedings. IEEE International Work-
shop on Intelligent Robots and Systems (IROS), Vol. 1, pages
Fig. 10. Efficiency of grape detection by taking into account only 231-235, 1990.
color information [2] Ceres, R., J.L. Pons, A.R. Jimenez, J.M. Martin, and L.
Calderon, ” Design and Implementation of An Aided Fruit-
Harvesting Robot (Agrobot), ” Industrial Robot, Vol. 25, No.
Figure 11 presents the efficiency of the proposed ap- 5, pp. 337-346, 1998.
[3] A. Choksuriwong, H. Laurent and B. Emile, ”Comparison of
proach by using all the parameters (color parameters and invariant descriptors for object recognition”, In the proceedings
invariant descriptors). We can see that with only 10% of of IEEE International Conference on Image Processing, 2005.
samples used in the learning database (which corresponds [4] C.-W. Chong, P. Raveendran and R. Mukundan, ”Mean Shift:
A Comparative analysis of algorithms for fast computation of
to 2160 samples with W=16 for example ), we have an er- Zernike moment”, Pattern Recognition 36 pp.731-742, 2003.
ror for the grape recognition less than 0.5%. [5] Y. Edan, D. Rogozin, T. Flash and G.E. Miles ”Robotic melon
harvesting”, IEEE Transactions on Robotics and Automation,
Vol. 16, pp. 831-835, 2000.
The computation time for learning step for the blocks [6] R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection us-
of size 16 × 16 pixels for the 17 images is 5 minutes on ing the second order information for training SVM. Journal of
a Pentium 4 with a processor 2.8 GHz. The computation Machine Learning Research 6, 1889-1918, 2005
[7] A. K. Jain, R.P.W. Duin and J.Mao ”Statistical Pattern Recog-
time for the recognition step (identification of each block nition : A Review”, IEEE Transactions on Pattern Analysis and
of size 16 × 16 pixels) takes less one second. Machine Intelligence 22(1) pp.4-37, 2000.
3702
[8] A. Khotanzad and Y. Hua Hong ”Invariant Image Recognition

by Zernike Moments”, IEEE Transactions on Pattern Analysis
and Machine Intelligence 12(5) pp.489-497, 1990.
[9] J. Lu, P. Gouton, J.P. Guillemin, C. My, JC. Shell ”Utiliza-
tion of Segmentation of Color Pictures to Distinguish Onions
and Weeds in Field”, Proceeding of International Conference on
Quality Control by Artificial Vision (QCAV), Le Creusot, Vol-
ume 2, pp.557-562, 2001.
[10] M.N. Monta, Y. Kondo and Y. Shibano ”Agricultural Robot in
Grape Production System”, Proceedings of IEEE International
Conference on Robotics and Automation, pp. 2504-2509, 1995.
[11] M. Petrou and A. Kadyrov ”Affine Invariant Features from the
Trace Transform”, IEEE Transactions on Pattern Analysis and
Machine Intelligence 26(1) pp.30-44, 2004.
[12] Sabetzadeh, F., P. Medwell and B. Parking, ”Development of
a Grape Harvesting Mobile Robot,” Dept. of Mechanical Engi-
neering, The University of Adelaide, Adelaide, SA, Australia,
2001.
[13] Shert, C.E., G.K. Brown ”Basic Considerations in Mechanizing
Citrus Harvest, ” Trans. ASAE, pp. 343-346, 1968.
[14] Sistler, F., ” Robotics and intelligent machines in agriculture ”
IEEE Journal of Robotics and Automation, Vol. 3 , pages 3-6,
1987.
[15] C. Rosenberger, B. Emile, H. Laurent, ”Calibration and Quality
Control of Cherries by Artificial Vision”, International Journal of
Electronic Imaging, Special issue on quality control by artificial
vision, pp. 539-546, (13), n3, July 2004.
[16] B. Scholkopf and A. Smola, ”Leaning with Kernels”, MIT Press,
2001.
[17] V. Vapnik, ”Statistical Learning Theory”, Wiley, 1998.
[18] Sarig, Y., ” Robotics of Fruit Harvesting : A State-of-the-art
Review, ” J. Agric. Engng Res., Vol 54, 1993, pp. 265-280, 1993.
[19] Jimenez, A.R., A.K. Jain, R. Ceres, J.L. Pons, ” Auto-
matic Fruit Recognition : A Survey and New Results Using
Range/Attenuation Images, ” Pattern Recognition, Vol. 32, pp.
1719-1736, 1999.
[20] DoE Document, ”Assessment Study on Sensors and Automation
in the Industries of the Future: Reports on Industrial Controls,
Information Processing, Automation, and Robotics,” pp. 1 - 90,
2004.
[21] Monta, M., N. Kondo, K.C. Ting, ” End-Effectors for Tomato
Harvesting Robot, ” Artificial Intelligence Rev., Vol. 12, pp. 11-
25, 1998.
[22] Recce, M., J. Taylor, A. Plebe, G. Tropiano, ” Vision and Neural
Control for An Orange Harvesting Robot, ” 1996
[23] Van Henten, E.J., J. Hemming, B.A.J. Van Tuijl, J.G. Kornet,
J. Meuleman, J. Bontsema, E.A. Van Os, ” An Autonomous
Robot for Harvesting Cucumbers in Greenhouses,” Autonomous
Robots, Vol. 13, pp. 241-258, 2002.

Chamelat 2006

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chamelat 2006

Uploaded by

Copyright:

Available Formats

Grape Detection

10 boulevard lahitolle, 18020 Bourges - France

Department of Engineering Science, Curicó, Chile

Abstract— We propose in this paper a new method for

Fig. 3. Principle of the proposed method

interesting to locate grapes. For each block of the origi-

In order to contribute to solve this problem, we tried to

We propose in this paper a method for the detection of

II. Developed method

Figure 6 plots 12 components of the 10th and 11th

A transformation is achieved in an implicit manner using

with αi∗ ∈ R. The value w and b are the parameters defin-

In SVMs, the optimality criterion to maximize is the

where C is a penalization coefficient for data points lo-

III. Experimental results

5. Kernel K: a gaussian kernel of bandwidth σ is chosen

x and y correspond to the descriptors vector of two blocks.

6. bandwidth σ: set to 2 (data are normalized) after opti-

7. penalization coefficient C: set to 100 after optimization

For harvesting grapes with a robot, the grape detection

[8] A. Khotanzad and Y. Hua Hong ”Invariant Image Recognition

You might also like