Professional Documents
Culture Documents
Instituto de Automtica e Informtica Industrial. Universidad Politcnica de Valencia. Camino de Vera s/n, 46022, Valencia, Spain
Centro de Agroingeniera. Instituto Valenciano de Investigaciones Agrarias. Cra. Moncada-Nquera, Km. 5, 46113 Moncada, Spain
Instituto en Bioingeniera y Tecnologa Orientada al Ser Humano. Universidad Politcnica de Valencia. Camino de Vera s/n, 46022 Valencia, Spain
a r t i c l e
i n f o
Article history:
Received 28 October 2009
Received in revised form 21 January 2010
Accepted 2 February 2010
Keywords:
Fruit Inspection
Automatic Quality Control
Multivariate Image Analysis
Principal Component Analysis
Unsupervised Methods
a b s t r a c t
One of the main problems in the post-harvest processing of citrus is the detection of visual defects in
order to classify the fruit depending on their appearance. Species and cultivars of citrus present a high
rate of unpredictability in texture and colour that makes it difcult to develop a general, unsupervised
method able of perform this task. In this paper we study the use of a general approach that was originally developed for the detection of defects in random colour textures. It is based on a Multivariate
Image Analysis strategy and uses Principal Component Analysis to extract a reference eigenspace from a
matrix built by unfolding colour and spatial data from samples of defect-free peel. Test images are also
unfolded and projected onto the reference eigenspace and the result is a score matrix which is used to
compute defective maps based on the T 2 statistic. In addition, a multiresolution scheme is introduced
in the original method to speed up the process. Unlike the techniques commonly used for the detection
of defects in fruits, this is an unsupervised method that only needs a few samples to be trained. It is
also a simple approach that is suitable for real-time compliance. Experimental work was performed on
120 samples of oranges and mandarins from four different cultivars: Clemenules, Marisol, Fortune, and
Valencia. The success ratio for the detection of individual defects was 91.5%, while the classication ratio
of damaged/sound samples was 94.2%. These results show that the studied method can be suitable for
the task of citrus inspection.
2010 Elsevier B.V. All rights reserved.
1. Introduction
The automatic inspection of quality in the agro-industry is
becoming of paramount importance in order to decrease production costs and increase quality standards as stated by Chen
et al. (2002) and Brosnan and Sun (2002). On the packing lines,
where external quality attributes are currently inspected visually,
machine vision is providing a way to perform this task automatically. Together, the reduction in the price of machine vision
equipment and the increased processing capacity of computers
now allow more complex analyses of images to be carried out, being
the detection of defects and quality control of the product some
of the major applications of machine vision in this eld (Rocha et
al., 2010; Davies, 2009). For instance, Vzhny and Felfldi (2000)
and Aleixos et al. (2002) reported automatic systems for grading
the quality of mushrooms and citrus, respectively, using machine
vision. More recently, Blasco et al. (2009a) developed a prototype
Corresponding author.
E-mail address: opez@disca.upv.es (F. Lpez-Garca).
0168-1699/$ see front matter 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.compag.2010.02.001
190
edge-based features and Hough transform features. A feature selection procedure was performed in the sets of colour percentiles,
LBP and edge-based features. For classication they used a k-NN
approach with k=3. An important drawback of the method is that it
uses supervised training and large number of samples are required
to build the reference model. Furthermore, the model is based on
the knowledge of defective and sound samples compiled in the
training stage, and thus, it is not adequate for the detection of
a-posteriori new types of defects (unpredictable new defects).
More recently, the Texems approach was proposed by Xie and
Mirmehdi (2007). This method is based on the use of texture primitives representing the texture layout of normal samples. Textural
primitive information is extracted, in an unsupervised manner,
from a small number of training samples using Gaussian Mixture
Modelling (GMM). Instead of generating them directly from the
RGB (red, green, blue) channels, they use a reference eigenspace
built from the colour data of locations belonging to a small set of
defect-free textures. RGB channels of a training image are then projected onto the reference eigenspace to achieve three decorrelated
eigenchannels from which texems are extracted. This is performed
to overcome the problem of high correlation among the RGB channels. In the defect detection stage, testing images are projected onto
the reference eigenspace. Then, textures in the resulting eigenchannels are compared with the texems collected in the training stage
and, nally, defects are detected using a probability function built
from Gaussian distributions.
In general, a desirable feature for the automatic inspection
methods is the capability to deal with new unpredictable defects
that may occur during production. In this case, the proper approach
should be to build a model of sound areas and compare new test
images against this model considering as being defective any area
that does not t the model. This kind of approach is called novelty
detection, since the method is designed to detect any sample that
does not t the learned model, and thus, what is detected is something novel or unknown to the system. Moreover, since conditions
at the production lines can change frequently, another desirable
feature is simple unsupervised training using few samples. In addition, real-time compliance is an important issue so that the overall
production can be inspected at on-line rates. Thus, approaches with
low computational costs are valuable.
In particular for the post-harvest inspection of citrus fruits, one
desirable feature is the capability to deal with new types of samples easily. This idea is closely related with the problems found
during the inspection. Due to the large variety of colours and textures present in citrus, inspection systems need frequent tuning to
adjust the classication to the features of each batch of fruit; moreover, if the cultivar changes, the system needs to be fully retrained.
Therefore, unsupervised and easy-to-train systems are needed.
The method we study in this paper (Lpez-Garca et al., 2006)
belongs to the group of methods oriented to the detection of
defects in random colour textures. It is based on a Multivariate
Image Analysis strategy, developed in recent years in the eld
of applied statistics, and uses the T 2 statistic to detect defective
locations and build the corresponding map of defects. Moreover,
this method has the advantage of being consistent with all the
desirable features mentioned above for the inspection methods:
it builds a model of defect-free colour textures using unsupervised
training and few samples, it considers any location that does not
match the model as being defective, and thus it is adequate for
the detection of new unpredictable defects by carrying out novelty detection, and also, it uses a simple approach that makes it
suitable for real-time compliance. In the present work we have
extended the original method to include a multiresolution framework which is able to signicantly speed-up the overall process, and
also, a post-processing stage has been added to improve detection
results.
191
Fig. 1. Six different types of surface defects. From top to bottom and left to right: wind scarring, thrips scarring, sooty mould, stem-end injury, scale infestation and green
mould.
The MIA approach presented by Lpez-Garca et al. (2006) follows the ow-diagram presented in Fig. 2. In both stages, training
and test, the rst step is to unfold the colour and spatial information of the image pixels to congure a matrix of raw data. In this
matrix, each row is composed of the RGB values of one pixel and
its vicinity. Pixels vicinity is set through a neighbourhood window.
Fig. 3 shows the unfolded RGB raw data of an image using a square
window of size 3x3. For a given pixel ith , its R value is translated
rst, next the R value of the top-left neighbour and then the rest
of neighbours following the clockwise direction. By following the
same approach with the G and B channels, each row of the matrix is
created. Apart of squares, other window shapes are suitable for use,
such as hexagons or crosses. Nevertheless, we have used the most
common window shape, also used by Lpez-Garca et al. (2006), consisting of a WxW square window where W can vary in odd numbers,
e.g. 3x3, 5x5, etc. Other unfolding orders are also possible since the
PCA analysis does not depend on the order of columns (data variables). However, we always must use the same unfolding order in
training and test stages for the correct functioning of the method.
The method presented by Lpez-Garca et al. (2006) works in the
following way. A training image free of defects is used to compile a
reference eigenspace that will be used to build the reference model
of locations (pixels) belonging to defect-free areas and to perform
defect detection in test images. Let INxM be the training image and
let
(1)
192
(2)
L
t2
il
l=1
sl2
(3)
where til is the score value of a given pixel ith in the lth principal
component (lth eigenvalue) with variance sl2 .
A behaviour model of normal pixels belonging to defect-free
areas is created by computing the T 2 statistic for every pixel in the
training image. Ti2 is, in fact, the Mahalanobis distance of the projection of the pixel neighbourhood onto the eigenspace with respect
to the centre of gravity of the model (the mean), and represents a
measure of the variation of each pixel inside the model. In order
to achieve a threshold level of the T 2 variable dening the normal
behaviour of pixels, a cumulative histogram is computed from the
T 2 values of the training image. The threshold is then determined
by choosing an extreme percentile in the histogram, commonly 90%
or 95%. Any pixel with a T 2 value greater than the threshold will
be considered a pixel belonging to a defective area. One or more
Fig. 3. Matrix of unfolded RGB data using a 3x3 square vicinity window. Arrows on
the top show pixels neighbours following the clockwise direction.
(4)
where pi are the pixels, Inew is the testing image, Dmap is the defective map image and TH2 is the threshold for a given percentile.
2.3. Multiresolution and Post-Processing
In order to speed-up the process and improve the results
of defect detection, we have extended the original method by
including a multiresolution scheme and a post-processing stage.
Multiresolution is introduced to capture defects, and parts of
defects, of different sizes with minimum computational cost. In this
case, we x the size of the vicinity window to the minimum size
of 3x3, and apply the method to the test sample at several scales
(lower of equal to 1.0). In this way, bigger defects are collected
at lower scales, where the 3x3 window covers a larger area than in
the original scale. Lower scales lead to smaller matrices of unfolded
data and then the process is accelerated because the major computational cost of the method is concentrated in the projection of data
matrix onto the reference eigenspace (a matrix multiplication). By
reducing the size of matrices the computational cost is signicantly
reduced. The nal map of defects is built by combining the defective
maps computed at each scale and then resized to the original size
of samples (scale 1.0). It is worthy noting that when we use a 0.5
scale the size of matrices is reduced to 1/4 and the computational
cost is reduced to 1/8.
193
Fig. 5. Several patches of defect-free peel areas extracted to build the model of sound colour textures corresponding to Clemenules cultivar.
194
0.25, 0.12), (1.00, 0.50, 0.12), (1.00, 0.50, 0.25), (1.0, 0.50, 0.25, 0.12).
This led to a total number of 462 different experiments carried out
for each cultivar. To tune the parameters, that is, to choose those
values that maximise the quality of the maps of defects, we marked
the defective areas in the samples of the corresponding cultivar
manually and then compared with the defective maps obtained by
the method using the measures of Precision, Recall and F-score,
which are dened as:
Precision =
Recall =
F =2
tp
tp + fp
tp
tp + fn
Precision Recall
Precision + Recall
(5)
Table 1
Best results of Precision, Recall and F-Score achieved for the Clemenules cultivar and
the corresponding combinations of factors. EV is the number of principal eigenvectors and P is the percentile. Timing results are provided in milliseconds.
Scales
EV
Preci.
Recall
F-Score
Timing
0.50, 0.12
0.50, 0.25
0.50, 0.25
1.00, 0.50
1.00, 0.50
1.00, 0.50
1.00, 0.50, 0.25
13
11
13
9
13
23
9
95%
95%
95%
95%
95%
99%
95%
0.57
0.60
0.59
0.66
0.57
0.66
0.62
0.65
0.61
0.64
0.59
0.69
0.61
0.62
0.55
0.54
0.55
0.55
0.55
0.57
0.55
557
593
633
2408
2690
3763
2487
(6)
(7)
where tp (true positives) is the number of pixels marked and correctly detected, fp (false positives) is the number of pixels not
marked but detected, and fn (false negatives) corresponds to the
number of pixels marked but not detected. Precision is dened as
the ratio between pixels correctly detected and the total number
of pixels detected, whereas Recall is the ratio between pixels correctly detected over the total number of pixels marked. Precision
can be seen as a measure of exactness (delity), Recall is a measure of completeness, and the F-score is a measure that combines
Precision and Recall through their harmonic mean. We chose these
measures because they are commonly used in binary classication
(Olson and Delen, 2008), which is the kind of classication performed here on pixels. Other approaches such as the Fishers exact
test are also available for this purpose.
Once the set of experiments was performed for each cultivar,
the mean values of previous measures were computed. We then
selected the most balanced result, taking into account the achieved
measures but also the corresponding computational cost. Table 1
shows the combinations of factors that achieved the best results of
Precision, Recall and F-score for the Clemenules cultivar.
Table 2
Best combinations of factors selected for each cultivar and the corresponding mean
values of Precision, Recall and F-Score. EV is the number of principal eigenvectors
and P is the percentile. Timing results are provided in milliseconds.
Cultivar
EV
Scales
Prec.
Recall
F-Score
Timing
Clemenules
Fortune
Marisol
Valencia
Mean
11
17
23
27
95%
90%
90%
95%
0.50, 0.25
0.50, 0.25
0.50, 0.25
0.50, 0.12
0.60
0.54
0.62
0.64
0.60
0.61
0.69
0.58
0.67
0.64
0.54
0.56
0.53
0.62
0.56
593
736
892
863
Fig. 6. Original samples (top), manually marked defects in green (middle) and defective areas detected using the MIA approach in dark blue (bottom).
195
Table 4
Correct detection of each type of defect in all cultivars.
Table 3
Detection results on individual defects.
Cultivar
Defects
Detected
False Detec.
Defect Type
Correct Detection
Clemenules
Fortune
Marisol
Valencia
Total
238
172
195
138
743
211 (88.7%)
159 (92.4%)
185 (94.9%)
125 (90.6%)
680 (91.5%)
4 (1.7%)
10 (5.5%)
7 (3.5%)
6 (4.2%)
27 (3.5%)
Scales
Wind scarring
Oleocellosis
Phytotoxicity
Thrips scarring
Anthracnose
Decay with spores
Stem-end injury
Sooty mould
Total
Stem end
92.8%
91.5%
87.2%
93.8%
90.5%
92.3%
91.7%
100.0%
100.0%
91.5%
100.0%
the defects marked manually and the defects detected using the
MIA approach. We can see there that although the method does not
t the marked defects precisely or extensively, it does provide an
estimation of them that is good enough to carry out the inspection
task.
Timing results in Tables 1 and 2 were achieved using: single
threaded code in a standard computer (Intel Core2 Quad CPU @
2.66GHz), Linux operating system, C programing language and GSL
(GNU Scientic Library) support for PCA and matrix calculation.
Each timing corresponds to the application of the MIA approach to
a test sample of a given cultivar using the corresponding combination of factors. They are provided in milliseconds and averaged
from 20 executions. The computational cost of the approach only
depends on the size of the images and the selected combination
of factors (scales, principal eigenvectors and percentile), thus, it is
independent of the cultivars and samples (the image content).
3. Results and Discussion
Once the parameters were tuned, from the corresponding
marked defects and defective maps we counted the actual defects,
the correctly detected defects and the false detections for each cultivar. These results are shown in Table 3 (the percentage of false
detections is provided with regard to the total number of defects
plus the false detections). In this case the method achieves a correct classication ratio of 91.5%. The Codex Standards for oranges
quality (FAO, 2005) are subjective and provide only imprecise definitions like only very slight defects in the skin of the fruits are
allowed in the extra category. However, they allow several tolerances such as 5% of misclassication for extra category and 10%
of misclassication for the rest of categories, which ts the results
observed in the experiments.
An example of the analysis performed by the MIA approach for
one sample of each cultivar is shown in Fig. 6. First row corresponds
to the original samples, second row shows the defects marked manually using a graphical tool, and the bottom row shows the defects
detected by the MIA approach. From left to right, the type of defect
corresponded to scale infestation, thrips scarring, wind scarring and
phytotoxicity. It can be seen the staircase effect due to the use of
low scales in the multiresolution approach. From these samples
and others, we have observed that the task of marking defects is
subjective and thus imperfect to certain point, in some cases very
small defects has been marked (see Clemenules sample) while in
other cases (very few) some bigger defects have not been marked
(see Valencia sample). But, although this could have introduced
some misclassication, it is not signicant with regard to the total
number of defects as it is observed in Table 3.
Results in Table 3 show that the method can be adequate for
the task of citrus inspection, although they are slightly lower than
those obtained by Blasco et al. (2009b) (94.2% of correct detections,
4.0% of false detections). It also has the advantage that it does not
need to assume the hypothesis that most of the orange is a sound
area to carry out the unsupervised training, as it is necessary in
the aforementioned work. With regard to the false detections issue,
they could become a problem for an automatic sorting system since
sound fruit could be classied as damaged. The elimination of false
Correct
Incorrect
Accuracy
Clemenules
Fortune
Marisol
Valencia
Total
28
28
29
28
113
2
2
1
2
7
93.3%
93.3%
96.6%
93.3%
94.2%
196
are over 93% of success ratio) and in total (94.2% of global performance). Incorrect detections were due in all cases to sound samples
that were detected as damaged.
The fruit used in the experiments belongs to different harvesting
times in Spain; the Marisol cultivar is an early mandarin harvested
from September, Clemenules cultivar is harvested from November and Fortune is a late mandarin cultivar harvested from March.
The Valencia cultivar, in turn, is a late orange harvested from April.
Looking at the results, the method seems to be effective with all of
them even though they present different textures, sizes and types
of defects, which shows the robustness of the method. This robustness is due to the use of the novelty detection approach, where
the different types of defects are not studied individually but considered globally as a single defective class which is novel with
regard to the only class studied by the method, the class of sound
non-defective areas. Moreover, the detection of individual defects
presented in Table 3 and the performance in the classication of
fruit presented in Table 5 obtain similar effectiveness for all the
studied cultivars. This is probably due to the fact that the MIA
approach works with texture information instead of single pixel
information, which makes this method appropriate for the analysis
of scenes that present a great amount of variability.
In this work, the time required to process one view of one fruit
varies from approximately 600 to 900 ms using single threaded
code (only one CPU core is used) what can be considered excessive
for a real-time application of inspection that must process multiple
views of each fruit. However, if several fruits are inspected in each
image, as happens in the real applications, the computational cost
per fruit could be reduced since it depends more on the size of
the image than of its content. Nevertheless, the timing cost can be
drastically reduced by working with more powerful computers or
by introducing multi-threaded code (all the cores in the CPU are
used) and/or simple and cheap parallelization techniques based on
computer clustering.
Finally, since the unfolded data in the method contains directional comparison in certain order and PCA compression may
assume some dependence on orientation, we could think that the
method is dependent to orientation (rotation of textures). However, this effect is minimised as we use the minimum size for the
vicinity window, which is 3x3. Using this window size the orientation dependence is very low as the number of combinations
resulting from texture rotations is small with regard to the number
of samples (pixels vicinities) introduced in the PCA stage to learn
the model. Moreover, the method only learns from sound colour
textures where the variability is much more smaller than in the
damaged areas.
4. Conclusions
In this paper we have studied the detection of defects during the
post-harvest quality inspection of citrus using a Multivariate Image
Analysis approach. The method uses PCA analysis to extract a reference eigenspace that models sound colour textures from a matrix
built by unfolding colour and spatial data from defect-free samples. Test images are also unfolded and projected onto the reference
eigenspace to obtain a score matrix which is used to compute T 2
defective maps. To speed up the process, a multiresolution scheme
was introduced and also a post-processing stage to improve the
detection results.
The method performs novelty detection, and also is able to identify new unpredictable defects, by using a model of sound colour
textures and considering those locations that do not t this model
as being defective. It also needs only a few samples to carry out
the unsupervised training. For this reason, it is suitable for citrus
inspection as these systems need frequent tuning to adjust to the
Acknowledgements
This work has been supported by the Spanish Ministry of Education (MEC) and by European FEDER funds, through the research
projects DPI2007-66596-C02-01 (VISTAC) and DPI-2007-66596C02-02.
References
Aleixos, N., Blasco, J., Navarrn, F., Molt, E., 2002. Multispectral inspection of citrus
in real-time using machine vision and digital signal processors. Computers and
Electronics in Agriculture 33 (2), 121137.
Bennedsen, B.S., Peterson, D.L., Tabb, A., 2005. Identifying defects in images of rotating apples. Computers and Electronics in Agriculture 48 (2), 92102.
Blasco, J., Aleixos, N., Cubero, S., Gmez-Sanchis, J., Molt, E., 2009a. Automatic sorting of satsuma (Citrus Unshiu) segments using computer vision
and morphological features. Computers and electronics in agriculture 66 (1),
18.
Blasco, J., Aleixos, N., Gmez-Sanchis, J., Molt, E., 2009b. Recognition and classication of external skin damage in citrus fruits using multispectral data and
morphological features. Biosystems Engineering 103 (2), 137145.
Blasco, J., Aleixos, N., Molt, E., 2007. Computer vision detection of peel defects in
citrus by means of a region oriented segmentation algorithm. Journal of Food
Engineering 81 (3), 535543.
Brosnan, T., Sun, D.W., 2002. Inspection and grading of agricultural and food products
by computer vision systems, a review. Computers and Electronics in Agriculture.
36 (23), 193213.
Chen, Y.R., Chao, K., Kim, M.S., 2002. Machine vision technology for agricultural
applications. Computers and Electronics in Agriculture. 33 (23), 173191.
Davies, E.R., 2009. The application of machine vision to food and agriculture: a
review. The Imaging Science Journal 57 (4), 197217.
Du, C.J., Sun, D.W., 2006. Learning techniques used in computer vision for food
quality evaluation: a review. Journal of Food Engineering 72 (1), 3955.
FAO (Food and Agriculture Organization of the United Nations), 2005. Codex Standard for Oranges (CODEX STAN 2452004). http://www.codexalimentarius.net.
Last accessed January 2010.
Gmez-Sanchis, J., Molt, E., Camps-Valls, G., Gmez-Chova, L., Aleixos, N., Blasco, J.,
2008. Automatic correction of the effects of the light source on spherical objects.
An application to the analysis of hyperspectral images of citrus fruits. Journal of
Food Engineering 85 (2), 191200.
Kavdir, I., Guyer, D.E., 2004. Comparison of Articial Neural Networks and Statistical
Classiers in Apple Sorting using Textural Features. Biosystems Engineering. 89
(3), 331344.
Leemans, V., Magein, H., Destain, M.F., 1999. Defect segmentation on Jonagold apples
using colour vision and a Bayesian classication method. Computers and Electronics in Agriculture 23 (1), 4353.
Li, Q., Wang, M., Gu, W., 2002. Computer vision based system for apple surface defect
detection. Computers and Electronics in Agriculture 36 (23), 215223.
Liming, X., Yanchao, Z., 2010. Automated strawberry grading system based on image
processing. Computers and Electronics in Agriculture 71, S32S39.
Lpez-Garca, F., Prats-Montalbn, J.M., Ferrer, A., Valiente, J.M., 2006. Defect Detection in Random Colour Textures using the MIA T 2 Defect Maps. Lecture Notes in
Computer Science 4142, 752763.
197
Rocha, A., Hauagge, D.C., Wainer, J., Goldenstein, S., 2009. Automatic fruit and vegetable classication from images. Computers and Electronics in Agriculture 70
(1), 96104.
Unay, D., Gosselin, B., 2007. Stem and calyx recognition on Jonagold apples by pattern
recognition. Journal of Food Engineering 78 (2), 597605.
Vzhny, T., Felfldi, J., 2000. Enhancing colour differences in images of diseased
mushrooms. Computers and Electronics in Agriculture 26, 187198.
Xiao-Bo, Z., Jie-Wen, Z., Yanxiao, L., Holmes, M., 2009. In-line detection of apple
defects using three color cameras system. Computers and Electronics in Agriculture 70 (1), 129134.
Xie, X., Mirmehdi, M., 2007. TEXEMS: Texture exemplars for defect detection on
random textured surfaces. IEEE Transactions on Pattern Analysis and Machine
Intelligence 29 (8), 14541464.
Zheng, C., Sun, D.W., Zheng, L., 2006. Recent applications of image texture for evaluation of food Qualities - a review. Trends in Food Science & Technology 17,
113128.