You are on page 1of 5

Identification of Rice Varieties Using Co-occurrence

Matrix Features of Bulk Samples and Support Vector


Machine
S. J. MousaviRad
Department of Computer
Engineering
University of Kurdistan
Sanandaj, Iran
jalalmoosavirad@gmail.com

F. Akhlaghian Tab

K. Nasri

K. Mollazade

Department of Computer Dept. of Mech. Eng. of Agri.


Engineering
Machinery
University of Kurdistan
University of Kurdistan
Sanandaj, Iran
Sanandaj, Iran
Fardin.tab@gmail.com kaveh.mollazade@gmail.com

Abstract In this paper, an algorithm for classifying five


different varieties of rice bulk samples is presented. Forty four
features from co-occurrence matrix were extracted from bulk
samples of rice. The Set of features contained redundant, noisy or
even irrelevant information. So, features were evaluated by
standard sequential forward algorithm. Finally, ten features were
selected as the superior ones. Support vector machine classifier
was developed to classify rice varieties. The overall classification
accuracy was achieved as 93.55%.

Department of Computer
Engineering
University of Kurdistan
Sanandaj, Iran
kiamnasri@gmail.com

During the last decades, several studies have been carried


out related to the application of machine vision for quality
evolution of agro-food produce, Liu et al., [5] proposed a
method of identification based on neural network to classify
rice variety using colour and shape features with accuracy of
88.3%. In [6] researchers presented an algorithm for
classifying five different varieties of rice using the colour and
texture features from individual rice kernels. Verma (2010)
extracted six morphological features (area, perimeter,
maximum length, maximum width, compactness, and
elongation) to classify three varieties of Indian rice. A neural
network was used with an accuracy ranging from 90 to 95%
[7]. Pabamalie and Premaratne[8] focused on producing a
classification system based on neural network and image
processing concepts with accuracy of 68% to 94% for the four
variaties. In another research, Van Dalen (2004) developed a
method for determination of the rice size and the amount of
broken rice kernels using image analysis[9]. Regarding the
quality evaluation of rice, a new method has been developed
to estimate the breakage and fissures ratio [10]. Literature
review shows that identification of rice variety from the bulk
samples has not been yet investigated. Hence, the objective of
this study was to demonstrate the effectiveness of cooccurrence matrix features for objective identification of rice
varieties from their bulk samples.

m
o
c
.
e
t
i
S
b
a
l
t
a
M

Keywords- texture; co-occurrence matrix; rice classification;


support vector machine

I.

INTRODUCTION

Rice is one of the most important cereal grain crops. It


constitutes the worlds principle source of food, being the
basic grain for the planets largest population. For tropical
Asians it is the staple food and is the major source of dietary
energy and protein. In Southeast Asia alone, rice is the staple
food for 80% of the population[1].
In the current grain-handling systems, grain type and
quality are assessed by visual inspection. This evaluation
process is, however, tedious and time consuming. The
decision-making capabilities of a grain inspector can be
seriously affected by his/her physical condition such as fatigue
and eyesight, mental state caused by biases and work pressure,
and working condition such as improper lighting condition,
etc. Hence, this needs to the automation of the process by
developing an imaging system that should acquire the rice
grain images, rectify, and analyze it.
Texture is an important image feature for describing
properties of objects in images. Image texture can be defined
as the spatial organization of intensity variations in an image
at various wave lengths, such as the visible and infrared
portions of the spectrum[2]. Image texture is an important
aspect of image, and textural features play a major role in
image analysis[3]. These features provide summary
information defined from intensity maps of the scene which
may be related to visual characteristics (coarseness of the
texture, regularity, presence of a privileged direction, etc.),
and also to characteristics that cannot be visually
differentiated[4]. Gray level co-occurrence matrix is a tool for
classification of textures in digital images.

II.

A.

MATERIALS AND METHODS

Grain Samples
Five Iranian rice varieties namely, Fajr, Hashemi, Mahali,
Gerde, and Domsiah were taken up for classification. The bulk
samples were prepared by pouring one kg of rice kernels into a
large plastic bag and shaking it to mix the grain thoroughly.
The rice kernel was then slowly poured into a dish until it was
completely filled. The excess rice was gently shaken off the
dish so that the top level of grain was almost horizontal and
matched up to the rim of the dish. This process was repeated
300 times for each rice variety. Thus, a total of 1500 images of
bulk samples were acquired (300 images per each rice

1833
MatlabSite.com

variety). Figure.1 Shows acquired from the five selected


varieties of rice.

B.

Imaging Device
A CCD colour camera (model No.C-2000 Olympus) was
used to acquire the images. Camera was mounted on a black
box which was equipped with a uniform lighting system. The
distance between the camera and bulk samples was 20cm.
C.

Texture Feature Extraction


Texture is one of the most important defining
characteristics of images. It is characterized by the spatial
distribution of gray levels in a neighbourhood. There are a
number of methods for texture feature extractions. Among
them are geometrical, statistical, model-based and signal
processing methods. Statistical methods analyze the spatial
distribution of gray values, by computing local features at
each point in the image and deriving a set of statistics from the
distributions of local features. Geometrical methods consider
texture to be composed of texture primitives and rule
governing their spatial organization by considering texture to
be composed of texture primitives. Model based methods
hypothesize the underlying texture process, constructing a
parametric generative model, which could have created the
observed intensity distribution. Signal processing methods
analyze the frequency content of image. For food processing;
the most widely used approach is statistical method like Cooccurrence matrix methods.

m
o
c
.
e
t
i
S
b
a
l
t
a
M
1)
Gray level Co-occurrence matrix (GLCM) Method
Gray level co-occurrence matrix is one of the mostly used
statistical texture analysis method in which texture feature is
extracted by some statistical approaches from co-occurrence
matrix , p(k,l)[2]. The use of the co-occurrence matrix is based
on the hypotheses that the same gray level configuration is
repeated in texture. This pattern will vary more by fine
textures than by coarse textures. The co-occurrence matrix
p ,d (i, j ) counts the co-occurrence of pixels with gray values

i and j at a given distance d and a given direction . (Fig. 2)


The direction can be selected from four different values of
0, 45, 90 and 135 while distance depends on the resolution of
texture.
After the construction of the co-occurrence matrix, the
matrix is normalized function:

p (k , l ) =

p(k , l )
R

where R is the normalized function, which is set as the sum of


the matrix.
Using the normalized GLCM, the following textural features
were extracted
Maximum Probability

max( pij )

Figure 1.

Images of five different rice varieties (A: Fajr, B: Hashemi,


C:Mahali, D:Gerde, E: Domsiah)

Correlation: this shows how a pixel is correlated to its


neighbor over the entire image
k

(i j)
i =1 j =1

1834
MatlabSite.com

pij

Mean
k

i i ( pij )
i =0 j =0
k

j j ( pij )
i =0 j =0

Cluster shade (CS) and cluster prominence (CP):


Cluster shade and cluster prominence are measures of
the skewness of the matrix, in other words the lock of
symmetry. When cluster shade and cluster prominence
are high, the image is not symmetric. In addition,
when cluster prominence is low, there is a peak in the
co-occurrence matrix around the mean value. For the
image, this means that there is little variation in gray
scales.

m
o
c
.
e
t
i
S
b
a
l
t
a
M
k

CS = ((i i ) + ( j j )) 3 pij

Figure 2. The direction of pixel pairs and the distance d between the
pixel pairs used to construct the gray level co-occurrence matrix. (a)
illustration of direction and distance d in image of pixel pairs(x1,y2) and
(x2, y2); (b)-(e) four examples at direction 0,45,90, and 135 ,
respectively[11].

Contrast: this measures contrast between a pixel and


its neighbor
k

(i j )

pij

Uniformity(Energy)

ij

i =1 j =1

Entropy: this measures randomness of a GLCM


element

pij log 2 pij

Homogeneity: this measures the closeness of the


distribution of elements in the GLCM
k

pij

1+ | i j |
i =1 j =1

Dissimilarity
k

i =0 j =0

pij | i j |

i =0 j = 0

So 44 features (11 features 4 orientations) were


extracted. After completing the feature extraction step,
outputs were normalized to the range of 0 to 1.

Feature selection
As stated in the previous section, 44 features were totally
extracted from the bulk samples of rice images. Of course,
these sets of features contain redundant, noisy or even
irrelevant information for classification purposes. To optimize
the number of features that contributed significantly to the
classification, standard sequential forward algorithm was
implemented using the 1-nearest neighbour error as the
selection criterion[12]. The selection algorithm reduced the
features to nearly optimum set of 10 features, which were
finally used to build the classifier.
E.

i =1 j =1

CP = ((i i ) + ( j j )) 4 pij

D.

i =1 j =1

i =0 j =0

Support Vector Machine


Support vector machines (SVM) are binary classifiers able
to classify data samples in two disjoint classes by a hyperplane
defined in a suitable space. The basic idea behind this
technique is that two classes are linearly separable. Actually,
there are more than one hyperplane for satisfying this
condition, and one of them is chosen as classifier on the basis
of the margin it creates between the two classes. Support
vector machine uses support vectors for classification and
discards other points. Support vectors are points that are on
maximum margin hyperplane. SVMs can be used for
classifying data that is not linearly separable. The data space is
transformed using kernel function into a higher dimensional
space where classes become linearly separable.
The classifier for the problem of binary classification is

1835
MatlabSite.com

Table 1.

f ( x) = sign[ wT . ( x) + b]
where the input vector x is mapped into a feature space by a
kernel function (x ) , and w and b are the classifier
parameters. Determining the classifier from the SVM theory is
equivalent to solving the following optimization problem

min
w ,b ,

Rice Variety
Fajr
Neda
Hashemi
Mahali
Gerde
All varieties

l
1 T
W .W + C i
2
i =1

Subject to

yi [w . ( xi ) + b] 1 i
T

where

i is

i 0,

Classification accuracy using all features

Table 2.

i = 1,...,l

a positive slack variable and C>0 is a penalty

Classification accuracy using selected features

Rice Variaty

parameter. This problem is a quadratic optimization problem


that can be solved using lagrange multipliers. Therefore, the
hyper plane function can be written as:

Fajr
Neda
Hashemi
Mahali
Gerde
All Variety

f ( x) = sgn( y i i .K ( x.xi ) + b)

Classification
accuracy
31.11
34.44
25.78
22.22
77.78
38.27

Classification
accuracy
88.89
91.11
94.44
93.33
100
93.55

m
o
c
.
e
t
i
S
b
a
l
t
a
M
i =1

That

k ( xi , x j ) = ( xi ). ( x j )

IV.

Support vectors have nonzero lagrange multipliers and m is


number of support vectors.
SVM has been developed to solve binary classification
problems. But, real problems have more than two classes. For
constructing a multi class SVM, two approaches are
introduced. On approach is by combining several binary
SVMs, and the other approach is implemented by direct multi
class classification. The first approach includes one against
one and one against all methodologies.[13]
We used RBF kernel and one against one methodology.
Different kernel functions were tested for classifying the rice
bulk samples by SVM and RBF kernel function had best
classification accuracy.
An RBF kernel with width is defined following:

This work presented an efficient model for identification


of rice varieties using bulk sample distribution pattern. The
gray level co-occurrence method was found to be an efficient
method for texture feature extraction. A support vector
machine classifier was designed for rice classification using
all extracted and top superior selected features. The total
classification rate on test data set showed an accuracy of
93.55%. In this work, gray scale images were considered for
processing. Authors recommend that potential of color
features for better classification be evaluated in the future
works. Also, it is recommended with the other intelligent
classifiers like artificial neural networks.

REFERENCES

k ( x, y) = exp( || x y || 2 /(2 2 ))
III.

RESULT AND DISCUSSIONS

The objective of the presented analysis is to check if


texture features using co-occurrence matrix can identify rice
variety of bulk samples. The results obtained in this work
indicate that the SVM is able, in general, to classify rice
varieties with success rate between 88% to 100% (see Table.
2). In the first step, all of forty four features, extracted from
co-occurrence matrix, were used for classification by support
vector machine. Classification result using this feature are
shown in Table 1.
In the next step, selection of features was carried out. Ten
superior features were selected as superior ones by the
standard forward sequential algorithm. Using these selected
features, classification accuracy showed a considerable
increase (Table 2)

CONCLUSION AND FUTURE WORKS

[1]
[2]
[3]

[4]

[5]

[6]

[7]
[8]
[9]

V. n. Nguyen, Fao Rice Information: Fao, 2000.


R. M. Haralick, K. Shanmugam, I. H. Dinstein, "Textural features
for image classification," Systems, Man and Cybernetics, IEEE
Transactions on, vol. 3, pp. 610-621, 1973.
J. Li, J. Tan, FA. Martz and H. Heymann, "Image texture features
as indicators of beef tenderness," Meat Science, vol. 53, pp. 17-22,
1999.
O. Basset, B. Buquet, S. Abouelkaram, P. Delacharte and J.
Culioni, "Application of texture image analysis for the
classification of bovine meat," Food Chemistry, vol. 69, pp. 437445, 2000.
Z. Liu, F. Cheng, Y. Ying and X. Rao, "Identification of rice seed
varieties using neural network," Journal of Zhejiang University
Science B, vol. 6, pp. 1095-1100, 2005.
S. J. Mousavi Rad,F. Akhlaghian Tab, K. Mollazade,
"Classification of rice varieties using optimal color and texture
features and BP Neural Network," presented at the The 7th Iranian
Conference on Machine Vision and Image Processing, Iran
University of Science and Technology, 2011.
B. Verma, "Image processing techniques for grading &
classification of rice," 2010, pp. 220-223.
L. Pabamalie and H. Premaratne, "An intelligent rice quality
classifier," International Journal of Internet Technology and
Secured Transactions, vol. 3, pp. 386-406, 2011.
G. Van Dalen, "Determination of the size distribution and
percentage of broken kernels of rice using flatbed scanning and

1836
MatlabSite.com

[10]
[11]
[12]

[13]

image analysis," Food research international, vol. 37, pp. 51-58,


2004.
F. Courtois, M. Faessel, and C. Bonazzi, "Assessing breakage and
cracks of parboiled rice kernels by image analysis techniques,"
Food control, vol. 21, pp. 567-572, 2010.
C. Zheng, D. W. Sun, and L. Zheng, "Recent applications of image
texture for evaluation of food qualities-a review," Trends in food
science & technology, vol. 17, pp. 113-128, 2006.
A. Jain and D. Zongker, "Feature selection: Evaluation,
application, and small sample performance," Pattern Analysis and
Machine Intelligence, IEEE Transactions on, vol. 19, pp. 153-158,
1997.
A. Mucherino, P. J. papajorgji, and P. M. pardalos, Data mining in
agriculture vol. 34: Springer Verlag, 2009.

m
o
c
.
e
t
i
S
b
a
l
t
a
M
1837
MatlabSite.com

You might also like