Robust Traf C Sign Recognition Based On Color

This article has been accepted for inclusion in a future issue of this journal.
Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1
Robust Trafc Sign Recognition Based on Color
Global and Local Oriented Edge Magnitude Patterns
Xue Yuan, Xiaoli Hao, Houjin Chen, and Xueye Wei
AbstractMost of the existing trafc sign recognition (TSR)
systems make use of the inner region of the signs or the local fea-
tures such as Haar, histograms of oriented gradients (HOG), and
scale-invariant feature transform for recognition, whereas these
features are still limited to deal with the rotation, illumination,
and scale variations situations. A good feature of a trafc sign
is desired to be discriminative and robust. In this paper, a novel
Color Global and Local Oriented Edge Magnitude Pattern (Color
Global LOEMP) is proposed. The Color Global LOEMP is a
framework that is able to effectively combine color, global spatial
structure, global direction structure, and local shape information
and balance the two concerns of distinctiveness and robustness.
The contributions of this paper are as follows: 1) color angular
patterns are proposed to provide the color distinguishing infor-
mation; 2) a context frame is established to provide global spatial
information, due to the fact that the context frame is established
by the shape of the trafc sign, thus allowing the cells to be aligned
well with the inside part of the trafc sign even when rotation and
scale variations occur; and 3) a LOEMP is proposed to represent
each cell. In each cell, the distribution of the orientation patterns
is described by the HOG feature, and then, each direction of
HOG is represented in detail by the occurrence of local binary
pattern histogram in this direction. Experiments are performed
to validate the effectiveness of the proposed approach with TSR
systems, and the experimental results are satisfying, even for
images containing trafc signs that have been rotated, damaged,
altered in color, or undergone afne transformations or images
that were photographed under different weather or illumination
conditions.
Index TermsHistogram of oriented gradient (HOG), local
binary pattern (LBP), rotation invariant, trafc sign recognition
(TSR).
I. INTRODUCTION
A
T PRESENT, intelligent transportation systemtechnology
is developing at a very rapid pace. Trafc problems, such
as driving safety, city trafc congestion, and transportation
efciency, are expected to be alleviated through the application
Manuscript received February 25, 2013; revised July 17, 2013, October 8,
2013, and December 14, 2013; accepted January 6, 2014. This work was
supported in part by the Specialized Research Fund for the Doctoral Program
of Higher Education under Grants 20110009120003 and 20110009110001, by
the National Natural Science Foundation of China under Grants 61301186 and
61271305, and by the School Foundation of Beijing Jiaotong University under
Grants W11JB00460 and 2010JBZ010. The Associate Editor for this paper was
S. S. Nedevschi.
X. Yuan is with the School of Electronics and Information Engineering,
Beijing Jiaotong University, Beijing 100044, China, and also with the Chinese
Academy of Surveying and Mapping, Beijing 100830, China (e-mail: xyuan@
bjtu.edu.cn).
X. Hao, H. Chen, and X. Wei are with the School of Electronics and
Information Engineering, Beijing Jiaotong University, Beijing 100044, China.
Color versions of one or more of the gures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/TITS.2014.2298912
of information technology and the intelligent transportation of
vehicles. As an important subsystem in intelligent transporta-
tion system technology, trafc sign recognition (TSR) systems
based on computer vision have gradually become an important
research topic in the eld of intelligent transportation system
technology [1][10].
Trafc sign images are generally obtained from outdoor
natural scenes by means of cameras installed on vehicles, and
then, the images are input to a computer for processing. Due
to the many types of complicated factors present outdoors,
outdoor environments are much more complex and challenging
than indoor systems. The main difculties of TSR systems are
how to extract the robust descriptor with rich information in
accordance with various lighting conditions, shape rotations,
afne transformations, dimension changes, and so on.
Through in-depth study of trafc sign data sets, some com-
mon characteristics of trafc signs may be observed, such as the
following, for example.
1) Some trafc signs have the same local features but differ-
ent background colors [see Fig. 1(a)].
2) Some trafc signs have the same local features and back-
ground colors but different distributions of global shapes
[see Fig. 1(b)].
3) Some trafc signs share the same components but have
different meanings [see Fig. 1(c)].
The proposed system consists of the following two stages:
detection performed using maximally stable extremal regions
(MSERs) and recognition performed by the novel Color Global
and Local Oriented Edge Magnitude Pattern (Color Global
LOEMP) features, which are classied using a support vector
machine (SVM). To the best of our knowledge, this is the rst
paper that adopts local binary pattern (LBP)-based features
for TSR, which are more discriminative and robust in han-
dling shape rotations, various illumination conditions, and scale
changes from trafc sign images than the existing systems. The
remainder of this paper is organized as follows. In Section II,
we review previous work and describe our improvements.
Section III presents the trafc sign detection algorithm, and
Section IV details the Color Global LOEMP feature extraction.
Recognition based on SVMs is presented in Section V. The
experimental results are presented in Section VI. Finally, a
conclusion is presented in Section VII.
II. RELATED WORK
In recent years, research with regard to TSR has grown
rapidly due to the signicant need for such systems in future
vehicles. The most common approach consists of two main
1524-9050 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Fig. 1. Examples of trafc signs. (a) Trafc signs with the same local texture patterns, but different background colors. (b) Trafc signs with the same local texture
patterns and background colors but, different distributions of global geometrics. (c) Trafc signs that share the same components but, have different meanings.
stages: detection and recognition. The detection stage identies
the regions of interest and is performed mostly using color
segmentation, followed by some formof shape recognition. The
detected candidates are then either identied or rejected during
the recognition stage. The features used in the recognition
stage are the inner components of trafc signs, Haar, and HOG
features. Classiers such as SVM [1][3], neural networks [4],
and fuzzy regression tree frameworks [5] were reported on
recent papers.
In the detection stage, the majority of the systems make
use of color information as a method for segmenting the
image. The performance of color-based trafc sign detec-
tion is often reduced in scenes with strong illumination,
poor lighting, or adverse weather conditions. Color models,
such as huesaturationvalue (HSV), YUV, Y
C
B
C
R
, and
CIECAM97, have been used in an attempt to overcome issues.
For example, Gao et al. [6] proposed a TSR system based on the
extraction of the red and blue color regions in the CIECAM97
color model. References [1] and [2] extracted color information
for trafc sign detection in the HSV color model. Reference
[4] detected potential road signs by means of the distribution
of red pixels within the image in the Y
C
B
C
R
color model. In
contrast, there are several approaches that ignore color informa-
tion entirely and instead use only shape information from gray
scale images. For example, Loy [7] proposed a system that used
local radial symmetry to highlight the points of interest in each
image and detect octagonal, square, and triangular trafc signs.
Recently, Greenhalgh and Mirmehdi [3] have proposed a trafc
sign detection algorithm using a novel application of MSERs;
the authors validated the efciency of the MSERs method.
In the recognition stage, the majority of the systems make
use of the inner region. For example, Fleyeh and Davami
[8] extracted the binary inner component of trafc signs for
recognition. They processed the size and rotation normalization
before recognition in order to reduce the effects caused by
shape rotation, afne transformation, and dimension changes.
Then, they used the principal component analysis algorithm to
determine the most effective eigenvectors of the trafc signs.
Escalera et al. [9] indicated that a sign is the sum of a color
border, an achromatic (either white and/or black) inner com-
ponent, and a shape. They proposed a method that computes
color energy, chromatic energy, gradient energy, and distance
energy. They also proposed two techniques for determining the
minimum as trafc sign regions in the energy function, namely,
simulated annealing and genetic algorithms. Maldonado-
Bascn et al. also proposed a TSR system using gray images
of inner regions [1], [2], in which the gray images were nor-
malized and the contrasts were stretched and equalized before
recognition to reduce the effect of illumination variations.
Some recent systems have made use of HOG, Haar, and
scale-invariant feature transform (SIFT) features for TSR. For
example, Ruta et al. [5] extracted Haar and HOG features from
trafc sign images. Greenhalgh and Mirmehdi [3] proposed
a TSR algorithm using HOG features. Takaki et al. [10] and
Ihara et al. [11] proposed a TSR method based on keypoint
classication by SIFT. In their system, two different feature
subspaces are constructed from gradient and general images.
Detected keypoints are then projected onto both subspaces,
and SIFT is a local descriptor, which remains unchanged
for images of different scales and small rotation angles.
Abdel-Hakim and Farag [12] proposed a trafc sign detection
and recognition technique by augmenting SIFT with the ad-
dition of new features related to the color of the keypoints.
In [13], Yuan et al. proposed a context-aware SIFT-based
algorithm for TSR. Furthermore, a method for computing the
similarity between two images is proposed in their paper, which
focuses on the distribution of the matching points, rather than
using the traditional SIFT approach of selecting the template
with the maximum number of matching points as the nal
result. However, some issues still remain when illumination
and rotation variations occur. For example, the performance of
inner-region-based TSR is often reduced in scenes with various
illumination conditions, rotation, and scale variations and that
have undergone afne transformations. HOG [14] is very effec-
tive in capturing gradients, long known to be crucial for vision
and robust to appearance and illumination changes. However,
images are clearly more than just gradients, and traditional
HOG is still limited to performing the rotation variations. SIFT
is proposed to solve these problems, which is robust to various
illumination conditions, shape rotations, and scale changes.
However, the dimension of the SIFT feature is dependent on
the number of detected keypoints, and the number of keypoints
is always different among the images. Due to the fact that the
dimensions of the features are different, it is difcult to design
a suitable classier for classication based on the SIFT feature.
In this paper, an LBP-based feature, which is robust to shape
YUAN et al.: ROBUST TRAFFIC SIGN RECOGNITION BASED ON COLOR GLOBAL LOEMPs 3
Fig. 2. Examples in the process of trafc sign detection. (a) Original images. (b) Normalized red/blue images. (c) MSERs (each level of the MSER is painted a
different color). (d) Borders of the MSERs. (e) Detection results.
rotations, various illumination conditions, and scale changes
from trafc sign images, is proposed. Furthermore, the LBP-
based feature is more suitable for combining with common
classiers.
Ojala et al. rst proposed the concept of LBP [15], which
may be converted to a rotational invariant version for applica-
tion in texture classication. Various extensions of LBP, such as
LBP variance with global matching [16], dominant LBPs [17],
completed LBPs [18], and joint distribution of local patterns
with Gaussian mixtures [19], have been proposed for rotational
invariant texture classication.
To the best of our knowledge, there have been no public
report of using LBP for TSR; it is due to the fact that the
following issues remain in traditional LBP.
1) LBP is unable to provide color information.
2) LBP only focuses on local textures while ignoring the
distribution of global shapes.
3) The rotational invariant version of LBP proposed in [15],
named LBP
riu2
, has a very small size, and such a small
size of this feature cannot effectively represent a complex
trafc sign image.
In this paper, a novel trafc sign descriptor is proposed,
known as Color Global LOEMP, which is robust to illumination
conditions, scale, and rotation variations and balances the two
concerns of distinctiveness and robustness. The main contribu-
tions of this paper are as follows.
1) The proposed color angular feature is able to exploit the
discriminative color information derived from the spati-
ochromatic texture patterns of different spectral channels
in a local region, which can respect the abundant color
information of trafc signs.
2) A novel context frame is established to provide global
spatial information. The context frame is established by
the shape of the trafc sign, thus allowing the cells to be
aligned well with the inside part of the trafc sign, even
when rotation and scale variations occur.
3) A novel descriptor known as LOEMP is extracted from
each cell, which is robust to lighting conditions and
Fig. 3. Flow of the proposed TSR system.
rotation variations. We apply the concept of calculating
both the HOG-based structure, to describe the distribution
of the holistic orientation of the local shape for each
orientation, and the LBP-based structure, to describe the
distribution of the local shape for each orientation.
Comparative experiments were performed to test the effec-
tiveness of the proposed Color Global LOEMP. The public
trafc sign data sets used were the Spanish trafc sign set [20],
the German Trafc Sign Recognition Benchmark (GTSRB)
data set [21], and an image data set captured from a moving
vehicle on cluttered China highways. The experimental results
show that the proposed Color Global LOEMP feature is able to
yield excellent performance when applied to challenging trafc
sign images.
III. TRAFFIC SIGN DETECTION
This paper uses the MSERs method and shape information
to extract trafc signs. First, the candidate regions of the trafc
signs are detected as MSERs [see Fig. 2(c)], each MSERlevel is
painted a different color, which are regions that maintain their
shapes when the image is thresholded at several levels. Then,
the border of each MSER is extracted [see Fig. 2(d)]. Finally,
elliptical, triangular, quadrilateral, and octagonal regions are
located as the candidate regions for further recognition [see
Fig. 2(e)]. Examples of this trafc sign detection process are
illustrated in Fig. 2, and each step is presented in detail as
follows.
Greenhalgh and Mirmehdi [3] proposed a trafc sign de-
tection algorithm using a novel application of MSERs, and
they proved that the MSERs were robust to both variations in
lighting and contrast. In this paper, MSERs are adopted for
detecting the trafc sign candidate regions. For each pixel of the
original image, values are found for the ratio of the blue channel
to the sum of all channels and the ratio of the red channel to the
sum of all channels. The greater of these two values is used as
the pixel value of the normalized red/blue image, i.e.,
= max
R
R +G+B
,
B
R +G+B
. (1)
MSERs are found for the image [see Fig. 2(b)]. Each image
is binarized at a number of different threshold levels, and the
connected components at each level are found. The connected
components that maintain their shape through several threshold
levels are selected as MSERs [see Fig. 2(c)].
An efcient ellipse detection method [22] is adopted to locate
the ellipses from the MSERs borders, and the general regular
polygon detector method proposed by Loy [7] is adopted to
detect the triangular, quadrilateral, and octagonal regions.
It should be noted that detected trafc sign candidate regions
contain a lot of noise blobs; the recognition system is used to
judge whether a candidate region is a trafc sign. Examples of
extracted trafc signs are shown in Fig. 2(e).
IV. COLOR GLOBAL LOEMP FEATURE EXTRACTION
The ow of the proposed TSR system is illustrated in Fig. 3.
First, the input color image is divided into three color compo-
nents, and the color angle patterns are computed as chromatic
information. Then, a context frame is used to divide the trafc
sign region into several cells, in order to describe the global
spatial structure of each color angle pattern. After that, the
LOEMP is extracted from each cell, and the LOEMPs extracted
from all the color angle patterns and cells are combined as the
nal descriptor. Finally, an SVM is used as the classier.
A. Global Feature Extraction
1) Color Angle Patterns: For the purpose of extracting the
discriminative patterns contained among the different spectral
bands, the ratio of the pixels between a pair of the spectral-
band images is calculated (see Fig. 4), using a method pro-
posed by Choi et al. [23]. This directional information may be
useful for extracting discriminative color angular patterns for
classication.
The ratio of the pixel values between the spectral bands is
dened as
i,j
=
v
j
v
i
+
, for i < j, i = 1, . . . , K (2)
Fig. 4. Illustration of extracting the color angular patterns from pixel C
obtained from three color bands.
Fig. 5. Trafc sign detection and establishing the context frame.
where v
i
and v
j
are the elements of color vector c associated
with the ith and jth spectral bands of the color image, respec-
tively. Note that is a small-valued constant used to avoid
a zero-valued input in the denominator term. The color angle
between the ith and jth spectral bands is computed as
(i,j)
= tan
1
(
(i,j)
) (3)
where the values of
(i,j)
fall between 0
and 90
. Note that,
as shown in Fig. 4, represents the value of angle computed
between the axis (corresponding to the ith spectral band) and
the reference line, which is formed by projecting C onto the
plane associated with the ith and jth spectral bands.
2) Global Spatial Structure: In this paper, a novel context
frame is established, which is robust to image rotation and scale
variations, to provide the information of a global spatial struc-
ture. In order to establish a more robust context frame, a frame
of overlapping cells is built with polar coordinates. As shown in
Figs. 5 and 6, the proposed context frame divides the images into
M N cells in polar coordinates [Fig. 4(a2)(c2) shows the
Fig. 6. Context frame model.
0
is the initial angle of the context frame
model.
examples of 2 4 cells, and Fig. 6 shows the example of 3
8 cells], where M is the number of the cells divided on the radial
coordinate, and N is the number of the cells divided on the
angular coordinate. The proposed implementation is not exactly
a logarithmic polar coordinate, since the radial increment is 0.
It should be noted that the cells overlap each other on both the
radial and angular coordinates in this paper, the overlapping
rate on the radial coordinate named L
o
and the overlapping rate
on the angular coordinate named A
o
. The initial
0
is used to
build the context frame, which is equal to the horizontal angle
between one border of the quadrilateral, the octagonal, or the
triangular and the x-axis [see Figs. 5(a2) and (c2) and 6]. The
context frame is established by the shape of the trafc sign, thus
allowing the cells to be aligned well with the inside part of the
trafc sign, even when rotation and scale variations occur.
B. Local Feature Extraction
Ojala et al. [15] proposed the rotation invariant LBP in
2002. In order to build the rotation invariant features possessing
distinctiveness for each cell, the authors propose applying
the concept of calculating both the HOG-based structure, to
describe the distribution of the holistic orientations, and the
LBP-based structure, to describe the distribution of the local
shape for each orientation. Then, the sequence of the bins
of the holistic HOG is adjusted, and all histograms originate
from its principal orientation. Based on the result of this step,
all identical trafc sign images may be considered to be on
the same rotation. Finally, the LBP codes of each orientation
are integrated based on the distribution of the holistic edge
information.
The image gradient is computed in one cell, and the gradient
orientation of each pixel is evenly discretized across 0
180
.
Then, a HOG is formed from the gradient orientations of the
image. As shown in Fig. 7, the HOG has K bins covering the
180
range of orientations. Each sample added to the histogram

is weighted by its gradient magnitude. The maximal bin of the
HOGis assigned as the principal orientation
main
. Then, all the
bins of the histogram are shifted until the principal orientation
shifts to the rst position.
Fig. 7. Processes of building the LOEMP.
Supposing one cell is X Y , after computing the image
gradient and discretizing the orientation for each pixel, the
entire image is represented by building a histogram as
H(
k
) =
X
x=1
Y
y=1
m(x, y)f(
k
,
p
), k [1, K] (4)
where
p
is the quantied orientation of each pixel P, K is
the number of bins of the histogram, m(x, y) is the gradient
magnitude of pixel P, and f is dened as
f(a1, a2) =
1 if a1 = a2 (5a)
0 otherwise, (5b)
The second step is to compute the LBP at each pixel, which
is the same as the traditional LBP.
Finally, we build the occurrence histogram of LBP for each
direction
k
(see Fig. 7). This procedure is applied to the accu-
mulated gradient magnitudes and across different directions to
build the LOEMP features. A LOEMP feature is calculated for
each discretized direction
k
, i.e.,
LOEMP
k
=
M
x=1
N
y=1
f(
k
,
p
)LBP
k
P,R
, k [1, K]
where
LBP
i
P,R
=
P
j=1
f(
i
,
p
)S(m
j
m
c
)2
j1
where
p
is the quantied orientation of each pixel P; K
is the number of bins of the histogram; m
c
and m
j
are the
gradient magnitudes of the central and surrounding pixels c and
j, respectively; and S() is the difference of the two gradient
magnitudes, i.e.,
S(x) =
1 x 0 (6a)
0 x < 0. (6b)
The nal feature is determined as
GLOEMP={LOEMP
1
, LOEMP
2
, . . . , LOEMP
K
}.
C. Properties of the Color Global LOEMP Feature
The Color Global LOEMP feature possesses the following
three properties.
1) Color angle patterns that are able to encode the discrimi-
native features derived from the spatiochromatic patterns
of different spectral channels within a certain local re-
gion. This enables it to contain richer information than
the other LBP-based features.
2) A framework that is able to effectively combine color,
global spatial structures, global direction structures, and
local shape information. This enables it to contain richer
image information.
3) A global-level rotation compensation method, which
shifts the principal orientation of HOG to the rst posi-
tion, making Color Global LOEMP robust to rotations.
The rst two properties allow the features to convey rich
image information, and the third one allows the algorithm to
be robust to exterior variations.
V. RECOGNITION BASED ON SVMS
Recognition is implemented by SVMs multiclass classica-
tion with a linear SVM as presented in [25] as follows, and the
library LibLinear [26] is used in our system.
The recognition stage input is a vector of Color Global
LOEMPs. To search for the decision region, all feature vectors
of a specic class are grouped together against all vectors
corresponding to the rest of the classes (including noisy objects
here), following the one-versus-all classication algorithm, so
that the system can recognize every sign.
To optimize the performance of the linear SVM classier,
an appropriate value for the cost of a regularization parameter
C has to be selected. A cross correlation of the training set is
performed, and the value of C that produces the highest cross-
correlation accuracy is used. In this paper, C is set to 1.2.
VI. EXPERIMENTS
In order to evaluate the effectiveness of the proposed method,
a series of comparative experiments were performed using
several trafc sign data sets. The experiments included two
components, namely, trafc sign detection and classication.
A. Databases
We illustrate the effectiveness of the detection module by
presenting experiments on two trafc sign data sets: the Spanish
trafc sign set and the authors data set. The effectiveness of our
recognition module is illustrated by presenting experiments on
two trafc sign data sets: the GTSRB data set and the authors
data set.
1) Spanish Trafc Sign Set [20]: Many sequences on dif-
ferent routes and under different lighting conditions were
captured. Each sequence included thousands of images.
With the aim of analyzing the most problematic situations,
313 images selected from thousands of 800 600 pixel images
extracted several sets in [20]. The images presented different
detection and recognition problems, such as low illumination,
rainy conditions, array of signs, similar background color, and
occlusions.
2) GTSRB Data Set [21]: The GTSRB data set was created
from approximately 10 h of video that was recorded while
driving on different road types in Germany during the day-
time. The sequences were recorded in March, October, and
November. The testing set contains 12 630 trafc sign images
of the 43 classes, and the training set contains 39 209 training
images.
3) Authors Data Set: The authors collected a data set by
capturing images on different roads and under different lighting
conditions. The camera images have a resolution of 1024
768 pixels. Each sequence included several thousand frames,
among which more than 5000 frames were analyzed. Visibility
status included occluded, blurred, shadowed, and visible. In
order to evaluate the effectiveness of the recognition module,
the trafc signs were divided into two sets, known as testing
set and training set. The testing set contains 4540 actual trafc
signs and the training set contains 4605 actual trafc signs in
41 classes; the training images were captured on different routes
with the test images.
It is important to note that all the aforementioned data
sets are unbalanced, and the number of images representing
different classes varies. Examples of the test images in the
aforementioned detection databases are shown in Fig. 8(a) and
(b), and examples of the test images used in the recognition
database are shown in Fig. 9. As shown in Figs. 8(a) and (b)
and 9, these images varied in their different rotation angles,
geometric deformation, occlusion, and shadows, according to
different weather and light conditions.
Fig. 8. Examples used in the experiments for trafc sign detection.
(a) Examples on the Spanish trafc sign set. (b) Examples on the authors
data set.
Fig. 9. Examples used in the experiments for trafc sign recognition.
B. Experiments for Trafc Sign Detection
All the elliptical, triangular, quadrilateral, and octagonal
regions were detected from the borders of MSERs. The trafc
sign regions in the two data sets were all manually labeled by
the authors; the sizes of the trafc signs varied between 15 15
and 156 193 pixels. The manually labeled trafc sign regions
were used to evaluate the efciency of the detection system.
The accuracy of the trafc sign detection was evaluated by
observing the outputs of both the detection and recognition
modules. If the nal result was identied as a trafc sign,
then it was considered a detection. If the algorithm failed
to detect a sign that was present in the test image, then it
was a miss. Finally, if the system detected a non-road-sign
object and classied it as a trafc sign, then it was a false
alarm. Tables I and II summarize the results generated during
the detection processing of the two data sets and include the
TABLE I
TRAFFIC SIGNS DETECTION RESULTS ON THE
SPANISH TRAFFIC SIGNS DATA SET
TABLE II
TRAFFIC SIGNS DETECTION RESULTS ON THE AUTHORS DATA SET
following information: 1) the total number of trafc signs that
appear in the test sequence; 2) the number of detections of
trafc signs that have been correctly detected by both the
detection and recognition modules; 3) the number of false
alarms in the output of the system; and 4) the number of misses.
After the trafc sign detection step was completed, the trafc
sign candidate regions were input into the recognition module
for further classication.
C. Experiments for Trafc Sign Classication
Several comparative experiments were performed to validate
the effectiveness of the proposed approach in TSR systems.
Accuracy rate is computed in each comparative experiment
with the given formula
Accuracy rate =
n
m
n
t
where n
m
is the number of accurate classied images, and n
t
is the number of the test images.
1) Comparison With the LBP-Based Features: For
comparison purposes, the following comparative experiments
using nine sorts of LBP-based features (LBP
u2
P,R
, LBP
riu2
P,R
,
Color + LBP
riu2
P,R
, Global LBP
riu2
P,R
, Color + Global
LBP
riu2
P,R
, LOEMP, Color + LOEMP, Global LOEMP, Color +
Global LOEMP) were performed to conrm the effectiveness
of the proposed features, where the parameters in LBP R and
P were set to 2 and 8.
Additionally, in order to obtain the global spatial structure,
the context frame was used to divide the whole image into
M N overlapping cells in polar coordinate, where M was
the number of the cells divided on the radial coordinate, and N
was the number of the cells divided on the angular coordinate.
On the other hand, in order to build the LOEMP feature, the
HOG had K bins covering the 180
range of orientations.
The number of cells (M N) and the number of HOG bins
(K) are two parameters that impact the performance of the
proposed method. Fig. 10(a) and (b) shows the recognition rate
by varying the number of cells and HOG bins. As expected,
a too large or a too small cell size results in a decreased
recognition rate because of the loss of spatial information or
sensitivity to local variations. A smaller size of HOG bins
loses the discriminative information, and a larger one increases
the computational cost. Considering the tradeoff between the
recognition rate and the computational cost, in the following
experiments, the trafc sign images were divided into 3 12
cells in polar coordinates, where the overlapping rate on the
both the angular and radial coordinates (L
0
, A
0
) was set to 0.5.
The number of HOG bins was set to 14.
The comparison features are presented as follows:
LBP
u2
8,2
: Extraction of the feature of LBP
u2
8,2
(uniform
LBP) and computation of the occurrence histogram from
the whole trafc sign region as the nal descriptor for
classication. The length of the feature vector LBP
u2
8,2
was 59.
LBP
riu2
8,2
riu2
8,2
(uniform
rotation invariant LBP) and computation of the occurrence
histogram from the whole trafc sign region as the nal
descriptor for classication, resulting in feature vectors of
length 10.
Color +LBP
riu2
8,2
riu2
8,2
and computation of the occurrence histogram from each
color angle pattern of the whole trafc sign region. After
that, the combination of the occurrence histograms of
all the color angle patterns as the nal descriptor for
classication. The length of the feature vector Color +
LBP
riu2
8,2
was 2 10 = 20. It should be noted that only
two color angle patterns (
RG
and
BG
) were used in this
experiment.
Global LBP
riu2
8,2
: Division of the trafc sign region into
several cells by the context frame and then extraction of
the feature of LBP and computation of the occurrence
histogram from each cell and each color angle pattern.
Combination of the occurrence histograms of all the cells
as nal descriptor for classication. The length of the
feature vector Global LBP
riu2
8,2
was 360.
Color + Global LBP
riu2
8,2
: Division of the trafc sign
region into several cells by the context frame and then
extraction of the feature of LBP and computation of the
occurrence histogram from each cell and each color angle
pattern. Combination of the occurrence histograms of all
the color angle patterns and cells as the nal descriptor for
classication. The length of the feature vector was 720.
LOEMP: Extraction of the feature of LOEMP and compu-
tation of the occurrence histogram from the whole trafc
sign region as the nal descriptor for classication, result-
ing in feature vectors of length 14 10 = 140.
Color + LOEMP: Extraction of the feature of LOEMP and
computation of the occurrence histogram from each color
angle pattern. Combination of the occurrence histograms
of all the color angle patterns as the nal descriptor for
classication, resulting in feature vectors of length 140
2 = 280.
Fig. 10. TSR rates with different parameters. (a) TSR rates with different cell numbers. (b) TSR rates with different HOG bin numbers.
TABLE III
EXPERIMENTAL RESULTS OF COMPARATIVE EXPERIMENTS
ON THE AUTHORS DATA SET
Global LOEMP: Division of the trafc sign region into
several cells by the context frame and then extraction of
the feature of LOEMP and computation of the occurrence
histogram from each cell. Combination of the occurrence
histograms of all the cells as the nal descriptor for
classication. The length of the feature vector was 140
36 = 5040.
Color + Global LOEMP: Division of the trafc sign re-
gion into several cells by the context frame and then ex-
traction of the feature of LOEMP and computation of the
occurrence histogram from each cell and each color angle
pattern. Combination of the occurrence histograms of all
the color angle patterns and cells as the nal descriptor
for classication. The length of the feature vector context-
aware LBP
riu2
8,2
+ Color was 10 080.
Table III presents the recognition results of the comparison
experiments for LBP-based features on the authors data set. As
shown in Table III, the proposed Color Global LOEMP attains
the highest recognition rate for all LBP-based feature extraction
methods.
2) Comparison With Other Features: For comparison pur-
poses, the following comparative experiments using 1) candi-
date trafc sign regions (64 64 and 32 32 pixels) [1],
[2], 2) two sorts of HOG (set1 and set2) [3], [5] and 3) color
histograms were performed to conrm the effectiveness of the
proposed Color Global LOEMP.
Candidate trafc sign regions: The recognition stage in-
puts were candidate trafc sign regions that were scaled to
a size of 64 64 and 32 32 pixels in gray scale images.
TABLE IV
EXPERIMENTAL RESULTS OF COMPARISON WITH OTHER
DESCRIPTORS ON THE GTSRB DATA SET
TABLE V
EXPERIMENTAL RESULTS OF COMPARISON WITH OTHER
DESCRIPTORS ON THE AUTHORS DATA SET
HOG descriptors: Based on the gradients of the color
images, different weighted and normalized histograms
were calculated, rst for the small nonoverlapping cells
of multiple pixels that cover the whole image (set 1) and
then for the larger overlapping blocks that integrate over
multiple cells (set 2). Two sets of features from differently
congured HOG descriptors were used. To compute the
HOG descriptors, all images were scaled to a size of 128
128 pixels. For sets 1 and 2, the sign of the gradient
response was ignored. Sets 1 and 2 used cells the size of
16 16 pixels, a block size of 4 4 cells, and an
orientation resolution of 8.
Color histograms: This set of features was provided to
complement the gradient-based feature sets with color
information. It contains a global histogram of the hue
values in HSV color space, resulting in 256 features per
image.
The experimental results of the comparisons with other fea-
tures are listed in Tables IV and V. The most ideal results are
marked in bold font. As shown in Tables IVand V, the proposed
Color Global LOEMP attains the highest recognition rate for all
feature extraction methods.
Additionally, in Fig. 11, the confusion matrix shows the
recognition rate of the 41 classes in the authors data set. The
Fig. 11. Confusion matrix obtained for a 41-class trafc sign problem.
values on the x-axis represent the individual trafc sign classes,
and the values on the y-axis represent the predictions made
by the classier. In Fig. 11, most of the confusions occurred
between very similar images, for example, some triangular
signs.
3) Comparison With Other Systems: In [1], Maldonado-
Bascn et al. adopted the inner region of the trafc sign
normalized into 31 31 pixels as the descriptor, and the linear
SVM was adopted as the classier. Using their method, the
accuracy rate on the GTSRB data set was 85.7869%. In [13],
Yuan. et al. proposed a context-aware SIFT-based algorithm
for TSR. Furthermore, a method for computing the similarity
between two images is proposed in their paper, which focuses
on the distribution of the matching points, rather than using the
traditional SIFT approach of selecting the template with the
maximum number of matching points as the nal result. Due
to the fact that the dimensions of the SIFT features are different
from each other, it is difcult to design a suitable classier for
classication based on the SIFT feature; template matching is
used in their system. Using the context-aware SIFT method, the
recognition rate on the GTSRB data set was 79.2381%.
The results of GTSRB Competition Phase I were listed on
[21]. The experiments made use of the inner region of signs,
Haar, or HOG features for recognition. In the recognition
stage, classiers such as convolution neural network (CNN),
convolutional networks, and subspace analysis were proposed
for trafc signs recognition. The CNN-based method attained
the highest accuracy rate for all comparison experiments. In
this paper, a novel feature of a trafc sign was proposed; in the
recognition stage of our system, the linear SVM was adopted as
the classier. The best accuracy rate based on the linear SVM
TABLE VI
PROCESSING TIME OF THE TSR SYSTEM FOR PER FRAME
classier was 95.89% on the results list of GTSRB Competition
Phase I.
The proposed approach with the accuracy rate of 97.2581%
on the GTSRB data set was greater than all the aforementioned
feature extraction methods of the trafc recognition systems.
4) Processing Time: Running on a 2.93-GHz Intel Core Duo
CPU E7500 central processing unit with MATLAB 7.0, where
the frame dimensions were 1360 1024, and Just-In-Time
Accelerator was used to speed up our programs. The system
speed was around 4 frames/s. The average processing time of
each part is listed in Table VI.
D. Discussion
As shown in Table III, the recognition rates increased by
about 8% when the color information was combined, due to
the fact that some trafc signs have the same local features
but different background colors. Combining with the color
information was able to provide the richer image features that
are robust to illumination variations.
The recognition rates increased by about 23% when the
global spatial structure information was combined, due to the
fact that some trafc signs had the same local patterns and
background colors but different distributions of global shapes.
The recognition rates increased by about 21% using LOEMP
rather than LBP
riu2
. An inner region of the trafc sign was
formed by a lot of microstructures of lines; therefore, most
of the LBP
riu2
values detected in trafc sign images were
the same; the important issue for TSR was how to describe
the relationship of the orientations among the local features.
Due to the fact that LOEMP characterizes image information
across different directions, therefore, it contained richer im-
age information than LBP
riu2
. Furthermore, because LOEMP
combines both the global direction structure and the local
texture features, LOEMP was more suitable to describe the
trafc signs.
The recognition rates increased by about 8% using LOEMP
rather than LBP
u2
, due to the fact that the global-level rotation
compensation method was proposed in LOEMP, which shifted
the principal orientation of HOG to the rst position, making
LOEMP robust to rotation; in contrast with LOEMP, LBP
u2
was liable to be affected by rotations.
VII. CONCLUSION
This paper has proposed a novel descriptor for a TSR system,
known as the Color Global LOEMP. The Color Global LOEMP
is a framework that is able to effectively combine color, global
spatial structures, global direction structures, and local shape
information. In addition, the proposed Color Global LOEMP is
robust to illumination conditions, scale, and rotation variations.
In order to verify the effectiveness of the detection module, two
trafc sign data sets, i.e., the Spanish trafc sign set and the
authors data set, were tested. Furthermore, two trafc sign data
sets, i.e., the GTSRB data set and the authors data set, were
tested to validate the efciency of the recognition module. The
images were captured under different conditions of geometric
deformation, damage, weather, and lighting. Different image
features, such as the HOG feature, color histogram features,
and nine sorts of LBP-based features, were used for the purpose
of comparison, and the experimental results demonstrated the
effectiveness of the proposed method.
REFERENCES
[1] S. Maldonado-Bascn, S. Lafuente-Arroyo, P. Gil-Jimnez, H. Gmez-
Moreno, and F. Lpez-Ferreras, Road-sign detection and recognition
based on support vector machines, IEEE Trans. Intell. Transp. Syst.,
vol. 8, no. 2, pp. 264278, Jun. 2007.
[2] S. Maldonado-Bascn, J. Acevedo-Rodrguez, S. Lafuente-Arroyo, A.
Fernndez-Caballero, and F. Lpez-Ferreras, An optimization on pic-
togram identication for the road-sign recognition task using SVMs,
Comput. Vis. Image Understand., vol. 114, no. 3, pp. 373383, Mar. 2010.
[3] J. Greenhalgh and M. Mirmehdi, Real-time detection and recognition
of road trafc signs, IEEE Trans. Intell. Transp. Syst., vol. 13, no. 4,
pp. 14981506, Dec. 2012.
[4] M. S. Prieto and A. R. Allen, Using self-organizing maps in the detec-
tion and recognition of road signs, Image Vis. Comput., vol. 27, no. 6,
pp. 673683, May 2009.
[5] A. Ruta, Y. Li, and X. H. Liu, Robust class similarity measure for
trafc sign recognition, IEEE Trans. Intell. Transp. Syst., vol. 11, no. 4,
pp. 846855, Dec. 2010.
[6] X. W. Gao, L. Podladchikova, D. Shaposhnikov, K. Hong, and
N. Shevtsov, Recognition of trafc signs based on their colour and shape
features extracted using human vision models, J. Vis. Commun. Image
Represent., vol. 17, no. 4, pp. 675685, Aug. 2006.
[7] G. Loy, Fast shape-based road sign detection for a driver assistance
system, in Proc. IEEE Int. Conf. Intell. Robots Syst., 2004, pp. 7075.
[8] H. Fleyeh and E. Davami, Eigen-based trafc sign recognition, IET
Intell. Transp. Syst., vol. 5, no. 3, pp. 190196, Sep. 2011.
[9] A. de la Escalera, J. M. Armingol, J. M. Pastor, and F. J. Rodriguez, Vi-
sual sign information extraction and identication by deformable models
for intelligent vehicles, IEEE Trans. Intell. Transp. Syst., vol. 5, no. 2,
pp. 5768, Jun. 2004.
[10] M. Takaki and H. Fujiyoshi, Trafc sign recognition using SIFT fea-
tures, IEEJ Trans. Electron., Inf. Syst., vol. 129, no. 5, pp. 824831, 2009.
[11] A. Ihara, H. Fujiyoshi, M. Takaki, H. Kumon, and Y. Tamatsu, Im-
provement in the accuracy of matching by different feature subspaces in
trafc sign recognition, IEEJ Trans. Electron., Inf. Syst., vol. 129, no. 5,
pp. 893900, 2009.
[12] A. Abdel-Hakim and A. A. Farag, CSIFT: A SIFT descriptor with color
invariant characteristics, in Proc. IEEE CVPR, 2006, pp. 19781983.
[13] X. Yuan, X. L. Hao, H. J. Chen, and X. Y. Wei, Trafc sign recognition
based on a context-aware scale-invariant feature transform approach, J.
Electron. Imaging, vol. 22, no. 4, p. 041105, Jul. 2013.
[14] N. Dalal and B. Triggs, Histograms of oriented gradients for human
detection, in Proc. IEEE CVPR, 2005, pp. 886893.
[15] T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale and
rotation invariant texture classication with local binary patterns, IEEE
Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971987, Jul. 2002.
[16] Z. Guo, L. Zhang, and D. Zhang, Rotation invariant texture classication
using LBP variance with global matching, Pattern Recognit., vol. 43,
no. 3, pp. 706719, Mar. 2010.
[17] S. Liao, M. W. K. Law, and A. C. S. Chung, Dominant local binary
patterns for texture classication, IEEE Trans. Image Process., vol. 18,
no. 5, pp. 11071118, May 2009.
[18] Z. Guo, L. Zhang, and D. Zhang, A completed modeling of local binary
pattern operator for texture classication, IEEE Trans. Image Process.,
vol. 19, no. 6, pp. 16571663, Jun. 2010.
[19] H. Lategahn, S. Gross, T. Stehle, and T. Aach, Texture classication by
modeling joint distributions of local patterns with Gaussian mixtures,
IEEE Trans. Image Process., vol. 19, no. 6, pp. 15481557, Jun. 2010.
[20] H. Gmez-Moreno, S. Maldonado-Bascn, P. Gil-Jimnez, and
S. Lafuente-Arroyo, Goal evaluation of segmentation algorithms
for trafc sign recognition, IEEE Trans. Intell. Transp. Syst., vol. 11,
no. 4, pp. 917930, Dec. 2010.
[21] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, Man vs. computer:
Benchmarking machine learning algorithms for trafc sign recognition,
Neural Netw., vol. 32, no. 8, pp. 323332, Aug. 2011.
[22] Y. Xie and Q. Ji, A new efcient ellipse detection method, in Proc. 16th
Int. Conf. Pattern Recognit., 2002, pp. 957960.
[23] J. Y. Choi, Y. M. Ro, and K. N. Plataniotis, Color local texture features
for color face recognition, IEEE Trans. Image Process., vol. 21, no. 3,
pp. 13661380, Mar. 2012.
[24] S. Belongie, J. Malik, and J. Puzicha, Shape matching and object recog-
nition using shape contexts, IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 24, no. 24, pp. 509522, Apr. 2002.
[25] C. Cortes and V. Vapnik, Support-vector networks, Mach. Learn.,
vol. 20, no. 3, pp. 273297, Sep. 1995.
[26] C. Chang and C. Lin, A Library for Support Vector Machines 2001.
[Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm
[27] V. Perlibakas, Distance measures for PCA-based face recognition, Pat-
tern Recognit. Lett., vol. 25, no. 6, pp. 711724, Apr. 2004.
Xue Yuan received the B.S. degree from Northeast-
ern University, Shenyang, China, and the M.S. and
Ph.D. degrees from Chiba University, Chiba, Japan,
in 2004 and 2007, respectively.
In 2007 she joined the Intelligent Systems Lab-
oratory SECOM Company Ltd., Tokyo, Japan, as a
Researcher. Since 2010 she has been with the School
of Electronics and Information Engineering, Bei-
jing Jiaotong University, Beijing, China. She is also
currently with the Chinese Academy of Surveying
and Mapping, Beijing. Her research interests include
image processing and pattern recognition.
Xiaoli Hao received the B.E., M.E., and Ph.D.
degrees from Beijing Jiaotong University, Beijing,
China, in 1992, 1995, and 2010, respectively.
In 1995 she joined the School of Electronics and
Information Engineering, Beijing Jiaotong Univer-
sity, where she has been an Associate Professor
since 2002. In 2006 she was a Visiting Scholar
with University of California, San Diego, CA, USA.
Her research interests include optical imaging, signal
processing, and machine vision.
Houjin Chen received the B.E. degree from
Lanzhou Jiaotong University, Lanzhou, China, in
1986, and the M.E. and Ph.D. degrees from Beijing
Jiaotong University, Beijing, China, in 1989 and
2003, respectively.
In 1989 he joined the School of Electronics and
Information Engineering, Beijing Jiaotong Univer-
sity, where he became a Professor in 2000 and is
currently the Dean of the School of Electronics and
Information Engineering. In 1997 he was a Visiting
Scholar with Rice University, Houston, TX, USA,
and in 2000 he was a Visiting Scholar with the University of Texas at Austin,
Austin, TX. His research interests include signal and information processing,
image processing, and the simulation and modeling of biological systems.
Xueye Wei received the B.S. and M.S. degrees from
Tianjin University, Tianjin, China, in 1985 and 1988,
respectively, and the Ph.D. degree from Beijing In-
stitute of Technology, Beijing, China, in 1994.
He is a Professor of electronics and information
engineering, Beijing Jiaotong University, Beijing.
His research interests are in the theory of automatic
control.

Robust Traf C Sign Recognition Based On Color

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Robust Traf C Sign Recognition Based On Color

Uploaded by

Copyright:

Available Formats

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

range of orientations. Each sample added to the histogram

You might also like