You are on page 1of 10

508 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 25, NO.

3, MARCH 2015

Efficient Feature Selection and Classification


for Vehicle Detection
Xuezhi Wen, Ling Shao, Senior Member, IEEE, Wei Fang, and Yu Xue

Abstract The focus of this paper is on the problem of


Haar-like feature selection and classification for vehicle detection.
Haar-like features are particularly attractive for vehicle detection
because they form a compact representation, encode edge and
structural information, capture information from multiple scales,
and especially can be computed efficiently. Due to the large-scale
nature of the Haar-like feature pool, we present a rapid and
effective feature selection method via AdaBoost by combining
a samples feature value with its class label. Our approach is
analyzed theoretically and empirically to show its efficiency. Fig. 1. Vehicle detection process.
Then, an improved normalization algorithm for the selected
feature values is designed to reduce the intra-class difference, this field: 1) the startling losses both in human lives and
while increasing the inter-class variability. Experimental results finance caused by vehicle accidents; 2) the availability of
demonstrate that the proposed approaches not only speed up
the feature selection process with AdaBoost, but also yield better feasible technologies accumulated within the last 40 years
detection performance than the state-of-the-art methods. of computer vision research; and 3) the exponential growth
in processor speeds that have paved the way for running
Index Terms AdaBoost, Haar-like features, support vector
machine (SVM), vehicle detection, weak classifier. computation-intensive video-processing algorithms even on a
low-end PC in real time. On-board vehicle detection systems
have high computational requirements as they need to process
I. I NTRODUCTION
the acquired images in real time or close to real time for

V ISION-BASED vehicle detection for driver assistance


has received considerable attention over the last 15 years.
There are at least three reasons for the booming research in
instant driver reaction. Searching the whole image to locate
potential vehicle locations is prohibitive for real-time applica-
tions. The majority of methods reported in the literature follow
Manuscript received March 10, 2014; revised May 24, 2014 and
two basic steps: 1) hypothesis generation (HG) where the
August 9, 2014; accepted September 8, 2014. Date of publication locations of possible vehicles in an image are hypothesized and
September 15, 2014; date of current version March 3, 2015. This work was 2) hypothesis verification (HV) where tests are performed to
supported in part by the Jiangsu Overseas Research and Training Program
for University Prominent Young and Middle-age Teachers and Presidents,
verify the presence of vehicles in an image (Fig. 1).
in part by the Jiangsu Planned Projects for Post-Doctoral Research Funds The input to the HV step is the set of hypothesized locations
under Grant 1102108C; in part by the National Natural Science Foundation from the HG step. During HV, tests are performed to verify
of China under Grant 61403206; in part by the Natural Science Foundation
of Jiangsu Province under Grant BK20141005; in part by the Natural Science
the correctness of a hypothesis. Approaches to HV can be
Foundation, Jiangsu Higher Education Institutions of China, under Grant classified mainly into two categories: 1) template based and
14KJB520025; in part by the Research Project, Nanjing University of Informa- 2) appearance based. Template-based methods use prede-
tion Science and Technology, Nanjing, China, under Grant 20110434; in part fined patterns from the vehicle class and perform correlation.
by the Priority Academic Program Development, Jiangsu Higher Education
Institutions; and in part by the Open Research Project, State Key Laboratory Appearance-based methods, which are also called machine
of Novel Software Technology, under Grant KFKT2014B21. This paper was learning methods, on the other hand, learn the characteristics
recommended by Associate Editor X. Wang. (Corresponding author: Ling of the vehicle class from a set of training images that should
Shao.)
X. Wen and Y. Xue are with Jiangsu Engineering Center of Network capture the variability in vehicle appearance. Usually, the
Monitoring, Nanjing University of Information Science and Technology, variability of the nonvehicle class is also modeled to improve
Nanjing 210044, China, and also with the School of Computer and Software, the performance.
Nanjing University of Information Science and Technology, Nanjing 210044,
China (e-mail: ww_pub@163.com; xueyu@nuist.edu.cn). The HV step acts as an important role for vehicle detection.
L. Shao is with the Department of Computer Science and Digital Template-based methods need to use thousands of prede-
Technologies, Northumbria University, Newcastle upon Tyne NE1 8ST, U.K.
(e-mail: ling.shao@ieee.org).
fined patterns of the vehicle class and perform correlation
W. Fang is with the Jiangsu Engineering Center of Network Monitor- between the test image and the template, which makes them
ing, Nanjing University of Information Science and Technology, Nanjing time consuming. In addition, template-based methods are
210044, China; School of Computer and Software, Nanjing University of
Information Science and Technology, Nanjing 210044, China; and also with
sensitive to the varying background (e.g., buildings, bridges,
State Key Laboratory for Novel Software Technology, Nanjing University, and guardrails). Therefore, the appearance-based validation
Nanjing 210097, China (e-mail: hsfangwei@sina.com). approaches have become more attractive. There are at least
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
two fundamental challenges faced by the appearance-based
Digital Object Identifier 10.1109/TCSVT.2014.2358031 validation methods: 1) processing time and 2) accuracy.
1051-8215 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
WEN et al.: EFFICIENT FEATURE SELECTION AND CLASSIFICATION FOR VEHICLE DETECTION 509

In this paper, we focus on the investigation of the local or global image variations (e.g., viewpoint changes,
appearance-based validation approaches to HV. Seeking the illumination changes, and partial occlusion).
solutions to boost the vehicle detection accuracy and reduce Local features, on the other hand, are less sensitive to
the false alarm rate while considering the real time, we propose the effects faced by global features. In addition, geometric
a machine learning algorithm based on Haar-like features and information and constraints in the configuration of different
support vector machine (SVM). Specifically, we first design a local features can be utilized either explicitly or implicitly.
Haar-like features extraction method to represent a vehicles An overcomplete dictionary of Haar wavelet features was
edges and structures, and then propose a rapid feature selection utilized in [14] for vehicle detection. They argued that this
algorithm using AdaBoost due to the large pool of Haar-like representation provided a richer model and spatial resolution
features. Finally, we present an improved normalization and that it was more suitable for capturing complex patterns.
method for feature values. Experimental results demonstrate Sun et al. [15] went one step further by arguing that the
that the proposed approaches not only speedup the feature actual values of the wavelet coefficients are not very impor-
selection process with AdaBoost, but also outperform the tant for vehicle detection. They proposed using quantized
state-of-the-art methods in terms of classification ability. coefficients to improve detection performance. Using Gabor
The rest of this paper is organized as follows. In Section II, filters for vehicle feature extraction was investigated in [16].
we review the related work for vehicle detection using Gabor filters [17] provide a mechanism for obtaining orien-
appearance-based approaches. In Section III, we present an tation and scale related features. The hypothesized vehicle
algorithm for computing Haar-like features. A fast feature subimages were divided into nine overlapping subwindows,
selection method based on AdaBoost is reported in Section IV. and then Gabor filters were applied on each subwindow sep-
Section V gives an introduction of SVMs and introduces an arately. Furthermore, Sun et al. [18] combined Haar wavelet
improved normalization method for the original feature values, with Gabor features to describe the properties of a vehicle.
while training SVM. The experimental results and analysis are Scale invariant feature transform features [19] were used
described in Section VI. Finally, the conclusion is drawn in in [20] to detect the rear faces of vehicles. In [21], the his-
Section VII. togram of oriented gradients (HOGs) features were extracted
in a given image patch for vehicle detection. In [22], a
combination of speeded up robust features [23] and edges was
II. R ELATED W ORK
used to detect vehicles in the blind spot.
Machine learning methods are becoming increasingly pop- The main drawback of the above local features is that
ular for their high performance, good robustness, and easy they are quite slow to compute. In recent years, there has
operation, which have been applied to many fields (such as been a transition from complex image features such as
image retrieval, image annotation, visual recognition, and Gabor filters and HOG to simpler and efficient feature sets for
vehicle detection) [1][4]. The HV using machine learn- vehicle detection. Haar-like features are sensitive to vertical,
ing methods is treated as a two-class pattern classification horizontal, and symmetric structures, and they can be
problem: vehicle versus nonvehicle. In general, machine learn- computed efficiently, making them well suited for real-time
ing methods consist of two processes: 1) feature representation detection of vehicles [24], also demonstrated by their good
and 2) classification. performance in the object detection literature [25][27].
Accordingly, we choose Haar-like features as the feature
representation for our vehicle detection system.
A. Feature Representation
Given the huge intra-class variabilities of the vehicle class,
one feasible approach is to learn the decision boundary based B. Classification
on training a classifier using the feature sets extracted from Classification can be broadly split into two categories:
a training set. Various feature extraction methods have been 1) discriminative and 2) generative methods. Discriminative
investigated in the context of vehicle detection. Based on the classifiers, which learn a decision boundary between two
method used, the features extracted can be classified as either classes, have been more widely used in vehicle detection.
global or local. Generative classifiers, which learn the underlying distribution
Global features are obtained by considering all the pixels in of a given class, have been less common in the vehicle
an image. Usually dimensionality reduction techniques [5], [6] detection literature. While in [28] and [29], artificial neural
are required for the high-dimensional features. Wu and network classifiers were used for vehicle detection, they have
Zhang [7] used standard principal component analysis (PCA) recently fallen somewhat out of favor. Neural networks have
for feature extraction, together with a nearestneighbor many parameters to tune, and the training tends to converge to
classifier, reporting an 89% accuracy on a vehicle data set. a local optimum. The research community has moved toward
However, their evaluation database was quite small (93 vehicle classifiers whose training converges to a global optimum over
images and 134 nonvehicle images), which makes it difficult the training set, such as SVMs and AdaBoost. The SVMs
to draw any useful conclusions. Although detection schemes have been widely used for vehicle detection. In [30] and [31],
based on global features such as those described in [7][13] SVM was used to classify feature vectors consisting of Haar
perform reasonably well, an inherent problem with global wavelet coefficients. The combination of HOG features and
feature extraction approaches is that they are sensitive to the SVM classifier has been also used in [28], [32], and [33].
510 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 25, NO. 3, MARCH 2015

TABLE I
N UMBER OF F EATURES FOR AN I MAGE PATCH W ITH S IZE OF 32 32

Fig. 2. Haar-like feature prototypes used in our method: upright ones (the first
row) and rotated ones (the second row).

Algorithm 1 Computing the Haar-Like Feature Pool


Input
Fig. 3. Haar-like feature examples for describing a vehicles appearance. A ROI image patch in RGB color space
Begin
1) Normalize ROI to 32 32 in grayscale
The HOG-SVM formulation was extended to detect and cal- 2) Compute the upright and rotated integral images
culate vehicle orientation using multiplicative kernels in [34]. 3) Compute all Haar-like feature values with the integral images
Edge features were classified for vehicle detection using according to Table I
End begin
SVM in [35] and [36]. In [37], vehicles were detected Output
using Haar and Gabor features, using SVM for classification. Haar-like feature pool
AdaBoost has been also widely used for classification, largely
owing to its integration with cascade classification in [25].
In [38], AdaBoost was used for detecting vehicles based on For a given image, the region of interest (ROI), i.e., the
symmetry feature scores. In [39], edge features were classified vehicle region, is segmented using shadow, symmetry, and
using AdaBoost. The combination of Haar-like feature extrac- aspect ratio information according to [30] and [31]. Consid-
tion and AdaBoost classification has been used to detect rear ering the property of the structure of a vehicle, we add the
faces of vehicles in [40][44]. diagonal features which are described in [46] and [47], and
In addition, Szegedy et al. [45] defined a multiscale infer- the whole Haar-like feature pool we deploy is summarized in
ence procedure that is able to produce high-resolution object Table I. The procedure for computing the Haar-like feature
detectors based on deep neural networks (DNNs). pool is shown in Algorithm 1.
Compared with the popular AdaBoost classifiers, SVM is
slower in the test stage. However, the training of SVM is much IV. F EATURE S ELECTION
faster than that of AdaBoost classifiers. Similarly, although The scale of the obtained Haar-like feature pool is far more
DNNs can yield strong results for object detection, these than the pixels of a 32 32 gray scale image. Even though
results come at heavy computational costs during training. each feature can be quickly computed, the whole process is
Therefore, we choose SVM as the classifier in this paper. still quite time consuming. Only a few features among them
play an important role for classification, which can be regarded
as key features. The AdaBoost algorithm is an effective way
III. H AAR -L IKE F EATURE E XTRACTION to select these key features. The traditional feature selection
Rather than using pixels, Viola and Jones [25][27] used and the proposed one via AdaBoost are detailed, respectively,
simple Haar feature prototypes to extract features to encode an as follows.
image patch for human face detection (the first row of Fig. 2).
To further lower the false alarm rate at a given hit rate, A. Traditional Feature Selection
Lienhart et al. [46], [47] introduced new Haar feature pro- The traditional feature selection process with the AdaBoost
totypes by rotating these simple ones by 45 (the second algorithm is illustrated in Algorithm 2 according to [48].
row of Fig. 2), and the results proved to be effective. From Algorithm 2, we can find that the time of feature
Hence, we take all of these simple and rotated prototypes. selection is mostly consumed on finding the weak classifiers.
In addition, we speed up the feature extraction procedure using In general, at each iteration, generating weak classifiers con-
an intermediate representation for the image patchintegral sists of three stages considering each feature: 1) generate the
image (see [25] for details). Fig. 3 gives a few examples of latent classification locations; 2) compute the classification
Haar-like features for the description of a vehicles error on each latent classification location; and 3) select the
appearance [24]. best classifier (weak classifier) that has the lowest error.
WEN et al.: EFFICIENT FEATURE SELECTION AND CLASSIFICATION FOR VEHICLE DETECTION 511

Algorithm 2 AdaBoost Algorithm for Feature Selection


Input
1) A training set:
{xi , yi }ni=1 , xi X, yi {1, +1}, i = 1, 2, . . . , n
where n is the size of the training set
2) xi denotes the feature vector of the ith sample
3) yi denotes the class label of the ith sample Fig. 4. Difference between the traditional feature selection method and the
4) X denotes the feature space proposed approach. Hollow and solid circles denote two different classes,
Begin respectively. (a) Traditional feature selection method. (b) Proposed feature
1) Initialize weights: w1 (i) = 1/n i = 1, 2, . . . , n selection approach.
2) H = null // // Key feature set
3) For t = 1 to T
(1) Normalize the weights: l are , and the classification error can be computed as

n
wt (i) = wt (i)/ wt (i) i = 1, 2, . . . , n 1
n
i=1 = w j ( f (x j ) y j )2 ) (1)
(2) For each feature j , train a weak classifier f j . 4
(3) The error j of a classifier f j is evaluated as j =1
follows:n where w j is the j th samples weight, f (x j ) {1, +1} is the

j = wt,i (xi ) classification result on the j th sample, and y j {1, +1} is
i=1  the real class label of the j th sample. Therefore
0 f j (xi ) = yi
where (xi ) = 1 else
1 
n
(4) Choose the classifier f t with the lowest error t and
H = H {t} = w j ( f (x j ) y j )2
4
(5) Compute t = 21 ln((1 t )/t ). j =1

(6) Update the weights:
1  
l1 n
wt +1 (i) = wt (i) exp(t ft (xi )yi ))
= w j y j
2
w j ( y j )2 +
End for   4

T j =1 j =l+1
4)F(x) = sign t f t (x)
1  
n n l1
t =1 1
End begin = w j (2 + y 2j )+ wj yj w j y j.
Output 4 2
j=1 j =l+1 j=1
1)Key feature set H
2)AdaBoost classifier F(x)
(2)
n
As w j and y j are known, j =1 w j y j is also known. From
n  
B. Proposed Feature Selection =1 w j y j = l1 =1 w j y j + nj =l+1 w j y j and 2 = y 2j = 1,
n
j j

The difference between the proposed feature selection j =1 w j = 1, we can compute as



method and the traditional one lies in stage (1): the traditional
1  n
1  n 
l1
method only uses the feature values to generate the latent = w j 2 + y 2j + wj yj 2 wj yj
classification locations, and the proposed approach generates 4 2
j =1 j =1 j =1
the latent classification locations by combining the feature
values with their class labels. Without loss of generality, 1 1  n 
l1
= + wj yj 2 w j y j . (3)
Fig. 4 shows an example of the difference between the two 2 2
j =1 j =1
methods for the given ten feature values.
For the traditional method, it uses the exhaustive method to Let us discuss the different cases of .
generate the latent classification locations. Specifically, it takes 1) When = 1, (3) turns into
the middle location of every two adjacent feature values as
the latent classification location, whereas the proposed method 1 1 n 
l1

takes the class labels into account, i.e., only the middle = + wj yj 2 w j y j . (4)
2 2
j =1 j =1
location of the two adjacent feature values with different labels
is considered as the latent classification location. Therefore,
 finding min () is to compute
max( l1
j =1 w j y j ). As w j > 0, only when y l1 = 1 and

yl+1 = 1, does l1 w y
j =1 j j reach the maximum.
C. Theoretical Analysis for the Proposed Approach 2) When = 1, (3) turns into

In Section IV-B, we have presented the proposed fea-
1 1  
n l1
ture selection method that combines the feature values with = wj yj 2 w j y j . (5)
their class labels. In this section, we theoretically analyze 2 2
j =1 j =1
our approach in terms of the property of the class labels.
For convenience, we assume l is the latent classification Therefore,
 finding min() is to compute
location, and the classification results of the left of l are min( l1 w
j =1 j j y ). Only when yl1 = 1 and yl+1 = 1,
kst
( {1, +1}); on the contrary, the results of the right of does j =1 w j y j reach the minimum.
512 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 25, NO. 3, MARCH 2015

Algorithm 3 Improved Normalization Algorithm


Input
A training set {xi , yi }ni=1 (yi {1, +1}),
where xi = (vi1 , . . . , vil )T , (l << m)
Begin
For j = 1 to l
For k = 1 to n
Compute the absolute value: v j k
End for n
max_value = max{ v j k k=1 }
min_value = min{ v j k k=1 }
n
For  k = 1to
n
Fig. 5. Example for the analysis of the adaptability of the proposed approach v j k = v j k min value /(max valuemin value)
with the number of the training samples as 2 and 20, respectively. Hollow and End for
solid circles denote two different classes, respectively. (a1) and (a2) Denote End for
the traditional feature selection method. (b1) and (b2) Denote our proposed
End begin
feature selection approach.
Output
The above analysis demonstrates that our proposed feature The normalized
 
feature

vector set:
selection method by combining feature values with their class xi = (vi1 , . . . , vil )T , i = 1, . . . , n
labels is reasonable and effective.

values of each continuous attributes into a well-proportioned


D. Scalability
range such that the effect of one attribute cannot domi-
According to the above theoretical analysis, the scalability nate the others. A statistical normalization method was used
of the proposed approach is further shown in Fig. 5the in [51] and [52] to convert the data into a standard normal
traditional feature selection method need to compute the distribution, while a minmax normalization method was
classification errors of 19 latent classification locations, while adopted in [53] to directly convert the data into a range
the proposed method computes the classification errors of of 0 and 1.
only three latent classification locations, which saves much The statistical normalization is defined as
training time. The larger scale the training data set is, the  xi
more consumed training time the proposed approach saves. x i == (8)

Therefore, the proposed method would be more advantageous
for a larger number of training samples. where is the mean of n values for a given attribute, and is
its standard deviation. However, using the statistical normal-
V. SVM C LASSIFIER ization, the data set should follow a Normal distribution.
The minmax normalization is defined as
The SVMs are primarily two-class classifiers that have been
 x i min(x i )
shown to be an attractive and more systematic approach to x i == . (9)
learning linear or nonlinear decision boundaries [49], [50]. max(x i ) min(x i )
If the training examples from two classes cause the two 
Normally, x i is set to zero if the maximum is equal to the
classes margin to be maximal, then the classification hyper- minimum. However, the minmax normalization method is
plane satisfies sensitive to the lighting condition if it is directly used to the

m image data.
f (x) = yi ai k(x, x i ) + b (6) To overcome the problem faced by the minmax
i=1 normalization, we present an improved normalization
where x, x i n are n-dimensional input feature vectors, m algorithm based on the minmax normalization method.
is the number of examples, yi {1, +1} is the label of the The main idea follows this observation: the actual feature
i th example, and k(x, x i ) is a kernel function. We use the values are not very important for vehicle detection. The
radial basis function (RBF) as the kernel function which is magnitudes indicate local oriented intensity differences, and
defined as this information could be very different even for the same
  vehicle under different lighting conditions. The proposed
x x i 2
k(x, x i ) = exp . (7) method first computes the magnitudes of the obtained feature
2 2 values, which is the main difference from the traditional
methods, and then normalizes the magnitudes to [0, 1] using
the minmax method. The detailed process is presented in
A. Data Normalization
Algorithm 3.
Data normalization is an essential step for most object
detection algorithms that learn the statistical characteristics
of attributes extracted from the object images, which can B. Training Process
effectively reduce the within-class variation and increase the After performing the improved normalization operation, all
between-class variability. Data normalization is to scale the feature values are normalized to [0, 1]. Then, the normalized
WEN et al.: EFFICIENT FEATURE SELECTION AND CLASSIFICATION FOR VEHICLE DETECTION 513

Fig. 6. Examples of training images. (a) Vehicle samples. (b) Non-vehicle Fig. 7. Examples of test images of Test data II. (a) Vehicle samples.
samples. (b) Nonvehicle samples.

feature vector set is used to train the RBF-SVM classifier with samples at both stages include roads, buildings, green plants,
cross-validation to select the optimal parameters and C. advertisement boards, bridges, traffic signs, guardrails, and
so on. Fig. 6 shows some training examples of vehicle and
C. Testing Process nonvehicle images, and Fig. 7 shows some test examples of
vehicle and nonvehicle images at the second stage.
For a given test ROI image patch, we first normalize it to a To evaluate the performance of the approaches, the true
32 32 grayscale patch, and then compute the feature values positive rate (or vehicle detection rate) t p and false positive
according to the selected Haar-like features and normalize the rate f p were recorded. They are defined in
feature values to [0, 1] according to the improved normaliza-
tion algorithm shown in Algorithm 3. Finally, we construct the NTP NFP
tp = , fp = (10)
normalized feature values to a vector and input it to the trained NTP + NFN NFP + NTN
RBF-SVM classifier, and then obtain the classification result. where NTP , NFP , NTN , and NFN are the numbers of the
objects identified as true positives, false positives, true neg-
VI. E XPERIMENTAL R ESULTS AND A NALYSIS atives, and false negatives, respectively. Three experiments
To evaluate the proposed approaches, we apply them to a are conducted on a PC [CPU: Intel Core2 2.13 GHZ,
monocular vision-based detection system for static rear-vehicle memory: 2 GB, operating system: Windows 7, implementa-
images. This system includes two modules. The first module tion: MATLAB 2012b].
aims to segment ROIs accurately according to [30] and [31]. The first experiment aims to validate the performance
The second module, which is the focus of this paper, per- in classification accuracy of the proposed machine learning
forms classification on the ROIs. Vehicle existence validation method compared with the state-of-the-art ones that perform
is a two-class pattern classification problem: vehicle versus reasonably well in vehicle classification and for which the code
nonvehicle. can be obtained or reproduced according to the original papers.
Different videos recorded by a camera mounted on a vehicle The second experiment compares the designed normalization
are collected for evaluating the presented algorithms, and algorithm for the feature vector set to other normalization
the videos are taken on different daytime scenes, including methods. The third experiment aims to validate the time
highway, urban common road, urban narrow road, and so on. efficiency of the proposed feature selection algorithm with
Some roads are covered with japanning, smear, and so on. the AdaBoost compared with the state-of-the-art selection
At the first stage, 23 687 samples from the same videos algorithms and the traditional one. All ROIs are normalized
were collected for training and testing, and 17 647 samples to 32 32 grayscale image patches.
were selected randomly for training, including 8 774 vehicle In the first experiment, since different data sets will induce
samples (positive samples) and 8873 nonvehicle samples different optimal parameters for feature extraction methods
(negative samples), and the remaining 6040 samples (denoted and classifiers, we select the optimal parameters in terms of
as Test data I) for testing that include 4266 vehicle samples and the classification ability. For the feature extraction of PCA [7],
1774 nonvehicle samples. At the second stage, 29 698 samples we choose the first 79 eigenvectors associated with the first
from different videos with the samples at the first stage were 79 biggest eigenvalues that generate the best classification
collected for only testing (denoted as Test data II), which accuracy. For the feature extraction of Gabor [16], we select
include 7901 vehicle samples (positive samples) and 4602 non- six angles and four orientations. For the feature extraction of
vehicle samples (negative samples). The vehicle samples at wavelet, we select the simplest Haar wavelet and perform a
both the first stage and the second stage include various kinds 6-level decomposition, and then remove the HH part of the
of vehicles, such as cars, trucks, and buses as well as different first level according to [15]. For the feature extraction of
colors, such as red, blue, black, gray, and white. Furthermore, the Gabor combining with wavelet according to [18], the
the vehicle samples include both vehicles near the vehicle computation of the Gabor features is similar to [16] and
mounted with the camera and those that are far. The nonvehicle the wavelet features is similar to [15]. For the extraction of
514 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 25, NO. 3, MARCH 2015

TABLE II TABLE III


E VALUATION R ESULTS OF S EVEN V EHICLE D ETECTION M ETHODS T WO P UBLIC T ESTING D ATABASES

TABLE IV
E VALUATION R ESULTS ON THE T WO P UBLIC D ATA S ETS

Fig. 8. ROCs of the seven detection methods on Test data II.

Haar-like features, 57 519 features were obtained from each


32 32 grayscale image patch [24]. While training the
RBF-SVM classifier, we use fivefold cross-validation to select
the best parameters and C. While training the cascaded
AdaBoost, the vehicle detection accuracy ratio is required
to be not <99.9% and the false alarm (classify non-vehicle
to vehicle) ratio is not >50% at the current stage, and
the false alarm (classify nonvehicle to vehicle) ratio of the
cascaded classifier is required to be not >10%, and we
select the classifier with the best performance by applying Fig. 9. ROCs of different normalization methods on Test data I.
fivefold crossvalidation. While selecting Haar-like features
with AdaBoost, we select features by choosing the clas- shot against diverse background scenes with different lighting
sifier with the best performance through applying fivefold conditions and degrees of occlusion. Table IV shows the
cross-validation and select 600 features from 57 519 features. evaluation results.
Table II shows the evaluation results. Fig. 8 shows the receiver In the second experiment, we use three schemes:
operating characteristic curves (ROCs) of the seven vehicle 1) the statistical normalization [51], [52]; 2) minmax
detection methods on Test data II. normalization [53]; and 3) our proposed method to normalize
In addition, two public image data sets are also used to the data. The normalized data as well as the original data
evaluate the above machine learning methods. As shown in are then fed into the RBF-SVM classifier for training and
Table III, the first set is published by Massachusetts Institute of testing. With different attribute normalization schemes, the
Technology Center for Biological and Computational Learning overall detection results are shown in Figs. 9 and 10.
(MIT CBCL) Group, which consists of the rear- and frontal- In the third experiment, we compare the proposed rapid
viewed vehicle images, and the second set is published by feature selection algorithm with that in [55] and the traditional
California Institute of Technology (Caltech) Vision Group, one. We conduct the experiment in five random trials. In each
which consists of the rear-viewed vehicle images (1999 and trial, we randomly divided the training sample set into five
2001 versions). The vehicles in the databases have a wide subsets and perform fivefold cross-validation. Table V shows
variety of sizes and in-plane or out-of-plane orientations are the mean time as well as the variance of the three methods.
WEN et al.: EFFICIENT FEATURE SELECTION AND CLASSIFICATION FOR VEHICLE DETECTION 515

From the evaluation in Tables II and IV, it can be observed


that the improvement achieved using the proposed system is
only slightly better than that of the state-of-the-art methods.
That is because those methods can learn sufficient knowledge
from the large scale training data set effectively and present
good performance. The enhanced performance of the proposed
system is due to its use of all types of Haar-like features, which
improves the tolerance of the vehicle validation process toward
geometric variance and partial occlusion, and its application
of improved attribute normalization, which reduces the intra-
class differences, while increasing the inter-class variability
and makes the validation process easier.
From Table V, one can conclude that the proposed feature
Fig. 10. ROCs of different normalization methods on Test data II.
selection method saves more than 15 h in training time than
from the traditional one, and saves more than 5 h compared
with the method that relies on the forward feature selec-
TABLE V
tion (FFS) algorithm to speed up the feature selection with
E VALUATION R ESULTS B EFORE AND A FTER THE
AdaBoost [55], leading to a more efficient feature selection
ADABOOST A LGORITHM B EING I MPROVED
process.

VII. C ONCLUSION
In this paper, we have proposed a solution based on
Haar-like features and RBF-SVM for vehicle detection. First,
From Table II, one can conclude that, compared with due to the huge pool of Haar-like features, a fast feature selec-
the state-of-the-art detection methods, the proposed algorithm tion algorithm via AdaBoost has been proposed by combining
produces not only a higher vehicle detection rate (t p ), but also a samples feature value with its class label. Then, an improved
a lower false positive rate ( f p ) on both Test data I and normalization algorithm for feature values has been presented,
Test data II. On Test data II, although the vehicle detection which can effectively reduce the within-class variation and
rate of the proposed algorithm is only 0.44% better than that increase the between-class variability. The experimental results
of the method in [46] and [47], but the false positive rate ( f p ) show that the proposed approaches not only speeded up
of the proposed algorithm is 7.84% lower than that achieved the feature selection process, but also showed superiority in
by the method in the literatures. From Fig. 8, one can conclude vehicle classification ability compared with the state-of-the-art
that the proposed algorithm shows the best performance among methods.
all methods.
From Table IV, one can conclude that the proposed ACKNOWLEDGMENT
algorithm shows its superiority on the two public data sets
The authors would like to thank all the anonymous
compared with the other methods. In Table IV, all methods
reviewers for their valuable comments.
have better classification results on MIT CBCL than on the
Caltech rear-viewed vehicle data set, because most of the
vehicle images in MIT CBCL are frontal-viewed vehicles that R EFERENCES
are more similar to our training samples in distribution. [1] D. Tao, X. Tang, X. Li, and X. Wu, Asymmetric bagging and random
From Figs. 9 and 10, one can conclude that, compared subspace for support vector machines-based relevance feedback in image
with the original data, attribute normalization improves the retrieval, IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 7,
pp. 10881099, Jul. 2006.
classification performance significantly, and compared with the [2] W. Liu and D. Tao, Multiview Hessian regularization for image
other two popular normalization methods in vehicle detection, annotation, IEEE Trans. Image Process., vol. 22, no. 7, pp. 26762687,
our improved normalization algorithm is the best choice for Jul. 2013.
[3] F. Zhu and L. Shao, Weakly-supervised cross-domain dictionary learn-
RBF-SVM classifier on both Test data I and Test data II. ing for visual recognition, Int. J. Comput. Vis., vol. 109, nos. 12,
The original data is sensitive to the illumination and easily pp. 4259, Aug. 2014.
dominated by the too big attribute values in classification, [4] S. Sivaraman and M. M. Trivedi, Looking at vehicles on the road:
A survey of vision-based vehicle detection, tracking, and behav-
and the statistical normalization method requires that the ior analysis, IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4,
attribute data should follow a normal distribution, which pp. 17731795, Dec. 2013.
is not always satisfied in real applications. Although the [5] J. Li and D. Tao, Simple exponential family PCA, IEEE Trans. Neural
Netw. Learn. Syst., vol. 24, no. 3, pp. 485497, Mar. 2013.
minmax normalization directly used on the original attribute [6] D. Tao, X. Li, X. Wu, and S. J. Maybank, Geometric mean for subspace
data overcomes the domination of the too big attribute selection, IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2,
values in classification, it is still sensitive to illumination. pp. 260274, Feb. 2009.
[7] J. Wu and X. Zhang, A PCA classifier and its application in vehi-
The improved normalization method overcomes the above two cle detection, in Proc. Int. Joint Conf. Neural Netw., vol. 1. 2001,
problems. pp. 600604.
516 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 25, NO. 3, MARCH 2015

[8] T. Kato, Y. Ninomiya, and I. Masaki, Preceding vehicle recognition [34] Q. Yuan, A. Thangali, V. Ablavsky, and S. Sclaroff, Learning a family
based on learning from sample images, IEEE Trans. Intell. Transp. of detectors via multiplicative kernels, IEEE Trans. Pattern Anal. Mach.
Syst., vol. 3, no. 4, pp. 252260, Dec. 2002. Intell., vol. 33, no. 3, pp. 514530, Mar. 2011.
[9] N. D. Matthews, P. E. An, D. Charnley, and C. J. Harris, Vehicle [35] N. Blanc, B. Steux, and T. Hinz, LaRASideCam: A fast and robust
detection and recognition in greyscale imagery, Control Eng. Pract., vision-based blindspot detection system, in Proc. IEEE Intell. Veh.
vol. 4, no. 4, pp. 473479, Apr. 1996. Symp., Jun. 2007, pp. 480485.
[10] S. L. Phung, D. Chai, and A. Bouzerdoum, A distribution-based [36] Z. Kim, Realtime obstacle detection and tracking based on constrained
face/nonface classification technique, Austral. J. Intell. Inf. Process. Delaunay triangulation, in Proc. IEEE ITSC, Sep. 2006, pp. 548553.
Syst., vol. 7, nos. 34, pp. 132138, 2001. [37] Y. Zhang, S. J. Kiselewich, and W. A. Bauson, Legendre and Gabor
[11] A. N. Rajagopalan, P. Burlina, and R. Chellapa, Higher order statistical moments for vehicle recognition in forward collision warning, in Proc.
learning for vehicle detection in images, in Proc. 7th IEEE Int. Conf. IEEE ITSC, Sep. 2006, pp. 11851190.
Comput. Vis., vol. 2. Sep. 1999, pp. 12041209. [38] T. Liu, N. Zheng, L. Zhao, and H. Cheng, Learning based symmetric
[12] Z. Sun, G. Bebis, and R. Miller, Object detection using feature subset features selection for vehicle detection, in Proc. IEEE Intell. Veh. Symp.,
selection, Pattern Recognit., vol. 37, no. 11, pp. 21652176, Nov. 2004. Jun. 2005, pp. 124129.
[13] K.-K. Sung and T. Poggio, Example-based learning for view-based [39] A. Khammari, F. Nashashibi, Y. Abramson, and C. Laurgeau, Vehicle
human face detection, IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, detection combining gradient analysis and AdaBoost classification, in
no. 1, pp. 3951, Jan. 1998. Proc. IEEE Intell. Transp. Syst., Sep. 2005, pp. 6671.
[40] J. Cui, F. Liu, Z. Li, and Z. Jia, Vehicle localisation using a single
[14] C. Papageorgiou and T. Poggio, A trainable system for object detec-
camera, in Proc. IEEE Intell. Veh. Symp., Jun. 2010, pp. 871876.
tion, Int. J. Comput. Vis., vol. 38, no. 1, pp. 1533, 2000.
[41] D. Withopf and B. Jahne, Learning algorithm for real-time vehicle
[15] Z. Sun, G. Bebis, and R. Miller, Quantized wavelet features and support tracking, in Proc. IEEE ITSC, Sep. 2006, pp. 516521.
vector machines for on-road vehicle detection, in Proc. 7th Int. Conf. [42] I. Kallenbach, R. Schweiger, G. Palm, and O. Lohlein, Multi-class
Control, Autom., Robot. Vis., 2002, pp. 16411646. object detection in vision systems using a hierarchy of cascaded classi-
[16] Z. Sun, G. Bebis, and R. Miller, On-road vehicle detection using Gabor fiers, in Proc. IEEE Intell. Veh. Symp., 2006, pp. 383387.
filters and support vector machines, in Proc. 14th Int. Conf. Digit. [43] T. T. Son and S. Mita, Car detection using multi-feature selection for
Signal Process., 2002, pp. 10191022. varying poses, in Proc. IEEE Intell. Veh. Symp., Jun. 2009, pp. 507512.
[17] D. Tao, X. Li, X. Wu, and S. J. Maybank, General tensor discriminant [44] D. Acunzo, Y. Zhu, B. Xie, and G. Baratoff, Context-adaptive approach
analysis and Gabor features for gait recognition, IEEE Trans. Pattern for vehicle detection under varying lighting conditions, in Proc. IEEE
Anal. Mach. Intell., vol. 29, no. 10, pp. 17001715, Oct. 2007. ITSC, Sep./Oct. 2007, pp. 654660.
[18] Z. Sun, G. Bebis, and R. Miller, Improving the performance of on-road [45] C. Szegedy, A. Toshev, and D. Erhan, Deep neural network for object
vehicle detection by combining Gabor and wavelet features, in Proc. detection, in Advances in Neural Information Processing Systems.
IEEE 5th Int. Conf. Intell. Transp. Syst., 2002, pp. 130135. Red Hook, NY, USA: Curran Associates, Inc., 2013, pp. 25532561.
[19] D. G. Lowe, Object recognition from local scale-invariant features, in [46] R. Lienhart and J. Maydt, An extended set of Haar-like features for
Proc. 7th IEEE Int. Conf. Comput. Vis., Sep. 1999, pp. 11501157. rapid object detection, in Proc. Int. Conf. Image Process., Jan. 2002,
[20] X. Zhang, N. Zheng, Y. He, and F. Wang, Vehicle detection using an pp. 900903.
extended hidden random field model, in Proc. 14th Int. IEEE Conf. [47] R. Lienhart, A. Kuranov, and V. Pisarevsky, Empirical analysis of
ITSC, Oct. 2011, pp. 15551559. detection cascades of boosted classifiers for rapid object detection, in
[21] M. Cheon, W. Lee, C. Yoon, and M. Park, Vision-based vehicle Proc. 25th German Pattern Recognit. Symp., 2003, pp. 297304.
detection system with consideration of the detecting location, IEEE [48] Y. Freund and R. E. Schapire, Experiments with a new boosting
Trans. Intell. Transp. Syst., vol. 13, no. 3, pp. 12431252, Sep. 2012. algorithm, in Proc. 13th Int. Conf. Mach. Learn., 1996, pp. 148156.
[22] B. F. Lin et al., Integrating appearance and edge features for sedan [49] V. Vapnik, The Nature of Statistical Learning Theory. New York, NY,
vehicle detection in the blind-spot area, IEEE Trans. Intell. Transp. USA: Springer-Verlag, 1995.
Syst., vol. 13, no. 2, pp. 737747, Jun. 2012. [50] C. J. C. Burges, A tutorial on support vector machines for pat-
[23] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, Speeded-up robust tern recognition, Data Mining Knowl. Discovery, vol. 2, no. 2,
features (SURF), Comput. Vis. Image Understand., vol. 110, no. 3, pp. 955974, 1998.
pp. 346359, 2008. [51] Y. Li, B. Fang, L. Guo, and Y. Chen, Network anomaly detection based
[24] X. Wen and Y. Zheng, An improved algorithm based on AdaBoost on TCM-KNN algorithm, in Proc. 2nd ASIACCS, 2007, pp. 1319.
for vehicle recognition, in Proc. 2nd Int. Conf. Inf. Sci. Eng. (ICISE), [52] W. Ma, D. Tran, and D. Sharma, A study on the feature selection of
Hangzhou, China, Dec. 2010, pp. 981984. network traffic for intrusion detection purpose, in Proc. IEEE Int. Conf.
[25] P. Viola and M. Jones, Rapid object detection using a boosted cascade ISI, Jun. 2008, pp. 245247.
of simple features, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. [53] Y. Liao, V. R. Vemuri, and A. Pasos, Adaptive anomaly detection with
Pattern Recognit., Jan. 2001, pp. 511518. evolving connectionist systems, J. Netw. Comput. Appl., vol. 30, no. 1,
[26] P. Viola and M. J. Jones, Robust real-time face detection, Int. pp. 6080, 2007.
J. Comput. Vis., vol. 57, no. 2, pp. 137154, 2004. [54] C.-C. R. Wang and J.-J. J. Lien, Automatic vehicle detection using
local featuresA statistical approach, IEEE Trans. Intell. Transp. Syst.,
[27] P. Viola and M. J. Jones, Robust real-time object detection, in Proc.
vol. 9, no. 1, pp. 8396, Mar. 2008.
IEEE ICCV Workshop Statist. Computat. Theories Vis., Vancouver, BC,
[55] J. Wu, S. C. Brubaker, M. D. Mullin, and J. M. Rehg, Fast asymmetric
Canada, Jul. 2001, pp. 130.
learning for cascade face detection, IEEE Trans. Pattern Anal. Mach.
[28] R. Miller, Z. Sun, and G. Bebis, Monocular precrash vehicle detection: Intell., vol. 30, no. 3, pp. 369382, Mar. 2008.
Features and classifiers, IEEE Trans. Image Process., vol. 15, no. 7,
pp. 20192034, Jul. 2006.
[29] O. Ludwig and U. Nunes, Improving the generalization properties of
neural networks: An application to vehicle detection, in Proc. 11th Int.
IEEE Conf. ITSC, Oct. 2008, pp. 310315.
[30] X. Wen, H. Zhao, N. Wang, and H. Yuan, A rear-vehicle detec-
tion system for static images based on monocular vision, in Proc. Xuezhi Wen received the Ph.D. degree in computer
9th Int. Conf. Control, Autom., Robot. Vis., Singapore, Mar. 2006, application technique from Northeastern University,
pp. 24212424. Shenyang, China, in 2008.
[31] W. Liu, X. Wen, B. Duan, H. Yuan, and N. Wang, Rear vehicle He is an Associate Professor with Jiangsu
detection and tracking for lane change assist, in Proc. IEEE Intell. Engineering Center of Network Monitoring,
Veh. Symp., Istanbul, Turkey, Jun. 2007, pp. 252257. Nanjing University of Information Science and
[32] S. S. Teoh and T. Brunl, Symmetry-based monocular vehicle detection Technology, Nanjing, China, where he is also an
system, Mach. Vis. Appl., vol. 23, no. 5, pp. 831842, Sep. 2012. Associate Professor with the School of Computer
[Online]. Available: http://dx.doi.org/10.1007/s00138-011-0355-7 and Software.
[33] S. Sivaraman and M. M. Trivedi, Active learning for on-road vehicle Dr. Wen is a member of the Association for
detection: A comparative study, Mach. Vis. Appl., vol. 25, no. 3, Computing Machinery. His research interests
pp. 599611, Dec. 2011. include pattern recognition, image processing, and intelligent transportation.
WEN et al.: EFFICIENT FEATURE SELECTION AND CLASSIFICATION FOR VEHICLE DETECTION 517

Ling Shao (M09SM10) received the B.Eng. Wei Fang received the Ph.D. degree in computer
degree in electronic and information engineering science from Soochow University, Suzhou, China,
from University of Science and Technology of in 2009.
China, Hefei, China, and the M.Sc. degree in med- He is an Associate Professor with the Jiangsu
ical image analysis and the Ph.D. (D.Phil.) degree Engineering Center of Network Monitoring,
in computer vision from Robotics Research Group, Nanjing University of Information Science and
University of Oxford, Oxford, U.K. Technology, Nanjing, China. His research interests
He was a Senior Lecturer with the Department include data mining, big data analytics, and cloud
of Electronic and Electrical Engineering, University computing.
of Sheffield, Sheffield, U.K., from 2009 to 2014, Dr. Fang is a Senior Member of the China
and a Senior Scientist with Philips Research, The Computer Federation and a member of the
Netherlands, from 2005 to 2009. He is currently a Full Professor with the Association for Computing Machinery.
Department of Computer Science and Digital Technologies, Northumbria
University, Newcastle upon Tyne, U.K. He has authored or co-authored over
150 academic papers in refereed journals and conference proceedings, and
holds over 10 European Union/U.S. patents. His research interests include Yu Xue received the Ph.D. degree from the College
computer vision, image/video processing, pattern recognition, and machine of Computer Science and Technology, Nanjing Uni-
learning. versity of Aeronautics and Astronautics, Nanjing,
Dr. Shao is a fellow of the British Computer Society and the Institu- China, in 2013.
tion of Engineering and Technology. He has been an Associate or Guest He is a Lecturer with the Jiangsu Engineering
Editor of IEEE T RANSACTIONS ON C YBERNETICS , IEEE T RANSACTIONS Center of Network Monitoring, Nanjing University
ON I MAGE P ROCESSING , Pattern Recognition, IEEE T RANSACTIONS ON of Information Science and Technology, Nanjing,
N EURAL N ETWORKS AND L EARNING S YSTEMS , and several other journals. where he is also a Lecturer with the School of Com-
He has organized several workshops with top conferences, such as ICCV, puter and Software. His research interests include
ECCV and ACM Multimedia. He has served as the Program Commit- computational intelligence, electronic countermea-
tee Member for many international conferences, including ICCV, CVPR, sure, and Internet of things.
ECCV, and ACM MM. Dr. Xue is a member of the China Computer Federation.

You might also like