92522084

國立中央大學
資訊工程研究所
碩士論文
以正規化色彩與邊緣資訊作車輛偵測
Vehicle Detection Using Normalized Color and

Edge Map
指導教授：范國清博士
研究生：蔡洛緯
中華民國九十四年六月
國立中央大學圖書館
碩博士論文授權書
(93 年 5 月最新修正版)
本授權書所授權之論文全文與電子檔，為本人於國立中央大學，撰寫之碩/博士
學位論文。(以下請擇一勾選)
( 9)同意 (立即開放)
( )同意 (一年後開放)，原因是：
( )同意 (二年後開放)，原因是：
( )不同意，原因是：
以非專屬、無償授權國立中央大學圖書館與國家圖書館，基於推動讀者
間「資源共享、互惠合作」之理念，於回饋社會與學術研究之目的，得
不限地域、時間與次數，以紙本、光碟、網路或其它各種方法收錄、重
製、與發行，或再授權他人以各種方法重製與利用。以提供讀者基於個
人非營利性質之線上檢索、閱覽、下載或列印。
研究生簽名：蔡洛緯
論文名稱：以正規化色彩與邊緣資訊作車輛偵測
指導教授姓名：范國清
系所：資訊工程所博士 ;碩士班
學號： 92522084
日期：民國九十四年六月二十四日
備註：
1. 本授權書請填寫並親筆簽名後，裝訂於各紙本論文封面後之次頁（全文電子
檔內之授權書簽名，可用電腦打字代替）。
2. 請加印一份單張之授權書，填寫並親筆簽名後，於辦理離校時交圖書館（以
統一代轉寄給國家圖書館）。
3. 讀者基於個人非營利性質之線上檢索、閱覽、下載或列印上列論文，應依著
作權法相關規定辦理
ii
Abstract
In this thesis, a novel approach for detecting vehicles using color and edge
information from static images is presented. Different from traditional methods
which use motion features to detect vehicles, the proposed method introduces a new
color transform model to find important “vehicle color” for the quick finding of
possible vehicle candidates. Since vehicles have various colors under different
weather and lighting conditions, seldom works were proposed for the detection of
vehicles using colors. The proposed new color transform model has extremely
excellent capabilities in identifying vehicle pixels from background ones even
though the pixels are under varying illuminations.
After finding possible vehicle candidates, three important features including
corners, edge maps, and coefficients of wavelet transform are used for constructing a
cascade and multi-channel classifier. According to this classifier, an effective
scanning is performed to verify all possible candidates. The scanning can be
quickly achieved because most background pixels are eliminated by the color feature.
Experimental results show that the integration of global color feature and local edge
feature is powerful in the detection of vehicles. The average accuracy rate of
vehicle detection is 94.5%.
iii
摘要
本篇論文提出一種新穎的車輛偵測方法。對於彩色影像中之車輛，利用色彩與邊
緣資訊加以偵測並確認。大多數過去的方法皆採用移動資訊 (Motion
features)，即假設車輛為影像中之移動物，但此法對於靜止之車輛完全失效。
本文提出一種新穎的色彩空間轉換方法，如同人臉偵測時會先尋找膚色區域一
般，快速地找出影像中屬於車輛顏色之像素點。由於車輛具有各種不同之顏色，
同時戶外環境伴隨著季節與天候不同，複雜的光線因素導致極少的論文採用色彩
資訊作車輛偵測。而本文所提之色彩空間轉換方式則為一強有力之工具，足以在
不同的光線條件下區分車輛與背景之像素點。
在找出可能屬於車輛之像素點後，本文結合三種有效之特徵，分別為角點
（corners）
、邊緣資訊(edge maps)與小波轉換之係數，用以建構一連串且多重
維度(multi-channel)之車輛分類器。此分類器對輸入影像中可能之車輛像素點
作有效之確認。由於先前已利用色彩資訊濾除大量無關的背景像素點，故此確認
步驟將可快速且有效的執行。實驗結果證明結合整體色彩資訊與局部邊緣資訊之
車輛偵測方式是強而有效的。平均偵測率達到 94.5％
iv
CONTENT
CHAPTER 1 INTRODUCTION........................................................................................................1
1.1 MOTIVATION.....................................................................................................................1
1.2 REVIEW OF RELATED WORKS .........................................................................................2
1.3 OVERVIEW OF THE PROPOSED SYSTEM ..........................................................................5
CHAPTER 2 CONVENTIONAL METHODS FOR DATA ANALYSIS ........................................7
2.1 KARHUNEN-LOE`VE TRANSFORM ...................................................................................8

2.2 BAYESIAN CLASSIFIER....................................................................................................10
2.3 NEAREST – NEIGHBOR CLUSTERING ALGORITHM ....................................................... 11
CHAPTER 3 VEHICLE COLOR DETECTOR ............................................................................13
3.1 COLOR FEATURES FOR DIMENSIONALITY REDUCTION ................................................15

3.2 PIXELS CLASSIFICATION USING BAYESIAN CLASSIFIER...............................................19
3.3 PIXELS CLASSIFICATION USING NEURAL NETWORK....................................................20
3.4 COLOR CLASSIFICATION RESULT ..................................................................................25
CHAPTER 4 VEHICLE VERIFICATION.....................................................................................27
4.1 VEHICLE HYPOTHESIS ...................................................................................................27

4.2 VEHICLE FEATURES .......................................................................................................28
4.2.1 Contour feature....................................................................................................28
4.2.2 Wavelet Coefficients.............................................................................................33
4.2.3 Corner Features...................................................................................................36
4.3 INTEGRATION AND SIMILARITY MEASUREMENT ..........................................................37
4.4 VERIFICATION PROCEDURE ...........................................................................................39
CHAPTER 5 EXPERIMENTAL RESULTS...................................................................................42
5.1 DATA SET ........................................................................................................................42

5.2 PERFORMANCE ANALYSIS OF PIXELS CLASSIFICATION ...............................................42
5.3 DETECTION RESULT IN VARIOUS ENVIRONMENTS .......................................................44
CHAPTER 6 DISCUSSIONS AND CONCLUSIONS ......................................................................47
6.1 DISCUSSIONS ..................................................................................................................47

6.2 CONCLUSIONS ................................................................................................................47
REFERENCES .....................................................................................................................................49
v
List of Figures
Fig. 1 Details of the proposed vehicle detector...........................................................5

Fig. 2 Two-dimension data representation of ten-dimension data set.........................8
Fig. 3 Bayes classifier (adapted from Tou and Gonzlaez[1974]) .............................10
Fig. 4 Effect of the threshold and starting points in a simple cluster seeking scheme.
(Adapted from [23]).........................................................................................12
Fig. 5 Vehicle detection procedure. (a) Vehicle hypotheses generation. (b) Vehicle
verification. ....................................................................................................14
Fig. 6 Road color distribution and (u, v) feature plane.............................................15
Fig. 7 Parts of vehicle training samples. (a) Vehicle training images. (b) Non-vehicle
training images...............................................................................................17
Fig. 8 Results of color transformations of background pixels. (a) Result of color
transformation using the (u, v) domain. (b) Result of color transformation
using the (s, t) domain.(c) Vehicle pixels plotting : result of color
transformation in the (u, v) domain................................................................18
Fig. 9 The basic perceptron model............................................................................21
Fig. 10 Vehicle color detection result. (a)~(b) Original images.(c)~(d) Color
classification result using Bayesian classifier. (e)~(f) Color classification
result using perceptron.................................................................................24
Fig. 11 Another vehicle color classification results. Compared with another
images, (b) is duskier but can still perform well. (a)~(b) Original
images.(c)~(d) Color classification result using Bayesian classifier. (e)~(f)
Color classification result using perceptron ................................................26
Fig. 12 A 3x3 averaging mask often used for smoothing .........................................29
Fig. 13 2-D Gaussian distribution with mean (0,0) and σ=1 ....................................29
Fig. 14 Discrete approximation to Gaussian function withσ=1.4.............................30
Fig. 15 The Sobel mask in x-direction and y-direction. ...........................................31
Fig. 16 Edge detection by Canny operator................................................................31
Fig. 17 The value of y is nonlinearly increased when x increases. ...........................32
Fig. 18 Result of distance transform. (a) Original Image. (b) Distance....................33
Fig. 19 Block diagram of discrete wavelet transform...............................................34
Fig. 20 Wavelet decomposition of three scales.........................................................35
Fig. 21. Corner detection of vehicles (a)Vehicle contains many corners
(b)Comparison with background, vehicle contains more corners than
background thus corners features can be adapted as features......................37
Fig. 22. The cascade classifier used for vehicle detection........................................38
vi
Fig. 23. Image pyramid structure. Assume the original image size is 320*240,
processing the image at each resolution rescaling the original size with 0.8
ratios until pre-defined resolution is achieved. ............................................39
Fig. 24 Red points represent the possible vehicle candidates with stronger responses.
These points should be clustered by nearest-neighbor algorithm. (a)
Original image (b) The white area denotes the region of possible vehicle
pixels. ...........................................................................................................41
Fig. 25. Result of vehicle color detection. (a) Original image. (b) Detection
result of vehicle color...................................................................................44
Fig. 26. Result of vehicle color detection. (a) Original image. (b) Detection result
of vehicle color.............................................................................................44
Fig. 27. Result of vehicle detection in a parking lot .................................................45
Fig. 28. Result of vehicle detection in a parking lot with different orientation. .......45
Fig. 29. Result of vehicle detection on road. ............................................................46
Fig. 30. Result of detecting vehicles from highway. Although these vehicles were
with different colors, all of them were correctly detected. ..........................46
Fig. 31. Result of vehicle detection in road with occlusion......................................46
vii
CHAPTER 1
INTRODUCTION
1.1 Motivation
Due to the development of the scientific and technological civilization, automobiles
have already become an indispensable tool in replacing walking in the modernized
society. The problem resulting from city civilization is the lacking of parking lot, i.e.,
fewer and fewer parking spaces are available especially in the city. It is difficult to
find vacant parking spaces within a short period, which results in the wasting of
precious resources and the problem of environmental pollution.
To fully utilize the precious space, it is a trend to construct the parking lot of every
building upwardly or down into underground in many floors. However, the supply
of the needing parking space still can not meet the urgent requirement comparing with
the speed of the fast growth of vehicles. If we can save the time in finding parking
space, the problems of fuel wasting, traffic violation, labor resource pending, and the
pollution of surrounding air can all be resolved even can improve the mobility of
transportation and the productivity economics. On the other hand, the problem of
traffic jam in city roads usually occurs in rush hour or vacation. Part of the reason is
due to the unawaring of road information, such as the information of nearby available
parking space. The detection of vehicles in parking lot can accurately provide the
information of available parking spaces. Hence, the providing of useful road
information to vehicle drivers may somewhat alleviate the traffic jam problem.
1
Moreover, the accurate detection of vehicles is also indispensable to the
measurement of various traffic parameters, such as vehicle count, speed, flow, and
concentration. Once a vehicle can be accurately tracked through a road network,
more complex traffic parameters, such as linked-travel-time, can be computed.
Therefore, vehicle detection is critical to traffic management systems and intelligence
transport systems for collision avoidance or traffic flow control.
In this thesis, we plan to develop an intelligent vehicle detection system based on
the techniques of computer vision with an eye to effectively managing the parking
spaces in the indoor/outdoor parking lots or along the roads.
1.2 Review of Related Works
Vehicle detection [1]-[2] is an important problem to be resolved in many related
applications, such as self-guided vehicles, driver assistance system, intelligent parking
system, and measurement of traffic parameters like vehicle count, speed, and flow.
One of the common approaches to vehicle detection is the using of vision-based
techniques to analyze vehicles from images or video sequences. However, due to the
variations of vehicle colors, sizes, orientations, shapes, and poses, the developing of a
robust and effective vision-based vehicle detection system is challenging. To
alleviate the above problems, different approaches using different features and
learning algorithms for locating vehicles have been investigated. For example, many
researchers [2]-[5] used background subtraction to extract motion features for
detecting moving vehicles from video sequences. However, this kind of motion
features is no longer usable in static images. To deal with static images, Wu et al. [6]
used wavelet transform to extract texture features for locating possible vehicle
candidates from roads. Then, each vehicle candidate is verified by using a PCA
2
classifier. In addition, Sun et al. [7] used Gabor filters to extract different textures
and then verified each candidate of vehicles using a SVM (support vector machine)
classifier. In addition to textures, “symmetry” is another important feature used in
vehicle detection. In [8], Broggi et al. described a detection system to search the
areas with a high vertical symmetry for locating vehicles. However, this cue is
prone to false detections, such as symmetrical objects in complex background.
Furthermore, in [9], Bertozzi et al. used corner features to build four templates of
vehicles for vehicle detection and verifications. In [10], Tzomakas and Seelen found
that the area shadow underneath a vehicle is a good cue in detecting vehicles. In
[11], Ratan et al. developed a scheme to detect the wheel features of vehicles as cues
to find possible vehicle positions and then used a method called Diverse Density to
verify each vehicle candidate. In addition, Bensrhari [12] and Aizawa [13] used
stereo-vision technique and 3D vehicle models to detect vehicles and obstacles. The
major drawback of the above methods for searching vehicles using local features is
that the system needs a very time-consuming search to scan all pixels of the whole
image. Although color is an important perceptual descriptor, there were seldom
color-based works addressed for vehicle detection because vehicles have very large
variations in the colors. In [14], Rojas and Crisman used a color transform to project
all road pixels on a color plane such that vehicles can be identified from road
backgrounds. Similarly, in [15], Guo et al. used several color balls to model road
colors in L*a*b* color space and then vehicle pixels can be identified if they are
classified as non-road regions. However, these color models are not compact and
general in modeling vehicle colors. Many cases of false detection will still be
produced and will thereby lead to the accuracy degradation vehicle detection.
As mentioned previously, vehicles have larger variations in appearance including
3
colors, sizes, and shapes changing, due to different viewing angles and lighting
conditions. All the variations will increase the difficulties and challenges in
selecting a general feature for describing vehicles. In this thesis, a novel vehicle
detection method using colors from still images is proposed. Once a general color
transform can be found to accurately capture the visual characteristics of vehicles
(even under different lighting conditions), color feature will become a very useful and
powerful cue to narrow down the search areas of possible vehicles. The main
contribution of this thesis is the presentation of a statistic linear color model that
makes vehicle colors be more compact in the specific feature space such that vehicles
pixels will be distributed sufficiently concentrating on a smaller area. The model is
leaned by observing how the vehicle colors change in static images under different
lighting conditions and cluttered backgrounds. The model is global and does not
need to be re-estimated for any new vehicle or new image. Without prior knowledge
of surface reflectance, weather condition, and view geometry in the training phase, the
model can still perform very well in separating vehicle pixels from background ones.
After that, three features including edge maps, corners, and wavelet coefficients are
devised to form a multi-channel classifier. The classifier is modeled using a Gaussian
model and can be automatically learned from a set of training examples. Since the
classifier records many vehicle appearance changes, it possesses good discriminative
properties in verifying the correctness of each vehicle candidate. Due to the usage of
color feature that can filter out most of background pixels in advance, only very few
candidates are still needed to be checked and thus the verification process can be
accomplished efficiently. Moreover, vehicle still can be detected successfully even
with occlusions because of the filtering effect and discriminative capabilities of the
proposed method. Experiments were conducted in various real cases and the
experimental results verify the superiority of the proposed method in detecting
4
vehicles.
Fig. 1 Details of the proposed vehicle detector.
1.3 Overview of the Proposed System
A novel system using edges and colors from static images to detect vehicles is
presented in this thesis. The flowchart of the proposed system is shown in Fig. 1.
At the beginning, a specific color transformation is proposed to project all the colors
of input pixels on a new feature space such that vehicle pixels can be easily
distinguished from non-vehicle ones. Here, a Bayesian network is adopted for
identification.
Since vehicles have different sizes and orientations, different vehicle hypothesis are
generated from each detected vehicle. Three kinds of vehicle features including
edges, coefficients of wavelet transform, and corners are employed to eliminate
non-vehicle candidates. Using the proper weights obtained from training samples,
these features can be combined together to form an optimal vehicle classifier. Then,
vehicles candidates can be verified robustly and accurately from static images.
The rest of the thesis is organized as follows. Chapter 2 introduces three
5
conventional methods for data analysis. The details of our proposed novel color
transform in finding vehicle colors are described in Chapter 3. Chapter 4 discusses
the details of feature extraction and the proposed multi-channel classifier.
Experimental results are demonstrated in Chapter 5 to verify the validity of the
proposed method. Finally, conclusions are given in Chapter 6.
6
CHAPTER 2
CONVENTIONAL MEHTODS FOR

DATA ANALYSIS
In this chapter, three conventional methods used for constructing the vehicle color
detector are firstly described including their basic theories, mathematical models, etc.
As we know, feature extraction, classification and clustering are significant issues that
were intensively discussed in pattern recognition. Feature extraction concentrates on
the looking of a subset that contains the critical information but can not destroy the
nature from the original vast data set. This subset can be utilized to replace the
original data set to avoid noise jamming. Dimensionality reduction is often
performed for feature extraction in reducing computational load. In addition, fewer
and critical features are remained in designing the classifier that decreases the
confused judgment. With the prior knowledge of the data distribution, Bayesian
classifier is adopted to separate the desired features from unconcerned ones and
clustering technique is employed to aggregate the critical features for recognition.
7
Fig. 2 Two-dimension data representation of ten-dimension data set
2.1 Karhunen-Loe`ve Transform
Dimensionality reduction is a useful skill in pattern recognition. Too many features
lead to more computation load and create confusion such as to decreasing the
classifier performance. The main idea of Dimensionality reduction is selecting a
subset of feature from original data and generating the lower dimension data still
preserving distinguishing characteristics. That is to reduce the dimensionality of a
high dimensional data without significant loss in accuracy. In practice,
high-dimensional data are often loose without tight clusters. By projecting data onto
an appropriate lower-dimensional space (feature space), data clusters would have a
local structure that makes the close neighborhood meaningful.
Human beings can not realize the shape, density of the data in high dimension.
Projecting data into lower-dimension space makes data clusters easy to observe by
human eyes. Fig. 2 shows a 10D data set projecting into 2D space. Obviously, this
data set contains four clusters in high-dimensional space through projecting into 2D
plane.
8
Karhunen-Loe`ve transform (K-L transform) is a well-know and widely used
technique for statistical analysis. It has various names, such as Principal component
analysis and Hotelling transform. This method usually was adopted in feature
extraction, data compression, image processing…etc. K-L transform is a linear
projection from m-dimension space to n-dimension space (n ≦m). It has to
compute a transformation matrix H which is constructed by the eigenvector of the
covariance matrix R. Suppose we have an input data set X = { x1, x2 ,...xn }, the
computation of the transformation matrix is as the following algorithm:
Step 1 : Let m denote the mean and C denote the covariance matrix
1 n
m= ∑ xk
n k =1
1 n
C= ∑
n i =1
( xi − m)( x i −m)T
Step 2 : Compute the eigenvalues d1 , d 2 ,..., d n and construct the associate

eigenvector λ of C . Sort them as λ 1 ≥ λ 2 ≥ ... λ n
Step 3 : form the matrix H = [λ1,λ2 ...λn ]T
After transformation, the covariance matrix of the feature becomes a diagonal
matrix. This matrix projects the input data into a subspace whose axes are in the
direction of the largest variation as follow:
Step 4 : yi = Hxi for i = 1,...N
9
Fig. 3 Bayes classifier (adapted from Tou and Gonzlaez[1974])
2.2 Bayesian classifier.
In statistical pattern recognition, we focus on how to developing decision or
classification strategies which form classifiers. The design of classifier attempts to
integrate all available information such as measurement of a priori probabilities of
data. Then, the classifier minimizes the total expected loss and using Bayes’ formula
as the optimum measure of performance. The class-conditional density function of
probability of a pattern x, when x belongs to class wi , can be given as follow:
p ( x / wi ), i = 1,2,..., K
All the class-conditional densities are completely know a prior, the decision
boundary between pattern classes can be established using the optimal Bayes decision
rules (as shown in Fig. 3)[24].
10
By way of introduction, consider a vector x with Gaussian distribution, the
probability density function of x is:
1  1  x − m 2 
P( x) = exp −   
2π * σ  2  σ  
m and σ are mean and variance respectively. We can get the decision boundary
function is:
1 1
[ ]
di ( x) = ln p(wi ) − ln Ci − ( x − mi ) Ci−1 ( x − mi ) , i = 1,2,..., M
2 2
T
C is covariance matrices. The Bayes classifier assigns a pattern x to class wi if
p( x / wi ) p(wi ) > p( x / wj ) p(wj ), j = 1,2,....M , j ≠ i
The detail derivate procedure can be found in [23].
2.3 Nearest – Neighbor Clustering Algorithm

Clustering is a data analysis method for discovering patterns and underlying cluster
structures. It’s also the formal study of algorithm and methods for grouping data.
It helps to exploring structure of the data and has many applications in engineering
and science. Unfortunately, there’s no predominant method that is the best for
different data sets. Various kinds of methods such as Hierarchical clustering,
partitioning based k-means and Self Organizing map (SOM) are wildly used. The
K-means algorithm needs choosing the value k, the number of clusters, at initial step.
Then it divided the input data set into k distinct cluster by minimizing the sum of the
distances between the input pattern and the each cluster center.
11
Fig. 4 Effect of the threshold and starting points in a simple cluster seeking scheme.
(Adapted from [23])
Here, a simple Nearest – Neighbor Clustering algorithm similar K-means without
initial value k is described below:
Step 1: set i = 1 and k = 1, assign pattern X i to cluster Ck (k = 1)
Step 2: i = i + 1. Find the nearest neighbor of X i and assign these patterns
to clusters. Let d denote the distance from X i to its nearest

neighbor.
Step 3: if d ≦ t (a pre-defined threshold), then assign X i to Ck .
Otherwise, set k = k +1, and assign X i to a new cluster Ck

Step4: if every pattern has been assigned to a cluster, stop. Else, go to
step2.
As shown in Fig. 4, the number of clusters k depends on the parameter t. As the
value of t increases, the fewer clusters are generated. The distance d is usually
measured as Euclidean distances.
12
CHAPTER 3
VEHICLE COLOR DETECTOR
Like the skin color used for face representation, a new color transform is introduced
in this chapter for projecting all pixels with (R, G, B) colors to a new domain. Then,
a specific “vehicle color” can be found and defined for effective vehicle detection.
The notion or vehicle color will be briefly introduced in the forthcoming contexts.
Section 3.1 describes the derived procedure of the transformation formula and Section
3.2 presents the classification of vehicle color pixels using Bayesian classifier.
Section 3.3 presents the neural network technique used for classification. Finally,
section 3.4 shows the results of color classification.
As a gray car goes into the general gray road surface, human vision will perceive
that the colors of the two are very similar. That is the feature of color alone is not
enough in distinguishing the two in human eyes. However, the color of an object
can be represented in several different color spaces, such as RGB, HSV, etc., and the
observation of an object’s color in images depends on incident light, reflectance of the
object, and viewing angle. In these feature spaces, vehicle color owns the different
and special characteristic to separate from the background even vehicles with various
colors. Thus, the distributions of road and vehicles color can be“learned” from
training samples. Once the distribution of vehicles color is evaluated precisely, the
difference between vehicles and background color pixels can be easily discriminated.
13
(a) (b)
Fig. 5 Vehicle detection procedure. (a) Vehicle hypotheses generation. (b) Vehicle
verification.
In general, vehicle detection contains two major steps: vehicle hypothesis and
vehicle verification. In the first step, the location of a vehicle or more vehicles in an
image is generating as shown in Fig. 5(a). Without extra information about vehicle
positions, a time-consuming scanning is performed from left to right, up to down in
the input image. Exhaustive search is the simplest method to verify by testing all
pixels in images and check all of them whether vehicles exist or not. Too many
vehicle hypotheses make the following verification step needing enormous
computation times. In the past, most papers focus on improving the performance of
verification steps but selecting an exhaustive search. Therefore, this paper provides
an efficient method which can quickly find possible vehicle positions reducing the
numbers of vehicle hypothesizes and decreasing the computation time without a full
search.
14
Fig. 6 Road color distribution and (u, v) feature plane.
3.1 Color Features for Dimensionality Reduction

In this section, a detail description about what we called “vehicle color” is given.
Assume that there are N images collected from roads, highways and parking places.
Through a statistic analysis, we can get the covariance matrix ∑ of the color
distributions of R, G, and B from N images. Using the Karhunen-Loe`ve transform,
the eigenvectors and eigenvalues of ∑ can be further obtained and represented as ei
and λi , respectively, for i = 1, 2, and 3. Then, three new color features Ci can be
formed and defined, respectively,
Ci = eir R + eig G + eib B for i =1, 2, and 3, (1)
where ei = ( eir , eig , eib ) . The analysis of Ohta et al[1] indicated the color feature C1
with the largest eigenvalue is the one used for color-to-gray transform, i.e.,
1 1 1
C1 = R + G + B . (2)
3 3 3
Other two color features C2 and C3 are orthogonal to C1 and can be obtained,
15
respectively, as follows:
R-B 2G - R - B
C2 = and C3 = . (3)
2 4
All the color features can be obtained by projected a pixel color (R, G, B) with the
vectors (1/3, 1/3, 1/3), (1/2, 0, -1/2), and (-1/4, 1/2, -1/4) respectively. In [17],
Healey pointed out that the colors of homogeneous dielectric surfaces (like roads or
clouds) moved along the axis directed by Eq.(2), i.e., (1/3, 1/3, 1/3). That means the
colors of homogeneous dielectric surfaces have no changes as time passes except
intensity strength. Compared with other metal surfaces, road surface are more easily
modeled. Fig. 6 indicates a RGB color space and the color of road surface
distribution are represented as red points. In [14], Rojas et al. also found that the
colors of roads concentrated around a small cylinder along the axis directed by Eq. (2).
Therefore, projecting all the road colors to a plane which is perpendicular to the axis
pointed by C1 , all the road colors concentrate around a small circle [14]. Once
feature vectors of color distribution of the object are founded, the object color
distribution would form a specific cluster in the feature space by feature projection.
Based on this observation, this paper proposed a new color model to transform all
color pixels on a 2D feature space. On this feature space, all vehicle color pixels
concentrated on a smaller area. By modeling the characteristics of this area, a
Bayesian classifier is developed to accurately identify vehicle pixels from background
ones.
16
(a) (b)
Fig. 7 Parts of vehicle training samples. (a) Vehicle training images. (b) Non-vehicle
training images.
At the beginning, thousands of training images are collected from different scenes
including roads, parking lots, building, and natural scenes. Fig. 7 shows some parts
of out training samples. Based on the training samples, using the KL transform, we
found that the eigenvector with the largest eigenvalue of this data set is (1/3, 1/3, 1/3)
(the same as in Eq. (2)). In addition, the color plane (u, v ) perpendicular to the axis
(1/3, 1/3, 1/3) expanded by other two eigenvectors is:
2Z p − G p − B p Z p − Gp Z p − Bp
up = and v p = Max{ , }, (4)
Zp Zp Zp
where ( R p , G p , B p ) is the color of a pixel p and Z p = (R p + G p + B p )/3 used for
normalization. The color transformation described in Eq.(4) concentrates all vehicle
pixels on a smaller area. There are also other color planes perpendicular to the axis
(1/3, 1/3, 1/3). For example, another color plane (s, t) perpendicular to the axis (1/3,
1/3, 1/3) can be found, i.e.,
Rp - B p - R p +2G p - B p
sp = and t p = . (5)
Zp Zp
17
(a) (b)
(c)
Fig. 8 Results of color transformations of background pixels. (a) Result of color

transformation using the (u, v) domain. (b) Result of color transformation using the (s,
t) domain.(c) Vehicle pixels plotting : result of color transformation in the (u, v)
domain
Plotting all background pixels of training images on the (u, v) and (s, t) planes
using Eq.(4) and Eq.(5), respectively, are 8.85384 and 40.1879, respectively. Fig. 8
shows the result of color transform. Clearly, the transformation described in Eq.(4)
makes background pixels more compact than the one in Eq.(5).
The feature space (u, v ) has better discrimination abilities to concentrate the
vehicle pixels forming a compact cluster with variance 13.2794. It makes us easy to
separate vehicle pixels from background ones. Using this color transformation, the
critical issue becomes a 2-class separating problem in the (u, v) feature space. With
18
an input image, the color transformation is performed to project all pixels of input
image into a 2D space. Then, How to find a decision boundary to distinguish these
pixels into two different classes (i.e., vehicles vs. non-vehicles) will be described in
the next section.
3.2 Pixels Classification Using Bayesian Classifier
After transformation, a Bayesian classifier should be designed to accurately identify
vehicle pixels from background ones with colors. We assume that the RGB color
component in the (u, v) color domain are multivariate Gaussian distribution.
Assume mv and mn are the mean of the vehicle and non-vehicle pixels calculated
from collected training images in the (u, v) color domain, respectively. In addition,
∑ v and ∑ n are their corresponding covariance matrices in the same color domain
respectively, which yields
1
mv = ∑ I ( x)
n x∈( u ,v )
1
∑
t
Σv = ( I ( x) − mv )( I ( x) − mv )
n x∈( u ,v )
where n is the total training images. The probability of the point x belongs to
vehicle class is:
1
p( x|vehicle) = exp(-d v ( x )) , (6)
2π Σ v
1
( x − mv ) Σ −v 1 ( x − mv ) . Similarly, the probability of point x
t
where d v ( x ) =
2
belonging to a non-vehicle class is defined as follows:
1
p( x|non - vehicle) = exp(-d n ( x )) , (7)
2π Σ n
19
1
( x − mn ) Σ n−1 ( x − mn ) . If a point x belongs to vehicle class, the
t
where d n ( x ) =
2
probability of a pixel x satisfy:
p ( vehicle|x ) > p ( non - vehicle|x ) , (8)
With the Bayesian rule, Eq.(8) can be rewritten as follows:
p ( x | vehicle ) P ( vehicle) > p ( x | non - vehicle ) P( non - vehicle) , (9)
where P ( vehicle) and P (non-vehicle) are the priori class probabilities of vehicle
pixels and non-vehicle ones, respectively. Plugging Eqs. (6) and (7) into Eq.(9),
we can get:
p( vehicle | x ) Σn P( non - vehicle)

= exp(-d v ( x ) + d n ( x )) > . (10)
p( non - vehicle | x ) | Σv | P( vehicle)
Taking the log form of Eq.(10), we have the following classification rule:
Assign a pixel x to class “vehicle” if
d n ( x ) - d v ( x )>λ , (11)
Σ v P( non - vehicle)
where λ = log[ ] . In this way, we can get a binary image that
| Σ n | P( vehicle)
denotes the vehicle pixels extraction result.
3.3 Pixels Classification Using Neural Network
Learning is an extremely important characteristic of the biological or artificial system
full of wisdom and can be divided into two styles: learning from examples or
learning from observation and discovery. The former is supervised learning and the
later is unsupervised learning. This section describes how to use neural network
method to classify vehicle pixels from background ones.
20
Input layer Hidden Layer Output layer
Fig. 9 The basic perceptron model
The operation of neural network can be divided into two stages: learning and recall.
In the learning stage, network learns from input data using different study rules. In
each repeated training recurrences, network adjusts the weight values in order to reach
the study and recall effects. The results of learning lies in the change of the network
weight values
In this section, a neural network model called perceptron is used for pixels
classification. The basic architecture of perceptron is shown in Fig. 9. The model
contains three layers which are called input layer, hidden layer and output layer,
respectively. Basic components of perceptron are accumulators which combine each
input pattern linearly with some proper weights. Let R denotes the response
activated by the ith neuron and wi is its corresponding weight. The total response is
given by:
n
R = ∑ wi x j = w' x
j =1
21
Then, the response are forwarded into a hard limiter or threshold function f(v). In
general, the responds of a neuron are positive causing the output of hard limiter
function 1, otherwise, the negative responds makes the output 0. f(v) can be described
as the following form:
1 if v ≥ 0
f (v ) = 
0 if v < 0
In section 3.1, we used the equation 5 to transform training images containing
vehicle and non-vehicle pixels into (u, v) domain. Here, we plus a label of each
pattern transformed of the two classes (i.e., vehicle is 1 and non-vehicle is -1).
Assume we have n examples from two classes
( xi , yi ) , i = 1...n x i ∈ (u , v), yi ∈ {+1,−1}
If we want to classify all input patterns into two classes, the discriminate
hyper-plane is defined as:
n
f ( x) = ∑ wk xk − θ
k =1
Let x denotes ( xi , yi ) and θ is the threshold of hard limit function pre-defined.
If f ( x) > 0 , we say that the pattern belongs to class ω1 ; if f ( x) < 0 , it belongs to
class ω2 . The perceptron algorithm is described below:
Step 1: Initialization
Random generating each w(0)i , i = 1...n as input patterns.
22
Step 2: Calculating the output of the network.
Assume input pattern is x(n) , n denotes the nth recurrence. The outputs
of neuron are:
T
y (n) = sgn[ w (n) x(n)]
+ 1 if v > 0
sgn(v) = 
− 1 if v < 0
Step 3: Adjust w(n)i , i = 1...n according to the following rules:
w(n) + η x(n) if x(n) ∈ C and ,wT (n) x(n) < 0

 1

w(n + 1) = w(n) − η x(n) if x(n) ∈ C and , wT (n) x(n) ≥ 0
2
w(n) if x(n) _ is _ correctly _ classified .


Step 4: n=n+1, go to step 2 until the network converges or learning recurrences n

exceed some pre-defined values.
23
(a) (b)
(c) (d)
(e) (f)
Fig. 10 Vehicle color detection result. (a)~(b) Original images.(c)~(d) Color
classification result using Bayesian classifier. (e)~(f) Color classification result using
perceptron
24
3.4 Color Classification Result
This section gives some illustration examples to present the effect about color
classification. The results of color classification are given in two ways which are
described in section 3.2 and 3.3. We plot the vehicle pixels with red color and
preserve the original image for comparison. Fig. 10(a) are shot at high way. It is
noticed that the color of vehicle and road surface are both very similar to gray color.
With the view of the human eye, these two kinds of color are very close to a certain
extent. However, these two kinds of color own different property in our experiment
after color classification. The road pixels with similar gray color are rejected and
vehicle pixels passed which drawn with red color. Fig. 10(c)and (e) demonstrate our
argument. In Fig. 10(b), two vehicles are occlusive by trees but there is no miss by
the power of color classification which has shown in Fig. 10(d) and (f).
In fact, color classification process is a supervised learning. There is no guarantee
which kinds of methods are the best. There are many techniques can achieve this
goal such as Support Vector Machine (SVM), Radial basis function network
(RBF)…etc. In this paper, we only give two methods including Bayesian and
perceptron to fulfill color classification. Other experimental results are given in Fig.
11 that vehicles are shot at short range. The experimental results proved that the
proposed method is robust dealing with large variety color changes including different
viewing angle, vehicle colors and lighting conditions. Fig. 11(b) was captured at
dusky lighting situation and the color classification still performs well.
25
(a) (b)
(c) (d)
(e) (f)
Fig. 11 Another vehicle color classification results. Compared with another images,
(b) is duskier but can still perform well. (a)~(b) Original images.(c)~(d) Color
classification result using Bayesian classifier. (e)~(f) Color classification result
using perceptron .
26
CHAPTER 4
VEHICLE VERIFICATION
In the previous chapter, a novel color model and classifiers (Bayesian and neural
network) were presented to extract vehicle pixels by using colors from static images.
After that, different vehicle hypothesizes are generated to tackle the variations of
vehicle appearances. Then, a verification process is applied to verify whether there
exists a vehicle in the scene or not.
4.1 Vehicle Hypothesis
Here, a vehicle hypothesis H sI ( X ) is a sub-image extracted from a static image I
with the size ws × hs and the center X. The minimum size of detected vehicles used
in this paper is assumed to be 36 × 36 . We build a set of classes Cθi of vehicle
templates to verify the correctness of H sI ( X ) estimating its maximum response at
different orientations. Here Cθi is a collection of different vehicle templates whose
orientations are at the same angle θi . The maximum response is defined as the
maximum similarity between H sI ( X ) and all vehicle templates.
27
4.2 Vehicle Features
In this thesis, three features including vehicle contour, wavelet coefficients and
corners are used to measure the similarity. In what follows, details of each feature
are introduced.
4.2.1 Contour feature
Contour is a good feature to describe vehicle’s shapes and usually represented by
chain coding. However, chain coding is easily affected by noise. Therefore, this
paper uses a distance transform to convert an object contour to a distance map
different from chain coding. Based on this map, different vehicle hypothesis can be
well discriminated.
4.2.1.1 Edge extraction
Edge detecting is a fundamental technique in image processing. Edges in images are
areas with strong intensity and characterize the object’s boundaries. Successful edge
detection filters out most useless information such as noise but preserving the
significant structure in images. The first step before detecting edges is to filter out
noise in the original image by Gaussian smoothing.
28
Fig. 12 A 3x3 averaging mask often used for smoothing
Fig. 13 2-D Gaussian distribution with mean (0,0) and σ=1
Image smoothing is a basic pre-processing method which is used for “blur” image
and suppressed image noise, especially for impulsive noise. The basic idea, what we
called “median filter”, is used the average of a small neighborhood of a pixel in an
input image to get a new gray value replacing the original pixel. The average of the
gray value in the neighborhoods is not affected by individual noise and so median
filter eliminates impulsive noise quite well. In real implementation, a 3x3 kernel
mask (or more large, i.e., 5x5) is used as shown in Fig. 12
29
Fig. 14 Discrete approximation to Gaussian function withσ=1.4
If the image I(x, y) with size M*N and the kernel mask K (k, l) with size m*n,
mathematically, we write the convolution as:
m n
O( x, y ) = ∑∑ I ( x + k − 1, y + l − 1) K (k , l )
k =1 l =1
The Gaussian smoothing operator is also a 2D convolution operator similar to
median filter which uses a different kernel that represents the shape of a Gaussian
distribution. 2D Gaussian has the form:
x2 + y 2
1 − −
G ( x, y ) = exp 2σ 2
2πσ 2
The 2-D Gaussian distribution with mean (0,0) and σ=1 is shown in Fig. 13. The
Gaussian filter can be created by the user in terms of mask size and standard deviation
σ . In practice, a discrete approximation to the Gaussian function should be
calculated. Then, the filtering process is a convolution by the filter mask and the
image. Fig. 14 shows a suitable integer Gaussian mask that approximates a
Gaussian distribution with a σ=1.4.
30
Fig. 15 The Sobel mask in x-direction and y-direction.
Fig. 16 Edge detection by Canny operator.
After smoothing the image and eliminating the noise, the second step is to compute
the gradient of the image. Here, the Sobel operator is used for estimating the
gradient in the x-direction and y-direction according to Fig. 15. Then, the magnitude
of edge gradient can be approximated by the following formula:
| G |=| Gx | + | G y |
Finally, Canny method are performed to get more precise edges including non-
maximum suppression thinning the edges and linking edges segments using two
thresholds. Fig. 16 shows some examples of our training vehicle images using
Canny method.
31
Fig. 17 The value of y is nonlinearly increased when x increases.
4.2.1.2 Distance transform
After getting the binary edge image, a 3×3 mask is used to detect all boundary points
from a vehicle image. When this mask is used and moved at a non-zero pixel p, if
one pixel in this mask is zero, then p is a boundary pixel. Assume that BV is a set
of boundary pixels extracted from a vehicle V. Then, the distance transform of a
pixel p in V is defined as
DTV ( p ) = min d ( p, q) , (12)

q∈BV
where d ( p, q) is the Euclidian distance between p and q. Eq.(12) is further modified
to enhance the strength of distance changes as follows
DT V ( p ) = min d ( p, q) × exp(κ d ( p, q)) , (13)

q∈BV
where κ = 0.1 . Like Fig. 17, when x increases more, the value of y will increase
more rapidly than x. Fig. 18(b) shows the result of the distance transform of Fig. 18
(a). Thus, according to Eq.(13), a set FC (V ) of contour features can be extracted
from a vehicle V. Scanning all pixels of V in a row major order, FC (V ) can be
32
then represented as a row vector, i.e.,
FC (V ) = [ DT V ( p0 ),...., DT V ( pi ),....] , (14)
where all pi belong to V and i is the scanning index.
(a) (b)
Fig. 18 Result of distance transform. (a) Original Image. (b) Distance

transform of (a).
4.2.2 Wavelet Coefficients
Wavelet transform is a very useful tool to represent images at different resolutions.
It has been successfully applied in many applications like compression, watermarking,
texture analysis, communications, and so on. The wavelet transform uses two kinds
of filers to decompose a signal into different resolutions, i.e., the low-pass filter h( k )
and the high-pass one g ( k ) . Then, given a discrete signal f(n) (assumed at the fine
resolution j=0 and represented as S0 f ( n ) ), with the low-pass filter h( k ) , the
approximation of f(n) at lower resolution j-1 can be calculated as follows:
33
∞
S j −1 f ( n ) = ∑S
k =−∞
j f ( k )h( k − 2 n ) . (15)
h(k) 2↓ Sj-1 f(n)

f(n)
g(k) 2↓ Wj-1 f(n)
(a)
h[k] 2↓ LL
h[k] 2↓ L
g[k] 2↓ LH
f(m,n)
h[k] 2↓ HL
g[k] 2↓ H
g[k] 2↓ HH
(b)
Fig. 19 Block diagram of discrete wavelet transform.

(a) 1D Wavelet transform. (b) 2D Wavelet transform.
In addition, information lost between S j f ( n ) and S j −1 f ( n ) can be obtained using
the high-pass filter g ( k ) as follows
∞
W j −1 f ( n ) = ∑S
k =−∞
j f ( k ) g ( k − 2n ) . (16)
From the view of signal processing, S j −1 f ( n ) and W j −1 f ( n ) are, respectively, the
components of low frequency and high frequency of S j f (n ) . The above procedure,
which is also known as the sub-band coding, can be repeatedly performed. Fig. 19(a)
shows the diagram of 1D wavelet transform. The 1D wavelet transform can be
easily extended to two dimensions. The simplest way to generate 2D wavelet
transform is to apply two 1D transforms to the rows and columns of a 2D signal f(m,
34
n), respectively. Fig. 19(b) shows the block diagram of 2D wavelet
Fig. 20 Wavelet decomposition of three scales
transform. Given f(m, n), convolving its rows with h( k ) and g ( k ) , we get two
sub-images whose horizontal resolutions are reduced by a factor 2. Both sub-images
are then filtered columnwise and down-sampled to yield four quarter-size output
subimages.
The filters h( k ) and g ( k ) we use are the D4 family of Daubeches’s basis, i.e.,
1+ 3 3 + 3 3 − 3 1− 3
{h(0), h(1), h(2), h(3)}= { , , , } and {g(0), g(1), g(2), g(3)}
4 2 4 2 4 2 4 2
= {h(3), -h(2), h(1), -h(0)}. In this paper, a three-scale wavelet transform is used to
process all vehicle images. Then, each wavelet coefficient is quantized to three
levels, i.e., 1, 0, -1, if its value is larger than 0, equal to 0, and less than 0, respectively.
After that, all the quantized coefficients are recorded for further recognition. As
shown in Fig. 20, when recording, each wavelet coefficient is further classified into
different bands, i.e., LL, LH, HL, and HH. According to this classification, a pixel p
is labeled as 1, 2, 2, and 4 if it locates in the LL, LH, HL, HH bands, respectively.
Let l ( p ) denote the labeling vale of p. Then, given a vehicle V, from its wavelet
coefficients, we can extract a set FW (V ) of WT features. Scanning V in a row-major
order, FW (V ) can be further represented as a row vector, i.e.,
35
FW (V ) = [l ( p0 )CoeffVW ( p0 ),...., l ( pi )CoeffVW ( pi ),....] , (17)
where all pi belong to V and i is the scanning index.
4.2.3 Corner Features
Corner is another type of image features like edge. Corner means interesting points
of the object which have stable invariance property even suffers noise, rotation,
compression, and scale or illumination variation. They are often used in image
alignment (homography, fundamental matrix), motion tracking, and image retrieval.
Corner happens in image intensity which has significant change in all directions, yet
edge has no change along the edge direction.
The Harris corner detector is a popular one using the locally averaged moment
matrix M computed from the image gradients:
I 2 Ix Iy 
M = ∑ w( x, y )  x 
x, y  I x I y I y2 
where w(x, y) means a window function, Ix and Iy means the first derivative of
image in x and y direction respectively. Then, combines the eigenvalues( λ1 and λ2 )
of M, which describes the intensity structure of local image, to measure of corner
response R:
R = det M − k (trace M ) 2
det M = λ1 * λ2
trace M = λ1 + λ2
36
where k is empirical constant, k = 0.04~0.06. The maximum value, larger than a
threshold, indicates the corner’s position. To avoid corners due to image noise, a
Gaussian filter can be used to smooth the image firstly.
(a) (b)
Fig. 21. Corner detection of vehicles (a)Vehicle contains many corners

(b)Comparison with background, vehicle contains more corners than background thus
corners features can be adapted as features.
Vehicles contain strong edges and lines with different orientation and scales. The
corners happen in areas of cross lines. In Fig. 21, we present the results of the corner
detector in two vehicle images. Obviously, the area with vehicles often contains
many corners than background.
4.3 Integration and Similarity Measurement
In Sections 4.2.1 and 4.2.2, two features have been illustrated to describe the visual
characteristics of a vehicle template V. We are now able to integrate these two
features together for computing the similarity between H sI ( X ) and V. Given V,
based on Eqs. (14) and (17), we can extract two feature vectors FC (V ) and
FW (V ) from its contour and wavelet transform, respectively. For convince, we
combine these two features together to form a new feature vector F (V ) , i.e.,
37
F (V ) = [FC (V ), FW (V )] . For a vehicle class Cθi , if there are Nθi templates in Cθi ,
we can calculate its mean µθi and variance Σθi of F (V ) from all samples V in
Cθi . Then, given a vehicle hypothesis H, the similarly between H and Cθi can be
measured by this equation:
S ( H , Cθi ) = exp(-( FH - µθi )Σθ-1i ( FH - µθi )t ) , (18)
where t means the transpose of a vector. Therefore, given a position X, its vehicle
response can be defined as follows
R( X ) = max S ( H sI ( X ), Cθi ) . (19)

s ,θi
When calculating Eq.(19), the parameter θi can be further eliminated if the direction
of the hypothesis H sI ( X ) is known in advance. In [18], a good moment-based
method is provided for estimating the orientation of the longest axis of a region. If
θ sI ( X ) is denoted the orientation of H sI ( X ) , Eq.(19) can be then rewritten
R( X ) = max S ( H sI ( X ), Cθ I ( X ) ) . (20)
s s
Fig. 22. The cascade classifier used for vehicle detection

38
Fig. 23. Image pyramid structure. Assume the original image size is 320*240,
processing the image at each resolution rescaling the original size with 0.8 ratios until
pre-defined resolution is achieved.
4.4 Verification Procedure
This section describes the method constructing a cascade classifier that improves
detection performance and reduces the computation time. We follow the idea from
Viola and Jones [19] to construct a simple cascade classifier. The classifiers are
connected in cascade to create a pipeline structure. In general, a classifier with low
threshold causes higher detection rates and higher false positive rates. Once many
features can be utilized, a progressive classifier should be designed.
As shown in Fig. 22, corner features formed a simple classifier which is used to
eliminate the almost “impossible” candidates though some false candidates survive.
The threshold of the corner classifier can be adjusted such that the detection rate is
39
close to 100%. In the second step, edge maps and Wavelet coefficients are combined
to form more complex classifiers to achieve low false positive rates. Thus, we
prevent to verify most candidates with all features and save a large number of
computation times. The negative candidates in any classifier are rejected and the
survivals get into the subsequent classifiers. Subsequent classifiers eliminate
additional negatives but require additional edges and wavelet features. Finally a real
vehicle is determined through the cascade classifier.
In real implementation, we borrow a well-known pyramid technique from face
detection methodologies [19]-[21] to speed up the calculation of Eq.(20). This
technique constructs a pyramid structure (see Fig. 23) of image by gradually resizing
the input image. For a vehicle pixel X at the full resolution, its corresponding
vehicle hypothesis H sI ( X ) will be generated at the pyramid level s. Then, R( X )
can be found by searching the optimal value of S ( H sI ( X ), Cθ I ( X ) ) across each

s
pyramid level. Two thresholds are used to remove spurious responses and to declare
whether a vehicle is detected at the position X. Let λR be the average value of
R( X ) for all the centers X of the training vehicle samples. For a vehicle pixel X, if
its response R( X ) is larger than 0.8 λR , it is considered a vehicle candidate. In
addition to λR , another threshold λC (threshold of corners) is used to remove false
detections of vehicles. If X contains a real vehicle, the number of corners around X
should be larger than λC . The parameters λR and Σθ-1i (the weight used in
Eq.(18)) can also be learned using the AdaBoost algorithm [22] for increasing the
accuracy of vehicle verification.
40
(a) (b)
Fig. 24 Red points represent the possible vehicle candidates with stronger responses.
These points should be clustered by nearest-neighbor algorithm. (a) Original image
(b) The white area denotes the region of possible vehicle pixels.
From experimental results, we can find that the above verification scheme performs
well enough in detecting all reveal vehicles. Finally, due to noise or shadows, there
would be many vehicle candidates which are overlapped together. These candidates
should be eliminated if they are inside other stronger candidates. Once an image
conations many vehicles and they parks very closely to each other, location of a
vehicle position is starting when the verification procedure is completed. A real
vehicle position may contain more than one vehicle pixel that has the stronger
respones. As shown in Fig. 24, these candidates are performed nearest-neighbor
algorithm for locating the best position of vehicles or separating the two vehicles that
are very close to each other.
41
CHAPTER 5
EXPERIMENTAL RESULTS
5.1 Data Set
To ensure our proposed method works well under large varieties of the data, training
examples were collected in different days, different weather conditions, and different
viewing angles. Especially, vehicle training examples are collected in different
orientations and various colors where vehicle parks on road, high way, and parking lot,
etc. The images used in our vehicle detection system were collected in the campus
of National Central University and Zong-shan high way during different seasons.
Highway images were captured in the summer of 2002 and the others were captured
from the summer of 2003 until the spring of 2004.
5.2 Performance Analysis of Pixels Classification
The dimension of training vehicles is clipped into the size of 36×36. To tackle the
variations of vehicle orientation, eight classes of vehicles with different orientations
were collected in training samples. In order to analyze the robustness and
effectiveness of the proposed method, several experiments under different conditions
were demonstrated in this paper. The first experiment was conducted to evaluate the
performance of our vehicle-color detection method. Fig. 25 shows the result of
detecting vehicle colors using Eq.(11). (a) is the original image and (b) the result
obtained from (a). To evaluate and measure the performances of our proposed
42
method to detect vehicle colors, the precision and false-alarm rates are defined.
Precision is the ratio of the number of correctly detected vehicle pixels to the number
of exactly existing vehicle pixels. False alarm rate is the ratio of the number of
background pixels but misclassified as vehicles to the number of all background
pixels. These two measures are defined as:
Precision = Cvehicle / Nvehicle and
Rate of False-Alarm = Fvehicle / Nbackground-pixels,
where Nvehicle is the total number of vehicle pixels, Cvehicle the number of correctly
detected vehicle pixels, Nbackground-pixels the number of all background pixels, and
Fvehicle the number of background pixels but misclassified as vehicle ones. When
calculating these two measures, the ground truth of vehicle pixels was manually
obtained. In Fig. 25 (a) and (b), the precision rate and false-alarm rate of vehicle
pixel detection were 86.1% and 6.3%, respectively. The lower false-alarm rate
implied that most of background pixels were filtered out and didn’t need to be further
verified. Thus, many redundant searches can be avoided in advance and the
verification process can be significantly speeded up to find desired vehicles. It is
noticed that none of vehicle candidates was missed at this stage of vehicle hypothesis
generation. Fig. 26 shows another result of vehicle color detection. The precision
rate and false-alarm rate of vehicle pixel detection are 89.9% and 2.1%, respectively.
All of possible vehicle candidates were also correctly extracted.
43
(a) (b)
Fig. 25. Result of vehicle color detection. (a) Original image. (b) Detection result
of vehicle color.
(a) (b)
Fig. 26. Result of vehicle color detection. (a) Original image. (b) Detection result of
vehicle color.
5.3 Detection Result in Various Environments
Some detection examples are given in this section. All testing images are collected
outdoor under different lighting and weather conditions even vehicles contain various
sizes, shapes and orientation. Although these vehicles have different colors, all of
them were correctly detected and located. Fig. 27 shows result of vehicle detection
in the parking lot. The proposed method is suitable for constructing a parking space
display system. This system can provide drivers real-time and accurate information
of free parking space.
44
Fig. 27. Result of vehicle detection in a parking lot
Fig. 28. Result of vehicle detection in a parking lot with different orientation.
Fig. 28 shows another result of vehicle detection when vehicles’ pose with another
orientation. Fig. 29 shows result of vehicle detection when vehicles driven on road.
Fig. 30 shows two results of vehicle detection when vehicles were driven on highways.
This technique can also be used for counting numbers of vehicle in duration to
estimate the traffic flow. Fig. 31 shows another result of vehicle detection on road.
It is noticed that although vehicles were occluded by a tree, they still were correctly
detected. The average processing time is 0.54 ~ 0.72 seconds per image depending
on vehicle numbers. The average accuracy rate of vehicle detection using the
proposed algorithm is 94.5% and the false alarm rate is 3.2%.
45
Fig. 29. Result of vehicle detection on road.
Fig. 30. Result of detecting vehicles from highway. Although these vehicles were
with different colors, all of them were correctly detected.
Fig. 31. Result of vehicle detection in road with occlusion.
46
CHAPTER 6
DISCUSSIONS AND CONCLUSIONS
6.1 Discussions
In this section, a brief discussion will be addressed about the proposed color model.
Although the experimental results demonstrate satisfactory results of color
classification, some false alarms still exist. They may be resulted from the following
two reasons.
1. The fraction of the proposed color model is calculated and generated by K-L
transform. In fact, the eigenvector of each eigenvalue is floating points
represented as the coefficient of R, G, and B components. For convenience
sake, we take them to be nearly an integer. The accuracy will be more or less
lost in this part to a certain extent.
2. The training set is made up of two groups (i.e., vehicle and non-vehicle). The
performance of the Bayesian classifier is heavily influenced by the collected
training set. Some erroneous judgments are due to the lack of representative
non-vehicle training set. The choosing of good “non-vehicle” examples will
efficiently decrease the false classification.
6.2 Conclusions
In this thesis, a novel vehicle detection method is presented to detect various vehicles
47
from static images. Firstly, a novel color projection method is presented. All
pixels of the input image are projected onto a 2D feature space such that vehicle
pixels form a compact cluster and can thus be easily identified from background ones.
Many redundant vehicle candidates are eliminated in advance using the Bayesian
classifier.
Then, three features including corners, edge maps, and wavelet coefficients are
employed to form a cascade and multi-channel classifier. The correctness of each
vehicle hypothesis can be effectively calculated even with different sizes and
orientations. Since the classifier can well record different changes of vehicle
appearances, real vehicles can be accurately detected from static images. The
contributions of this thesis can be summarized as follows:
(a) A novel color model is proposed to identify vehicle colors pixels from
background ones. This identification procedure eliminates most impossible
candidates before the performing of vehicle verification. Different from
other methods [6]-[7] which need an exhaustive search to find possible
vehicles candidates, the proposed method detects vehicles more quickly and
efficiently.
(b) A cascade and multi-channel classifier is proposed to verify each vehicle
hypothesis. According to this classier, an effective scan is performed to
verify all vehicle candidates from static images even though they have
different sizes and orientations.
The proposed method is robust in dealing with various outdoor images containing
different weather and lighting conditions. Experimental results demonstrate the
superiority of our proposed method in vehicle detection.
48
References
[1] Z. Sun, G. Bebis, and R. Miller, “On-road vehicle detection using optical sensors:
a view,” 2004 IEEE Intelligent Transportation Systems Conference, pp.585-590,
Washington, D.C., USA, Oct. 3-6, 2004.
[2] V. Kastinaki, M. Zervakis, and K. Kalaitozakis, “A survey of video processing
techniques for traffic applications,” Image, Vision, and Computing, vol. 21, no. 4,
pp.359-381, April 2003.
[3] R. Cucchiara, P. Mello, M. Piccardi, “Image analysis and rule-based reasoning
for a traffic monitoring,” IEEE Trans. on Intelligent Transportation Systems, vol.
3, no. 1, pp.37-47, March 2002.
[4] S. Gupte, O. Masoud, R. F. K. Martin, and N. P. Papanikolopoulos, “Detection
and classification of vehicles,” IEEE Trans. on Intelligent Transportation
Systems, vol. 1, no. 2, pp.119-130, June 2000.
[5] G. L. Foresti, V. Murino, C. Regazzoni, “Vehicle recognition and tracking from
road image sequences,” IEEE Trans. on Vehicular Technology, vol. 48, no. 1,
pp.301-318, Jan. 1999.
[6] J. Wu, X. Zhang, and J. Zhou, “Vehicle detection in static road images with
PCA-and- wavelet-based classifier,” 2001 IEEE Intelligent Transportation
Systems Conference, pp.740-744, Oakland, C.A., USA, Aug. 25-29, 2001.
[7] Z. Sun, G. Bebis, and R. Miller, “On-road vehicle detection using Gabor filters
and support vector machines,” IEEE International Conference on Digital Signal
Processing, Santorini, Greece, July 2002.
[8] A. Broggi, P. Cerri, and P. C. Antonello, “Multi-resolution vehicle detection
using artificial vision,” 2004 IEEE Intelligent Vehicles Symposium, pp. 310- 314,
June 2004.
[9] M. Bertozzi, A. Broggi, and S. Castelluccio, “A real-time oriented system for
vehicle detection,” Journal of Systems Architecture, pp. 317-325, 1997.
[10] C. Tzomakas and W. Seelen, “Vehicle detection in traffic scenes using shadow,”
Tech. Rep. 98-06, Institut fur neuroinformatik, Ruhtuniversitat, Bochum,
Germany, 1998.
49
[11] A. Lakshmi Ratan, W.E.L. Grimson, and W.M. Wells, “Object detection and
localization by dynamic template warping,” International Journal of Computer
Vision, vol. 36, no. 2, pp.131-148, 2000.
[12] A. Bensrhair, et al., “Stereo vision-based feature extraction for vehicle
detection,” 2002 IEEE Intelligent Vehicles Symposium, vol. 2, pp. 465-470, June
2002.
[13] T. Aizawa, et al., “Road surface estimation against vehicles’ existence for
stereo-based vehicle detection,” IEEE 5th International Conference on
Intelligent Transportation Systems, pp. 43-48, Sep. 2002.
[14] J. C. Rojas and J. D. Crisman, “Vehicle Detection in Color Images,” IEEE
Conference on Intelligent Transportation System, pp.403-408, Nov. 9-11, 1997.
[15] D. Guo et al., “Color modeling by spherical influence field in sensing driving
environment,” 2000 IEEE Intelligent Vehicles Symposium, pp. 249- 254, Oct. 3-5
2000.
[16] Y. Ohta, T. Kanade, and T. Sakai, “Color Information for Region Segmentation,”
Computer Graphics and Image Processing, vol. 13, pp. 222-241, 1980.
[17] G. Healey, “Segmenting Images Using Normalized Color,” IEEE Transactions
on Systems, Man, and Cybernetics, vol. 22, no. 1 , pp. 64-73, 1992.
[18] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis and Machine
Vision, London, U. K.: Chapman & Hall, 1993.
[19] P. Viola and M. J. Jones, “Robust Real-Time Face Detection,” International
Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, May 2004.
[20] E. Osuna, R. Freund, and F. Girosi, “Training support vector machines: an
application to face detection,” IEEE Proceeding of Computer Vision and Pattern
Recognition, vol. 6, pp.130-136, 1997.
[21] K.K. Sung and T. Poggio, “Example-Based Learning for View-Based Human
Face Detection,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 20, no. 1, 39-51, 1998.
[22] R. E. Shapire and Y. Singer, “Improving Boosting Algorithms Using
Confidence-rated Predictions,” Machine Learning, vol. 37, no. 3, pp. 297-336,
Dec. 1999.
50
[23] JULIUS T. Tou, RAFAEL C. GONALEZ, “Pattern Recognition Principles”,
Addison-Wesley Publishing Company
[24] Abjijit S.Pandya, Robert B. Macy, “Pattern Recognition with Neural Networks in
C++”, A CRC Book Published in Cooperation with IEEE Press.
51

92522084

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

92522084

Uploaded by

Copyright:

Available Formats

國 立 中 央 大 學

Vehicle Detection Using Normalized Color and

系所： 資 訊 工 程 所博士 ;碩士班

日期：民國 九十四 年 六 月 二十四 日

information from static images is presented. Different from traditional methods

excellent capabilities in identifying vehicle pixels from background ones even

though the pixels are under varying illuminations.

After finding possible vehicle candidates, three important features including

cascade and multi-channel classifier. According to this classifier, an effective

scanning is performed to verify all possible candidates. The scanning can be

feature is powerful in the detection of vehicles. The average accuracy rate of

vehicle detection is 94.5%.

CHAPTER 2 CONVENTIONAL METHODS FOR DATA ANALYSIS ........................................7

2.1 KARHUNEN-LOE`VE TRANSFORM ...................................................................................8

CHAPTER 3 VEHICLE COLOR DETECTOR ............................................................................13

3.1 COLOR FEATURES FOR DIMENSIONALITY REDUCTION ................................................15

CHAPTER 4 VEHICLE VERIFICATION.....................................................................................27

4.1 VEHICLE HYPOTHESIS ...................................................................................................27

CHAPTER 5 EXPERIMENTAL RESULTS...................................................................................42

5.1 DATA SET ........................................................................................................................42

CHAPTER 6 DISCUSSIONS AND CONCLUSIONS ......................................................................47

6.1 DISCUSSIONS ..................................................................................................................47

Fig. 1 Details of the proposed vehicle detector...........................................................5

Due to the development of the scientific and technological civilization, automobiles

have already become an indispensable tool in replacing walking in the modernized

precious resources and the problem of environmental pollution.

information of available parking spaces. Hence, the providing of useful road

concentration. Once a vehicle can be accurately tracked through a road network,

more complex traffic parameters, such as linked-travel-time, can be computed.

Therefore, vehicle detection is critical to traffic management systems and intelligence

transport systems for collision avoidance or traffic flow control.

In this thesis, we plan to develop an intelligent vehicle detection system based on

spaces in the indoor/outdoor parking lots or along the roads.

1.2 Review of Related Works

Vehicle detection [1]-[2] is an important problem to be resolved in many related

applications, such as self-guided vehicles, driver assistance system, intelligent parking

One of the common approaches to vehicle detection is the using of vision-based

robust and effective vision-based vehicle detection system is challenging. To

researchers [2]-[5] used background subtraction to extract motion features for

classifier. In addition to textures, “symmetry” is another important feature used in

prone to false detections, such as symmetrical objects in complex background.

image. Although color is an important perceptual descriptor, there were seldom

As mentioned previously, vehicles have larger variations in appearance including

transform can be found to accurately capture the visual characteristics of vehicles

pixels will be distributed sufficiently concentrating on a smaller area. The model is

devised to form a multi-channel classifier. The classifier is modeled using a Gaussian

classifier records many vehicle appearance changes, it possesses good discriminative

accomplished efficiently. Moreover, vehicle still can be detected successfully even

experimental results verify the superiority of the proposed method in detecting

Fig. 1 Details of the proposed vehicle detector.

1.3 Overview of the Proposed System

distinguished from non-vehicle ones. Here, a Bayesian network is adopted for

edges, coefficients of wavelet transform, and corners are employed to eliminate

The rest of the thesis is organized as follows. Chapter 2 introduces three

transform in finding vehicle colors are described in Chapter 3. Chapter 4 discusses

the details of feature extraction and the proposed multi-channel classifier.

Experimental results are demonstrated in Chapter 5 to verify the validity of the

proposed method. Finally, conclusions are given in Chapter 6.

CONVENTIONAL MEHTODS FOR

were intensively discussed in pattern recognition. Feature extraction concentrates on

original data set to avoid noise jamming. Dimensionality reduction is often

performed for feature extraction in reducing computational load. In addition, fewer

clustering technique is employed to aggregate the critical features for recognition.

2.1 Karhunen-Loe`ve Transform

國立中央大學

系所：資訊工程所博士 ;碩士班

日期：民國九十四年六月二十四日