You are on page 1of 64

Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks Abstract We present an automatic vehicle detection system for

aerial surveillance in this paper. In this system, we escape from the stereotype and existing frameworks of vehicle detection in aerial surveillance, which are either region based or sliding window based. We design a pixel wise classification method for vehicle detection. The novelty lies in the fact that, in spite of performing pixel wise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. We consider features including vehicle colors and local features. For vehicle color extraction, we utilize a color transform to separate vehicle colors and non vehicle colors effectively. For edge detection, we apply moment preserving to ad!ust the thresholds of the "anny edge detector automatically, which increases the adaptability and the accuracy for detection in various aerial images. #fterward, a dynamic $ayesian network %&$'( is constructed for the classification purpose. We convert regional local features into )uantitative observations that can be referenced when applying pixel wise classification via &$'. *xperiments were conducted on a wide variety of aerial videos. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging data set with aerial surveillance images taken at different heights and under different camera angles.

Introduction: #erial surveillance has a long history in the military for observing enemy activities and in the commercial world for monitoring resources such as forests and crops. +imilar imaging techni)ues are used in aerial news gathering and search and rescue aerial surveillance has been performed primarily using film or electronic framing cameras. The ob!ective has been to gather high resolution still images of an area under surveillance that could later be examined by human or machine analysts to derive information of interest. "urrently, there is growing interest in using video cameras for these tasks. ,ideo captures dynamic events that cannot be understood from aerial still images. It enables feedback and triggering of actions based on dynamic events and provides crucial and timely intelligence and understanding that is not otherwise available. ,ideo observations can be used to detect and geo locate moving ob!ects in real time and to control the camera, for example, to follow detected vehicles or constantly monitor a site. -owever, video also brings new technical challenges. ,ideo cameras have lower resolution than framing cameras. In order to get the resolution re)uired to identify ob!ects on the ground, it is generally necessary to use a telephoto lens, with a narrow field of view. This leads to the most serious shortcoming of video in surveillance. it provides only a /soda straw0 view of the scene. The camera must then be scanned to cover extended regions of interest. #n observer watching this video must pay constant attention, as ob!ects of interest move rapidly in and out of the camera field of view. The video also lacks a larger visual context .the observer has difficulty perceiving the relative locations of ob!ects seen at one point in time to ob!ects seen moments before. In addition, geodetic coordinates for ob!ects of interest seen in the video are not available. 1ne of the main topics in aerial image analysis is scene registration and alignment. #nother very important topic in intelligent aerial surveillance is vehicle detection and tracking. The challenges of vehicle detection in aerial surveillance include camera motions such as panning, tilting, and rotation. In addition, airborne platforms at different heights result in different sizes of target ob!ects.

In this paper, we design a new vehicle detection framework that preserves the advantages of the existing works and avoids their drawbacks. The framework can be divided into the training phase and the detection phase. In the training phase, we extract multiple features including local edge and corner features, as well as vehicle colors to train a dynamic $ayesian network %&$'(. In the detection phase, we first perform background color removal. #fterward, the same feature extraction procedure is performed as in the training phase. The extracted features serve as the evidence to infer the unknown state of the trained &$', which indicates whether a pixel belongs to a vehicle or not. In this paper, we do not perform region based classification, which would highly depend on results of color segmentation algorithms such as mean shift. There is no need to generate multiscale sliding windows either. The distinguishing feature of the proposed framework is that the detection task is based on pixelwise classification. -owever, the features are extracted in a neighborhood region of each pixel. Therefore, the extracted features comprise not only pixel level information but also relationship among neighboring pixels in a region. +uch design is more effective and efficient than region based or multiscale sliding window detection methods.

Literature Survey
Vehicle Detection in Aerial Images using eneric !eatures" rou#ing and $onte%t

This work introduces a new approach on automatic vehicle detection in monocular large scale aerial images. The extraction is based on a hierarchical model that describes the prominent vehicle features on different levels of detail. $esides the ob!ect properties, the model comprises also contextual knowledge, i.e., relations between a vehicle and other ob!ects as, e.g., the pavement beside a vehicle and the sun causing a vehicle2s shadow pro!ection. This approach neither relies on external information like digital maps or site models, nor is it limited to vial specific vehicle models.

Models The knowledge about vehicle which is represented by the model is summarized in Fig.3. &ue to the use of aerial imagery, i.e. almost vertical views,

the model comprises exclusively features of vehicle4s upper surface and neglects the feature of the side up to now. 5& information is used for predicting the shadow region, though. #t the lowest of detail %6evel3(, we model a vehicles outline as convex somewhat compact and almost rectangular 7& region, possibly showing high edge density inside. This captures nearly all typical vehicles except busses and trucks. #t the next level %6evel 7(, we incorporate basic vehicle sub structures, e.g., the windshields and uniformly colored surfaces %front, top, and rear( that form a se)uence of connected 7& rectangles denoted as rectangle se)uences in the se)uel. This level still describes many types of passenger cars, vans, and small trucks since the rectangles may vary in their size and, what is more, an ob!ect instance is not constrained to consist of all of the modeled sub structures %exemplified by two instances at 6evel 7 in Fig. I(. The last level %6evel 5( contains 5& information and local context of vehicles. &epending on the sub structures of the previous level different simple height profiles are introduced which enrich the 7& model with the third dimension. #s a vehicle2s local context we include the bright and homogeneous pavement around a vehicle and a vehicle2s shadow pro!ection. *specially the shadow pro!ection has proven to be a reliable feature, since aerial images are only taken under excellent weather conditions at least for civil applications. 8lease note, that the context model needs no external information except ground control points for geo coding the images. If all image orientation parameters and the image ac)uisition2s date and daytime are known which is indeed the standard case for aerial imagery then it is easy to compute the direction of the sun rays and derive there from the shadow region pro!ected on the horizontal road surface.

Extraction strategy The extraction strategy is derived from the vehicle model and, conse)uently, follows the two paradigms 9coarse to fine9 and 9hypothesize and test9. It consists of following : steps; %3( "reation of <egions of Interest %<oIs( which is mainly based on edge voting for convex and compact regions, %7( -ypotheses formation for deriving rectangle se)uences front extracted lines, edges, and surfaces, %5( -ypotheses validation and selection which includes the radiometric and geometric analysis of rectangle se)uences, and %:( ,erification using 53( vehicle models and their local context +ee also Fig. 7 for an illustration of the individual steps. In order to avoid time, consuming grouping algorithms in the early stages of extraction, we first focus on generic image features as edges, lines, and surfaces. Creation of RoIs: We start, with the extraction of edges, link them into con tours, and robustly estimate the local direction and curvature at each contour point. +imilar to a generalized %i.e., elliptical( -ough Transform we use direction and curvature as well as intervals for length and width of vehicles for deriving the centers and orientations of the <ots %Fig. 7 a( and 3=. Hypotheses formation: 1ne of the most obvious vehicle features in aerial images is the line like dark front windshield. Therefore, we extract dark lines of a certain width, length and straightness in the <oIs using the differential geometric approach of +teger and, because vehicles are usually uniformly colored, we select those lines as windshield candidates which have symmetric contrast on both sides %see bold white lines in Fig. 7 c(. Front, top, and rear of a vehicle mostly appear as more or less bright 9blob9 in the vicinity of a windshield. Thus, we robustly fit a second order gray value surface to each image part ad!acent to windshield candidates %apexes are indicated by white crosses in Fig. 7 c( and keep only those blobs whose estimated surface parameters satisfy the condition of an ellipsoidal or

parabolic surface. #s a side effect, both algorithms for line and surface extraction return the contours of their bounding edges %bold black lines in Fig. 7 c(. These contours and additional edges extracted in their neighborhood %thin white lines in Fig. 7 c( are approximated by line segments. Then, we group the segments into rectangles and, furthermore, rectangle se)uences. Fitting the rectangle se)uences to the dominant direction of the underlying image edges yields the final 7& vehicle hypotheses %Fig. 7 d(.

Hypotheses validation and selection $efore constructing 5& models from the 7& hypotheses we check the hypotheses for validity. *ach rectangle se)uence is evaluated regarding the geometric criteria length, width, and length>width ratio, as well as the radiometric criteria homogeneity of an individual rectangle and gray value constancy of rectangles connected with a 9windshield rectangle9. The validation is based on fuzzy set theory and ensures that only promising, non overlapping hypotheses are selected %Fig. 7 e(. 3D model generation and verification In order to approximate the 5& shape of a vehicle hypothesis, a particular height profile is selected from a set of predefined profiles. 8lease note that the hypothesis2 underlying rectangles remain unchanged, i.e., the height values of the profile refer to the image edges perpendicular to the vehicle direction. The selection of the profiles depends on the extracted sub structures, i.e., the shape of the validated rectangle se)uence. We distinguish rectangle se)uences corresponding to 5 types of vehicles; hatch back cars, saloon cars, and other vehicles such as vans, small trucks, etc. In contrast to hatch back and saloon cars, the derivation of an accurate height profile for the last category would re)uire a deeper analysis of the hypotheses %e.g., for an unambiguous determination of the vehicle orientation(. -ence, in this case, we approximate the height profile only roughly by an elliptic arc having a constant height. offset, above the ground. #fter creating a 5& model from the 7& hypothesis and the respective height profile we are able to predict the boundary of a vehicle2s shadow pro!ection on the underlying road surface. # vehicle hypothesis is !udged as verified if a dark and homogeneous region is extracted besides the shadowed part of the vehicle and a bright and homogeneous region besides the illuminated part, respectively %Fig. 7 f(.

Vehicle Detection Using Normali&ed $olor and 'dge (a# This work presents a novel vehicle detection approach for detecting vehicles from static images using color and edges. &ifferent from traditional methods, which use motion features to detect vehicles, this method introduces a new color transform model to find important /vehicle color0 for )uickly locating possible vehicle candidates. +ince vehicles have various colors under different weather and lighting conditions, seldom works were proposed for the detection of vehicles using colors. The new color transform model has excellent capabilities to identify vehicle pixels from background, even though the pixels are lighted under varying illuminations. #fter finding possible vehicle candidates, three important features, including corners, edge maps, and coefficients of wavelet transforms, are used for constructing a cascade multichannel classifier. #ccording to this classifier, an effective scanning can be performed to verify all possible candidates )uickly. The scanning process can be )uickly achieved because most background pixels are eliminated in advance by the color feature.

This work presents a novel system to detect vehicles from static images using edges and vehicle colors. The flowchart of this system is shown in Fig. 5. #t the beginning, a color transformation to pro!ect all the colors of input pixels on a color space such that vehicle pixels can be easily identified from backgrounds. -ere, a $ayesian

classifier is used for this identification. Then, each detected vehicle pixel will correspond to a possible vehicle. +ince vehicles have different sizes and orientations, different vehicle hypotheses are generated from each detected vehicle. For verifying each hypothesis, we use three kinds of vehicle features to filter out all impossible vehicle candidates. The features include edges, coefficients of wavelet transform, and corners. ?sing proper weights obtained from a set of training samples, these features can then be combined together to form an optimal vehicle classifier. Then, desired vehicles can be very robustly and accurately verified and detected from static images.

Detection o) Vehicles and Vehicle *ueues !or +oad (onitoring Using ,igh +esolution Aerial Images This work introduces a new approach to automatic vehicle detection in monocular high resolution aerial images. The extraction relies upon both local and global features of vehicles and vehicle )ueues, respectively. To model a vehicle on local level, a 5& wireframe representation is used that describes the prominent geometric and radiometric features of cars including their shadow region. The model is adaptive because, during extraction, the expected saliencies of various edge features are automatically ad!usted depending on viewing angle, vehicle color measured from the image, and current illumination direction. The extraction is carried out by matching this model 0top down0 to the image and evaluating the support found in the image. 1n global level, the detailed local description is extended by more generic knowledge about vehicles as they are often part of vehicle queues. +uch groupings of vehicles are modeled by ribbons that exhibit the typical symmetries and spacings of vehicles over a larger distance. @ueue extraction includes the computation of directional edge symmetry measures resulting in a symmetry map, in which dominant, smooth curvilinear structures are searched for. $y fusing vehicles found using the local and the global model, the overall extraction gets more complete and more correct.

,ehicle detection is carried out by a top down matching algorithm. # comparison with a grouping scheme that groups image features such as edges and homogeneous regions into carlike structures has shown that matching the complete geometric model top down to the image is more robust. # reason for this is that, in general, bottom up grouping needs reliable features as seed hypotheses which are hardly given in the case of such small ob!ects like cars. #nother disadvantage of grouping refers to the fact that we must constrain our detection algorithm to monocular images, since vehicles may move within the time of two exposures. <econstructing a 5& ob!ect from monocular images by grouping involves much more ambiguities than matching a model of the ob!ect to the image. The steps of detection can be summarized as follows; *xtract edge pixels and compute gradient direction. 8ro!ect the geometric model including shadow region to edge pixel and align the model4s reference point and direction with the gradient direction. Aeasure reference color > intensity at roof region #dapt the expected saliency of the edge features depending on position, orientation, color, and sun direction. Aeasure features from the image; edge amplitude support of each model edge, edge direction support of each model edge, color constancy, darkness of shadow. "ompute a matching score %a likelihood( by comparing measured values with expected values. $ased on the likelihood, decide whether the car hypothesis is accepted or not.

Vehicle Queue Detection ,ehicle )ueue detection is based on searching for one vehicle wide ribbons that are characterized by; +ignificant directional symmetries of gray value edges with symmetry maxima defining the )ueue4s center line. Fre)uent intersections of short and perpendicularly oriented edges with homogeneous distribution along the center line -igh parallel edge support at both sides of the center line +ufficient length !ast Vehicle Detection and -racking In Aerial Image Bursts This work present a combination of vehicle detection and tracking which is adapted to the special restrictions given on image size and flow but nevertheless yields reliable information about the traffic situation. "ombining a set of modified edge filters it is possible to detect cars of different sizes and orientations with minimum computing effort, if some a priori information about the street network is used. The found vehicles are tracked between two consecutive images by an algorithm using +ingular ,alue &ecomposition. "oncerning their distance and correlation the features are assigned pair wise with respect to their global positioning among each other. "hoosing only the best correlating assignments it is possible to compute reliable values for the average velocities. Preprocessing To identify the active regions as well as the orientation of images among each other they have to be geo referenced, which means their absolute geographic position and dimension have to be defined. <elated to the B8+>IA? information and a digital terrain model the image data gets pro!ected into BeoTIFF images, which are plane and oriented into north direction. This is useful to combine the recorded images with

existing datasets like maps or street data. To avoid examining the whole image data, only the street area given by a database is considered. Detection For providing fast detection of traffic ob!ects in the large images a set of modified edge filters, that represent a two dimensional car model, is used. <ecent tests showed that the car4s color information does not yield better results in detection than its gray value. Therefore the original images are converted into gray images. This conversion saves two thirds of filtering time. #s there is additional information about street area and orientation this knowledge is used as well. The databases provided by 'avte) %www.navte).com( and #tkis %www.atkis.de( for example contain that information about the street network. For every street segment covered by the image a bounding box around it is cut out. The subimage is masked with the street segment to only use the filters on traffic area. We use neither a -ough transformation for finding straight edges nor a filter in shape of the whole car. $ut we create four special shaped edge filters to represent all edges of the car model, which are elongated to the average expected size and turned into the direction given by the street database. To avoid filtering for all different car sizes, we only shift the filter answers %fig.:( to the expected car edges within a certain range. This has the same effect as positioning the filter kernels around an anchor point. In the con!unction image of the four threshold and shifted edge images remain blobs at the position, where all four filters have answered strong enough to the related edge filter. The regions remaining become thinned by a non maxima suppression until one pixel each is left representing the car4s center. For bigger vehicles like trucks the same filter answers are used. To recognize long edges without using new filters, the given answers of the side edges are shifted along the side of the car and always con!unct with each other. To avoid cars being detected twice, all observations are tested pairwise for their distances among each other. +ome observations have more than one maximum, or vehicles are detected twice between two neighboring street segments. With respect to their size and

orientation, ob!ects below a certain distance to each other are discarded while only the one with the strongest intensity remains. Vehicle Detection and -racking Using the Block (atching Algorithm It describes an approach to vehicle detection and tracking fully based on the $lock Aatching #lgorithm %$A#(, which is the motion estimation algorithm employed in the A8*B compression standard. $A# partitions the current frame in small, fixed size blocks and matches them in the previous frame in order to estimate blocks displacement %referred to as motion vectors( between two successive frames. The detection and tracking approach is as follows. $A# provides motion vectors, which are then regularised using a ,ector Aedian Filter. #fter the regularisation step, motion vectors are grouped based on their ad!acency and similarity, and a set of vehicles is built per singular frame. Finally, the tracking algorithm establishes the correspondences between the vehicles detected in each frames of the se)uence, allowing the estimation of their tra!ectories as well as the detection of new entries and exits. The tracking algorithm is strongly based on the $A#. $A# consists in partitioning each frame of a given se)uence into s)uare blocks of a fixed size %in pixel; CxC or DxD or...( and detecting blocks displacement between the actual frame and the previous one, searching inside a given scan area. It provides a field of displacement vectors %&,F( associated with. *ach block encloses a part of the image and is a matrix containing the grey tones of that image part. What we have to do is estimating each block position in the previous frame in order to calculate displacements and speeds %knowing Dt between frames(. *ach block defines in the previous frame, a /scan area0, centered in the block center. The block is shifted pixel by pixel inside the scan area, calculating a match measure at each shift position. The comparison is aimed at determining the pixel set most similar to the block between its

possible positions in the scan area. #mong these positions, the scan area subpart defined by its center, will be the matrix with the best match measure. Vehicle Detection and -racking )or the Urban $hallenge The ?rban "hallenge 7==E was a race of autonomous vehicles through an urban environment organized by the ?.+. government. &uring the competition the vehicles encountered various typical scenarios of urban driving. -ere they had to interact with other traffic which was human or machine driven. It describes the perception approach taken by Team Tartan <acing, winner of the competition. # focus is set on the detection and tracking of other vehicles around the robot. The presented approach allows a situation specific interpretation of perception data through situation assessment algorithms keeping the perception algorithms situation independent.

Vision.based detection" tracking and classi)ication o) vehicles using stable )eatures with automatic camera calibration # system for detection, tracking and classification of vehicles based on feature point tracking is presented in this chapter. #n overview of the system is shown in Figure. Feature points are automatically detected and tracked through the video se)uence, and features lying on the background or on shadows are removed by background subtraction, leaving only features on the moving vehicles. These features are then separated into two categories; stable and unstable. ?sing a plumb line pro!ection %868(, the 5& coordinates of the stable features are computed, these stable features are grouped together to provide a segmentation of the vehicles, and the unstable features are then assigned to these groups. The final step involves eliminating groups that do not appear to be vehicles, establishing correspondence between groups detected in different image frames to achieve long term tracking, and classifying vehicles based upon the number of unstable features in the group. The details of these steps are described in the following subsections.

Vehicle Detection and -racking in $ar Video Based on (otion (odel This work aims at real time in car video analysis to detect and track vehicles ahead for safety, auto driving, and target tracing. This paper describes a comprehensive approach to localize target vehicles in video under various environmental conditions. The extracted geometry features from the video are pro!ected onto a 3& profile continuously and are tracked constantly. We rely on temporal information of features and their motion behaviors for vehicle identification, which compensates for the complexity in recognizing vehicle shapes, colors, and types. We model the motion in the field of view probabilistically according to the scene characteristic and vehicle motion model. The -idden Aarkov Aodel is used for separating target vehicles from background, and tracking them probabilistically

'%isting System
-inz and $aumgartner utilized a hierarchical model that describes different levels of details of vehicle features. There is no specific vehicle models assumed, making the method flexible. -owever, their system would miss vehicles when the contrast is weak or when the influences of neighboring ob!ects are present. "heng and $utler considered multiple clues and used a mixture of experts to merge the clues for vehicle detection in aerial images. They performed color segmentation via mean shift algorithm and motion analysis via change detection. In addition, they presented a trainable se)uential maximum a posterior method for multiscale analysis and enforcement of contextual information. -owever, themotion analysis algorithm applied in their system cannot deal with aforementioned camera motions and complex background changes. Aoreover, in the information fusion step, their algorithm highly depends on the color segmentation results. 6in et al. proposed a method by subtracting background colors of each frame and then refined vehicle candidate regions by enforcing size constraints of vehicles. -owever, they assumed too many parameters such as the largest and smallest sizes of vehicles, and the height and the focus of the airborne camera. #ssuming these parameters as known priors might not be realistic in real applications. The authors proposed a moving vehicle detection method based on cascade classifiers. # large number of positive and negative training samples need to be collected for the training purpose. Aoreover, multiscale sliding windows are generated at the detection stage. The main disadvantage of this method is that there are a lot of miss detections on rotated vehicles. +uch results are not surprising from the experiences of face detection using cascade classifiers. If only frontal faces are trained, then faces with poses are easily missed. -owever, if faces with poses are added as positive samples, the number of false alarms would surge.

Disadvantage -ierarchical model system would miss vehicles when the contrast is weak or when the influences of neighboring ob!ects are present. *xisting method result highly depends on the color segmentation a lot of miss detections on rotated vehicles a vehicle tends to be separated as many regions since car roofs and windshields usually have different colors high computational complexity

/ro#osed System
In this paper, we design a new vehicle detection framework that preserves the advantages of the existing works and avoids their drawbacks. The framework can be divided into the training phase and the detection phase. In the training phase, we extract multiple features including local edge and corner features, as well as vehicle colors to train a dynamic $ayesian network %&$'(. In the detection phase, we first perform background color removal. #fterward, the same feature extraction procedure is performed as in the training phase. The extracted features serve as the evidence to infer the unknown state of the trained &$', which indicates whether a pixel belongs to a vehicle or not. In this paper, we do not perform region based classification, which would highly depend on results of color segmentation algorithms such as mean shift. There is no need to generate multi scale sliding windows either. The distinguishing feature of the proposed framework is that the detection task is based on pixel wise classification. -owever, the features are extracted in a neighborhood region of each pixel. Therefore,

the extracted features comprise not only pixel level information but also relationship among neighboring pixels in a region. +uch design is more effective and efficient than region based or multi scale sliding window detection methods.

Advantage
Aore *ffective and efficient. It does not re)uire a large amount of training samples It increases the adaptability and the accuracy for detection in various aerial images

,ardware re0uirements:

8rocessor <am -ard &isk "ompact &isk Input device

; #ny 8rocessor above F== A-z. ; 3 B$. ; 3= B$. ; CF= Ab. ; +tandard Geyboard and Aouse.

So)tware re0uirements:

1perating +ystem Technology

; Windows Hp. ; 'et $eans E.7 ; Idk3.C

System Architecture

(odules 12 !rame '%traction 32 Background color removal 42 !eature '%traction 52 $lassi)ication 62 /ost /rocessing

(odule Descri#tion 12 !rame '%traction In module we read the input video and extract the number of frames from that video.

,ideo

Frame *xtraction

Frame Image

32 Background color removal In this module we construct the color histogram of each frame and remove the colors that appear most fre)uently in the scene. These removed pixels do not need to be considered in subse)uent detection processes. 8erforming background color removal cannot only reduce false alarms but also speed up the detection process.

Frame Image

"onstruct "olor -istogram

<emove fre)uent color

Bet the background removal image

42 !eature '%traction In this module we extract the feature from the image frame. In this module we do the following *dge &etection, "orner &etection, color Transformation and color classification.

&etect the *dge

52 $lassi)ication
In this module we perform pixel wise classification for vehicle detection using &$'s. %&ynamic $ayesian 'etwork(. In the training stage, we obtain the conditional probability tables of the &$' model via expectation maximization algorithm by providing the ground truth labeling of each pixel and its corresponding observed features from several training videos. In the detection phase, the $ayesian rule is used to obtain the probability that a pixel belongs to a vehicle.

&etect vehicle

Bet pixel values

Frame Image

#pply &$'

Transfor m color

&etect "orner

Frame Image

62 /ost #rocessing In this module we use morphological operations to enhance the detection mask and perform connected component labeling to get the vehicle ob!ects. The size and the aspect ratio constraints are applied again after morphological operations in the post processing stage to eliminate ob!ects that are impossible to be vehicles.

&etected vehicle

#pply post processing

Aask the detected vehicles

&isplay the result

Data !low Diagram

Input ,ideo

,ehicle &etection

&etected Frame

Input ,ideo

Frame *xtraction

$ackgroun d color removal

Feature *xtraction

"lassificatio n

UML Diagram Use Case Diagram

&ete cted Fram e

et In#ut Video

'%tract !rames

+emove background color

User

'%tract !eature

$lassi)ication

/ost /rocessing

Detected Vehicle

Class Diagram

(ain J+tring fn J,ideo video3 JImage frame Jvoid readInput,ideo%( Jvoid extractFrame%(

VehicleDetection JImage frame Jvoid remove$ackground%( Jvoid extractFeature%( Jvoid classification%( Jvoid post8rocessing%( Jvoid store&etectedFrame%(

Activity Diagram

Input ,ideo

*xtract Frame

<emove $ackground color

*xtract Feature

"lassification

8ost 8rocessing

Sequence Diagram

Training 8hase

&etection 8hase

1 : Input Video()

2 : Extract Frame() 3 : Frames()

4 : Feature Extraction()

5 : Remove back round()

! : Features()

" : #$assi%ication()

& : 'ost 'reprocessin ()

Collaboration diagram

1 : Input Video() 2 : Extract Frame() 4 : Feature Extraction()

5 : Remove back round() " : #$assi%ication() & : 'ost 'reprocessin ()

3 : Frames()

&etection 8hase

Training 8hase
! : Features()

Software Description

7ava -echnology Iava technology is both a programming language and a platform. -he 7ava /rogramming Language The Iava programming language is a high level language that can be characterized by all of the following buzzwords; +imple #rchitecture neutral 1b!ect oriented 8ortable &istributed -igh performance

Interpreted Aultithreaded <obust &ynamic +ecure With most programming languages, you either compile or interpret a program so that you can run it on your computer. The Iava programming language is unusual in that a program is both compiled and interpreted. With the compiler, first you translate a program into an intermediate language called Java byte codes .the platform independent codes interpreted by the interpreter on the Iava platform. The interpreter parses and runs each Iava byte code instruction on the computer. "ompilation happens !ust onceK interpretation occurs each time the program is executed. The following figure illustrates how this works.

89+:IN

9! 7AVA

Lou can think of Iava bytecodes as the machine code instructions for the Java Virtual Machine %Iava ,A(. *very Iava interpreter, whether it4s a development tool or a Web browser that can run applets, is an implementation of the Iava ,A. Iava bytecodes help make /write once, run anywhere0 possible. Lou can compile your program into bytecodes on any platform that has a Iava compiler. The bytecodes can then be run on

any implementation of the Iava ,A. That means that as long as a computer has a Iava ,A, the same program written in the Iava programming language can run on Windows 7===, a +olaris workstation, or on an iAac.

-he 7ava /lat)orm # platform is the hardware or software environment in which a program runs. We4ve already mentioned some of the most popular platforms like Windows 7===, 6inux, +olaris, and Aac1+. Aost platforms can be described as a combination of the operating system and hardware. The Iava platform differs from most other platforms in that it4s a software only platform that runs on top of other hardware based platforms. The Iava platform has two components;

The Java Virtual Machine %Iava ,A( The Java pplication !rogramming Interface %Iava #8I(

Lou4ve already been introduced to the Iava ,A. It4s the base for the Iava platform and is ported onto various hardware based platforms. The Iava #8I is a large collection of ready made software components that provide many useful capabilities, such as graphical user interface %B?I( widgets. The Iava #8I is grouped into libraries of related classes and interfacesK these libraries are known as pac"ages. The next section, What "an Iava Technology &oM, highlights what functionality some of the packages in the Iava #8I provide. The following figure depicts a program that4s running on the Iava platform. #s the figure shows, the Iava #8I and the virtual machine insulate the program from the hardware.

-,' 7AVA /LA-!9+( 'ative code is code that after you compile it, the compiled code runs on a specific hardware platform. #s a platform independent environment, the Iava platform can be a bit slower than native code. -owever, smart compilers, well tuned interpreters, and !ust in time bytecode compilers can bring performance close to that of native code without threatening portability. #hat Can Java $echnology Do% The most common types of programs written in the Iava programming language are applets and applications. If you4ve surfed the Web, you4re probably already familiar with applets. #n applet is a program that adheres to certain conventions that allow it to run within a Iava enabled browser. -owever, the Iava programming language is not !ust for writing cute, entertaining applets for the Web. The general purpose, high level Iava programming language is also a powerful software platform. ?sing the generous #8I, you can write many types of programs. #n application is a standalone program that runs directly on the Iava platform. # special kind of application known as a server serves and supports clients on a network. *xamples of servers are Web servers, proxy servers, mail servers, and print servers. #nother specialized program is a servlet. # servlet can almost be thought of as an applet that runs on the server side. Iava +ervlets are a popular choice for building interactive web applications, replacing the use of "BI scripts. +ervlets are similar to applets in that they

are runtime extensions of applications. Instead of working in browsers, though, servlets run within Iava Web servers, configuring or tailoring the server. -ow does the #8I support all these kinds of programsM It does so with packages of software components that provide a wide range of functionality. *very full implementation of the Iava platform gives you the following features;

The essentials; 1b!ects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.

#pplets; The set of conventions used by applets. 'etworking; ?<6s, T"8 %Transmission "ontrol 8rotocol(, ?&8 %?ser &ata gram 8rotocol( sockets, and I8 %Internet 8rotocol( addresses.

Internationalization; -elp for writing programs that can be localized for users worldwide. 8rograms can automatically adapt to specific locales and be displayed in the appropriate language.

+ecurity; $oth low level and high level, including electronic signatures, public and private key management, access control, and certificates.

+oftware components; Gnown as Iava$eansTA, can plug into existing component architectures.

1b!ect serialization; #llows lightweight persistence and communication via <emote Aethod Invocation %<AI(.

Iava &atabase "onnectivity %I&$"TA(; 8rovides uniform access to a wide range of relational databases.

The Iava platform also has #8Is for 7& and 5& graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Iava 7 +&G.

!I U+' 5 ; 7AVA 3 SD:

9DB$ Aicrosoft 1pen &atabase "onnectivity %1&$"( is a standard programming interface for application developers and database systems providers. $efore 1&$" became a de facto standard for Windows programs to interface with database systems, programmers had to use proprietary languages for each database they wanted to connect to. 'ow, 1&$" has made the choice of the database system almost irrelevant from a coding perspective, which is as it should be. #pplication developers have much more important things to worry about than the syntax that is needed to port their program from one database to another when business needs suddenly change. Through the 1&$" #dministrator in "ontrol 8anel, you can specify the particular database that is associated with a data source that an 1&$" application program is written to use. Think of an 1&$" data source as a door with a name on it. *ach door will lead you to a particular database. For example, the data source named +ales Figures might be a +@6 +erver database, whereas the #ccounts 8ayable data source could refer to an #ccess database. The physical database referred to by a data source can reside anywhere on the 6#'.

Windows NF does not install the 1&$" system files on your system. <ather, they are installed when you setup a separate database application, such as +@6 +erver "lient or ,isual $asic :.=. When the 1&$" icon is installed in "ontrol 8anel, it uses a file called 1&$"I'+T.&66. It is also possible to administer your 1&$" data sources through a stand alone program called 1&$"#&A.*H*. There is a 3C bit and a 57 bit version of this program, and each maintains a separate list of 1&$" data sources.

From a programming perspective, the beauty of 1&$" is that the application can be written to use the same set of function calls to interface with any data source, regardless of the database vendor. The source code of the application doesn4t change whether it talks to 1racle or +@6 +erver. We only mention these two as an example. There are 1&$" drivers available for several dozen popular database systems. *ven *xcel spreadsheets and plain text files can be turned into data sources. The operating system uses the <egistry information written by 1&$" #dministrator to determine which low level 1&$" drivers are needed to talk to the data source %such as the interface to 1racle or +@6 +erver(. The loading of the 1&$" drivers is transparent to the 1&$" application program. In a client>server environment, the 1&$" #8I even handles many of the network issues for the application programmer. The advantages of this scheme are so numerous that you are probably thinking there must be some catch. The only disadvantage of 1&$" is that it isn4t as efficient as talking directly to the native database interface. 1&$" has had many detractors make the charge that it is too slow. Aicrosoft has always claimed that the critical factor in performance is the )uality of the driver software that is used. In our humble opinion, this is true. The availability of good 1&$" drivers has improved a great deal recently. #nd anyway, the criticism about performance is somewhat analogous to those who said that compilers would never match the speed of pure assembly language. Aaybe not, but the compiler %or 1&$"( gives you the opportunity to write cleaner programs, which means you finish sooner. Aeanwhile, computers get faster every year.

7DB$ In an effort to set an independent database standard #8I for Iava, +un Aicrosystems developed Iava &atabase "onnectivity, or I&$". I&$" offers a generic +@6 database access mechanism that provides a consistent interface to a variety of <&$A+s. This consistent interface is achieved through the use of /plug in0 database connectivity modules, or drivers. If a database vendor wishes to have I&$" support, he or she must provide the driver for each platform that the database and Iava run on. To gain a wider acceptance of I&$", +un based I&$"4s framework on 1&$". #s you discovered earlier in this chapter, 1&$" has widespread support on a variety of platforms. $asing I&$" on 1&$" will allow vendors to bring I&$" drivers to market much faster than developing a completely new connectivity solution. I&$" was announced in Aarch of 3NNC. It was released for a N= day public review that ended Iune D, 3NNC. $ecause of user input, the final I&$" v3.= specification was released soon after. The remainder of this section will cover enough information about I&$" for you to know what it is about and how to use it effectively. This is by no means a complete overview of I&$". That would fill an entire book.

7DB$

oals

Few software packages are designed without goals in mind. I&$" is one that, because of its many goals, drove the development of the #8I. These goals, in con!unction with early reviewer feedback, have finalized the I&$" class library into a solid framework for building database applications in Iava.

The goals that were set for I&$" are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for I&$" are as follows; 1. Q! !evel "P# The designers felt that their main goal was to define a +@6 interface for Iava. #lthough not the lowest database interface level possible, it is at a low enough level for higher level tools and #8Is to be created. "onversely, it is at a high enough level for application programmers to use it confidently. #ttaining this goal allows for future tool vendors to /generate0 I&$" code and to hide many of I&$"4s complexities from the end user. $. Q! %onformance

+@6 syntax varies as you move from database vendor to database vendor. In an effort to support a wide variety of vendors, I&$" will allow any )uery statement to be passed through it to the underlying database driver. This allows the connectivity module to handle non standard functionality in a manner that is suitable for its users. 5. &D'% must (e implemental on top of common data(ase interfaces The I&$" +@6 #8I must /sit0 on top of other common +@6 level #8Is. This goal allows I&$" to use existing 1&$" level drivers by the use of a software interface. This interface would translate I&$" calls to 1&$" and vice versa. ). Provide a &ava interface that is consistent *ith the rest of the &ava system $ecause of Iava4s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Iava system. +. ,eep it simple This goal probably appears in all software design goal listings. I&$" is no exception. +un felt that the design of I&$" should be very simple, allowing for only one method of

completing a task per mechanism. #llowing duplicate functionality only serves to confuse the users of the #8I. -. .se strong/ static typing *herever possi(le +trong typing allows for more error checking to be done at compile timeK also, less errors appear at runtime. 0. ,eep the common cases simple $ecause more often than not, the usual +@6 calls used by the programmer are simple +*6*"T4s, I'+*<T4s, &*6*T*4s and ?8&#T*4s, these )ueries should be simple to perform with I&$". -owever, more complex +@6 statements should also be possible.

NetBeans The 'et$eans I&* is open source and is written in the Iava programming language. It provides the services common to creating desktop applications menu management, settings storage such as window and and is also the first I&* to fully support I&G F.=

features. The 'et$eans platform and I&* are free for commercial and non commercial use, and they are supported by +un Aicrosystems. It can be downloaded from http;>>www.netbeans.org>

!eatures and -ools The 'et$eans I&* has many features and tools for each of the Iava platforms. Those in the following list are not limited to the Iava +* platform but are useful for building, debugging, and deploying applications and applets;

+ource "ode *ditor


+yntax highlighting for Iava, Iava+cript, HA6, -TA6, "++, I+8, I&6 "ustomizable fonts, colors, and keyboard shortcuts 6ive parsing and error marking 8op up Iavadoc for )uick access to documentation #dvanced code completion #utomatic indentation, which is customizable Word matching with the same initial prefixes 'avigation of current class and commonly used features Aacros and abbreviations Boto declaration and Boto class Aatching brace highlighting Iump6ist allows you to return the cursor to previous modification

UI Builder

Fully WL+IWLB designer with Test Form feature +upport for visual and nonvisual forms *xtensible "omponent 8alette with preinstalled +wing and #WT components "omponent Inspector showing a component2s tree and properties #utomatic one way code generation, fully customizable

+upport for #WT>+wing layout managers, drag and drop layout customization 8owerful visual editor +upport for null layout In place editing of text labels of components, such as labels, buttons, and text fields

Iava$eans support, including installing, using, and customizing properties, events, and customizers

,isual Iava$ean customization classes

ability to create forms from any Iava$ean

"onnecting beans using "onnection wizard Ooom view ability

Database Su##ort

&atabase schema browsing to see the tables, views, and stored procedures defined in a database

&atabase schema editing using wizards &ata view to see data stored in tables +@6 and &&6 command execution to help you write and execute more complicated +@6 or &&6 commands

Aigration of table definitions across databases from different vendors Works with databases, such as Ay+@6, 8ostgre+@6, 1racle, I$A &$7, Aicrosoft +@6 +erver, 8oint$ase, +ybase, Informix, "loudscape, &erby, and more

The 'et$eans I&* also provides full featured refactoring tools, which allow you to rename and move classes, fields, and methods, as well as change method parameters. In addition, you get a debugger and an #nt based pro!ect system.

Screen Shot

Sam#le $oding
Import a!avax.swing.PK public class Frame"apture implements "ontroller6istener Q 8rocessor pK 1b!ect wait+ync R new 1b!ect%(K boolean stateTransition1G R trueK public boolean already8rnt R falseK int tempR3K

AainFrame mfK Frame"apture%AainFrame m( Q mfRmK S public boolean open%Aedia6ocator ml( Q try Q p R Aanager.create8rocessor%ml(K S catch %*xception e( Q +ystem.err.println%9Failed to create a processor from the given url; 9 J e(K return falseK S p.add"ontroller6istener%this(K p.configure%(K if %TwaitFor+tate%8rocessor."onfigured(( Q +ystem.err.println%9Failed to configure the processor.9(K return falseK S p.set"ontent&escriptor%null(K Track"ontrol tcUV R p.getTrack"ontrols%(K if %tc RR null( Q

+ystem.err.println%9Failed to obtain track controls from the processor.9(K return falseK S Track"ontrol videoTrack R nullK for %int i R =K i W tc.lengthK iJJ( Q if %tcUiV.getFormat%( instanceof ,ideoFormat( videoTrack R tcUiVK else tcUiV.set*nabled%false(K S if %videoTrack RR null( Q +ystem.err.println%9The input media does not contain a video track.9(K return falseK S +tring videoFormat R videoTrack.getFormat%(.to+tring%(K &imension video+ize R parse,ideo+ize%videoFormat(K try Q "odec codecUV R Q new 8ost#ccess"odec%video+ize(SK videoTrack.set"odec"hain%codec(K S catch %*xception e( Q +ystem.err.println%9The process does not support effects.9(K S

p.prefetch%(K if %TwaitFor+tate%8rocessor.8refetched(( Q +ystem.err.println%9Failed to realise the processor.9(K return falseK S p.start%(K return trueK S public &imension parse,ideo+ize%+tring video+ize( Q int xR5==, yR7==K +tringTokenizer strtok R new +tringTokenizer%video+ize, 9, 9(K strtok.nextToken%(K +tring size R strtok.nextToken%(K +tringTokenizer size+trtok R new +tringTokenizer%size, 9x9(K try Q x R Integer.parseInt%size+trtok.nextToken%((K y R Integer.parseInt%size+trtok.nextToken%((K S catch %'umberFormat*xception e( Q +ystem.out.println%9unable to find video size, assuming default of 5==x7==9(K S return new &imension%x, y(K S

boolean waitFor+tate%int state( Q synchronized %wait+ync( Q try Q while %p.get+tate%( TR state XX stateTransition1G( wait+ync.wait%(K S catch %*xception e( Q S S return stateTransition1GK S public void controller?pdate%"ontroller*vent evt( Q if %evt instanceof "onfigure"omplete*vent YY evt instanceof <ealize"omplete*vent YY evt instanceof 8refetch"omplete*vent( Q synchronized %wait+ync( Q stateTransition1G R trueK wait+ync.notify#ll%(K S S else if %evt instanceof <esource?navailable*vent( Q

synchronized %wait+ync( Q stateTransition1G R falseK wait+ync.notify#ll%(K S S else if %evt instanceof *nd1fAedia*vent( Q p.close%(K +ystem.exit%=(K S S static void pr?sage%( Q +ystem.err.println%9?sage; !ava Frame#ccess WurlZ9(K S public class 8re#ccess"odec implements "odec Q void accessFrame%$uffer frame( Q long t R %long( %frame.getTime+tamp%( > 3=======f(K if%frame.get6ength%(RR=( Q p.stop%(K S S

protected Format supportedInsUV R new FormatUV Q new ,ideoFormat%null(SK protected Format supported1utsUV R new FormatUV Q new ,ideoFormat%null(SK Format input R null, output R nullK public +tring get'ame%( Q return 98re #ccess "odec9K S public void open%( QS public void close%( QS public void reset%( QS public FormatUV get+upportedInputFormats%( Q return supportedInsK S public FormatUV get+upported1utputFormats%Format in( Q if %in RR null( return supported1utsK else Q Format outsUV R new FormatU3VK outsU=V R inK return outsK S S public Format setInputFormat%Format format( Q

input R formatK return inputK S public Format set1utputFormat%Format format( Q output R formatK return outputK S public int process%$uffer in, $uffer out( Q accessFrame%in(K 1b!ect data R in.get&ata%(K in.set&ata%out.get&ata%((K out.set&ata%data(K out.setFlags%$uffer.F6#B['1[+L'"(K out.setFormat%in.getFormat%((K out.set6ength%in.get6ength%((K out.set1ffset%in.get1ffset%((K return $?FF*<[8<1"*++*&[1GK S public 1b!ectUV get"ontrols%( Q return new 1b!ectU=VK S public 1b!ect get"ontrol%+tring type( Q return nullK

S S public class 8ost#ccess"odec extends 8re#ccess"odec Q public 8ost#ccess"odec%&imension size( Q supportedIns R new FormatUV Q new <B$Format%(SK this.size R sizeK S void accessFrame%$uffer frame( Q if %Talready8rnt( Q $ufferToImage stop$uffer R new $ufferToImage%%,ideoFormat( frame.getFormat%((K Image stopImage R stop$uffer.createImage%frame(K try Q $ufferedImage outImage R new $ufferedImage%size.width, size.height, $ufferedImage.TL8*[I'T[<B$(K Braphics og R outImage.getBraphics%(K og.drawImage%stopImage, =, =, size.width, size.height, null(K Iterator writers R ImageI1.getImageWriters$yFormat'ame%9!pg9(K ImageWriter writer R %ImageWriter( writers.next%(K +ystem.out.println%9fn name 9Jtemp(K File f R new File%9Frames\\9J+tring.value1f%temp( J 9.!pg9(K tempJJK Image1utput+tream ios R ImageI1.createImage1utput+tream%f(K writer.set1utput%ios(K

writer.write%outImage(K ios.close%(K S catch %I1*xception e( Q +ystem.out.println%9*rror ;9 J e(K S S long t R %long( %frame.getTime+tamp%( > 3=======f(K if%frame.get6ength%(RR=( Q +ystem.out.println%9stop9(K p.stop%(K p.close%(K I1ption8ane.showAessage&ialog%new IFrame%(,9Frame *xtracted9(K mf.!$utton:.set*nabled%true(K S S public +tring get'ame%( Q return 98ost #ccess "odec9K S private &imension sizeK S

$onclusion In this paper, we have proposed an automatic vehicle detection system for aerial surveillance that does not assume any prior information of camera heights, vehicle sizes, and aspect ratios. In this system, we have not performed region based classification, which would highly depend on computational intensive color segmentation algorithms such as mean shift. We have not generated multiscale sliding windows that are not suitable for detecting rotated vehicles either. Instead, we have proposed a pixelwise classification method for the vehicle detection using &$'s. In spite of performing pixelwise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. Therefore, the extracted features comprise not only pixel level information but also region level information. +ince the colors of the vehicles would not dramatically change due to the influence of the camera angles and heights, we use only a small number of positive and negative samples to train the +,Afor vehicle color classification. Aoreover, the number of frames re)uired to train the &$' is very small. 1verall, the entire framework does not re)uire a large amount of training samples. We have also applied moment preserving to enhance the "anny edge detector, which increases the adaptability and the accuracy for detection in various aerial images. The experimental results demonstrate flexibility and good generalization abilities of the proposed method on a challenging data set with aerial surveillance images taken at different heights and under different camera angles. For future work, performing vehicle tracking on the detected vehicles can further stabilize the detection results. #utomatic vehicle detection and tracking could serve as the foundation for event analysis in intelligent aerial surveillance systems.

+e)erence U3V <. Gumar, -. +awhney, +. +amarasekera, +. -su, T. -ai, B. Lanlin, G. -anna, #. 8ope, <. Wildes, &. -irvonen,A. -ansen, and 8. $urt, /#erial video surveillance and exploitation,0 !roc& I''', vol. DN, no. 3=, pp. 3F3D]3F5N, 7==3. U7V I. *mst, +. +u!ew, G. ?. Thiessenhusen,A. -etscher, +. <abmann, and A. <uhe, /6?A1+.#irbome traffic monitoring system,0 in !roc& I''' Intell& $ransp& (yst& , 1ct. 7==5, vol. 3, pp. EF5]EFN. U5V 6. &. "hou, I. L. Lang, L. ". -sieh, &. ". "hang, and ". F. Tung, /Intersection based routing protocol for ,#'*Ts,0#irel& !ers& Commun&, vol. C=, no. 3, pp. 3=F]37:, +ep. 7=33. U:V +. +rinivasan, -. 6atchman, I. +hea, T. Wong, and I. Ac'air, /#irborne traffic surveillance systems; ,ideo surveillance of highway traffic,0 in !roc& #or"shop Video (urveillance (ens& *et+&, 7==:, pp. 353]35F. UFV #. ". +hastry and <. #. +chowengerdt, /#irborne video registration and traffic flow parameter estimation,0 I''' $rans& Intell& $ransp& (yst&, vol. C, no. :, pp. 5N3]:=F, &ec. 7==F. UCV -. "heng and I.Wus, /#daptive region of interest estimation for aerial surveillance video,0 in !roc& I''' Int& Conf& Image !rocess&, 7==F, vol. 5, pp. DC=]DC5 UEV +. -inz and #. $aumgartner, /,ehicle detection in aerial images using generic features, grouping, and context,0 in !roc& D ,M-(ymp&, +ep. 7==3, vol. 73N3, 6ecture 'otes in "omputer +cience, pp. :F]F7. CM )nd Int&

UDV -. "heng and &. $utler, /+egmentation of aerial surveillance video using a mixture of experts,0 in !roc& I''' Digit& Imaging Comput& .$ech& ppl&, 7==F, p. CC. UNV <. 6in, H. "ao, L. Hu, ".Wu, and -. @iao, /#irborne moving vehicle detection for urban traffic surveillance,0 in !roc& //th Int& I''' Conf& Intell& $ransp& (yst& , 1ct. 7==D, pp. 3C5]3CE. U3=V 6. -ong, L. <uan, W. 6i, &. Wicker, and I. 6ayne, /*nergy based video tracking using !oint target density processing with an application to unmanned aerial vehicle surveillance,0 I'$ Comput& Vis&, vol. 7, no. 3, pp. 3]37, 7==D U33V <. 6in, H. "ao, L. Hu, ".Wu, and -. @iao, /#irborne moving vehicle detection for video surveillance of urban traffic,0 in !roc& I''' Intell& Veh& (ymp&, 7==N, pp. 7=5]7=D. U37V I. L. "hoi and L. G. Lang, /,ehicle detection from aerial images using local shape information,0 dv& Image Video $echnol&, vol. F:3:, 6ecture 'otes in "omputer +cience, pp. 77E]75C, Ian. 7==N. U35V ". B. -arris and A. I. +tephens, /# combined corner and edge detector,0 in !roc& 0th lvey Vis& Conf&, 3NDD, pp. 3:E]3F3. U3:V I. F. "anny, /# computational approach to edge detection,0 I''' $rans !attern nal Mach& Intell&, vol. 8#AI D, no. C, pp. CEN]CND, 'ov. 3NDC