Computer Engineering Department The University of Isfahan
1392/6/23
Common Approaches To Moving Target Detection 1. Optical Flow
2. Temporal Differencing
3. Background Subtraction
Optical Flow If an observer (a camera or a human eye) moves in a 3D scene, the pattern of motion of objects, surfaces, and edges in a visual scene caused by the relative motion between the observer and the scene is called optical flow. (a) A Rubik's cube on a rotating turntable; (b) Flow vectors calculated from comparing the two images of a Rubik's cube (Russell and Norvig, 1995) (a) (b) Optical Flow Advantages and Disadvantages Target detection and tracking can be performed using optical flow even though there is no prior knowledge about the background or when the camera is moving. Optical flow methods need very complex computations which cannot be coded in real-time algorithms without specialised hardware. Optical flow methods are sensitive to noise. Thus, many researchers usually do not prefer using optical flow methods for implementing real-time background generation techniques. Temporal Differencing In temporal differencing method, the arithmetic difference of corresponding pixels in the same physical locations in two (consecutive or non-consecutive) frames of an image sequence is obtained. The difference image contains non-zero values whenever objects have moved to another location but will be black when no moving object is detected.
The problems with consecutive frames are:
1. A pre-selected threshold is necessary to obtain the thresholded difference image. As a result, this method will be dependent on the video sequence and selected threshold.
2. If an object moves too slowly so that it is stationary in two consecutive frames, the object will not appear in the difference frame.
3. Extending the method to use more than two frames is difficult.
Background Subtraction The most important approach to identify and segment moving objects in a video stream is background subtraction, which involves computing a reference image for each new frame. Then, by comparing the next input frame with the reference image, regions of the image which have changed are identified. Thresholding the result produces a binary segmentation which is used for discriminating the moving objects from the background regions. D(x, y, t) = | I(x, y, t) B(x, y, t-1) | T(x, y, t) = Thresholded (D(x, y, t)) Due to several factors, the background image must be temporally adaptive and it should be updated continuously in order to be kept up to date. Background Subtraction Example Fld249 Backgnd248 Dif_249 Threshld_249 Thresh_Sizefilt249 Background Generation Requirements Illumination changes Gradual variations in lighting conditions. Sudden illumination changes (such as turning a lamp on or off in an indoor environment). Motion changes Camera oscillations. Small movements of background objects such as trees waving in the wind or sea waves, rain, snow, etc.
Background Generation Requirements (continued) Changes in the background geometry Introducing or removing objects in the scene (such as a door is opened and then is left opened or a parked car is moved on). An initialisation process is required by a number of algorithms. These problems and requirements are the constraints which should be considered as important characteristics by adaptive background removal algorithms. Background Removal Algorithms Background removal (subtraction) methods can be classified into two broad categories: non-recursive and recursive.
Non-recursive techniques A non-recursive technique uses a sliding-window approach which maintains a buffer to hold N previous input frames. The background image is estimated based on the temporal variations of each pixel in the buffered frames. A significant storage requirement is considered as the weakness of these techniques.
Recursive techniques Instead of using a buffer, in recursive techniques, a single background model is updated using each input frame recursively. Often more weights are given to most recent samples and as a result, input frames from past distance usually have less effect on the current background model. In comparison with non-recursive techniques, recursive techniques require much less storage. However, if an error object suddenly appears in the background image, it may remain in the background for a much longer period of time. Non-recursive Techniques Frame differencing Mean filter Mode filter Median filter Basic methods (i.e. mean, mode and median) with selectivity Each pixel in every input frame is classified as either a foreground or a background pixel. If the pixel is detected as a foreground point, it is ignored in the background update process. Least Median of Squares B t (x, y) = min b median t (I t (x, y) b) 2
Non-recursive Techniques (continued) Linear predictive filter The background is modelled by: 1. Pixel-by-pixel linear prediction (Weiner filter) using colour information 2. Region-level by region filling algorithm for dealing with background object relocation problem 3. Frame-level by model switching for detecting global illumination changes Non-parametric background model Background pdf is obtained by the histogram of the N most recent pixel values as follows:
where K(.) is a smoothing Gaussian kernel estimator. The pixel I t (x, y) is considered as a foreground point if there is little possibility that f (I t (x, y)) has such a distribution, i.e., f (I t (x, y)) is smaller than one global threshold value .
Non-recursive Techniques (continued) Standard mean-shift based estimation In this gradient-ascent method, the modes of a multimodal distribution are detected using their covariance matrix. The method uses an iterative approach so its step is decreased until it is converged. For n data points x i , i = 1. . . n in the d-dimensional space Rd, the multivariate mean shift vector computed with kernel g in the point x is given by:
where h is the kernel bandwidth. The major problems with the standard mean-shift method are that the algorithm is very slow and requires the amount of N (i.e. buffer length) * size (frame) memory. Eigenbackground subtraction
Recursive Techniques Running average B t+1 (x, y) = * I t (x, y) + (1 ) * B t (x, y) where 0 < < 1. is called the learning rate and is typically 0.05. Running average with selectivity B t+1 (x, y) = * I t (x, y) + (1 ) * B t (x, y) : if I t (x, y) is a background pixel B t+1 (x, y) = B t (x, y) : if I t (x, y) is a foreground pixel Approximated median filter The running estimate of the median is incremented by one if the input pixel is larger than the estimate, and decreased by one if smaller. This estimate eventually converges to a value for which half of the input pixels are larger than and half are smaller than this value, that is, the median. Kalman filter Recursive Techniques (continued) Running Gaussian average It models the background as a textured surface, each point of which is associated with a mean colour and a variance about the mean. It fits one Gaussian distribution (, ) over the histogram which gives the p.d.f of the background. It also applies the running average to update the background p.d.f as follows:
The method uses a threshold for partitioning the background pixels into visible and occluded points by examining the test: | I t (x, y) t (x, y) | > , where = k.
Recursive Techniques (continued) Mixture of Gaussians The intensity of each background pixel is adaptively represented by the summation of k weighted Gaussians. The MOG model maintains a density function for each pixel and as a result is capable of handling multimodal background distributions. The number of modes (i.e. k) is usually predefined from 3 to 5. The pixel distribution is modelled as a mixture of k Gaussians:
where is the i-th Gaussian component with intensity mean and standard deviation . is the portion of the data accounted for by the i-th component. For each input pixel I t (x, y), the component whose mean is closest to I t (x, y) is declared as the matched component if . At every new frame, the parameters of the matched component are then updated as
where is a user-defined learning rate with 0 1. is the learning rate for the parameters and can be approximated by / .
f (I t (x, y) = u) Recursive Techniques (continued) In order to determine whether I t (x, y) is a foreground or background pixel, all components are ranked by their . If i 1 , i 2 , ., i k is the component order after sorting, the first M components that satisfy the following criterion are declared to be the background components:
The advantages of Gaussian mixture models (GMMs) are: Dealing with lighting changes slow-moving objects introducing or removing objects from the scene.
The drawbacks of GMMs are: The number of the mixture components is pre-set and fixed-value. GMM approach for foreground segmentation is a time-consuming process due to estimating the number of parameters which are mostly determined by the number of mixture components. Also the application of GMMs for background subtraction requires an efficient method for learning the GMM parameters which are computationally expensive. Therefore, the selection of the number of components and the initialisation process are two important problems of the GMM algorithm for background subtraction. where is the weight threshold. Recursive Techniques (continued) Sequential kernel density approximation The density is represented by a weighted sum of Gaussians, whose number, weights, means and covariances are updated at each time step to include the new data into the model. For each mode, a Gaussian component is created whose mean is given by the mode location. The covariance of the Gaussian is also derived from the Hessian matrix which is computed at the mode location. It relies on the modelling and density modes which are propagated by adapting them with the new samples as follows:
Pdf (x) = (new_mode) + (1 ) (existing_modes)
Comparison of Background Removal Algorithms
Method
Non-Recursive (N) or Recursive (R)
Real-time
Required Memory
Frame Differencing N Yes Low Mean Filter N Yes High Mode Filter N Yes High Median Filter N Yes High Basic methods with selectivity N Yes High Least Median of Squares (LMedS) N Yes High Linear Predictive Filter N Yes Intermediate Non-parametric Background Model N No (Relatively Slow) High Standard Mean-Shift Based Estimation N No (Very Slow) High Eigenbackground Subtraction N No (Relatively Slow) Intermediate
Comparison of Background Removal Algorithms (continued)
Method
Non-Recursive (N) or Recursive (R)
Real-time
Required Memory
Running Average R Yes Low Running Average with Selectivity R Yes Low Approximated Median Filter R Yes High Kalman Filter R No (Almost Fast) Intermediate Running Gaussian Average R No (Almost Fast) Intermediate Mixture of Gaussians (MOG) R No (Relatively Slow) Intermediate Sequential Kernel Density Approximation R No (Relatively Slow) Intermediate