You are on page 1of 9

I.

INTRODUCTION

Motion Estimation for Moving


Target Detection

VISHAL MARKANDEY, Member, IEEE


ANTHONY REID, Member, IEEE
SHENQ WANG, Member, IEEE
Texas Instruments

This paper describes a suite of techniques for the autonomous


detection of moving targets by processing electro-optical sensor
imagery (such as visible or infrared imagery). Specific application
scenarios that require moving target detection capability are
described, and solutions are developed under the constraints
imposed by the scenarios. Performance evaluation results are
presented using a test data set of over 300 images, consisting
of real imagery (visible and infrared) representative of the
application scenarios.

Manuscript received November 16, 1992; revised December 15, 1995.


IEEE Log No. T-AES/32/3/05861.
Authors addresses: V. Markandey, Digital Video Products, Texas
Instruments, Dallas, TX 75265; A. Reid and S. Wang, Systems
Technology Center, Systems Group, Texas Instruments, Plano, TX
75086.
c 1996 IEEE
0018-9251/96/$10.00
866

The autonomous detection of moving targets using


electro-optical sensor imagery is a requirement in
several defense application scenarios. Examples of
such applications are: detection of moving targets
from a surveillance post, a missile flyby search for
ground-based moving targets, and the detection of
airborne targets from an airborne platform. Each
application imposes a unique set of constraints, and
the solution developed needs to take these constraints
into account. For example, the detection of moving
targets from a stationary platform can be addressed by
detecting the presence of motion in an image sequence
and compensating for sensor drift if necessary. On
the other hand, in applications that involve significant
sensor motion, motion detection alone is not enough.
Sensor motion creates apparent background motion
in the imagery, and this motion can be significantly
greater than the target motion. Compensation for
apparent background motion by image registration and
segmentation of motion information into target and
background motion is required in such a scenario.
The first step in the detection of moving targets
in image sequences is the detection and estimation
of motion from sensor data. The detection of motion
can be performed by simple operations such as image
differencing followed by thresholding, and is thus
attractive from a computational standpoint. However
its applicability is limited to scenarios where there
is no background motion. Motion estimation can
provide a richer source of information that can be
useful not only in target detection but also in the
follow-on activity of target tracking. Motion estimation
techniques that provide motion measurement (optical
flow) at every pixel location in the image plane
are particularly attractive as they can be useful in
the detection of low contrast targets. While such
techniques have been extensively developed and
studied in the fields of computer vision and image
processing, not much has been done in applying
them to moving target detection. We have applied
motion estimation and analysis techniques to various
application scenarios within the domain of moving
target detection. Each scenario imposes its own unique
constraints on the problem and the solution has to
take these constraints into account. This has led us
to develop several new results in motion estimation
and analysisprimary among them being robust
motion estimation in the presence of noise, and
multiresolution motion estimation for moving sensor
scenarios. We have also developed new techniques
to process the motion estimates to produce usable
measures of target detection. In this work we describe
our development of motion estimation techniques,
their application to moving target detection application
scenarios, and the results of testing on real imagery.
Please note that we do not address target tracking

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 32, NO. 3 JULY 1996

here. Our focus is on moving target detection. The


detection results can certainly be useful in tracking (as
say in initializing a tracker), but in this work we limit
ourselves to the detection problem alone.
The rest of this work is organized as follows. In
the rest of this section we describe the techniques
available in literature for both motion estimation and
moving target detection. In Section II we describe
motion estimation techniques that we have developed
for several moving target detection application
scenarios. In Section III we describe motion analysis
techniques that analyze the motion estimates provided
by Section II, to achieve moving target detection
functionality. In Section IV we provide results of
experimental testing of our techniques.
One of the common approaches to pixel level
motion estimation or optical flow computation, is
based on the brightness constancy assumption [5],
which states that if I(x, y, t) is the image intensity at
pixel (x, y) at time t then:
dI
= 0:
dt

(1)

Application of the chain rule to the term on the


left-hand side gives
Ix u + Iy v + It = 0:

(2)

Here Ix , Iy , and It are the partial derivatives of I


with respect to x, y, and t, respectively, and (u, v)
are the optical flow components. Horn and Schunck
[5] proposed an iterative technique for computing
optical flow. They solved (2) for (u, v) by imposing
a smoothness requirement on the flow field and
minimizing an error functional in terms of accuracy
and smoothness. A drawback of this technique from
the moving target detection standpoint is that it
smooths across motion boundaries of objects, smearing
motion discontinuities caused by occluding contours.
Other techniques have developed solutions to (2)
for (u, v) by alternate additional constraints. Wohn,
et al. [14] applied constancy assumptions to image
functions such as spatial gradient magnitude, curvature,
and moments. In similar work, constraints for solving
(2) were obtained by applying constancy assumptions
to image functions such as contrast, and entropy in
[9]. Schunck [11] developed a technique that solves (2)
after transforming it into a polar form. This form is
considered convenient for representing image flows
with discontinuities, as the polar equation will not have
-functions at discontinuities. Koch, et al. [7] addressed
the problem of smoothing across discontinuities by
using the concept of binary line processes which
explicitly mark the presence of discontinuities. The
line process terms are encoded as a modification of
the Horn and Schunck [5] minimization terms for
smoothness.
While (1) has been used as the basis of many
optical flow techniques as discussed above, it is not

a realistic assumption in many cases. It requires that


the image brightness corresponding to a physical
surface patch remain unchanged over time. This is
not true when points on an object are obscured or
revealed in successive image frames, or when an object
moves such that the light is incident at a given point
on the object from a different angle. This causes the
surface shading to vary. In view of this, Cornelius
and Kanade [2] developed a variation fo the Horn
and Schunck method. In their formulation (1) need
not hold true, and gradual changes are allowed in the
way an object appears in a sequence of images. This
is done by defining a smoothness measure for change
in brightness variation and an error measure between
(1) and (2), and minimizing a weighted sum of these
two quantities and the spatial gradient of the velocity
components defined by Horn and Schunck. Gennert
and Negahdaripour [3] expanded further on this idea.
Their approach allows a global linear transformation
between brightness values in consecutive images. The
approaches of Cornelius and Kanade, and of Horn
and Schunck, can be shown to be special cases of this
method. They provide qualitative results to show that
their technique performs better than the Horn and
Schunck, and Cornelius and Kanade techniques under
certain changing illumination conditions.
Instead of considering the brightness constancy
assumption (1) as the starting point of optical flow
computation, a gradient constancy requirement was
used in [4, 12]. This gradient constancy assumption is
embodied by the equation,
d
rI = 0
dt

(3)

where r is the spatial gradient operator over the image


plane. Equation (3) can be rewritten as
Ixx u + Ixy v + Ixt = 0

(4)

Ixy u + Iyy v + Iyt = 0

(5)

where (u, v) are the optical flow components. Optical


flow computation algorithms based on the gradient
constancy assumption require the solution of two
linear equations to compute the optical flow, a
computationally simple noniterative operation. By
contrast the brightness constancy based algorithms are
much more computationally complex (e.g., some of
them are iterative). Thus gradient constancy algorithms
have a computational advantage over brightness
constancy algorithms. On the other hand, gradient
constancy algorithms tend to be more noise sensitive
than brightness constancy algorithms because they
use second-order image derivatives while brightness
constancy algorithms use firt-order image derivatives.
The motion estimation techniques discussed
so far are relevant to target detection from a
stationary sensor or where the apparent background
motion in the imagery (induced by sensor motion)

MARKANDEY ET AL.: MOTION ESTIMATION FOR MOVING TARGET DETECTION

867

is of comparable magnitude to the target motion


as perceived in the imagery. However there are
application scenarios such as air-to-ground target
detection or air-to-air target detection, where the
camera-induced sensor motion is often significantly
greater than the targets of interest. This means that in
order to achieve moving target detection functionality,
image analysis must be precise enough to detect the
small differential in the apparent motions of target and
background in the image plane. Also, camera-induced
scene motion can be nonuniform across the image,
due to perspective effects and sensor maneuvering. An
example is closure, where the apparent background
motion due to sensor motion is zero at the focus of
expansion and has varying magnitude and direction in
different parts of the image. Motion estimation and
analysis techniques should be able to distinguish such
effects from the effects of target motion in imagery.
The simple optical flow computation techniques
discussed above prove inadequate in such cases, and
more sophisticated techniques are needed for the
computation of optical flow fields accurate enough to
represent the subtle variations due to target motion.
A technique developed by Burt, et al. [1] computes
estimates of motion at various scales, or levels of
image resolution, for the moving sensor moving
target detection problem. Initial motion estimates
computed at low resolution are used to register
imagery at successive levels of resolution and residual
motion estimates are computed at each resolution
level. Registration continues until the background is
completely registered and the only apparent motion in
a sequence is due to target motion. Image differencing
is then used to discern the moving target regions.
A shortcoming of this method is that, depending
on the differential between magnitudes of target
and apparent background motion, unless one knows
a priori when to stop the registration process, one
could register the target as well as background, so
that the moving targets are not detected in the final
difference imagery. Also, the technique may not
work when the magnitudes of target and apparent
background motion are similar (e.g., some cases of
panning sensors), or if part of the apparent background
motion is smaller in magnitude than target motion
(e.g., in case of closure).
The preceding discussion focused on the estimation
of motion, the first step of moving target detection.
The motion estimates have to be further processed to
produce the final output of moving target detection:
a target list containing centroid and bounding box
information of detected targets. For stationary sensors
(including sensors with drift), optical flow discontinuity
detection and histogram segmentation approaches
have been used by Russo, et al. [10] to demonstrate
the feasibility of performing moving target detection.
These techniques provide only a visual output, and not
an explicit target list.
868

II.

MOTION ESTIMATION FOR MOVING TARGET


DETECTION

All of the techniques described in Section I for


brightness constancy and gradient constancy based
optical flow computation have one shortcoming in
common: they do not take into account the fact that
in real applications, noise is invariably present in
the image intensity measurements and will lead to
violations of their constancy assumptions. Optical
flow fields computed from these techniques using real,
noisy data can therefore be noise sensitive. Attempts
are usually made in these techniques to reduce noise
effects by performing spatial smoothing operations on
the flow field estimates computed from the constraints.
Such attempts have limited applicability because of
their ad hoc nature.
The least squares technique proposed by Kearney
[6] differs from the approaches enumerated above, in
that it provides a formal, mathematical mechanism
to account for the presence of noise in image
measurements. This technique is based on the
assumption that the optical flow field is constant
within a spatial neighborhood of a pixel. A brightness
constancy constraint (2) is obtained from each pixel
in this neighborhood, leading to an overconstrained
system of linear equations of the optical flow
components, i.e.,
2
3
2I 3
Ix1 Iy1
t1
6
7
6I 7
6 Ix2 Iy2 7 u
6 t2 7
6
7
7
= 6
(6)
6 .
7
6 .. 7
.
6 .
7 v
.
4 . 5
. 5
4 .
Ixn

Itn

Iyn

where 1, 2, : : : , n are pixel indices. This is a linear


system of equations of the general form
A~x = ~b

(7)

where A and ~b are the measurements and ~x is


the parameter being estimated. The least squares
technique assumes that noise is present only in ~b (as
~b), and minimizes the cost function
LS = k~bk2F = kA~x ~bk2F

(8)

where k kF is the Frobenius norm, defined for an


M N matrix C as
v
uM N
uX X
kCkF = t
jCij j2 :
(9)
i=1 j=1

The least squares solution thus obtained is


~xLS = (AT A)1AT~b:

(10)

While this approach provides a mechanism to account


for the presence of noise in image measurements,
it is unfortunately of limited applicability because it

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 32, NO. 3 JULY 1996

assumes that the noise in (7) is present only in the


measurement of ~b and not in the measurement of A.
In the case of optical flow computation, both A and ~b
are noisy. Hence we have developed a more suitable
technique for estimating optical flow. This technique
is based on total least squares (TLS) [13]. It assumes
that in (7), measurements of both A and ~b are noisy,
leading to
(A + A)~x = ~b + ~b
(11)
where A and ~b are the noise terms. The cost
function minimized is
TLS = kA + ~bk2F :

(12)

Minimization of this function leads to the solution:


~xTLS = (AT A 2 I)1AT~b

(13)

where 2 is the minimum eigenvalue of


[A; ~b]T [A; ~b], [A; ~b] is the matrix formed by appending
~b to A, and I is a 2 2 identity matrix.
Please note that while our discussion of least
squares and TLS methods has focused on optical
flow computation based on the brightness constancy
constraint (2), it is equally applicable to optical
flow computation based on the gradient constancy
constraint (4), (5). In our experimental work we
have performed implementation and testing of
both brightness and gradient constancy based
approaches.

Fig. 1. Image pyramid.

of size n m pixels, it is successively reduced in size to


create smaller images which constitute components of
the image pyramid. Any of several techniques may be
used to obtain the pyramid components; the specific
technique used in our implementation is discussed in
the Appendix. The reduction factor that may be used
in creating the pyramid is variable in general, but has
A. Moving Sensor Scenario
been assigned the fixed value 2 in our implementation
for simplicity. Thus if the original image size is n m
As was mentioned in Section I, there are several
pixels, successive layers in the pyramid will be of
problems unique to the moving sensor scenario.
size n=2 m=2, n=4 m=4, : : : pixels. Fig. 1 shows an
Sensor-induced apparent background motion is
example of such an image pyramid.
often significantly greater than the target motion
Fig. 2 provides an overview of motion estimation
in the image, and image analysis has to be precise
using the multiresolution imagery. Pixel level motion
enough to detect the small differential in the apparent
estimation or optical flow computation begins at the
motions of the target and background in the image
top of the pyramid, corresponding to the smallest
plane. Also, the camera-induced scene motion can
image size (designated as pyramid layer 0). Any
be nonuniform, due to sensor maneuvering (such as
technique that provides a dense optical flow field
roll) and perspective projection. We have developed
(optical flow vector at every pixel in the image)
a technique for the moving sensor scenario that
may be used. The specific technique used in our
takes these issues into account. In this technique, the
implementation is described in the Appendix. Let the
multiresolution approach of [1] (please see Section
optical flow field computed at the top of the pyramid
I for details) is enhanced to estimate optical flow
be designated O00 .
at various levels of resolution and the optical flow
estimates from each resolution level are combined with
Having computed the optical flow field at the top
previous estimates to incrementally build the complete of the pyramid, computation proceeds to the next
flow field. A flow field segmentation technique is
layer of the pyramid (designated as layer 1). O00 is
then used to isolate the moving target regions. The
expanded to twice its size and each component of
following section describes the multiresolution optical O00 is multiplied by 2, leading to a new flow field O10 .
flow computation technique in detail. The flow field
The expansion technique is typically the inverse of
segmentation technique is described in Section III.
the process used for pyramid generation. The specific
The first step is the creation of multiresolution
technique used in our implementation is discussed
imagery from the original imagery. An image pyramid in the Appendix. Multiplication of components by a
generation scheme [1] is used for this. Given an image factor of 2 is necessary to account for the increased

MARKANDEY ET AL.: MOTION ESTIMATION FOR MOVING TARGET DETECTION

869

Fig. 2. Multiresolution optical flow computation.

pixel resolution with respect to the previous pyramid


layer.
O10 is used to warp the second image at layer
1 of the pyramid towards the first image. Warping
is performed on a pixel-by-pixel basis. The specific
technique used in our implementation utilizes image
interpolation to achieve subpixel accuracy. Details
of the technique are provided in the Appendix.
Residual optical flow is then computed between the
first image of layer 1 and the warped image. Let this
vector field be called O01 . The sum of O10 and O01
provides a complete estimate of optical flow at layer
1. Computation then moves to the next lower layer of
the pyramid. The optical flow field from the previous
layer is expanded to twice its size, each component
multiplied by 2, and the above steps are repeated.
This computation is continued until the bottom of the
pyramid, corresponding to the original image size, is
reached. The optical flow field available at this point
has been incrementally computed, with contributions
from each level of the pyramid. Let this flow field
be represented in terms of its component arrays as
(U, V). This flow field may then be subjected to further
processing, such as segmentation, to isolate regions
corresponding to moving targets. A segmentation
technique is described in Section III.
III. USING MOTION ESTIMATES FOR MOVING
TARGET DETECTION
Having computed the motion estimates by one
of the gradient-based methods discussed above, the
next step in moving target detection is the processing
of these motion estimates to generate a list of target
detections. Next we present a technique to realize this
functionality for imagery acquired from a stationary
sensor (sensor may have some drift). A technique for
the case of moving sensors is presented in Section
IIIB.
870

A. Stationary Sensor Scenario


Given the optical flow field consisting of estimates
(u, v) at every pixel location, a measure of motion
energy in a pixel neighborhood is estimated by
considering a circular region centered at that pixel and
computing the sum of contributions from individual
optical flow components in the region. For a given
pixel location, the average motion energy computed
from the surrounding circular region is
E=

N
1X 2
(ui + vi2 )
N

(14)

i=1

where the index i specifies individual pixel location in


the region, and N is the total number of pixels in the
circular region of summation. A circular summation
region is used to account for the fact that the direction
of target motion is not known a priori. The target may
have purely lateral motion, it may be moving towards
or away from the sensor, or it may skew or swerve with
respect to the sensor. The nondirectional nature of the
circular region makes it a robust choice to account for
this possible variability in target motion.
The value E computed above is assigned as the
score for the region. Regions are ranked according to
their motion energy scores and a cascade algorithm
developed in earlier work [8] is used for region
merging and splitting before the final list of potential
moving target regions is generated. The final output
is a ranked target list. Its content includes candidate
moving target regions with associated statistics
such as region centroids, size estimates, confidence
measures in relative motion scale, etc. Any useful
auxiliary information in the motion analysis can be
tabulated here for on-line study. The default number
of candidate regions on the list is set at ten which
meets the requirements specified in many real
applications.

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 32, NO. 3 JULY 1996

B. Moving Sensor Scenario


The technique described above for motion analysis
and target list generation is primarily suitable for
stationary sensors, or sensors that may have some
drift. It is not suitable for moving sensors as it relies
on detecting motion energy, and will pick up the
sensor-induced apparent background motion. We
have developed a separate technique for the case
of moving sensors. This new technique is general
enough that it can handle the case of stationary
sensors also, but it is more computationally complex
than the technique described above for stationary
sensors. Hence, from a computational standpoint, the
above technique is attractive when it is known that the
sensor is stationary. The new technique is based on
the detection of discontinuities in optical flow fields,
arising due to the presence of moving targets. The
technique consists of the following stages.
1) Discontinuity Detection: Given the optical
flow field (U, V), we first compute spatial derivatives
of its component arrays, (Ux , Uy , Vx , Vy ), where (x, y)
are spatial image coordinates. Any of various finite
differencing methods may be used for the spatial
derivative computation. Our implementation computes
these spatial derivatives by convolving the (U, V) arrays
with spatial derivatives of 2D Gaussians.
2) Initialization: Blob detection is performed on
the spatial derivative arrays (Ux , Uy , Vx , Vy ). A blob array
is initialized and for a given pixel location, if any of the
spatial derivative arrays has a value above a threshold,
then the blob array is assigned a value of 1 for that
pixel location, otherwise the blob array is assigned a
value of 0 for that pixel location. After the blob array
has been thus assigned values for all pixel locations,
the next step of labeling is invoked.
3) Labeling: Given the blob array from the blob
detection step, blob regions are formed by checking
the spatial connectivity of each pixel marked 1 in
the blob array to other pixels marked 1. If a pixel is
marked 1 and so is a neighboring pixel, then they are
both assigned the same label. If a pixel has a value
of 1 in the blob array but is spatially distinct from
the previous pixels with value 1, then the blob label
counter is incremented and it is assigned a new label.
At the end of this step, every pixel of the blob array
that was assigned a value of 1 in the previous step will
have a label associated with it. Spatially contiguous
pixels that had values of 1 in the previous step will
have the same label, while spatially distinct sets of
such pixels will have different labels. The label map
thus formed is then passed to the next step of
merging.
4) Merging: Blob size and spatial proximity
constraints are used to combine spatially proximate
blobs. This step is used to eliminate regions of small,
spatially fragmented blobs. The blobs thus created are
subjected to the next step of abstraction.

Fig. 3. Stationary sensor.

5) Abstraction: The minimum and maximum


spatial coordinates for a blob of each label specify
the bounding box coordinates of that blob. These
coordinates can be used to compute the centroid of
the target detection corresponding to that blob. A
target list is created consisting of each such target
detection centroid, bounding box dimensions, and a
confidence measure for the detection. The confidence
measure is the average value of the strength of the
spatial derivatives of the optical flow field within the
bounding box.
IV. EVALUATION RESULTS
In this section we present results of experimental
testing of the techniques described in the above
sections. Real data (visible and infrared) representative
of application scenarios was used to perform the
testing.
1) Stationary Sensor Scenarios: Here we describe
tests for the case of moving target detection from a
stationary sensor. The results of our first experiment
are shown in Fig. 3. Infrared imagery (812 m)
was used in this experiment. Two moving targets are
present in the field of view. Target 2 begins to enter
the field of view of the sensor in the second image.
Although only a portion of target 2 is visible in the
second image, it is detected by the algorithm.
The results of our second experiment are presented
in Fig. 4. The imagery in this experiment is also
infrared (812 m). This experiment tests the
long-range target detection capability of our technique.
A stationary as well as a moving target are present in
the field of view. The technique correctly detects the
moving target in all frames except the 12th where it
is completely obscured behind the stationary target.
The moving target reappears in the 13th frame and the
technique locks-on to it immediately.
Results of 10 other experiments are presented in
Table I. Each experiment used a sequence of multiple

MARKANDEY ET AL.: MOTION ESTIMATION FOR MOVING TARGET DETECTION

871

Fig. 5. Moving sensor.

Fig. 4. Stationary sensor.

TABLE I
Stationary Sensor Moving Target Detection
Sequence

Images

Targets

% Detection

irfr01a
irfr01b
irfr03
tvfr03
scen04
scen08
scen09
scen11
scen13
scen15

49
29
49
49
24
24
24
24
24
24

98
29
49
49
24
24
24
24
24
24

55
100
100
100
96
33
100
100
100
100

Fig. 6. Moving sensor.

Fig. 6 shows the result of our second experiment.


This visible radiation imagery was obtained from the
image database of the IEEE Workshop on Visual
Motion 1991. This imagery represents a situation
where the contrast between the background and
images, ranging from 24 to 49. The first column in
target is low, and so was used to test the contrast
Table I, labeled Sequence, refers to the name of
each individual image sequence. The next two columns, sensitivity of our technique. The helicopter has an
interframe displacement of 12 pixels, while the
respectively, provide the number of images in each
apparent background motion varies from 45 pixels at
sequence, and the total number of moving target in
the top of the image to 910 pixels at the bottom, due
the sequence. Typically there are one or two moving
to perspective projection. In Fig. 6, the result of target
targets per frame. The last column of Table I displays
detection is superimposed over the original image, in
the percentage of correct target detections achieved
the form of a bounding box enclosing the target.
by our technique for each sequence. The sequence
referred to as tvfr03 is a sequence of visible radiation
images while all the other sequences are composed of APPENDIX
infrared radiation (812 m) images.
2) Moving Sensor Scenarios: The result of our
Here we provide details of the multiresolution
first experiment in the case of moving sensors is
optical flow technique discussed in Section IIIB. The
shown in Fig. 5. The imagery is infrared (812 m)
components of this technique are as follows.
and was acquired from a downward-looking sensor
1) P yramid Generation: The original image is
mounted on an aircraft. The aircraft visible in the
the starting point for pyramid generation. Starting
image was flying directly below the sensor and at
at the bottom of the pyramid containing the given
approximately the same speed. In the imagery, the
image data (designated as layer p), each value in
apparent aircraft displacement is subpixel while the
the next pyramid layer (designated as layer p 1) is
apparent background displacement is approximately 10 computed as a weighted average of pixel values in
pixels between frames. In Fig. 5, the result of target
layer p within a 5 5 window. Each value in layer
detection is superimposed over the original image, in
p 2 is then computed from values in layer p 1 by
the form of a bounding box enclosing the target.
applying the same pattern of weights. The size of the
872

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 32, NO. 3 JULY 1996

weighting function is not critical and a 5 5 pattern


is used because it provides adequate filtering at low
computational cost. The weight values are selected to
provide an approximation to Gaussian filtering. The
filtering operation can be represented as
Ik1(i, j) =

2
2
X
X

quantities x + u t, y + v t may not correspond


to integer pixel locations. In such cases, bilinear
interpolation is used to compute the image intensity
values.
REFERENCES

w(m, n)Ik (2i + m, 2j + n)

[1]

m=2 n=2

(15)
where Ik (i, j) is the image intensity at pixel location
(i, j) in layer k of the pyramid, and w(m, n) is the
weighting function.
2) Optical Flow Field Computation: Any optical
flow computation technique that provides flow
estimates at pixel level resolution may be used.
Examples are several techniques based on the
brightness, and gradient constancy assumptions [27,
9], and techniques based on correlation or Fourier
transform based techniques [1]. The technique used in
our implementation is based on the gradient constancy
assumption and computes the optical flow estimate
(u, v) at every pixel by solving the following equations:
Ixx u + Ixy v + Ixt = 0

(16)

Ixy u + Iyy v + Iyt = 0

(17)

where the terms Ixx , : : : , Iyt represent spatio-temporal


derivatives of image intensity.
3) Expansion: Expansion is the inverse of the
filtering operation used to generate the image pyramid
described in Section IIIB. Expansion from layer k 1
to layer k of the pyramid is achieved by
Ik (i, j) =

2
2
X
X

m=2 n=2

w(m, n)Ik1

im j n
,
2
2

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

(18)
[10]

where Ik (i, j) is the image intensity at pixel location


(i, j) in layer k of the pyramid, and w(m, n) is the
weighting function. Note that this weighting function
is the same as that used for pyramid generation. Only
terms for which (i m)=2 and (j n)=2 are integers
are used in the above sum.
4) Warping: Given an image pair I1 and I2 and
the optical flow field O between them, the purpose
of image warping is to warp I2 back in the direction
of I1 on a pixel-by-pixel basis using the flow field
components of O. This is achieved by creating a new
image I2 :
I2 (x, y) = I2 (x + u t, y + v t)
(19)
where (u, v) represents the optical flow vector at
location (x, y), and t is the time interval between
image frames I1 and I2 . Note that as the vector
components (u, v) are typically real valued, the

[11]

[12]

[13]

[14]

Burt, P. J., et al. (1989)


Object tracking with a moving camera.
In Proceedings of the Workshop on Visual Motion, IEEE,
1989.
Cornelius, N., and Kanade, T. (1986)
Adapting optical-flow to measure object motion in
reflectance and X-ray image sequences.
In N. I. Badler and J. K. Tsotsos (Ed.), Motion:
Representation and Perception.
Amsterdam: North-Holland Press, 1986.
Gennert, M. A., and Negahdaripour, S. (1987)
Relaxing the brightness constancy assumption in
computing optical flow.
Memo 975, MIT AI Lab, 1987.
Girosi, F., Verri, A., and Torre, V. (1989)
Constraints for the computation of optical flow.
In Proceedings of the Workshop on Visual Motion, 1989.
Horn, B. K. P., and Schunck, B. G. (1981)
Determining optical flow.
In J. M. Brady (Ed.), Computer Vision.
Amsterdam: North-Holland Publishing, 1981.
Kearney, J. K. (1983)
Gradient-based estimation of optical flow.
Ph.D. dissertation, Dept. Computer Science, University of
Minnesota, Minneapolis, 1983.
Koch, C., et al. (1989)
Computing optical flow in resistive networks and in the
primate visual system.
In Proceeding of the Workshop on Visual Motion, 1989.
Merickel, M. B., Lundgren, J. C., and Shen, S. S. (1984)
A spatial processing algorithm to reduce the effects of
mixed pixels and increase the separability between classes.
Pattern Recognition, 17, 5 (1984), 525533.
Mitiche, A. (1984)
Computation of optical flow and rigid motions.
In Proceedings of the Workshop on Computer Vision:
Representation and Control, 1984.
Russo, P., Markandey, V., Bui, T. H., and Shrode, D. (1990)
Optical flow techniques for moving target detection.
In Proceedings of Sensor Fusion III: 3-D Perception and
Recognition, SPIE, 1990.
Schunck, B. G. (1988)
Image flow: Fundamentals and algorithms.
In W. N. Martin and J. K. Aggarwal (Eds.), Motion
UnderstandingRobot and Human Vision.
Boston: Kluwer Academic Publishers, 1988.
Tretiak, O., and Pastor, L. (1982)
Velocity estimation from image sequences with second
order differential operators.
In Proceedings of the International Conference on Pattern
Recognition, 1982.
Van Huffel, S., and Vandewalle, J. (1985)
The use of total linear least squares techniques for
identification and parameter estimation.
1985, 11671172.
Wohn, K., Davis, L. S., and Thrift, P. (1983)
Motion estimation based on multiple local constraints and
nonlinear smoothings.
Pattern Recognition, 16, 6 (1983).

MARKANDEY ET AL.: MOTION ESTIMATION FOR MOVING TARGET DETECTION

873

Vishal Markandey (S84M90) received a Bachelors degree in electronics and


communications engineering from Osmania University, India, in 1985, and a
Masters degree in electrical engineering from Rice University, Houston, TX, in
1988.
He is a Senior Member of the Technical Staff and Team Leader of the
Advanced Video Systems Team in Texas Instruments Digital Imaging Corporate
Venture Project. He is responsible for the development of algorithms and
architectures for video products based on the Digital Micromirror Device.
Mr. Markandey is a member of the Society of Motion Picture and Television
Engineers.

Anthony Reid (M81) received his B.S.E.E. degree (cum laude) from Rensselaer
Polytechnic Institute, Troy, NY, in 1970, the M.S.E.E. degree from Stanford
University, Stanford, CA, in 1971, and the Ph.D. degree from Southern Methodist
University, Dallas, TX, in 1994.
His engineering career started at Sandia Laboratories, Livermore, CA as
an MTS doing survivability analysis of U.S. strategic nuclear defense systems.
Later, he was an MTS at AT&T Bell Labs, Indian Hill, IL doing system design,
development and analysis of digital communication synchronization receivers,
and fault-tolerant time division networks for long distance call switching. Prior
to joining Texas Instruments (TI), he was a Senior Research Scientist with
R. R. Donnelly and Sons, Chicago, IL. He joined TI in 1984 where he is now a
Senior MTS and Branch Manager of Advanced Signal Processing in the Systems
Technology Center of the Advanced Technology Entity in Systems Group.

Sheng Wang (M79) received the B.S. degree from National Chiao Tung
University, Hsinchu, Taiwan, in 1970, the M.S. degree from University of
Connecticut, Storrs, in 1974, and the Ph.D. degree from State University of New
York, Buffalo, in 1979, all in electrical engineering.
From 1979 until 1982, he was an Imaging Scientist at Picker International, Inc.,
Cleveland, OH. In February 1982, he joined Bell Labs, Indianapolis, IN as an
MTS at the Advanced Communications Laboratory. Since May 1984, he has been
with Texas Instruments, Inc. and is currently a member of the Group Technical
Staff at TI Systems Group. His main interests are in the signal processing area,
in particular, DSP applications, image sequence analysis, target tracking, and
multiresolution analysis.
874

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 32, NO. 3 JULY 1996

You might also like