You are on page 1of 13

Leopoldo Armesto

Josep Tornero
Fast Ego-motion
Dept. of Control Systems Engineering,
Technical University of Valencia
Estimation with
Camino de Vera, s/n 46022, Valencia, Spain Multi-rate Fusion of
Inertial and Vision1
leoaran@isa.upv.es, jtornero@isa.upv.es

Markus Vincze
Automation and Control Institute
Vienna University of Technology
Gusshausstr. 27.29/361 A-1040, Vienna, Austria
vincze@acin.tuwien.ac.at

Abstract Kalman filter (EKF) (Dissanayake et al. 2001) or recently, the


Unscented Kalman filter (UKF) (Julier and Uhlmann 2002).
This paper presents a tracking system for ego-motion estimation
The Kalman filter gives a robust, optimal, recursive state es-
which fuses vision and inertial measurements using EKF and UKF
timation to fuse redundant information from different sensors.
However, both approaches assume that the probability distri-
(Extended and Unscented Kalman Filters), where a comparison of
bution function (pdf ) of the noise is Gaussian, which is not
their performance has been done. It also considers the multi-rate na-
true for non-linear systems. Other recent filtering methods are
ture of the sensors: inertial sensing is sampled at a fast sampling fre-
Particle Filters (PF) (Gordon et al. 19931 Doucet et al. 2001),
quency while the sampling frequency of vision is lower. the proposed
where the main advantage is that the pdf can be accurately ap-
approach uses a constant linear acceleration model and constant an-
proximated with a large number of particles with a cost 11N 2.
gular velocity model based on quaternions, which yields a non-linear
The most common approach to PF is the Sampling Impor-
model for states and a linear model in measurement equations. Re-
tance Resampling (SIR) (Smith and Gelfand 19921 Carpen-
sults show that a significant improvement is obtained on the estima-
ter et al. 1997). Another well-known approach is the Rao–
tion when fusing both measurements with respect to just vision or
Blackwellized PF (Doucet et al. 2000), which uses a PF for
just inertial measurements. It is also shown that the proposed system
some variables of the state and an EKF filter for other vari-
can estimate fast-motions even when vision system fails. Moreover, a
ables.
study of the influence of the noise covariance is also performed, which
In mobile robots, inputs and outputs have different sam-
aims to select their appropriate values at the tuning process. The set- pling rates. Proprioceptive sensors, such as encoders, gyros,
up is an end-effector mounted camera, which allow us to pre-define accelerometers, might be considered as inputs or outputs, de-
basic rotational and translational motions for validating results. pending of the considered approach, while exteroceptive sen-
KEY WORDS—vision and inertial, multi-rate systems, sensor sors, such as laser, sonar rangers, vision systems, etc., are usu-
fusion, tracking ally considered as outputs. Proprioceptive sensors are typically
sampled several times faster than exteroceptive sensors, see for
instance (Armesto and Tornero 20041 Huster and Rock 20031
Armesto et al. 2004). This is a problem that arises from in-
1. Introduction herent technological limitations of each type of sensor, com-
munication channels, processing cost, etc. A typical solution
One of the most common techniques for state estimation to overcome this problem is to increase the overall sampling
of non-linear discrete-time dynamic systems is the Extended period to the slowest one. However, it is well known that this

The International Journal of Robotics Research 1 This work has been supported by the Spanish Goverment (Ministerio de
Vol. 26, No. 6, June 2007, pp. 577–589 Ciencia y Tecnologia), research project PDI2000-0362-P4-05 and BIA2005-
DOI: 10.1177/0278364907079283 09377-C03-02 the Austrian Science Foundation (FWF) under grant P15748.
2
c 2007 SAGE Publications The paper was received on 10/04/2006, revised on 23/01/2007 and accepted
Figures 1, 4, 6–9, 11, 12 appear in color online: http://ijr.sagepub.com on 03/03/2007.

577
578 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Fig. 1. Applications to the tracking system for ego-motion estimation.

approach may decrease the overall system performance since tracking system is shown in 1(b). In addition, this tracking sys-
high frequency dynamics are missed due to the temporal dis- tem will accurately estimate the pose of the end-effector of a
cretization as a consequence of the Nyquist sampling con- manipulator, used for restoration tasks of old-building facades.
straint. This robot will be mounted on a mobile base and a lifting struc-
Multi-rate systems have been extensively treated in the last ture as shown in Figure 1(c). The purpose of this mechanical
four decades and it is possible to find many contributions deal- system is to extend the work area of the manipulator, although
ing with modelling (Albertos 19901 Khargonekar et al. 19851 it has the disadvantage of suffering from perturbations and/or
Kranc 19571 Tornero 1985) and analysis (Goodwin and Feuer oscillations.
1992) as well as control design (Tornero et al. 1999) of pe- This paper presents a precise model description for pose
riodic sampled-data systems. One of the most relevant mod- estimation where jerks and angular accelerations are treated as
elling techniques is the Lifting Technique (Khargonekar et al. noise, including the effects of centripetal accelerations. With
1985), where an isomorphism between a linear periodic system respect to our previous work (Armesto et al. 2004), this pa-
and an enlarged linear-time invariant (LTI) system is defined per is focused on the study of estimation of different tracking
via the lifting operator. Another interesting point of view for velocities in rotational and translational movements.
modelling multi-rate systems is the one provided by Tornero As a main contribution, fusion of vision and inertial mea-
(1985) and Longhi (1994), where two periodic matrices relate surements is performed with a generic multi-rate EKF (MR-
inputs and outputs according to the multi-rate sampling pat- EKF) and multi-rate UKF (MR-UKF), in order to deal with
tern. The main advantage of this approach with respect to the data at different sampling rates. This approach improves
lifting technique is that it is not restricted to linear systems and the overall performance with respect to single-rate methods.
it is implemented at the fastest sampling rate, and therefore it Moreover, this fusion concept is valid for other sensors such
is much more appropriate for “real-time” systems. as laser rangers and encoders in mobile robot localisation
This paper investigates a new, generic approach to multi- and map building (Armesto and Tornero 20041 Armesto et al.
rate tracking combining vision and inertial sensor data. The 2007).
fusion of vision and inertial measurements provides comple- In the paper, the influence of uncertainties associated to
mentary characteristics: visual sensing is very accurate at low both type of sensors has been studied. Results show that the
velocities while inertial sensors can track fast motions but suf- combination of vision and inertial data gives better estimation
fer from drift, particularly at low velocities. The set-up for this in a wide variety of situations (slow and fast motions).
application is an end-effector mounted camera together with Results have been obtained offline using Matlab, although
an inertial sensor based on accelerometers and gyroscopes. real-time results have also been obtained on an implemented
One of the motivations of this tracking system is to estimate version in Labview. Data, Matlab code and Videos can be
arbitrary motions of mobile robots or people. The application found in 2 at the Appendix.
is to use a monocular-camera vision system and an inertial
measurement unit (IMU) mounted on a pan–tilt unit (PTU), as
shown in Figure 1(a). In Figure 1(a), the system diagram of the
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 579

1.1. Related Work

Fusion of inertial and vision data is needed in many applica-


tions (Rehbinder and Ghosh 20011 Alves et al. 20031 Huster
and Rock 20031 Abuhadrous et al. 20031 Chroust and Vincze
20031 Panerai et al. 2000), specially for designing new “intel-
ligent” cameras with integrated low-cost accelerometers and
gyroscopes (MEMS). In Rehbinder and Ghosh (2001), iner-
tial and vision data (with a delay), acting at different sampling
rates, are used for the rotation estimation with an observer. For
modelling and calibration of vision and inertial sensors see Fig. 2. General multi-rate filtering scheme.
Alves et al. (2003), where an algorithm is presented to esti-
mate the internal calibration of the inertial sensor and also to
estimate the relative orientation between camera and inertial and vision sensors is also presented in Chai et al. (2002). The
system. An UKF approach of data fusion with vision and in- dynamic description of the system combines an acceleration
ertial sensors is described in Huster and Rock (2003). In Lobo model in the linear motion and a velocity model in the orienta-
and Dias (1998), a pose estimation algorithm is proposed to tion. The measurement model directly uses image coordinates,
integrate measurements from a stereo camera system and an which results in a nonlinear measurement model besides the
inertial sensor for mobile robots. Due to the restricted motion nonlinear dynamic model.
of the robot (only in a plane) the estimation routine is sim- Pose estimation based on inertial sensors using quaternions
plified significantly as only two positions and one angle have is treated in detail in Goddard and Abidi (1998) where dy-
to be estimated to determine the pose of the robot. Sandini and namic equations are derived and linearised. In Ude (1999),
co-workers investigated a stabilization gaze algorithm in order an approach is developed based on exponential mapping for
to obtain “steady-state” images (Panerai et al. 2000) using vi- predicting quaternions depending on angular velocities, in a
sion and inertial data. Their results show that stabilized images mathematical way. Another way to predict quaternions is the
improve the reactivity to changes of the environment. use of a so-called quaternion velocity (Chou 1992), which has
In Jekeli (2001) and Grewal et al. (2001) a thorough revi- two disadvantages, first it leads to a nonlinear measurement
sion of pose estimation is given based on Inertial/GPS fusion. equation, which results in poorer estimation results (Gurfill
These systems take into account several aspects regarding with and Kasdin 2002) and secondly it is only an approximation
Earth geodetic properties such as earth rotation (Coriolis ef- valid for low angular velocities. The approach of Ude (1999)
fect), Earth elliptic shape, complex gravity models, etc. Iner- has been extended in this paper, taking into account angular
tial/GPS fusion has many similarities with Inertial/Vision fu- accelerations in addition to angular velocities.
sion, although they might be applied in different environments.
In the present paper, we neglect Earth properties such those
used in Jekeli (2001) and Grewal et al. (2001), since the scope 2. Multi-rate Filtering
of our application is limited to smaller areas where the Earth
plain surface assumption holds. In this paper, a multi-rate filtering structure is used based on
Huster and Rock (2001, 2003) developed an algorithm to two different sampling frequency interfaces: multi-rate holds
estimate the pose of autonomous underwater vehicles. The (MR-Holds) and samplers (MR-Samplers), as schematized in
state description combines the position, velocity and accelera- Figure 2. These MR-Interfaces allow us to implement conven-
tion together with quaternion representation of the orientation. tional Prediction and Update steps at a the fastest sampling
The state also includes an estimation of the acceleration bias. rate. Multi-rate holds are used to interface inputs of the sys-
They analyzed an EKF, a two-step estimator and a UKF, which tem at the prediction step, while multi-rate samplers interface
has been shown to be the most appropriate filter for these vehi- outputs at the update step. It is interesting to note that the Pre-
cles. Their conclusion was that better estimation results were diction and Update steps run at the fastest frequency, while
obtained if the non-linearity is in the states only, with linear each signal is sampled at its own sampling rate.
measurement equations. In Gurfill and Kasdin (2002) similar In the context of mobile robot pose estimation, signals from
conclusions are given. Based on these conclusions, our ap- proprioceptive sensors such as encoders, accelerometers, gy-
proach uses a non-linear model for states and a linear model roscopes, etc. might be considered as inputs (ut ) or outputs
for the measurement equations. (zt ), while signals from exteroceptive sensors such as laser
In Rehbinder and Ghosh (2001), a nonlinear estimator for and sonar rangers, GPS or vision systems are always consid-
the fusion of inertial and vision systems based on geometric ered as outputs (zt ). If all measurements are considered as out-
properties has been developed. An architecture to estimate the puts, only MR-Samplers are required, where such considera-
ego-motion and the structure of an environment using inertial tions depend on associated dynamic models.
580 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Table 1. Multi-rate HOH primitive functions

Type Function
1n 23n 4
t6t j 6q
5 6
Interpolation (Lagrange) 3 4
u1t2 q40 t j6l 6t j 6q
u1t j6l 2
l40 q54l
1n 2 7 8n6l 7 8l 6
t6t j t6t j
Approximation (Bezier) 3 4
u1t2 n!
l!1n6l2!
17 t j 6t j 6n t j 6t j 6n
u1t j6l 2
l40
1n 1t6t j 2l
Approximation (Taylor) 3 4
u1t2 l!
ul2 1t j 2
l40

2.1. Multi-rate Holds Algorithm 1 MR-Hold


9

The mathematical background of multi-rate holds and sam- 1 MR-Hold uk 3 2 j 3 k3 j


plers is described in our previous contributions (Tornero et al. 2 if uk is sampled then
19991 Tornero and Tomizuka 2000, 2002). A multi-rate hold 3 shift out u j6n and shift in uk from 2 j 1
is a hybrid device for generating, from a sequence of inputs
4 u3 k 4 uk 1
sampled at a slow sampling rate, a continuous signal which
is discretized at a high sampling rate. In these contributions, 5 j 4 k1
multi-rate zero, first and second order holds (MR-ZOH, MR- 6 else
FOH and MR-SOH) were presented. Later on, in Armesto and
7 u3 k 4 01
Tornero (2003), a wide variety of holds were obtained based
on primitive functions. The idea behind Armesto and Tornero 8 for l 4 0 to n do
(2003) is to generate an extrapolated continuous signal based 9 retrieve u j6l from 2 j 1
on previous input samplings 8u1t j 23 u1t j61 23 4 4 4 3 u1t j6n 29, uni- 10 u3 k 4 u3 k 7 fl 1kT3 u j6l 3 1 j 6 l2T 21
formly distributed or not, where j represent each time instant
an input is sampled. 11 end
12 end
1
n
3 4
u1t2 fl 1t3 u1t j6l 23 t j6l 2 (1) 13 return u3 k 3 2 j and j1
l40

where u1t j 2 denotes an input that has been sampled at time in-
2.2. Multi-rate Samplers
stant t j , where t j6n 5 4 4 4 5 t j 5 t. The primitive function
fl 1t2 generates the continuous signal u1t2, 3 which afterwards A multi-rate sampler is used to interface outputs of a system
is discretized at any desired sampling rate T to provide u3 k , sampled at different sampling rates. As a result, it generates a
with t 4 kT . In order to implement the multi-rate hold, a shift size-varying signal at a fast sampling rate that combines mea-
register 2 j 4 8u j 3 4 4 4 3 u j6n 9 is required to maintain the his- surements at different samplings. For a given time-instant, the
tory of the signal. Algorithm 1 implements the (discrete-time) multi-rate sampler appends to the measurement vector ysk the
multi-rate hold based on a general primitive function, where a measurement yi3k if and only if the sensor has performed a
constant base period T is assumed. If an input arrives during valid acquisition, thus yi3k
ysk , if yi3k is available (or sam-
sampling period T , lines 3 to 5 are executed, otherwise lines 7 pled), where i denotes the ith measurement.
to 10 perform an extrapolation process based on the registered Although this concept is not new, it is important in sensor
history of the signal 2 j , while j represents the time instant fusion techniques to coherently integrate measurements at dif-
of the most recent sample added to 2 j . Asynchronous holds, ferent sampling rates into the estimation.
with variable sampling periods, can also be found in Armesto Take Figure 3 as an example of periodic sampling with N 4
(2007). 6, where N is the periodicity ratio within the frame-period,
Table 1 summarizes some primitive functions that that can the sampling period where signals are periodically repeated.
be used in multi-rate holds. In particular, for Taylor holds, According to this, the resulting size-varying vector ysk is:
derivatives are computed using the backward approximation:
ysj N 4 [y13 j N 3 y23 j N ]T 3 ysj N 71 4 [ ]3
un612 1t j 2 6 un612 1t j61 2
un2 1t j 2 4 (2)
t j 6 t j61 ysj N 72 4 [y23 j N 72 ]3 ysj N 73 4 [y13 j N 73 ]3
where n2 means the nth derivative. ysj N 74 4 [y23 j N 74 ]3 ysj N 75 4 [ ]4 (3)
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 581

Algorithm 2 Multi-rate Extended Kalman Filter.


9

1 MR-EKF xk61 k61 3 Pk61 k61 3 Qk61 3 Rk 3 uk61 3 yk 3 2 j 3 k3 j


2 [u3 k61 3 2 j 3 j] 4 MR-Hold 1uk61 3 2 j 3 k 613 j21
// Multi-rate output sampler
3 ysk 4 1 hs 4 1 Rsk 4 1
Fig. 3. Sampling example.
4 for i 4 1 to length 1yk 2 do
5 if sensor i is sampled then
2.3. Multi-rate Extended Kalman Filter 6 append in rows yi3k to ysk 1
7 append in rows hi [ ] to hs [ ]1
A general discrete-time stochastic non-linear system can be
described as: 8 append in rows and columns Ri3k to Rsk 1
9 end
xk 4 f xk61 3 uk61 3 wk61 (4)
10 end

yk 4 h [xk ] 7 vk 4 (5) 6f[ ]
11 fx 4 6xk61 x3
1
k61 k61
Equations (4) and (5) describe the dynamic behaviour of the 6f[ ]
12 f7 4 6w 1
system, with f [ ] the non-linear vector-valued function relat-
k61 x3
k61 k61
s
ing the system dynamics, and h [ ] the non-linear output mea- 13 hsx 4 6h6x[ ] 1
k x3 k k61
surement equations. The state vector xk , the system noise wk
and measurement noise vk are assumed to be variables with 14 x3 k k61 4 f x3 k61 k61 3 u3 k61 3 0 1
Gaussian distribution, with covariances Pk , Qk and Rk respec- 15 Pk k61 4 fx Pk61 k61 fTx 7 f7 Qk61 f7T 1
tively.
16 if length 1ysk 2 8 0 then
The multi-rate Extended Kalman filter can be implemented 61
as described in Algorithm 2. First the multi-rate hold estimates 17 Kk 4 Pk k61 1hsx 2T hsx Pk k61 1hsx 2T 7Rsk 1
the control input (line 2), if available. Second, the multi-rate 18 x3 k k 4 x3 k k61 7Kk 1ysk 6hs [3xk k61 ]21
sampler is implemented so that it generates a size-varying out- 19 Pk k 4 Pk k61 6Kk hsx Pk k61 1
put vector ysk , output equations hs and output covariance Rsk
(lines 3 to 10, where sub-index i represent the element within 20 else
a vector/matrix). Lines 11 to 15 implement the conventional 21 x3 k k 4 x3 k k61 1
EKF prediction equations, while EKF update (lines 21 and 22) 22 Pk k 4 Pk k61 1
is only produced if any output has been sampled (length of ysk
23 end
is greater than zero). Otherwise, the prediction is taken as the
best guess for the next iteration. 24 return xk k 3 Pk k 3 2 j and j1

2.4. Multi-rate Unscented Kalman Filter

In a similar way, a Multi-rate Unscented Kalman filter can be ak . The orientation is represented with quaternions qk and an-
implemented as described in Algorithm 3. First the multi-rate gular velocities 1k . Previous results (Huster and Rock 20011
hold estimates the control input (line 2) and the multi-rate sam- Chroust and Vincze 2003) have shown an improvement if the
pler is implemented as before (lines 3 to 10). The Unscented biases of the acceleration measurements bk are included and
Transform is implemented in lines 11–22, prediction equations estimated on-line. The output vector yk is formed with the
measured accelerations am k and angular velocities 1 k from the
m
of the Unscented Kalman Filter are implemented in lines 23 m
to 30 and update equations are implemented in lines 32–43. inertial sensor and the Cartesian positions pk and quaternions
Again, if no measurement has been received, the prediction is qmk from the vision system, which have been obtained after the
taken as the best guess for the next iteration (lines 45 and 46). image processing procedure. Jerks jk , in the Cartesian coor-
dinates, angular accelerations 2k and velocity biases bk are
considered as the system noise. As usual, the output vector has
3. Motion model associated a measurement noise vector 3 k . In this case, the
system has no inputs, since command movements are assumed
The state of the tracking system is composed of position and to be unknown.
orientation variables. Position is described with Cartesian po-
sitions pk together with their velocities vk and accelerations xk 4 [pT vT aT bT qT 9
 T ]kT (6)
582 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Algorithm 3 Multi-rate Unscented Kalman Filter. 37 for i 4 0 to 2n do


9
yy yy s s T
1 MR-UKF xk61 k61 3 Pk61 k61 3 Qk61 3 Rk 3 uk61 3 yk 3 2 j 3 k3 j 38 Pk k614 Pk k617Wi 7i3k k616 y3 sk k61 7i3k k61
6 y3 sk k61 1
xy xy s T
2 [u3 k61 3 2 j 3 j] 4 MR-Hold1uk61 3 2 j 3 k 613 j21 39 Pk k614 Pk k617Wi 4i3k k616 x3 k k61 7i3k k616 y3 sk k61 1
// Multi-rate output sampler 40 end
Kk 4 Pk k61 1Pk k61 261 1
xy yy
3 ysk 4 1 h 4 1
s
Rsk 4 1 41
4 for i 4 1 to length 1yk 2 do 42 x3 k k 4 x3 k k61 7Kk 1ysk 6 y3 sk k61 21
yy
5 if sensor i is sampled then 43 Pk k 4 Pk k61 6Kk Pk k61 KkT 1
6 append in rows yi3k to ysk 1 44 else
7 append in rows hi [ ] to h [ ]1 s 45 x3 k k 4 x3 k k61 1
8 append in rows and columns Ri3k to Rsk 1 46 Pk k 4 Pk k61 1
9 end 47 end
10 end 48 return xk k 3 Pk k 3 2 j and j1
4 5
11 x 4 a
x3 k61 k61
T
0 0 T
1
  yk 4 [1am 2T 11m 2T 1pm 2T 1qm 2T ]kT (7)
Pk61 k61 0 0
  wk 4 [jT 2T bT ]kT (8)
 
12 P 4 
a
0 Qk61 0 1
  3k 4 [3 aTm 3 9T m 3 Tpm 3 qTm ]kT 4 (9)
s
0 0 Rk
 The acceleration of a system subject to a rotational and
13 3 4 Pa  40a 4 xa  W0 4
1n7
21 translational motion is an inertial frame is a 4 at 7 1  v,
14 for i 4 1 to n do where at is the tangential acceleration. The derivative of the
15 4ia 4 xa 73i 1 Wi 4 1 121n7
221 acceleration is:
16 4i7n
a
4 xa 63i 1 Wi7n 4 1 121n7
221 da dat d1 dv
a 4 4 7 v71
17 end dt dt dt dt
18 for i 4 0 to 2n do 4 j 7 2  v 7 1  a4 (10)
19 4i3k61 k61 4 4i31:
a
n1 Thus, the discrete-time dynamic equations for the accelera-
20 5i3k61 4 4i3n71: n7g 1
a
tion, velocity and position are:
21 6i3k
s
4 4i3n7g71:
a
n7g7r 1
T2 T3 
22 end pk71 4 pk 7 T  vk 7  ak 7 ak (11)
2 6
23 x3 k k61 4 01 Pk k61 4 0
T2 
24 for i 4 0 to 2n do vk71 4 vk 7 T  ak 7 ak (12)
2
25 4i3k k61 4 f 4i3k61 k61 3 uk61 3 5i3k61 1
ak71 4 ak 7 T  a k
26 x3 k k61 4 x3 k k61 7Wi 4i3k k61 1
27 end 4 ak 7 T  1jk 7 2k  vk 7 1 k  ak 2 (13)
28 for i 4 0 to 2n do
T where T is the sampling period.
29 Pk k61 4 Pk k61 7Wi 4i3k k616 x3 k k61 4i3k k616 x3 k k61 Biases are supposed to be static, but unknown, thus their
30 end dynamic equations are assumed to be:
31 if length 1ysk 2 8 0 then
bk71 4 bk 7 T  b k 4 (14)
yy xy
32 y3 sk k61 4 01 Pk k61 4 01 Pk k61 4 01
For the representation of the orientation we use quaternions,
33 for i 4 0 to 2n do
see Chou (1992) for definition and calculation rules. The dy-
34 7i3k k61
s
4 hs [4i3k k61 ]76i3k
s
1 namic description of a quaternion is given by Ude (1999):
35 y3 sk k61 4 y3 sk k61 7Wi 7i3k k61
s
1  
4k
36 end qk71 4 qk71 k  qk 4 exp  qk (15)
2
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 583

where,
 7 8 
 

 cos  12 k 
 
  
 7 8 3  4k  54 0
4k   1k
exp 4 sin  12 k   1 (16)
 k
2 


 4 5
 T
1 0 0 0 3  4k  4 0

and  is the quaternion multiplication (Chou 1992), 4k is the


increment of the rotation and qk 4 [q0 q1 q2 q3 ]kT . The norm
of both quaternion qk71 k and qk is 1 and therefore, the norm Fig. 4. Experiment set-up.
of the predicted quaternion is also 1.
Now, consider that 4k is:
1 equation. A further important issue is the influence of gravity
4k  1k  T 7  2k  T 2 (17)
2 on the measured accelerations. At present, the implemented
which is valid for low sampling periods. solution rotates the gravity vector g 4 [0 0 6 g]T around
The prediction of quaternions if 4k 4 0 then qk71 4 qk , the estimated orientation and subtracts this from the measured
otherwise can be expressed as, acceleration.
 9 

cos 1k 7 T2 2k  T2
 7  4. Experimental Results
qk71 4   8
 sin 2k 7 T2 3k  T2 9

  qk (18)

 T

  1 k 7 T
2
2k
2 k 7 2 3k  The proposed tracking system is tested with a predefined set of
   rotational and translational movements at different speeds. In
 T  T
qk71 4 qk  cos 1k 7 2k 


particular, rotational movements consist of a sequence of turns
2 2 in roll  , pitch  and yaw  angles:
  9 

T sin 1k 7 T2 2k  T2 Rz 12, Rz 1622 and Rz 12.


7 k  1 k 7 2k    3 (19)
2 1k 7 T 2k 
2 Rx 16 2, Rx 12 2 and Rx 16 2.
where,   R y 16 2, R y 12 2 and R y 16 2.
6q1 6q2 6q3
 
 q0 q3 6q2  Similarly, translational movements consist of:
 
k 4   4 (20)
 6q3 q1 
 q0 Tx 16x2, Tx 12x2 and Tx 16x2.
q2 6q1 q0 k T y 1y2, T y 162y2 and T y 1y2.
The prediction of the angular velocity 1k is simply per-
formed as Four different speeds have been tested using different ve-
1k71 4 1 k 7 T  2k 4 (21) locities (percentages of the maximum velocity of KUKA
KR15/2 robot): 10%, 30%, 50% and 75% where the system
Finally, the output equations (5), are linear: has been tested. The system is composed of a firewire camera
(Allied Dolphin F145C) with a sampling period of TV 4 80ms
yk4 h [xk ] 7 vk 4 H  xk 7 vk (22)
and an inertial sensor (XSens-MT9B) with a sampling period
 
of TI 4 10ms, as shown in Figure 4(a). See Extension 3 in the
033 033 I33 I33 033 034
  Appendix.
 
033 033 033 033 033 I34  Pose estimations are computed by detecting squared fea-
H 4 

 (23)
 tures on a grid image, as shown in Figure 4(b), using the IMAQ
 I33 033 033 033 033 034 
 Toolkit of Labview and the Zhang (1998) method. This sim-
043 043 043 043 I43 044 plified approach allows us to compute the 3D pose indepen-
dently from previous estimations, since positions of grid points
where am k and 1 k in yk have been properly rotated based on
m
are assumed to be known. In addition, this approach allows us
the estimated orientation to avoid non-linearities on the output to focus the research on analyzing the effects of the multi-rate
584 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Fig. 5. Previous experiment set-up.

fusion as well as the influence of the covariance Rk on the es-


timation. Other well-known approaches of Computer Vision
techniques have been used previously in our application based
on artificial landmarks and common office objects (Vincze et Fig. 6. Covariance influence over the performance index Jq
al. 20011 Bradski 20001 Gemeiner et al. 2006) as shown in Fig- under different rotational movements with EKF.
ures 5(a) and 5(b).
A performance index is used to quantify the estimation er-
ror, that is the mean quadratic error between estimated posi-
tions and orientations and movement references, known com-
pletely known for this set-up:

1 1
N
Jp 4 1pre f 6 p22 3
N i40

1 1
N
Jq 4 1qre f 6 q22 4 (24)
N i40

Figures 6 and 7 show the performance indexes for differ-


ent covariance values of Rq and R2 related to the rotational
movements using EKF and UKF estimation. It can be appre-
ciated that for low speed motions the estimation decreases (in-
dex increases) with assumed low uncertainty on 1 and high
uncertainty on q. Thus, it is shown that, at low speeds, the esti-
mation relies on the vision system, as expected. In contrast, for
high speed motions, estimation mainly relies on inertial data.
It is interesting to remark that the numerical results do not just
demonstrate that vision is crucial for low motion movements Fig. 7. Covariance influence over the performance index Jq
and inertial is very important for faster movements, but also under different rotational movements with UKF.
give guidelines regarding the tuning process of covariance val-
ues in terms of accuracy of the estimated motions. These re-
sults highlight the benefits of fusing both inertial and vision Similarly, Figures 8 and 9 show the performance index for
data, since the sensors are complementary, and good perfor- different covariance values of p and a, related to translational
mance can be obtained when an intermediate trade-off is con- movements. For low speed experiments (10% and 30%) there
sidered for both sensors. Outside of the calculated bounds of is a region with higher covariance values for vision and lower
covariance, the estimation becomes numerically unstable in for inertial where the estimation is very poor. This is because
most cases, and they are not reported. It can also be appre- the influence of vision is lower and consequently biases of ac-
ciated that the UKF presents some peaks in the performance celerations are not correctly estimated. Slow motions are af-
index, which do not appear in the EKF, which indicates that fected by biases of acceleration measurements, while for fast
UKF has worse numeric behavior than EKF. motions, they can be neglected. The main difference with re-
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 585

Fig. 10. Blurred images examples.

 !
 1063  I33 3 1064  I33 3 "
Rk 4 diag (25)
 1067  I 3 1066  I #
33 44

 !
 047447  I33 3 0438  I33 3 "
Qk 4 diag 4 (26)
 0419  1066  I #
Fig. 8. Covariance influence on the performance index J p un- 33
der different translational movements with EKF.
Figures 11(a), 11(b), 11(c) and 11(d) show the estimation
results for rotational movements using EKF and UKF. Both
filters give nearly the same results for selected covariance val-
ues, and therefore their responses overlap.
In Figures 11(a) to 11(d), vision measurements are repre-
sented with dots1 large gaps between dots are due to failures
of the detection process of the vision system, which are much
more frequent with faster motions. Figure 10 shows the cap-
tured blurred image from the vision system on a detection fail-
ure. It can be appreciated that the number of detected features
is too low to provide an accurate pose estimation and therefore
the image is rejected. Despite this fact, the fusion integration of
inertial measurements aims to re-construct the correct motion.
Similarly, results for translational movements are depicted
in Figures 12(a), 12(b), 12(c) and 12(d), where the same con-
clusions as before can be derived.
According to these results, for this particular application,
EKF gives better performance than UKF, since both filters pro-
vide nearly the same estimation, but the computational cost of
UKF is about 7 times higher.
Finally, a study of the benefits of fusing vision and inertial
measurements is performed. In that sense, the performance in-
Fig. 9. Covariance influence on the performance index J p un- dex is also calculated for pure inertial or pure vision. Figure 13
der different translational movements with UKF. shows that the fusion of inertial and vision gives better perfor-
mance results than single estimations. Fusion introduces more
benefits to pure inertial than to pure vision estimation. This is
mainly due to the double integration performed on the inertial
measurements where bias correction could not be performed.
spect to rotational movements is that, in this case, vision is
much more crucial, since accelerations require a double inte-
gration to compute the pose, while angular velocities require 5. Conclusions
only a single integration to compute the orientation.
Based on the results obtained we have selected the follow- The tracking of fast movements is a difficult task, particularly
ing values for Rk and Qk common to all velocities: if it is performed with a pure vision or pure inertial system.
586 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Fig. 11. Estimation results of rotational movements (continuous lines) and with vision measurements (dots).

This problem can be solved if both sensors are fused, since fusion for mobile robot self-localization and map building
they provide complementary properties. using an asynchronous multi-rate FastSLAM (Armesto et al
In this paper, we have presented a tracking system for 2007).
ego-motion estimation, which can recover vision failures dur- In addition to this, the paper investigates the influence of
ing fast motions. The fusion is performed by considering a the covariance matrices of noise. Conclusions from this analy-
EKF and UKF with multi-rate sampling of measurements. sis lead to determination of appropriate tuning values between
They have shown to be robust enough for this application and vision and inertial measurements. This aspect is crucial since
therefore they have been tested alongside other non-Gaussian it has a direct influence on the estimation performance. It has
filters. In both fusion techniques (EKF and UKF), each sensor been shown that a common set of covariance values exist that
is sampled at the highest frequency at which they can provide give good performance over a range of motion speeds. A set-up
measurements. The approach considered in this paper uses based on an industrial robot arm has been used to validate the
multi-rate holds and samplers to interface signals at differ- estimation. This set-up allows us to pre-define basic rotational
ent frequencies. Estimations with MR-EKF (multi-rate EKF) and translational motions, which can be combined to generate
and MR-UKF (multi-rate UKF) provided very similar results, complex motions.
without significant differences between them. The computa- Future research is oriented towards estimating even more
tional cost of UKF is about 7 times higher than the cost of complex motions such as natural human movements. In ad-
EKF computation. dition to this, other fusion techniques such as Particle Filters
This approach has not only been validated in Vision/Inertial will be tested as well as SLAM techniques to estimate the pose
fusion, but also has been recently validated in Laser/Encoder and the structure (map). In the context of the RESTAURO Re-
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 587

Fig. 12. Estimation results for translational movements (continuous lines) and with vision measurements (dots).

Table 2. Multimedia Extensions of Fast Ego-Motion Esti-


mation with Multi-rate Fusion of Inertial and Vision.
Extension Media Type Description
Extension 1 Data Raw-data of experiments.
Extension 2 Code Matlab code.
Extension 3 Videos Videos of pre-defined
movements

References

Abuhadrous, I., Nashashibi, F., and Laurgeau, C. (2003). 3D


land vehicle localization: a real-time multi-sensor data fu-
Fig. 13. Mean quadratic error versus movement power. sion approach using RT MAPS. International Conference
on Advanced Robotics, pp. 71–76.
Albertos, P. (1990). Block multirate input-output model for
sampled-data control systems. IEEE Transactions on Au-
search Project, fusion of Vision/GPS/Inertial data will be com-
tomatic Control, AC-35(9): 1085–1088.
bined for better estimation, improving such required tasks such
Alves, J., Lobo, J., and Dias, J. (2003). Camera-inertial sen-
as collision detection and avoidance, defects detection and re-
sor modelling and alignment for visual navigation. Interna-
covery, and feedback position control.
tional Conference on Advanced Robotics, pp. 1693–1698.
Armesto, L. I. (2007). An asynchronous multi-rate approach
to probabilistic self-localisation and mapping. Accepted in
Appendix Robotics and Automation Magazine.
Armesto, L. and Tornero, J. (2003). Dual-rate high order holds
The multimedia extensions to this article can be found online based on primitive functions. American Control Confer-
by following the hyperlinks from www.ijrr.org. ence, pp. 1140–1145.
588 THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH / June 2007

Armesto, L. and Tornero, J. (2004). SLAM based on Kalman Huster, A. and Rock, S. (2003). Relative position sensing
filter for multirate fusion of laser and encoder measure- by fusing monocular vision and inertial rate sensors. In-
ments. IEEE International Conference on Intelligent Ro- ternational Conference on Advanced Robotics, pp. 1562–
bots and Systems, pp. 1860–1865. 1567.
Armesto, L., Chroust, S., Vincze, M., and Tornero, J., (2004). Huster, A. and Rock, S. (2001). Relative position estimation
Multi-rate fusion with vision and inertial sensors. Interna- for interventioncapable AUVS by fusing vision and iner-
tional Conference on Robotics and Automation, pp. 193– tial measurements. International Symposium on Unmanned
199. Unthethered Submersible Technology.
Bradski, G. (2000). An open-source library for processing im- Jekeli, C. (2001). Inertial Navigation Systems with Geo-
age data. Dr Dobbs Journal, November. detic Applications. Walter de Gruyter. John Wiley & Sons,
Carpenter, J., Clifford, P., and Fernhead, P. (1997). An im- Canada.
proved particle filter for non-linear problems. Technical Re- Julier, S. and Uhlmann, J. (2002), Reduced sigma points filters
port, Department of Mathematics, Imperial College. for the propagation of means and covariances through non-
Chai, L., H., Hoff, W.A., and Vincent, T. (2002). Three- linear transformations. American Control Conference, Vol.
dimensional motion and structure estimation using inertial 2, pp. 887–892.
sensors and computer vision for augmented reality. Pres- Khargonekar, P., Poolla, K., and Tannenbaum, A. (1985). Ro-
ence: Teleoperators and Virtual Environments, 11(5): 474– bust control of linear time-invariant plants using periodic
492. compensation. IEEE Transactions on Automatic Control,
Chou, J. (1992). Quaternion kinematic and dynamic differen- AC-30: 1088–1985.
tial equations. IEEE Transactions on Robotics and Automa- Kranc, G. (1957). Input-output analysis of multirate feedback
tion, 8(1): 53–64. systems. IEEE Transactions on Automatic Control, AC-3:
Chroust, S. and Vincze, M. (2003). Fusion of vision and in- 21–28.
ertia data for motion and structure estimation. Journal of Lobo, J. and Dias, J. (1998). Integration of inertial information
Robotics Systems, 21(2): 73–83. with vision. IEEE Industrial Electronic Society, pp. 1263–
Dissanayake, M, Newman, P., Clark, S., Durrant-Whyte, H., 1267.
and Csorba, M. (2001). A solution to the simultaneous lo- Longhi, S. (1994). Structural properties of multirate sampled
calization and map building (SLAM) problem. IEEE Trans- systems. IEEE Transactions on Automatic Control, 39(3):
actions on Robotics and Automation, 17(3): 229–241. 692–696.
Doucet, A., de Freitas, N., Murphy, K., and Russell, S. (2000). Panerai, F., Metta, G., and Sandini, G. (2000). Visuo-
RaoBlackwellised particle filtering for dynamic Bayesian inertial stabilization in space-variant binocular sys-
networks. In Uncertainty in Arti2cial Intelligence. tems. Robotics and Autonomouns Systems, 30(1-2): 195–
Doucet, A., Gordon, N., and Krishnamurthy, V. (2001). Par- 214.
ticle filters for state estimation of jump Markov linear sys- Rehbinder, H. and Ghosh, B. (2001). Multi-rate fusion of vi-
tems. IEEE Transactions on Signal Processing, 49(3): 613– sual and inertial data. International Conference on Multi-
624. Sensor Fusion and Integration for Intelligent Systems, pp.
Gemeiner, P., Armesto, L., Montes, N., Tornero, J., Vincze, M., 97–102.
and Pinz, A. (2006). Visual tracking can recapture after fast Smith, A. and Gelfand, E. (1992). Bayesian statistics without
motions with the use of inertial sensors. Digital Imaging tears: A samplingresampling perspective. American Statis-
and Pattern Recognition, OAGM/AAPR, pp. 141–150. tician, 2: 84–88.
Goddard, J. and Abidi, M. (1998). Pose and motion estimation Tornero, J. (1985). Non-conventional sampled-data systems
using dual quaternion-based extended Kalman filtering. modelling. University of Manchester (UMIST), Control
SPIE, 3313: 189–200. System Center Report, 640/1985.
Goodwin, G. and Feuer, A. (1992), Linear periodic control: A Tornero, J., Albertos, P., and Salt J. (1999). Periodic opti-
frequency domain viewpoint. Systems and Control Letters, mal control of multirate sampled data systems. 14th World
19: 379–390. Congress of IFAC, China, pp. 211–216.
Gordon, N., Salmond, D., and Smith, A. (1993). Novel aproach Tornero, J., Gu, Y., and Tomizuka, M. (1999). Analysis
to nonlinear/non-gaussian Bayesian state estimation. IEE- of multi-rate discrete equivalent of continuous controller.
Proceedings-F, 140(2): 107–113. American Control Conference, pp. 2759–2763.
Grewal, M. S., Weill, L. R., and Andrews, A. P. (2001). Global Tornero, J. and Tomizuka, M. (2000). Dual-rate high order
Positioning Systems, Inertial Navigation, and Integration. hold equivalent controllers. American Control Conference,
John Wiley and Sons, Canada. pp. 175–179.
Gurfil, P and Kasdin, N. (2002). Two-step optimal estima- Tornero, J. and Tomizuka, M. (2002). Modeling, analysis and
tor for three dimensional target tracking. American Control design tools for dual-rate systems. American Control Con-
Conference, pp. 209–214. ference, pp. 4116–4121.
Armesto, Tornero, and Vincze / Fast Ego-motion Estimation with Multi-rate Fusion of Inertial and Vision 589

Ude, A. (1999). Filtering in a unit quaternion space for model- for robust model-based object tracking. International Jour-
based object tracking. IEEE Transactions Robotics and Au- nal of Robotics Research, 7(20): 533–552.
tonomous Systems, 28(2–3): 163–172. Zhang, Z. (1998). A flexible new technique for camera cal-
Vincze, M., Ayromlou, M., Ponweiser, W., and Zillich, M. ibration. Technical Report, Microsoft Research, Microsoft
(2001). Edge projected integration of image and model cues Corporation.

You might also like