Professional Documents
Culture Documents
Scientific Colloquium
Prepared by
VISHAL PARGUNDE
1
Abstract
To deal with the complexity in the sensor and action models, the modules
can be assumed to be Gaussian distributed and linearly related to the location.
By representing the sensor models as probability distributions, the complexity
can be decreased significantly.
We will start our KF discussion with the need of it and a quick overview
of how is it used. Further, the basic concepts involved will be explained by a
simple example. A more detailed explanation follows considering system
equations with some diagrams and what kind of problem the KF solves. We are
then ready to derive the KF equations and the meaning of them.
A simple system is assumed with the equations derived and this system is
simulated in a very small error environment to get a better understanding of
how the errors are suppressed.
2
Contents
1. Introduction………………………………………………………………4
2. A Simple Example………………………………………………………10
6. References………………………………………………………………26
Chapter 1
3
Introduction
With such a mathematical model and the tools provided by system and
control theories, he is able to investigate the system structure and modes of
response. If desired, he can design compensators that alter these characteristics
and controllers that provide appropriate inputs to generate desired system
responses.
There are three basic reasons why deterministic system and control theories
do not provide a totally sufficient means of performing this analysis and design.
First of all, no mathematical system model is perfect. Any such model depicts
only those characteristics of direct interest of the engineer’s purpose.
The second shortcoming of such models is that dynamic systems are driven
not only by our own control inputs, but also by disturbances which we can
neither control nor model deterministically.
4
Thus, there is a need to model system which has the following capabilities:
For dwelling in the details of the filter, it would be useful to see where we
are going conceptually. This will be conducted at a very elementary level but
will provide insights into the underlying concepts.
The first question that comes to mind is what exactly is a Kalman Filter? A
Kalman Filter is simply optimal recursive data processing algorithm. Optimal
means in the sense with respect to virtually any criterion that makes sense. One
aspect of optimality is that the Kalman Filter incorporates all information that
can be provided to it. It processes all the available information to estimate the
current value of the variables of interest with use of (1) knowledge of the
system and measurement device dynamics, (2) the statistical description of the
system noises, measurement errors, and uncertainty in the dynamic models, and
(3) any available information about initial conditions of variables of interest.
The word recursive means that unlike certain data processing concepts, the
Kalman Filter does not require all previous data to be kept in storage and
reprocessed every time a new measurement is taken. This will be of vital
importance to the practicality of filter implementation.
5
practical applications, the “filter” is just a computer program in a central
processor. It inherently incorporates discrete time measurement samples rather
than continuous time inputs.
Figure 1.1 below depicts a typical situation in which a Kalman Filter can be
used advantageously. A system of some sort is driven by some known controls,
and measuring devices provide the values of certain pertinent quantities.
Knowledge of these system inputs and outputs is all that is explicitly available
from the physical system for estimation purposes.
The need for a filter becomes apparent now. Often, the variables of interest,
some finite number of quantities to describe the “state” of the system, cannot be
measured directly, and some means of inferring these values from the available
data must be generated. Furthermore, any measurement will be corrupted to
some degree by noise, biases, and device inaccuracies, and so a means of
extracting valuable information from a noisy signal must be provided as well. A
Kalman filter combines all available measurement data, plus prior
knowledge about the system and measuring devices, to produce an estimate of
the desired variables in such a manner that the error is minimized statistically.
In other words, if we were to run a number of candidate filters many times for
the same application, then the average results of the Kalman Filter would be
better than the average results of any other.
6
Figure 1.1
7
A Kalman filter performs this conditional probability density propagation
for problems in which the system is described through a linear model and in
which system and measurement noises are white and Gaussian. Under these
conditions, a “best” estimate of the value of x can be propagated.
“Whiteness” implies that the noise value is not correlated in time. Stated
more simply, if you know what the values of the noise is now; this knowledge
does no good in predicting what its value is at any other time. Whiteness also
implies that the noise has equal power at all frequencies. Since this result in a
noise with infinite power, a white noise obviously cannot exist. Then how come
we have taken this noise into consideration, the answer is simple.
Figure 1.2
First, any physical system of interest has a certain frequency “band pass”
i.e., a frequency range of inputs to which it can respond. Above this range, the
input either has no effect, or the system so severely attenuates the effect that it
essentially does not exist. In Fig. 1.2, a typical system band pass curve is drawn
8
on a plot of “power spectral density” (interpreted as the amount of power
content at a certain frequency) versus frequency.
Chapter 2
9
A Simple Example
Suppose that you are lost at sea during the night and have no idea at all of
your location. So you take a star sighting to establish your position (for the sake
of simplicity take only one dimensional plane). At some time t1, you determine
your location to be z1. However, because of inherent measuring device
inaccuracies, human error, and the like, the result of your measurement is
somewhat uncertain. Say you decide that the precision is such that the standard
deviation (one sigma value) involved is σ z1 . The corresponding variance or
second order statistic is σ 2. Thus, you establish the conditional probability of
z1
x(t1), your position at time t1, conditioned on the observed value of the
measurement being z1, as depicted in the figure below (Fig. 2.1)
Figure 2.1
10
This plot tells the probability of being in any one location, based upon the
measurement taken.
Based on this conditional density, the best estimate of your position is,
σ x 2(t1) = σ z
2
1
(2.2)
Now, say a trained navigator friend takes an independent fix right after the
initial reading is taken, at time t2 ≈ t1 ( so that the true position is not changed at
all ), and obtains a measurement z2 with a varianceσ z . Because he has a higher
2
skill, assume the variance in this case is smaller than the earlier measurement.
Fig. 2.2 depicts the conditional density of the navigator’s position at time t2,
based only on measured value z2. Note the narrower peak due to smaller
variance, indicating higher certainty of the position.
Figure 2.2
At this point, we have two measurements available for estimating your position.
The question is, how do we combine these data? It will be shown subsequently
that, based on the assumptions made, the conditional density of our position at
11
σ 2
time t2 ≈ t1, x(t2), given both z1 and z2, is a Gaussian density with mean µ and
variance as indicated in Fig. 2.3 is,
These two equations can be represented as (2.3) and (2.4) respectively. Note
from equation (2.4), σ is less than either σ z1 or σ z 2 , which is to say that the
uncertainty in our estimate of position has been decreased by combining two
pieces of information.
x̂̂(t2) = µ (2.5)
Here, we can observe, the form of µ given in equation (2.3) makes good sense.
If σ z1 were equal to σ z 2 , which is to say that the measurements are of equal
precision, the equation says the optimal estimate of position is simply the
average of the two measurements, as would be expected. On the other hand, if
the case was σ z1 were greater than σ z2 , which is to say that the uncertainty
involved in the measurement z1 is greater than that of in z2, then the equation
dictates “weighting” z2 more heavily than z1. Finally, the variance of the
estimate is less than , even if is very large: even poor quality data
provide some information, and should thus increase the precision of the filter
output.
12
σ z1 σ z2
x̂̂(t2) = z1 + z2
σ z1 2 + σ z2 2 σ z1 2 + σ z2 2
2 σ z1 2
= z1 + ( z2 - z1 ) (2.6)
σ z1 2 + σ z2 2
Below is the diagram of the estimate after the measurement was taken.
Figure 2.3
The final form that is actually used in Kalman Filter implementations can be
represented by the following equation,
These equations say that the optimal estimate at time t2, x̂̂(t2) is equal to
the best prediction of its value before z2 is taken, x̂̂(t1), plus a correction term of
an optimal weighting value times the difference between z2 and the best
prediction of its value before it is actually taken, x̂̂(t1).
13
made. When the next measurement is taken, the difference between it and its
predicted value which is also called as “residual” is used to correct the
prediction of the desired variables.
Using the K(t2) in (2.8), the variance equation given by (2.4) can be
rewritten as follows,
σ x (t1)
2
σ x (t2) = σ x (t1) -
2 2
K(t2) (2.9)
Note that the values of x̂̂(t2) and σ x 2 (t2) embody all of the information in the
final conditional probability density as shown in Fig 2.3. Stated differently, by
propagating these two variables, the conditional probability density of your
position at time t2, given z1 and z2, is completely specified.
Chapter 3
Least Squares estimation
14
The Gaussian concept of estimation by least squares, originally
stimulated by astronomical studies, has provided the basis for a number of
estimation theories and techniques during the ensuing 170 years, probably none
as useful in terms of today’s requirements as the Kalman Filter.
Gauss invented and used the method of least squares as his estimation
technique which is implemented by the Kalman Filter. Let us consider Gauss’s
definition of the method.
He suggested that the most appropriate values for the unknown but
desired parameters are the most probable values, which can be defined by the
following manner: “The most probable value of the unknown quantities will be
that in which the sum of the squares of the differences between the actually
observed and the computed values multiplied by numbers that measure the
degree of precision is minimum”. The difference between the actual measured
and the predicted value is called as “residual”.
Chapter 4
Discrete Kalman Filter
15
In 1960, R.E. Kalman published his famous paper describing a recursive
solution to the discrete data linear filtering problem [Kalman60]. Since that
time, due in large part to advances in digital computing; the Kalman filter has
been the subject of extensive research and application, particularly in the area of
autonomous or assisted navigation. A very “friendly” introduction to the general
idea of the Kalman filter can be found in Chapter 1 of [Maybeck79], while a
more complete introductory discussion can be found in [Sorenson70], which
also contains some interesting historical narrative.
The Kalman filter addresses the general problem of trying to estimate the
state x of a discrete-time controlled process that is governed by the linear
stochastic difference equation
xj = a xj-1 + b uj-1 + wj-1 (4.1)
with a measurement given as
zj = hxj + vj (4.2)
The random variables wj and vj represent the process and measurement
noise (respectively). They are assumed to be independent (of each other), white,
and with normal probability distributions.
p(w) ≈ N ( 0, Q ) (4.3)
p(v) ≈ N ( 0, R ) (4.4)
16
with each time step, but here we assume it is constant. The n * l matrix “b”
relates the operational control input “u” to the state “x”. The m * n matrix “h” in
the measurement equation (4.2) relates the state to the measurement zj. In
practice “h” might change with each time step or measurement, but here we
assume it as constant.
We define x̂j- ( note the “super minus” ) to be our “priori state estimate”
at step “j” given knowledge of the process prior to step “j”, and x̂j be the
posteriori state estimate at step “j” given measurement zj. We can then define
a priori and a posteriori estimate errors as
ej- = xj - x̂j-
and ej = xj - x̂j
pj = E{(ej) 2} (4.6)
In deriving the equations for the Kalman filter, we begin with the goal of
finding an equation that computes an a posteriori state estimate x̂j as a linear
combination of a priori estimate x̂j– and a weighted difference between an actual
measurement zj and a measurement prediction hxj as shown below
x̂j = x̂j- + K ( zj - h x̂j- ) (4.7)
17
predicted measurement and the actual measurement. A residual of zero means
that the two are in complete agreement.
The n * n matrix K in (4.7) is chosen to be the gain or blending factor
that minimizes the posteriori error covariance in (4.6). This minimization can be
accomplished by first substituting (4.7) in the equation for ej, substituting that
into (4.6), performing the indicated expectations, taking the derivative of the
trace of the result with respect to K, setting the result equal to zero, and then
solving for K. One form of the resulting K that minimizes (4.6) is given by,
We will begin this section with a broad overview, covering the “high-
level” operation of one form of the discrete Kalman filter. After presenting this
high-level view, we will narrow the focus to the specific equations and their use
in this version of the filter.
The Kalman filter estimates a process by using a form of feedback
control: the filter estimates the process state at some time and then obtains
feedback in the form of (noisy) measurements. As such, the equations for the
Kalman filter fall into two groups: time update equations and measurement
update equations. The time update equations are responsible for projecting
forward (in time) the current state and error covariance estimates to obtain the a
priori estimates for the next time step. The measurement update equations are
18
responsible for the feedback—i.e. for incorporating a new measurement into the
a priori estimate to obtain an improved a posteriori estimate.
The time update equations can also be thought of as predictor equations,
while the measurement update equations can be thought of as corrector
equations. Indeed the final estimation algorithm resembles that of a predictor-
corrector algorithm for solving numerical problems is shown in the table below.
Figure 4.1
The above figure signifies that the time update projects the current state
estimate ahead in time. The measurement update adjusts the current estimate by
an actual measurement at that time.
19
Figure 4.2
The above figure gives the final set of equations. The first task during the
measurement is to compute the Kalman Gain, K. The next step is to actually
measure the process to obtain zj, and then to generate the posteriori estimate by
incorporating the measurement. The final step is to posteriori error covariance
estimate.
After each time and measurement update pair, the process is repeated with the
previous a posteriori estimates used to project the new priori estimates. This
recursive nature is one of the very appealing features of the Kalman Filter - it
makes practical implementations much more feasible and recursively conditions
the current estimate on all of the past measurements.
20
Chapter 5
Estimating a Random Constant
In the previous sections, we presented the form of Kalman Filter and the
derived equations. To help in developing a better feel for the operation and
capability of the filter, we present a very simple example here.
The state does not change from step to step so “a” = 1. There is no control
input so “u” = 0. Our noisy measurement is of the state directly so “h” = 1.
x̂j-= x̂j-1
Pj- = Pj-1 + Q
and our measurement update equations are
K = Pj- ( Pj- + R )-1
x̂j = x̂j- + K ( zj - x̂j- )
Pj = ( 1 – K )P-j
21
Presuming a very small process variance, we let Q = 1e -5. (We could
certainly let Q = 0 but assuming a small but non-zero value gives us more
flexibility in “tuning” the filter as we will demonstrate below.) Let’s assume
that from experience we know that the true value of the random constant has a
standard normal probability distribution, so we will “seed” our filter with the
guess that the constant is 0. In other words, x̂j-1 = 0.
Similarly we need to choose an initial value for Pj-1, call it P0. If we were
absolutely sure that our initial state estimate is zero was correct, we would let P0
= 0. However given the uncertainty in our initial estimate, choosing P0 = 0
would cause the filter initially and always believe that initial state estimate is
zero. As it turns out, the alternative choice is not critical. We would choose
almost P0 ≠ 0 and the filter would eventually converge. We’ll start our filter
with P0 = 1.
22
random constant x = -0.37727 is given by the solid line, the noisy measurements
by the cross marks, and the filter estimate by the remaining curve.
Figure 5.1
Figure 5.1 depicts the first simulation: R = 0.01. The true value of the
random constant x = -0.37727 is given by the solid line, the noisy measurements
by the cross marks, and the filter estimate by the remaining curve.
23
Figure 5.2
As depicted by the above figure, after 50 iterations, our initial (rough)
error covariance choice of 1 has settled to about 0.0002 (Volts)2.
Now we will observe the effect of changing the value of R. Here, we are
going to increase R by a factor of 100.
Figure 5.3
24
Figure 5.3 depicts the second simulation at R = 1. We can see the filter is
slower to respond the measurements, resulting in reduced estimate variance.
This signifies that it is “slower” to believe the measurements.
Figure 5.4
Figure 5.4 depicts the third simulation at R = 0.0001. The filter responds
to measurements quickly, increasing the estimate variance. So, it is very “quick”
to believe the noisy measurements.
While the estimation of a constant is relatively straight-forward, it clearly
demonstrates the workings of the Kalman Filter. In figure 5.3 in particular the
Kalman “filtering” is evident as the estimate appears considerably smoother
than the noisy measurements.
25
Chapter 6
References
26