You are on page 1of 9

Kalman Filter: Lecture Notes

Prasanth Prahladan 04 November,2012

Motivation for a Filter

A dynamical system when formulated in the State-Space framework is generally represented by a pair of equations each representing two aspects of the system the Process Model and the Measurement Model.Mathematically, a continuoustime dynamical system is described as follows:

x = f (x, u)

: P rocessM odely = g(x, u)

: M easurementM odel

(1)

where (x,u) = (State-vector, Control Input). Some questions to ponder about: 1. What information about the system do we get from the description above? What information does the Process Model carry? What information does the Measurement Model carry? What information does the Model NOT CARRY - What do we NOT KNOW from the model? 2. Why do we need to determine the state of the system from the measurements? When is this possible? is it only when the system is OBSERVABLE? 3. What do we mean by OBSERVABLE? How do we determine Observability for a Linear and Non-linear system? We need to remember that at any given time, we only have the measurements from the system. We do not have an exact idea of the State of the system, since it is an abstraction that we have developed to describe how the system evolves. The State Vector as we know it, may not be State-Vector that Nature uses to describe the System. So, once we have dened a System-Model as above,and we continuously obtain measurements about the system, do we have all the information we need about the system? What is the challenge we are trying to solve? Its easy to believe that with all the information we have - the System-Model + Measurements, we know everything to describe the System. However, we are missing out on one CRITICAL INFORMATION - the INITIAL STATE(X0) of the System. To understand the intuition behind this, consider the dierential equation: 1 dx = dt x 1

The dynamical system above has the solution: x(t1)2 x(t0)2 = t1 t0 x(t1) = /sqrt[2]x(t0)2 + (t1 t0) This shows that, the state at any time t1, x(t1) is a function of its initial state x(t0) and the time-interval of integration (t1 -t0). The generalized equation for a deterministic-system when modied to account for Noise/Uncertainty becomes: x = f (x, u) + wy = g(x, u) + v (2)

where w,v represent Noise Vectors, which are generally assumed to be zeromean nite-covariance Gaussians,N(Mean,Covar).The system is now classied as a Stochastic Process. Why is this assumption acceptable? The Distribution of Noise in the Process and Measurement Models is generally NOT EXACTLY known. However, for most of the system analysis, we ASSUME that the distribution is exactly known, to proceed with closed form derivation and estimation.There are specic methods to identify the Noisedistributions.However,we shall not cover them in this tutorial. Task 1: What are the methods to determine the Noise-distribution for a given system model? Are the methods to determine Process Noise dierent from those for determining Measurement Noise? For ease of illustration, we shall consider a discrete-time linear time-invariant system as below: xk+1 = Axk + Buk + wk , wherewk = N (0, Q) yk = Cxk + Duk + vk , wherevk = N (0, R) In the above system, we assume that the Noise-Distributions are exactly known to be zero-mean gaussians with Covariance vectors Q,R. To summarize: Information we EXACTLY KNOW : A,B,C,D,Q,R , Measurements recorded yk Information we LACK : Initial State (X0 ). Trivia 1: What information does the Mean and Covariance of the Noise distribution convey? When considering a Gaussian noise-distribution with (M ean, Covariance) = (M, C), it implies that the estimate of the noise-amplitude is M. However, the true noise-amplitudes at any time-instant may spread around M, with values within M 5 C. Earlier, we considered the notion of observability for a deterministic system. We also realized that for a linear observable system, there exists a closed form expression for the Initial State of the system. Thus, with knowledge of the system-model and measurement data obtained from the model, we can EXACTLY determine the Initial State(X0) of the system. But, for a linear stochastic-process with zero-mean Gaussian Noise, is there a means to identify the Initial State? NO.The Initial State CANNOT be EXACTLY determined. However, we may now dene a notion of Optimality of the state-estimates, since for a stochastic process, every estimate would be associated with a measure of reliability/trust-worthiness of the estimate. Questions to ponder upon: 2

What do you mean by Optimality of an Estimate? Is there a notion of an Optimal Filter? What is the criteria for determining Optimality of - an Estimate? a Filter? How does a change in denition of Optimality, change the algorithm adopted to nd the Estimates? Is this feasible, or do we always have only one unique denition of Optimality?

Kalman Filter

With the motivation for the Kalman Filter described above, let us now proceed to derive the expressions for the lter. However, we shall adopt a procedure of rst understanding the intuition behind the working of the lter, before we dive deep into the details.

2.1

Optimality using Measurement Model

At any given time, we obtain a measurement yk of the system. For a given state-estimate made from this measurement, its possible for us to calculate the Expected-Measurement using the Measurement Model. T ime M srmtT ime StateEstimateT ime OutputEstimateT ime k yk xk yk yk = C xk + (E[vk ] = 0) So, what would we like to Optimize? minimize
x

(yk yk )

minimize
x

(yk yk )T (yk yk )

The x that minimizes the above equation is the optimal state estimate xk for that given time-step.However, is this the best we can consider? The equation says that all the measurements are equally weighted. However, in the measurement model we considered a covariance matrix R, which dened the reliability of each sensor-measurement.Intuitively, wouldnt we like to give higher weights to the more-reliable measurements, and lesser weights to the less-reliable measurements.Thus, to improve the optimality criteria above, we need to consider the Noise-Covariance in the Measurement model. This is incorporated as below: minimize
x

(yk yk )T R1 (yk yk )

Assuming that the Measurement Noise covariance matrix, R is a diagonal matrix, i.e the noise in each sensor is independent of each other. The crosscovariance measure for all sensors is zero in the Covariance matrix. Higher, covariance value CCov implies lesser reliability and hence, an ap1 propriate weight for those measurements is CCov 0. This idea is then incor1 porated by including the weight-matrix R in the equation above. 3

However, the estimate above is incomplete. We have incorporated only knowledge about the Measurement Noise-Distribution, while ignoring the Process Noise-Distribution. We need to now nd a means of incorporating the Process-Model into the optimal state estimation problem.

2.2

Optimality using Process Model

Consider a system-model in which the Process-model is exact i.e without any noise. xk+1 = xk yk = xk + vk (3) We have assumed a steady-state process model.All future states is equal to the initial state X0 .However, the measurements made from this steady-state process contains noise. And hence, we need to determine an estimate of the initial-state. Since, we know that E[vk ] = 0, from the measurement model we have: xk = E[xk ](+E[xk ] == 0) = E[yk ] xk = xk = xk =
k k=1

yk

k
k1 k=1

yk + yk k

yk k 1 yk + k1 k k k 1 yk xk = xk1 + k k xk|k = (1 )xk1|k1 + yk Meaning, the state-estimate xk given all measurements yk from k = 0 k is obtained by a linear combination of the state-estimate in the previous time-step,xk1|k1 and the new measurement yk . Therefore, to determine the Optimal State-Estimate we need to determine the weight .This optimization problem may be solved in many ways. The Kalman Filter provides a unique-iterative method to solve the above optimization problem.It has been proved that the procedure delineated by the Kalman Filter, provides the optimal-solution for a linear system with Gaussian Noise. Any alternative method can only provide solutions that are either poorer or equal to the KF solution. The estimates cannot get any better than the KFsolution. The weight is then called the Kalman Coecient However, if the process is non-linear, or the system-model comprises of nongaussian noise-distributions, then the solutions obtained using the KF tend to be sub-optimal. And we need better strategies to handle these scenarios. Further, we learnt above that xk|k = f (xk1|k1 , yk ). Would it not be possible to obtain a better estimate by including the information obtained from all the previous measurements also i.e Is xk|k = f (xk1|k1 , yk , yk1 , yk2 , .., y1 ) a better state-estimate relationship? Answer: The Kalman Filter methodology adopts the concept of Innovation where the information contained in each measurement yk is considered to be orthogonal/independent to that contained in any previous measurement

k1 k=1

yki . Further, all the information contained in the measurements made from time : 0 k 1 required for state-estimation is optimally summarized in the single scalar value xk1|k1 and any new measurement yk shall contain information completely uncorrelated to the information contained in the previous measurements. However, we may obtain a better state-estimate by considering future measurements i.e. xk|k = f (xk1|k1 , yk , yk+1 , ..., yk+d ) This, algorithm however ceases to be a FILTER, and becomes a SMOOTHER. Therefore, to summarize the lessons learnt from the two simple situations studied above: 1. Optimality of an estimate is obtained by determining the state-estimate that minimizes the weighted-dierence between the current-measurement and the predicted-measurement using the measurement-model. 2. The complete state-estimate should incorporate the process and measurement noises. An iterative procedure would thus, obtain current stateestimates by linearly combining the current measurement to the previous state-estimate. 3. A Filter for a stochastic-process, should provide two things - (an Estimate,a measure of Quality of Estimate). Generally, the Quality of the Estimate is a SCALAR value, which is computed as the Trace[Covariance Matrix,P]. An optimal-lter should attempt to minimize the Trace[Covariance Matrix, P] (or Maximize the Quality) of the Estimates it produces as time progresses.

2.3

Kalman Filter Derivations

To derive an Iterative Filter for determining the optimal state-estimate of the system, we consider the following equation derived above: xk|k = (1 )xk1|k1 + yk xk|k = xk1|k1 + (yk xk1|k1 ) which is valid for the system-model: xk+1 = xk yk = xk + vk Now, considering a steady state process, we can argue that: xk|k1 = xk1|k1 yk|k1 = xk|k1 (+E(vk ) == 0) = xk1|k1 (6) (5) (4)

where (xk|k1 , yk|k1 ) = (P redicted State, P redicted Output) given mea surements upto time(k-1). Therefore, we have the nal form of the iterative optimal-state-estimator: xk|k = xk|k1 + (yk yk|k1 ) Qualitatively, it means: The optimal state-estimate at time k, xk|k is a linear combination of pre dicted state-estimate at time k, xk|k1 and the dierence between the current 5

measurement yk and the predicted measurement yk|k1 , all predictions being made with measurement information upto previous time-step. The Kalman Filter may thus be considered to be made up of two logical steps: 1. PREDICTOR STAGE: Predict/Propagate a future State-Estimate and Condence-Measure 2. CORRECTOR STAGE : Make corrections of the Predictions made, by Assimilating information from new measurement. Further, to derive an Iterative-Filter, we need to understand two basic steps: 1. Initialization Step of the Filter 2. How to progress from one time step to the next, i.e k (k + 1) Why do we need to understand the Initialization Stage of an Iterative Filter?To understand the intuition behind this step, lets consider the following example. Assume we have the System-model of a Stochastic Process. Assume that we know the Initial state(X0 ) and the condence-measure(P0 ) of the estimate. What can you comment about the procedure to be adopted for making estimates of the state-vector at all future time, considering that we obtain measurements yk , k : 0 k ? Case 1: P0 = 0 Since,P0 = 0 implies that we are VERY CONFIDENT of the Initial-State X0 obtained without any measurements from the system, then by induction every Open-Loop prediction of the state-vector would be a good estimate. Therefore, to determine state-estimates of this system, WE NEED NOT TAKE ANY MEASUREMENTS OF THE SYSTEM, using the system model and X0 we can compute the future state-estimates exactly. Question: Is it not possible for the Condence-Measure to become worse as time progresses? How do we account for that? Case 2: P0 = IN F IN IT Y Since, X0 is COMPLETELY UNRELIABLE to obtain any reliable future state-estimate we NEED to consider all measurements made from the system. Therefore, if we start o with a ZERO-CONFIDENCE(P0 = IN F IN IT Y ) in the Initial-State of the system, and proceed with making measurements from the system, we can assume that a stage would arise, when we would become pretty-condent about the state-estimates obtained from the Filter. At this stage of HIGH-CONFIDENCE (P0 0) we can stop relying on the information obtained from each new measurement. The system is assumed to have attained its STATIONARY STATE. With the above qualitative ideas in mind, lets proceed with the Kalman Filter derivations. The system-model under study is: xk+1 = Axk + Buk + wk , wherewk = N (0, Q) yk = Cxk + Duk + vk , wherevk = N (0, R) Without loss of generality, we may neglect to consider the Deterministic Control Input which is exactly known, and which thus only adds a BIAS to the stochastic 6

process.The new system-equations become: xk+1 = Axk + wk , wherewk = N (0, Q) yk = Cxk + vk , wherevk = N (0, R) Known parameters:A,C,Q,R Assumptions about Noise: What is the intuition behind these assumptions?
T T T E[wk wk ] = QE[wk wk1 ] = 0E[vk vk ] = R

(7)

PREDICTOR STAGE: At time-instant k, measurement yk is made. The state-estimate and the estimate-condence-measure xk|k andPk|k are assumed to be known. Pk+1|k = E[(xk+1 xk+1|k )(xk+1 xk+1|k )T ] xk+1|k = Axk|k (+E[wk ] == 0) = E[(Axk + wk Axk|k )(Axk + wk Axk|k )T ] = E[(A(xk xk|k + wk )(A(xk xk|k + wk )T ] The predicted state-estimate, estimate-condence and the predicted measurement are: xk+1|k = Axk|k Pk+1|k = APk|k AT + Qyk+1|k = C xk+1|k = CAxk|k CORRECTOR STAGE: The update equation derived above is: xk|k = xk|k1 + k (yk yk|k1 ) To derive k we solve the optimization problem seeking to minimize Pk|k . xk+1 xk+1|k+1 = [I k+1 C](xk+1 xk+1|k ) + k+1 vk xk+1 xk+1|k = A(xk xk|k ) + wk xk+1 xk+1|k+1 = [I k+1 C]A(xk xk|k ) + [I k+1 C]wk + k+1 vk (9) (8)

T Pk+1|k+1 = [I k+1 C]Pk+1|k [I k+1 C]T + k+1 Rk+1 Pk+1|k = APk|k AT + Q (10) To minimize Pk+1|k+1 , we seek to minimize the TRACE of the Matrix. Then, since it is scalar valued, we can determine the Maxima by equating its derivative wrt to zero. The optimization stage thus forms the heart of the lter derivations. And since it involves some involved matrix manipulations we shall use some standard results to proceed. For nding the derivative of a Scalar()wrt a Matrix(A) we have: T

A = [a1a2]

d = dA

d d da1 da2

(11)

Further, from matrix theory, we also have the relationship: dtrace[ABAT ] = 2AB dA 7 (12)

Given the above relationships, and solving the optimization problem we obtain: Pk+1|k+1 = [I k+1 C]Pk+1|k [I k+1 C]T
T +k+1 Rk+1

dPk+1|k+1 = 0, f ormaxima dk+1 (2[I k+1 C]Pk+1|k )(C T ) + (2k+1 R) = 0 k+1 (CPk+1|k C T + R) = Pk+1|k C T k+1 = Pk+1|k C T (CPk+1|k C T + R)( 1) Thus, the corrected expectations obtained are: Pk+1|k+1 = [I k+1 C]Pk+1|k xk+1|k+1 = xk+1|k + k+1 (yk+1 yk+1|k ) where, k+1 = Pk+1|k C T (CPk+1|k C T + R)( 1) A Note on Adaptation of Kalman Filter for Nonlinear-Systems: It is interesting to note that the above derivations, have been further adapted to handle systems with non-linear dynamics. The derivative of the basic Kalman Filter modied for non-linear systems with Gaussian Noise, is called the Extended Kalman lter. The system model equations are given by: xk+1 = g(xk ) + wk yk = h(xk ) + vk (13)

where g(),h() are non-linear functions and wk N (0, Q) and vk N (0, R). The prediction stage equations for the state-variable and measurement remains the same, however the equation for the predicted covariance matrix is modied slightly. At any given time k, the information available is xk|k andPk|k . The nonlinear system model is linearised about this expected-state to obtain the following linear-model description: xk+1|k = Ak xk|k + wk yk+1|k = Ck xk|k + vk The equations for the Extended Kalman Filter are: P redictorStage : xk+1|k = g(xk|k ) yk+1|k = h(xk|k ) Pk+1|k = Ak Pk|k AT k CorrectorStage : Pk+1|k+1 = [I k+1 Ck ]Pk+1|k xk+1|k+1 = xk+1|k + k+1 (yk+1 yk+1|k ) where, k+1 =
T T Pk+1|k Ck (Ck Pk+1|k Ck

(14)

+ R)( 1)

However, the adoption of the above equations in the Predictor stage, involves a certain amount of hand-waving rather than analytic rigour. We have willingly violated the general condition that, for a nonlinear function g() E [g(x)] = g(E [x]) (15)

However, the lter has found a good amount of practical use. However, for a more rigorous and versatile lter, the EKF has been succeeded by a novel Kalman Filter derivative called the Unscented Kalman Filter(UKF). Research on lter design, thus proceeds in two basic directions : 1. Accounting for nonlinearity in the system models 2. Accounting for inclusion of Constraints on the state-variables

You might also like