Professional Documents
Culture Documents
Hirak Parikh
1. Introduction
While there are many application-specific approaches to computing
(estimating) an unknown state from a set of process measurements, many of these
methods do not inherently take into consideration the typically noisy nature of the
measurements. For example, consider our work in tracking for interactive computer
graphics. This noise is typically statistical in nature (or can be effectively modeled as
such), which leads us to stochastic methods for addressing the problems.
This is also known as the Observer Design Problem. ([4]Welch & Bishop)
Observer Design Problem: There is a related general problem in the area of linear
systems theory generally called the observer design problem. The basic problem is to
determine (estimate) the internal states of a linear system, given access only to the
systems outputs. Access to the systems control inputs may also presumed. This is
akin to what people often think of as the black box problem where you have access to
some signals coming from the box (the outputs) but you cannot directly observe whats
inside.
I/I/P
BLACK BOX
O/P
PROCESS VARIABLE ( X )
+
PROCESS NOISE (Q )
The many approaches to this basic problem are typically based on a state-space model
There is typically a process model that models the transformation of the process state.
This can usually be represented as a linear stochastic difference equation similar to
equation
xk + 1 = Axk + Qk
In addition there is some form of measurement model that describes the relationship
between the process state and the measurements. This can usually be represented with a
linear expression:
zk = Hkxk + Wk
The terms and W are random variables representing the process and measurement
noise respectively.
Why model measurement and process Noise
We consider here the common case of noisy sensor measurements. There are
many sources of noise in such measurements. For example, each type of sensor has
fundamental limitations related to the associated physical medium, and when pushing the
envelope of these limitations the signals are typically degraded. In addition, some amount
of random electrical noise is added to the signal via the sensor and the electrical circuits.
The time varying ratio of pure signal to the electrical noise continuously affects the
quantity and quality of the information. The result is that information obtained from any
one sensor must be qualified as it is interpreted as part of an overall sequence of
estimates, and analytical measurement models typically incorporate some notion of
random measurement noise or uncertainty as shown above.
There is the additional problem that the actual state transform model is completely
unknown. While we can make predictions over relatively short intervals using models
based on recent state transforms, such predictions assume that the transforms are
predictable, which is not always the case. The result is that like sensor information,
ongoing estimates of the state must be qualified as they are combined with measurements
in an overall sequence of estimates. In addition, process models typically incorporate
some notion of random motion or uncertainty as shown above.
shown to be the best (minimum error variance) filter out of the class of linear unbiased
filters.
Prediction Equations
1) The first equation estimates the a priori estimate of the variable x depending on
the Process Gain A
2) This calculates the a priori estimate of the predictor PkMeasurement Update Equations:
1) Kalman Gain: This is most crucial part of the Kalman filter equations. It has a
large value at the beginning since trusts the a priori estimate less than the
corrected value. As the filter converges the Kalman Gain value becomes small.
2) Then the posteriori estimate of x is obtained depending on the a priori estimate
and Kalman Gain
3) Finally the value of the Pk is corrected.
Training of Parameters:
The A,H,W and Q parameters need to be retrained. This is done by using a sliding
window that refines the values of the parameters of the model depending on the last few
values of the measured variable and the estimated process variable.
If the Switch (St) were observed we would know when to apply each submodel ie
the segmentation would be known but since St is hidden we use a weighted
combination of each sub-model where the weights are calculated depending on the error
and measured variable. This is called soft switching. Hence the resulting system can be
thought of as a mixture of Kalman Filters.
Filtering:
We use the following equations which are the same as for the simple Kalman case. The
only overheads are to calculate the various probabilities of the different states. There a
number of probabilities depending on whether is current state or previous state or a
conditional probability.
We compute the error in the prediction (the innovation), the variance of the error the
Kalman gain matrix and likelihood of detection
Now update the estimates of the mean, variance and cross variance
Collapsing:
The technique is to approximate the mixture of Mt Gaussians with a mixture of r
Gaussians. This is called the Generalized Pseudo Bayesian algorithm of order r (GPB(r),
When r=1, we approximate a mixture of Gaussian with a single Gaussian using moment
matching. When r = 2, we collapse Gaussians which differ in their history two steps ago,
in general these will be more similar than Gaussians that differ in their more recent
history.
One worry is that errors introduced at each time step by approximating the posterior
might accumulate over time, leading to poor performance. However the stochasticity of
the process ensures that the true distribution spreads out and with high probability
overlaps the approximate distribution; hence they are able to prove that error remains
bounded.
[x,new_predictor] = collapse(xhat(i), predictor(i),weight(i)
x = weight(i)*xhat(i)
new_predictor = weight(i)(predictor(i) + (xhat(i) x )(xhat(i) x )T
Training:
At the end of every iteration the filter parameters need to be retrained. The
parameters need to be trained over a window. Equations for the A,C ,W and Q follow.
(The details of the derivation are to be found in [1] Kevin Murphy, Switching Kalman
Filter)
3. X Y Tracking:
This was a simple X-Y tracking task that was implemented using a single Kalman filter.
Sample Output:
Some results for a sample simulation for each of the above cases
Two_State
Plotting
Mean Error
MSE
of given data
of calculated data
-0.0614
0.0038
3.458
2.8394
Rate Tone
Detection
0.3092
0.0956
1.4193
1.9134
Tracking
-2.882
7.186
106.18
105.18
5. Discussion:
1.The Swiched Kalman filter performed quite well. It matched the data well within the
error bounds. The issues that cropped up during some of the simulations is that the
training data has to reasonably good else some of the initial parameters are set incorrectly
and sometimes the filter might not coverge quickly though given enough time it does
converge even if slowly.
2.The size of window that is used for smoothing can play a role in quick or slow
convergence and this has to be set depending on how much the signal is supposed to
vary.
3. Since the number of states grows to M2 during the filtering operation increasing the
bank of filters can become computationally expensive. Since all the filters work
independently during filtering they can be vectorized. The current code has a for loop in
the collapse routine and this needs to removed as MATLABs speed is comprised by or
loops.
4. This implementation needs to checked for its operation in real time situations.
6. References
[1] Kevin Murphy, Switching Kalman Filter (1998)
[2] Wei Wu, Michael Black, et al, A Switching Kalman Filter Model for Motor
Cortical Coding of Hand Motion (2003)
[3] Lawrence Rabiner, A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition (1989)
[4] Welch & Bishop, An Introduction to the Kalman Filter (2001)
[5] Peter Maybeck, Stochastic Models, estimation and control (1979)
[6] T. Kailath, Lectures on Wiener and Kalman Filtering (1981, 2nd ed)