You are on page 1of 52

A diagnostic tool analysing ablative heart surgery

Nikhil Chandaria
CID:00466532

Supervisor: Dr. Colin J. Cotter

1st of June, 2010

Abstract
This is a report on creating a filter to classify an electrocardiogram. The Ornstein-Uhlenbeck process
is used to model the signal between heartbeats and we investigate the use of the Ensemble Kalman
Filter to estimate the parameters of this stochastic process. We find that the filter is unable to estimate
the drift term in the stochastic differential equation. We then propose a modification involving the
Ensemble Square Root filter and a Bayesian approach to estimating this parameter, which proves to be
more successful, however we discover that the Ornstein-Uhlenbeck process proves to be an insufficient
model for the signal between each heartbeat from a patient.
Acknowledgements
I would like to thank Dr. Colin Cotter for his encouragement, advice and insight throughout this
project. I would also like to thank Professor Nicholas Peters and Dr. Louisa Lawes for their explanation
of atrial fibrillation, the ablation procedure and for providing the electrocardiogram data.
Contents
1 Introduction 1
1.1 Atrial Fibrillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Ablative Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Stochastic Processes 3
2.1 Euler-Maruyama Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Ornstein-Uhlenbeck Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Kalman Filters 7
3.1 Linear Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Ensemble Filters 12
4.1 Particle Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Ensemble Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Ensemble Square Root Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Estimating σ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4.3 Changes in Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.4 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5 Application to Heart Surgery Data 34

6 Final Remarks 37
6.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Bibliography 39

A Solution to OU Process 41

B σ Estimation Issue 42

C Maximum Likelihood Estimator 44

D Ornstein-Uhlenbeck Process MATLAB Code 45

E Final Filter MATLAB code 46

F Removing Heartbeats from ECG data 49

i
1 Introduction
Atrial fibrillation (AF) is classified as cardiac arrhythmia; an irregular heartbeat. It is associated with
problems within the electrical conduction system of the heart. Within the UK there are at least 46,000
people diagnosed every year (Iqbal et al., 2005); the subsequent result is that £459 million is spent by the
National Health Service (Stewart et al., 2004) which is roughly 1% of the NHS budget.
There are a variety of treatments available for patients who suffer from AF including medicinal, eletrical
and chemical cardioversion and ablative surgery. Within this project we are going to focus on catheter
ablation. We will consider the detection of AF and the ablative procedure. In section 2 we will consider
stochastic processes as a means for modelling the heart in order to develop an analytical tool. In sections
3 and 4 we will consider filtering techniques for estimation purposes. In section 5 we will examine the
application of filtering techniques to data from the electrocardiogram (ECG).

1.1 Atrial Fibrillation


In this subsection we will examine atrial fibrillation and methods of detecting the signal.

We will first consider normal sinus rhythm; a regular heartbeat and how the electrical signal is con-
ducted through the heart. It is possible to classify the heartbeat into different stages with the use of
an ECG. A typical heartbeat is shown in figure 1.1.2. We can then identify the electrical impulse that
generates in the sinoatrial node (shown as the sinus node in figure 1.1.1) as the P wave; this is what causes
the contraction of the atria. This will push the blood from the atria into the ventricles. The electrical
signal will then travel to the atroventricular (AV) node upon which the signal will cause the ventricles to
contract thus forcing blood from the heart to the rest of the body. The delay between the contractions of
the atria and the ventricles is characterized by the PR segment; without this delay the entire heart would
beat at the same time. The QRS complex denotes the spread of the electrical activity from the atria to
the ventricles. Finally the repolarization of the ventricles is shown by the presence of the T wave.

Fig. 1.1.1: Diagram illustrating the main areas of the heart

For a patient who suffers from AF there are two main methods for detection of the condition: the
regularity of the heartbeat or the more stronger indicator is the absence of P waves (American Heart
Association, 2008). The condition can cause palpitations, fainting and congestive heart failure.

1
Fig. 1.1.2: A typical heartbeat as shown on an ECG

1.2 Ablative Surgery


In this subsection we will discuss the ablative surgery method and the issues associated with it.

Ablative surgery is a procedure where surgeons insert a number of electrodes into a patient’s heart
and measure the electrical activity (Sivakumaren, 2009). A catheter using a high frequency alternating
current is then used to burn any abscesses away that may exist. This is done by a surgeon searching for
any abnormal impulses using a roving electrode. Once the surgeon has located and ablated the abscess
they will then continue to search for any other sources of impulses that may exist. The aim is that this
can aid in returning the heartbeat to a normal sinus rhythm.
This method, however, is subjective and can lead to varying success rates between surgeons and treatment
centers (Calkins et al., 2007). We believe that this is because there is no method for determining whether
a signal is displaying disorganized electrical activity as is the case under AF. In this project we aim to
work on a method that will classify whether a signal is noise or whether it is indeed atrial fibrillation. We
hope that this research will help provide an objective decision making process for surgeons in being able
to determine whether a signal is noise or an abnormal electrical activity causing AF. This may provide a
method for a surgeon to perform a post operation analysis on the procedure to understand the impact of
ablation on the patient and to able to distinguish between noisy signals and anomalous electrical activity
in the heart.

2
2 Stochastic Processes
A stochastic process is a random process. The most common example is that of Brownian motion. A
stochastic process can only be described by a probability density function (pdf). In the case of an ordinary
differential equation (ODE) a given initial condition will give one real evolution through time however for
a stochastic differential equation (SDE) with a given initial condition there can be any number of possible
paths (Risken, 1996). An example of a stochastic process is a Wiener process which has the following
properties:

• W0 = 0

• ∆W ∼ N (0, t − s)

We can then generate a Wiener process after a set time span using the following equation:

xt = xt−1 + N (0, t − s) (2.0.1)

with the initial condition to this equation being x0 = 0.

Fig. 2.0.1: Demonstration of 4 sample Wiener process paths generated. This shows the randomness of gen-
erating a Wiener process.

Figure 2.0.1 displays an example of 4 different Wiener processes generated using a time difference of 0.01
and 100 time steps.

Figure 2.0.2 contains the evolution of 10,000 Wiener processes using the same conditions used to generate
the paths shown in figure 2.0.1. Based on these plots and (Risken, 1996) we can infer that the evolution
of the standard deviation of particles is given by:

σ(t) = t (2.0.2)

3
Fig. 2.0.2: Demonstration
√ of 10,000 Wiener processes. This demonstrates that the process displays a standard
deviation of t.

2.1 Euler-Maruyama Method


In this subsection we will discuss the Euler-Maruyama method for solving SDEs numerically.

For many SDEs of the form


dXt = a(Xt )dt + b(Xt )dWt (2.1.1)
there is no explicit solution (Risken, 1996) that can be obtained, thus a numerical integrator is required
to obtain the approximate solution to an SDE. The simplest numerical method is the Euler-Maruyama
(EM) method which aims to find a solution to an SDE using the following equation:

Xk+1 = Xk + a(Xk )∆t + b(Xk )∆Wk (2.1.2)

where ∆Wk = W (tk ) − W (tk−1 ). This method has order of convergence of n = 21 and a weak order of
convergence of n = 1 (Kloeden et al., 2003). This means that if we want to improve the precision of the
EM method by 10 times we would need to reduce the time step by 100 times. We can see that the above
equation is the forward Euler method for ordinary differential equations.

2.2 Ornstein-Uhlenbeck Process


This subsection deals with the mean reverting process called the Ornstein-Uhlenbeck process which will be
used a method for modelling the signal between each heart beat. The application and validity of the method
will also be discussed.

The Ornstein-Uhlenbeck (OU) Process is a mean reverting stochastic process (Uhlenbeck and Ornstein,
1940) of the form:
dXt = θ(µ − Xt )dt + σdWt (2.2.1)

4
where θ, µ, σ > 0 and Wt is a Wiener process. The exact solution of the OU process has the following
properties:  
E[Xt ] = X0 e−θt + µ 1 − e−θt (2.2.2)
and
σ2  
var(Xt ) =1 − e−2θt (2.2.3)

For proof of these two statements please refer to appendix A. In order to generate a sample path simulation
of equation 2.2.1 we need to apply an AR(1) method:
r
−θ∆t

−θ∆t
 1 − e−2θ∆t
xt = xt−1 e +µ 1−e +σ N (0, 1) (2.2.4)

An interesting note is that the OU process displays a Markov chain:

P (Xn+1 |X1 , X2 , ..., Xn ) = P (Xn+1 |Xn ) (2.2.5)

Applying the parameters laid out in table 2.2.1 results in figure 2.2.1.

θ 100
µ 0
σ 1
∆t 0.01
Number of points 100
Total time 1s

Table 2.2.1: Paramaters for Sample OU Process

Fig. 2.2.1: A sample Ornstein-Uhlenbeck process generated using µ = 0, θ = 100 and σ = 1 with 100 points
and a total time for the process of 1 second. This demonstrates the mean reverting nature of the
process.

2.2.1 Applications
There are various applications for the Ornstien-Uhlenbeck process such as financial modelling. A popular
use of the process is for commodity pricing using the parameters µ as an equilibrium, σ as the risk of the
product and θ the sensitivity to external factors. It has also been used in modelling the cardiovascular
system in animals (Gartner et al., 2010).

5
2.2.2 Validity
The importance of examining the Ornstein-Uhlenbeck process is to assess whether it will be a useful
model for devising a tool that will be able to process data from an electrocardiogram (ECG).
The main requirement is that the process can match the data taken from an ECG. The idea is that the
signal between heartbeats is mean reverting and thus we can use the OU process to generate a signal
that will be analogous to the signal between heart beats. In order to do this we need to consider the
data from a surgery; as we discussed in section 1.1 the absence of P waves illustrated by an ECG is a
strong indicator of the presence of atrial fibrillation thus it is important to examine the signal between
the heartbeat to detect the presence of the P wave. In order to do this it is important that we filter out
the heartbeat to understand whether the stochastic process chosen is suitable.

Fig. 2.2.2: The bottom graph shows an unmodified ECG. The top diagram is the same ECG with the heartbeat
removed. This shows that the signal between each heartbeat shows a mean reverting nature.

Figure 2.2.21 gives a comparison of the data with and without heartbeats. We can see that the process
is indeed mean reverting with a mean of roughly -400. Another important check to see that the process is
suitable for the task is whether the data itself is Gaussian by taking a histogram of the data. As we can
see in figure 2.2.3 the filtered data does display a Gaussian distribution therefore for modelling purposes
the Ornstein-Uhlenbeck process should be sufficient.

Fig. 2.2.3: A histogram of the ECG with the heartbeat removed. This shows that the information is distributed
normally.

1
This was created by using a tool to check for monotonicity in a set number of points in an ECG and thus remove a
heartbeat. The method is displayed in appendix F

6
3 Kalman Filters
A Kalman filter (Kalman, 1960) is a discrete recursive numerical method used to filter noisy signals and
estimate the true signal and infer other parameters that are associated with the system. It was originally
developed for use in trajectory estimation for the Apollo space program (McGee and Schmidt, 1985).
Since its conception it has become used in many everyday applications such as GPS and RADAR
tracking, numerical weather prediction (NWP) (Evensen, 1992), turbulence modelling (Majda et al.,
2010) and financial modelling (Krul, 2008).
The filter relies on the fact that the true state can be inferred from the state from the previous time
step; it is a Markov chain. This allows for the filter to be used in modelling stochastic processes.
This section discusses the developments of Kalman filters and discusses their suitability towards the
problem of estimating the Ornstein-Uhlenbeck process parameters.

3.1 Linear Kalman Filter


In this subsection we will deal with the linear Kalman filter.

The linear Kalman filter tries to estimate the state x ∈ Rn using the following stochastic difference
equation (Welch and Bishop, 2006):
xk = Axk−1 + Buk−1 + wk−1 (3.1.1)
where A is the state transition matrix of size n×n, B (size n×l) relates an optional control input (u ∈ Rl )
with the state x and wk is white process noise of the form P (w) ∼ N (0, Q) where Q is the process noise
covariance. The filter has an observation equation of the form:
zk = Hxk + vk (3.1.2)
where zk ∈ Rm is the measurement of a signal, H is a measurement operator of size m × n and vk is a
white measurement noise of the form P (v) ∼ N (0, R) where R is the measurement noise covariance. It
is important to note that the symbols used in the above definitions vary depending on the book, paper
or notes used to reference the Kalman filter. In order to understand how the Kalman filter works the
following terms also need to be defined:
bfk = prior estimate
x (3.1.3)
bak = posterior estimate
x (3.1.4)
Along with these two terms, we can define the following covariance matrices:
Pkf = E[(xk − x
bfk )(xk − x
bfk )T ] = prior covariance (3.1.5)
Pka = E[(xk − x
bak )(xk − x
bak )T ] = posterior covariance (3.1.6)
With all the required terms now defined, it is possible to now display the Kalman filter equations. The
equations can be broken down into two categories: prediction and correction steps.
The equations associated with the prediction are as follows:
bfk
x xak−1 + Buk−1
= Ab (3.1.7)
Pkf a
= APk−1 AT + Q (3.1.8)
The equations associated with the correction step are as follows:
Kk = Pkf H T (HPkf H T + R)−1 (3.1.9)
bak =
x bfk
x + Kk (zk − bfk )
Hx (3.1.10)
Pka = (I − Kk H)Pkf (3.1.11)

7
The yet-to-be defined term, Kk , is the Kalman gain matrix which relates the prior and posterior estimates2 .

3.1.1 Algorithm
This section deals with the the pseudocode for the linear Kalman filter.

Algorithm 3.1.1 Linear Kalman Filter Algorithm


define A,B,u,Q2 and R2
for k = 1 to N do
xfk = Axak−1 + Buk−1
Pkf = APk−1
a AT + Q

Kk = Pkf H T (HP f T
 k H + R) 
−1

xak = xfk + Kk zkf − Hxfk


Pka = (I − Kk H) Pkf
end for

3.1.2 Application
In order to understand how the Kalman filter operates it is important to apply it to the underlying prob-
lem at hand: estimating the Ornstein-Uhlenbeck process however we do have to deal with the issue that
the process is nonlinear when we try to estimate all parameters (x,µ,θ and σ) thus for understanding the
filter it is assumed that the only parameter that is unknown is x and all others are known thus simplifying
the problem into a linear problem.

In doing so we generate an Ornstein Uhlenbeck process using the parameters as defined in table 2.2.1 and
using the Kalman filter parameters as defined in table 3.1.1 we are able to call the filter to estimate the
x state of the process.

x0 2
P0 1
A e−θdt
B 1 − e−θdt
u µ
σ2 −2θdt
Q2

2θ 1−e
R2 0.01

Table 3.1.1: Parameters used to estimate the position of the OU process when passed through a linear Kalman
Filter. It is assumed that µ, θ and σ are known.

Figure 3.1.1 shows that the Kalman Filter works well in trying to estimate the true state given noise
added to the initial signal generated for this application. While it is not perfect it does give us a good
idea of how the filter operates by forecasting and correcting the signal and it will not completely believe
the incoming state and instead modify it closer to the true state. An important note is that the filter can
be further fine tuned by reducing the value of R2 (model measurement noise covariance).
This section has provided a grounding in the Kalman filter however as this filter is linear it is of little
2
For a more comprehensive description of the Kalman filter and deriving the equations please refer to (Simon, 2006)

8
use to us as we are looking to estimate all parameters in the Ornstein-Uhlenbeck process thus we need to
consider non-linear options.

Fig. 3.1.1: Example of the Kalman Filter estimating the Ornstein-Uhlenbeck process. This demonstrates the
ability for the filter to estimate the true state from the noisy observations.

3.2 Extended Kalman Filter


In this subsection we will go on to discuss the extended Kalman filter used for non-linear problems.

The linear Kalman filter does have an inherent problem in that it can only be applied to a linear system;
in the example of the Ornstein-Uhlenbeck process this means that the only element that can be estimated
is the position, x. The aim is to estimate all parameters associated with the OU process (x,µ,θ and σ)
therefore we need to consider the case where the filter takes into account non-linearity. The first step in
doing so is the extended Kalman filter (EKF).
The EKF works by linearizing the set of equations that we will be using to estimate the process. If
we use the generalised equation (Welch and Bishop, 2006):

xk = f (xk−1 , uk−1 , wk−1 ) (3.2.1)

zk = h(xk , vk ) (3.2.2)
We can then linearize the equations using a Taylors expansion and apply them to the filter equations
provided we have the following Jacobian matrices:
∂fi
Aij = (b
xk−1 , uk−1 , 0) (3.2.3)
∂xj

∂fi
Wij = (b
xk−1 , uk−1 , 0) (3.2.4)
∂wj
∂hi
Hij = (b
xk , 0) (3.2.5)
∂xj

9
∂hi
Vij = (b
xk , 0) (3.2.6)
∂vj
where x
bk is a posterior estimate as defined in section 3.1. With the above definitions we can define the
extended Kalman filter equations. Again they can be split into estimate and correction steps.
The estimation equations are:

b−
x k = f (b
xk−1 , uk−1 , 0) (3.2.7)
Pkf = a
Ak Pk−1 ATk + Wk Qk−1 WkT (3.2.8)

The correction equations are:

Kk = Pkf HkT (Hk Pkf HkT + Vk Rk VkT )−1 (3.2.9)


x
bk = b−
x k x−
+ Kk (zk − h(bk , 0)) (3.2.10)
f
Pka = (I − Kk Hk ) Pk (3.2.11)

3.2.1 Algorithm
The algorithm for the extended Kalman filter is very similar to the linear variant as seen in algorithm
3.1.1 in section 3.1.1.

Algorithm 3.2.1 Extended Kalman Filter Algorithm


define f (x, u), A0 , W0 , H0 , V0 , Q2 and R2
for k = 1 to N do
xfk = f (xk−1 , uk−1 , 0)
Pkf = Ak Pk−1
a AT + W QW T
k k k

Kk = Pkf HkT (H f T T −1
 k )
k Pk Hk + Vk RV
xak = xfk + Kk zkf − h(xfk , 0)
Pka = (I − Kk Hk ) Pkf

redefine Ak , Wk , Vk and Hk {H and V will in most cases remain constant}


end for

3.2.2 Application
In order to apply the EKF to the OU process we have to apply the Jacobian matrices for the EKF
to the problem at hand. By going back to equation 2.2.1 and applying persistence equations for the
non-observed states with added artificial noise. The reason for this is to prevent these parameters from
completely settling to one value. For the problem at hand the patient undergoing surgery may be ablated
thus changing the shape of their ECG therefore we need to ensure that the model covariance for the
parameters does not settle to 0 and thus stop believing the data being taken in from electrodes attached
to the patient:

bfk
µ = µbak−1 + Cµ dWµ (3.2.12)
θbf
k = θba + Cθ dWθ
k−1 (3.2.13)
bkf
σ = σ a
bk−1 + Cσ dWσ (3.2.14)

10
We obtain the following Jacobian matrices:
 
−θ θ µ − Xt 0
0 1 0 0
A=
0
 (3.2.15)
0 1 0
0 0 0 1
 
0 0 0 σ
0 Cµ 0 0
W =
0 0
 (3.2.16)
Cθ 0 
0 0 0 Cσ
 
H= 1 0 0 0 (3.2.17)

 
V = 0 (3.2.18)
Applying these matrices to the filter results in very poor performance and as such the figure has not
been displayed. The filter diverges with the xt estimation dropping to values O(10172 ) after 17 time steps.
The reason for this is that the extended Kalman filter suffers from linearization error which could explain
the reason for the divergence of parameters. Other approaches to the extended Kalman filter could be
pursued such as hybrid filters which considers a continuous-time system with discrete-time measurements
or by looking at higher order approaches to the linearization however despite these options the EKF can
be very difficult to tune and can give unreliable estimates depending on the severity of the nonlinearity of
the system (Simon, 2006). This linearization of the covariance error associated with the EKF can result
in unbounded linear instabilities for the error evolution (Evensen, 1992). Therefore other filter types need
to be examined as an alternative to higher order linearizations which brings us to the ensemble family of
filters as discussed in section 4.

11
4 Ensemble Filters
Ensemble filters are alternatives to the traditional filters as discussed in section 3 of which the two most
well known are the ensemble Kalman filter and the particle filter. In the case of ensemble filters the error
covariance matrix is represented by a large ensemble of model realizations. The uncertainty in the system
is represented by a set of model realizations rather than an explicit expression for the error covariance
(Evensen, 2009). The model states are then integrated forward in time to predict error statistics. Research
has also shown that the use of ensemble filters for non-linear models costs less computationally than an
extended Kalman filter (Evensen, 2006). Subsequently the ensemble filters have found widespread use
when handling a large state space such as in NWP.

4.1 Particle Filter


In this subsection we will discuss particle filtering techniques used for estimating non-linear systems where
the probability density function is non-modal.

The particle filter is a sequential Monte Carlo algorithm that, as mentioned in section 4, uses an en-
semble of N members, or particles, to estimate the characteristics of a system. It is a computational
method of implementing a Bayesian estimator. In order to understand how it works we must first look
at Bayes’ theorem to understand that the particle filter computes the statistics of the system from which
information can be extracted. If we begin with our system and measurement equations (Simon, 2006):

xk+1 = f (xk , wk ) (4.1.1)


zk = h(xk , vk ) (4.1.2)

p(zk |xk )p(xk |Zk−1 )


p(xk |Zk ) = R (4.1.3)
p(zk |xk )p(xk |Zk−1 ) dxk
Where Zk denotes measurements z1 , z2 , ..., zk . Equation 4.1.3 does pose some problems because the
denominator can prove to be intractable hence in many cases it is necessary to use a delta function to
integrate the function and to estimate the probability density function of the system. By being able to
evaluate equation 4.1.3 we will be able to integrate our model in time by using the Euler-Maruyama
method or by using the exact solution to the OU process.
Now that we understand Bayes’ theorem we can then begin to look at the particle filter and how to
apply it. Unlike the family of Kalman filters, the particle filter does not assume that the distribution
is Gaussian which means evaluating the pdf is much more difficult therefore we need to represent the it
(i)
using a series of weighted particles where πt represents a normalized weight for the i’th particle at time
time t. X
P (xt |Zt ) = πt−1 δ (xt − xt−1 ) (4.1.4)
i

Where Zt = (zt , zt−1 , ..., z0 ). To initialize the particle filter we distribute a set of N particles based on a
known pdf. We shall assume the notation xak,i andxfk,i where k is the time step, i is the particle number
and a donates the analysis step and f denotes the forecast state. If we begin by evaluating each particle
and generating a prior state:
xfk,i = f (xak−1,i , wk−1 ) (4.1.5)

We then compute the relative likelihood of each particle by evaluating the pdf p(zk , xfk,i ) which we will
denote as qi . We then normalize each likelihood to obtain the weight of each particle:
qi
πki = PN (4.1.6)
j=1 qj

12
Once we have this normalized weight we are able to resample each particle to generate the posterior state
xak,i according to the relative likelihood and thus we have our pdf p(xk |zk ). The particle filter does suffer
from some problems; namely sample impoverishment in which case all the particles will collapse to the
same value (Simon, 2006). There are methods of reducing this impoverishment such as adding random
noise or by modifying the resampling step by using a Monte-Carlo Metropolis Hastings algorithms. While
this illustrates the use of particle filters we can simplify the problem because we have shown that we are
using a Gaussian distribution thus allowing us to move to less computationally expensive methods.

4.2 Ensemble Kalman Filter


In this subsection we will examine the ensemble Kalman filter used for estimating a non-linear problem
with the assumption that the model displays a normal distribution. This subsection will also go onto dis-
cuss modifications to the filter required to prevent members collapsing to a single value.

The ensemble Kalman filter (EnKF) works similar to the method to the particle filter however it is
computationally less expensive because of the assumption that the distribution of the system is Gaussian
and that every member has an equal weighting. Evensen (2006) shows that the EnKF is a special version
of the particle filter where the update step is approximated by a linear update step using just the mean
and covariance of the pdf. In order to avoid confusion the notation for the EnKF will be slightly different
to that which has been proposed in section 3. From now on we will use the notation that xfk,i is the i’th
forecast ensemble member at time k, xak,i is the corrected i’th member at time k and in our particular
application x = (x, µ, θ, σ). We can then revert back to our system equations 3.2.1 and 3.2.2 and use
them for our analysis step.
xfk,i = f (xak−1,i , uk−1 , 0) (4.2.1)
zk,i = h(xfk,i , 0) (4.2.2)
In section 3.1 we defined the covariance matrices for the prior and posterior distributions in equations
3.1.3 and 3.1.4 however we need to pursue a slightly different method to establish the covariance matrices.
We begin by defining the matrix Xf ∈ Rn×N as the matrix of ensemble member:
 
Xfk = xfk,1 , xfk,2 , ..., xfk,N (4.2.3)

We can then define a matrix X̄k ∈ Rn×N as the matrix of the ensemble mean:
X̄k = Xfk 1N (4.2.4)
where 1N is a matrix where all entries are equal to 1/N . Once we have these definitions we can assemble
a matrix of fluctuations:
f
X0 k = Xfk − X̄k (4.2.5)
And finally we can now assemble our error covariance matrix:
1 f

f T

Pkf = X0 k X0 k (4.2.6)
N −1
Now that we have our error covariance matrix we can proceed to perform our Kalman filter update
equations:
Kk = Pkf H T (HPkf H T + R) (4.2.7)
 
xak,i = xfk,i + Kk zk − Hxfk,i (4.2.8)

Pka = (I − Kk H) Pkf (4.2.9)


The most important change is the way in which the prior error covariance matrix is assembled and the
way in which the Kalman update equation is applied; the update equation is applied to every ensemble
member.

13
4.2.1 Algorithm
Algorithm 4.2.1 demonstrates how to implement the EnKF.

Algorithm 4.2.1 Ensemble Kalman Filter Algorithm


define N {number of members}
distribute N normally with a chosen variance
for k = 1 to T do
for i = 1 to N do
xfk,i = f (xak−1,i , uk−1 )
end for

X0 fk = Xfk (I − 1N )
 T
Pkf = N 1−1 X0 fk X0 fk
 −1
Kk = Pkf H T HPkf H t + R
Pka = (I − Kk H) Pkf

for i = 1 to N do  
xak,i = xk,i + Kk zk − Hxfk,i
end for
end for

4.2.2 Application
We can now begin to examine how well the EnKF performs and decide as to how much further the filter
can be taken in reaching our goal of estimating the parameters from an OU process.

xt , µ and θ Estimation To start we will first look at estimating the diffusion parameters (µ and θ)
along with xt in the OU process while assuming that we know σ. Therefore the equations that we will be
using for our analysis will be similar to what was seen in section 3.2.2 including the persistence equations
for µ and θ and the exact solution to the OU process for xt . In this section we will present the result
from running the EnKF. We will create a single OU process as a benchmark with the parameters in table
4.2.1. With these parameters set and an OU process saved we will begin by examining how well the filter

µ 10
θ 200
σ 500
Number of points 1,000,000
Total time 100s

Table 4.2.1: Parameters for the benchmark OU process which will be used as a control for the development
of the filter.

performs with 100 ensemble members and a measurement noise variance of 0.01. The initial variance of
the ensemble is 1 × 109 which has been chosen through empirical testing. In this test we have removed
the Wiener process from the µ and θ parameters. We can see in figure 4.2.1 that the xt estimator works
very well. The plot of the first and last 100 points have been included to demonstrate the ability of the
filter to estimate the measured state. The numerical result of the simulation can be seen in table 4.2.2.
The total time taken has been displayed to understand the trade off between the amount of ensemble

14
members and the computational cost of running the simulation.

µ 27.513
θ 197.5599
Time taken 256.512845s

Table 4.2.2: The result of the EnKF estimating the benchmark OU process with 100 ensemble members.

Fig. 4.2.1: Graphical result of the EnKF with 100 members estimating the benchmark process. The top two
graphs show the first and last 100 points of the estimation process. The third graph shows the
estimation for µ and the final graph shows the estimation of θ. The filter demonstrates that the
method works well for estimating the position, xt however the method fails to correctly estimate
the non-observed parameters correctly using the persistence model.

While the θ and µ plots are not clear in figure 4.2.1 if we zoom into the first 1000 points we can see
the change in both parameters before they settle to a the set values as can be seen in figure 4.2.2.

We can see that the filter can estimate the parameters that are not measured well. Further tests are
required to see whether this performance is repeated every time, therefore the filter is run with the same
parameters twice more and the results from which are presented in table 4.2.3. It is important to note that
between each run of the filter the memory of the computer being used is cleared and the only information
that is constant throughout is the OU process. The results show that the second run did not prove to
be promising and the third run gave a better value of µ but a worse value of θ compared to the first
simulation. Therefore further testing is required to understand whether the number of ensemble members
is sufficient or whether further research into ensemble filters needs to be pursued.
We can see from figure 4.2.1 that the parameters settle down very quickly and for further testing we will

15
Fig. 4.2.2: Examining the first 1000 points of the non-observed parameter estimating from the EnKF with 100
ensemble members. The persistence model has an effect within this region however the members
collapse to one value which leads to a constant value.

µ2 -20.4883
θ2 20.9569
Time taken2 315.764881s
µ3 16.5418
θ3 158.1899
Time taken3 249.621532s

Table 4.2.3: Further results for the EnKF with 100 ensemble members. This shows that there is no level of
consistency in estimating µ and θ.

subsample the OU process to add an extra element of uncertainty in order to understand the robustness of
the filter. Henceforth we shall use every 10th sample from the stochastic process. In doing so we will need
to modify our value of ∆t which has been taken into account in the subsequent tests. Another important
measure is to understand how the filter operates with 100 ensemble members with subsampling. Having
run the test three times we obtain averages of the parameters as seen in table 4.2.4.

µaverage 30.4354
θaverage 75.2114
Total timeaverage 29.30938s

Table 4.2.4: An example of the filter operating when the benchmark process is subsampled.

250 ensemble Members In this section we will look at how the filter works with subsampling and
the use of 250 ensemble members.

16
µ1 5.4461
θ1 108.6679
Time taken1 108.6679s
µ2 11.2516
θ2 147.8386
Time taken2 107.158336
µ3 11.1496
θ3 162.4759
Time taken3 105.324963

Table 4.2.5: Results of the EnKF with 250 ensemble members. This shows that there is an increased level of
accuracy by using more members however this comes at an added computational cost.

We can see in table 4.2.5 that by increasing the amount of ensemble members we see an increase in the
ability of the filter to estimate the parameters although we do see the trade off between computational
time.

500 ensemble Members We will now consider the case where there are 500 ensemble members.
Table 4.2.6 shows that the filter works better when the amount of members is increased, however the cost
of doubling the members from the case of 250 ensembles is that the time is more than quadrupled. This
demonstrates the payoff between time taken and the accuracy required.

µ1 7.4670
θ1 166.6105
Time taken1 465.012905s
µ2 11.4385
θ2 161.7368
Time taken2 476.574365
µ3 14.0925
θ3 171.1054
Time taken3 614.160173

Table 4.2.6: Results from the EnKF with 500 ensemble members. This shows a greater level of accuracy over
the cases with 250 and 100 ensemble members however the computational cost is almost 4 times
that of the case with 250 members.

Added noise to µ and θ The problem with trying to accomplish further accuracy is that our
persistence model causes the parameters to settle to a value and the filter stops believing the data that
it is receiving; the covariance for both µ and θ is equal to 0. In applying this to the problem of AF, this
will be a problem as we are hoping to look for any changes to the parameters incoming from the ECG. If
the model stops believing the data incoming then it will not be able to detect any changes once the value
has settled. So far we have run the persistence models without artificial Wiener processes; if we are to
run the filter again with the parameters Cµ = 0.5 and Cθ = 1.0 we can examine whether doing this will
improve the accuracy of the method and prevent the members from settling to one value. The result is
shown in figure 4.2.3.

We can see that while the values no longer settle to a constant value that the added uncertainty is not
large enough for θ. If we increase Cθ to 10.0, the Lévy process is now sufficient to cause θ to fluctuate
around the actual value as we can see in figure 4.2.4.

17
Fig. 4.2.3: This demonstrates the parameter estimation used 500 ensemble members and adding Wiener
processes to the persistence models where Cµ = 0.5 and Cθ = 1.0. This prevents the persistence
model from settling to one value although the estimation for θ is not sufficient.

Fig. 4.2.4: Parameter estimation using 500 ensemble members and added Wiener processes to the persistence
model with the parameters Cµ = 0.5 and Cθ = 10.0. This shows that the added noise has an impact
on the estimation of θ however it causes the signal to fluctuate around the mean of the value.

Now that we are a step closer to being able to approach the entire problem and include σ in our parameter
space. Running the simulation with the same parameters used to create the filter for figure 4.2.4 and now
including a Wiener process for the σ equation with Cσ = 10.0 we are able to obtain figure 4.2.5. We can
see that the method still works well for µ and θ, although the latter of which has more fluctuations around
the mean. The σ plot performs very poorly; it fluctuates around a mean of roughly -200 which is very far
off from the 500 used to generate the process. We need to begin to look at alternatives or modifications to
the EnKF to ascertain whether the performance of the filter can be improved. It is stated in (Yang and
DelSole, 2009) that while the method of augmenting our state space to include the parameters works well,

18
Fig. 4.2.5: Estimating all parameters using 500 ensemble members and added Wiener processes where Cµ =
0.5 and Cθ = Cσ = 10.0. This shows that method does allow for the estimation for µ and θ to
fluctuate around the true value however the estimation for σ fails.

we also need to also consider how stable it is. It is stated that problems can arise when the parameters
are multiplicative and can cause the model to become dynamically unstable. Based on this we will now
consider the ensemble square root filter which will be discussed in section 4.3.

4.3 Ensemble Square Root Filter


This subsection will deal with a modification to the EnKF called the ensemble square root filter which
applies the update to the perturbations from the mean as opposed to the member itself. We will also go
on to discuss temporal smoothing and artificial covariance inflation methods.

The ensemble square root filter (EnSRF) is modified form of the ensemble Kalman filter. We propose
splitting the state space vector into two vectors: an m-dimensional vector specifying the forecast state
(xf ) and a q dimensional vector specifying the uncertain model parameters (bf ). This requires, therefore,
a change of some of the equations used in the EnKF. In the EnKF we performed the update equation on
each individual ensemble; in this case we will apply the Kalman update to the mean of these ensembles
and then propagate the fluctuations as opposed to each individual member. We will redefine our operation
equation as
zk = Hx xfk + vk (4.3.1)
Where Hx is a measurement operator that maps the forecast state to the observation and vk we have
defined previously; white noise with mean 0 and covariance R. We can then define our augmented state
vector x∗f as:  f
xk
∗f
xk =   (4.3.2)
f
bk
And an augmented measurement operator as

H∗ = Hx 0
 
(4.3.3)

19
Like the EnKF we will require a matrix of ensemble fluctuations. We defined this in equation 4.2.5
however we will now require another equation for the model parameters.
f
B0 k = Bfk (I − 1N ) (4.3.4)

Where Bfk = (bf1 , bf2 , ..., bfN ). Now that we have the matrices of fluctuations defined we can now assemble
our error covariance matrix in a way similar to the EnKF although it will now require 3 steps. We will
require the following 3 matrices (note the subscript k has been dropped for this notation):
1 f fT
Pxx = X0 X0 (4.3.5)
N −1
1 f fT
Pbx = B0 X0 (4.3.6)
N −1
1 f fT
Pbb = B0 B0 (4.3.7)
N −1
(4.3.8)

Hence
Pxx PTbx
 
f
P = (4.3.9)
Pbx Pbb
We will now require two Kalman gain matrices:
−1
Kxk = Pxx HTx Hx Pxx HTx + R (4.3.10)
−1
Kbk = Pbx HTx Hx Pxx HTx + R (4.3.11)

Our update now is two fold. We have our regular Kalman update applied to the means of xfk and bfk .
 
x̄ak = x̄fk + Kxk zk − Hx x̄fk (4.3.12)
 
b̄ak = b̄fk + Kbk zk − Hx b̄fk (4.3.13)

and finally assuming that our observations are independent we can follow (Whitaker and Hamill, 2002)
and apply the following updates to the fluctuations from the mean:
a f f
x0 j = x0 j − αKx Hx x0 j (4.3.14)
a f f
b0 j = b0 j − αKb Hx b0 j (4.3.15)

Where s !−1
R
α= 1+ (4.3.16)
Hx Pxx HTx + R
And we can see that 0.5 ≤ α < 1.

20
4.3.1 Algorithm

Algorithm 4.3.1 Ensemble Square Root Filter Algorithm


define N {number of members}
distribute N normally with a chosen variance

for k = 1 to T do

for i = 1 to N do
xfi,k = f (xai,k−1 , 0)
bfi,k = g(bai,k−1 , 0) {g is generally a persistence equation}
end for

X0 fk = Xfk (1 − 1N )
B0 fk = Bfk (1 − 1N )

Pxx = 1 0f 0f T
N −1 X k X k
Pbx = 1 0f 0f T
N −1 B k X k
Pbb = 1 0f 0f T
N −1 B k B k

Pxx PTbx
 
P=
Pbx Pbb
−1
Kxk = Pxx HTx Hx Pxx HTx + R
−1
Kbk = Pbx HTx Hx Pxx HTx + R
 
x̄ak = x̄fk + Kbk zk − Hx x̄fk
 
b̄ak = b̄fk + Kbk zk − Hx x̄fk
 q −1
α = 1 + Hx PxxRHT +R
x

for j = 1 to N do
x0 aj,k = x0 fj,k − αKx Hx x0 fj,k
b0 aj,k = b0 fj,k − αKb Hx b0 fj,k

xaj,k = x̄aj,k + x0 aj,k


baj,k = b̄aj,k + b0 aj,k
end for

end for

4.3.2 Application
As we have done previously, we will begin by looking at the application of the EnSRF to xt , µ and θ
while assuming that σ is known to understand whether the additional steps in the EnSRF will provide
better performance than the EnKF. We will examine the case where we have 250 ensemble members and
500 ensemble members. In both examples we will include the additional Wiener processes using the same
parameters as discussed previously: Cµ = 0.5 and Cσ = 10.0.

21
250 ensemble members Running the filter with 250 ensemble members results in figure 4.3.1. As we
can see the method does not settle and the values do indeed tend to the real values, although θ seems to
be undershooting. Running the filter three times results in the average figures displayed in table 4.3.1. In
each run we take a time average and then we have then the average of all the runs. Already we can see

Fig. 4.3.1: Using the ensemble square root filter with 250 ensemble members to estimate µ and θ. The
persistence model still contains the added noise. This shows that this method improves the accuracy
compared to the EnKF.

µ̄ 11.2213
θ̄ 178.6549
Time takenaverage 229.281044s

Table 4.3.1: Average results for 250 ensemble members in the EnSRF

that there is some improvement over the EnKF; the average value of θ is improved although again this is
at the cost of the time taken to run each simulation; we now are doubling the processing time due to the
increased amount of calculations required.

500 ensemble members We will now examine the method when using 500 ensemble members. We
will use the same method as described for 250 ensemble members. Figure 4.3.2 shows an improvement
over 250 ensemble members with all values fluctuating closer to actual value. The average results are
presented in table 4.3.2.

The results are very promising. As we can see, we have improved the result of all the 2 parameters and
as such it is now time to see how well this filter performs with σ added to this case.

500 ensemble members estimating µ, θ and σ We will now consider the case where we have
the added Wiener process to the σ persistence equation: we will set Cσ = 10.0 and examine how well

22
Fig. 4.3.2: Estimating µ and θ using 500 ensemble members in the EnSRF with added noise to the persistence
model.

µ̄ 10.065
θ̄ 207.6561
Time takenaverage 1134.1052s

Table 4.3.2: Estimating µ and θ using 500 ensemble members in the EnSRF with added noise to the persis-
tence model.

this performs. Figure 4.3.3 shows the performance of all 3 runs of the filter. The numerical results are
presented in table 4.3.3.

µ̄ 11.0981
θ̄ 194.2767
σ̄ -53.2441
Time takenaverage 1260.2632s

Table 4.3.3: Average values from the EnSRF estimating all parameters in the OU process with 500 ensemble
members with added noise to the persistence model applied.

The results of this show that the extra cost of estimating σ is minimal however the filter fails at estimating
the parameter. While the diffusion parameters (µ and θ) are estimated fairly well, the estimation for the
drift term is poor: two of the three runs result in a negative value and none are close to the actual value.
Therefore further research is still required.

23
Fig. 4.3.3: EnSRF with 500 ensemble members and added noise. The top and middle graphs demonstrate
the ability of the filter to estimate the diffusion terms in the OU process. The bottom graph shows
that the filter still fails to effectively estimate σ.

Temporal Smoothing It is suggested in (Yang and DelSole, 2009) that the persistence model can be
modified for use by using temporal smoothing. The result of using the smoothing is that it can mitigate
model blow-up for a small number of ensembles. The modification required is demonstrated in equation
4.3.17.
bfk+1,j = (1 − β)bak,j + βbfk,j (4.3.17)
It is suggested that β = 0.8 for the most effective result. We will consider the application of this using
the same parameters as described so far.

As we can see in figure 4.3.4 and table 4.3.4 the result of the temporal smoothing is indeed beneficial. We
are able to obtain accuracy close to the case where we have 500 members thus reducing the computational
time taken. An interesting note, however, is the instability that occurs in the third run of the filter; after
about 8 × 104 we see θ starting to ’blow up’ and this is also reflected in the other two plots. Regardless
of this, the other two attempts seem to be successful. This may have occurred due to a spurious point
being generated in the normal distribution However we still encounter the problem with estimating σ
estimation.

Previously we could see a definite increase in accuracy by using 500 ensemble members compared to 250,
however this is no longer the case as shown by table 4.3.5. With temporal smoothing applied there is very
little different between both cases for µ and θ. We can see that the amount of members has little to no
effect on estimating σ. Thus we can infer that by applying the temporal smoothing we improve accuracy
for a smaller ensemble thus decreasing the computational cost. Henceforth we will consider the case of

24
Fig. 4.3.4: The application of temporal smoothing to the EnSRF with 250 ensemble members with added
noise. The top (µ) and middle (θ) graphs demonstrate estimation comparative to the case of 500
ensemble members. The bottom graph (σ) still fails to be estimated correctly.

µ̄ 9.9926
θ̄ 197.6678
σ̄ 39.2513
Time takenaverage 241.9382s

Table 4.3.4: Average results from EnSRF with 250 members, temporal smoothing and added noise. This
shows that the overall accuracy of the filter is increased by using temporal smoothing compared
to the case with 500 members and no smoothing applied. σ continues to b estimated poorly.

250 ensemble members.

Artificial Covariance Inflation We have already discussed why it is important for the model param-
eters to have some form of artificial noise or inflation to prevent the filter from settling to one value as
we require the filter to be able to adapt to changes that may occur to the patient during the surgical
procedure. We have already proposed one method of artificial inflation which is by introducing additional
Wiener processes to the persistence models of the parameters we are trying to estimate. An alternative
to this is proposed by (Anderson, 2007). This involves applying an inflation coefficient to the update
equation of the parameters of the model. This takes the form of

binf
p
j,k = λb (bj,k − b̄k ) + b̄k (4.3.18)

It is suggested that λb should be incorporated into the state space vector however for simplicity we will
attempt to tune the value to understand how this method works. We will adapt different values of λ
for µ and θ for the case in hand. We will begin by removing the Wiener processes for the persistence

25
µ̄ 9.3593
θ̄ 197.4367
σ̄ -62.6541
Time takenaverage 1641.4369s

Table 4.3.5: Average results from EnSRF with 500 members, temporal smoothing and added noise. This
shows little-to-no improvement over the case with 250 ensemble members for the diffusion pa-
rameters. σ continues to b estimated poorly.

model and instead apply the inflation. This inflation will occur prior to the persistence model. As we
have failed to estimate σ so far, we will assume that it is known and once again simplify the problem
for understanding whether this extra step is viable. We will begin with the parameters λµ = 1.02 and
λθ = 1.01. The result of this is shown in figure 4.3.5.

Fig. 4.3.5: Application of artificial covariance inflation instead of added noise to the EnSRF with 250 en-
semble members to estimate µ (top) and θ (bottom).

µaverage 9.6066
θaverage 196.8571
Time takenaverage 257.1983s

Table 4.3.6: Average data for EnSRF with 250 ensemble members, artificial inflation, temporal smoothing
and no σ estimation. This shows that this method is comparable to the method of adding noise
to the unobserved members.

We can see in the artificial inflation increases the covariance of both µ and θ. An interesting note is
that during the third run of the µ estimation we can see that there is a very large amount of uncertainty
although the filter does settle down. This may be due to the initial distribution of the members in the

26
filter however despite this we see that the severity of the fluctuations reduces. While this value does
settle to a mean we see that this does increase the amount of time it takes for θ to settle which should
be expected. If θ controls how quickly the signal is pulled back to the mean then the mean does indeed
need to be discovered first. The average data is presented in table 4.3.6. We can see that this method
also works well as an alternative to the additional Wiener processes. The most important issue, however,
is in estimating σ. We will now consider alternative methods to estimating this parameter.

27
4.4 Estimating σ
In this subsection we will discuss the problems in estimating the drift parameter in an SDE and propose
a Bayesian method for estimation purposes.

We have seen that estimating σ seems to be more complicated than estimating θ and µ. The reason
for this was unknown, however it is suggested in (DelSole and Yang, 2009) that as the initial covariance
between xt and σt vanish, then σta remains constant and equal to the initial guess of σ.3

So far we have been using Kalman filter methods to estimate the diffusion parameter by augmenting our
state space vector. However the proposed method involves having two different vectors; an augmented
vectors β = [σ] and x∗ = [x, b]T = [xt , µ, θ]T . If we revert to Bayes’ theorem:

p(βx∗ |z) ∝ p(z|x∗ β)p(x∗ |β)p(β) (4.4.1)

And by defining the following


M 1 −1 (z−Hx)
p(z|x∗ β) = (2π)− 2 |R| 2 e(z−Hx)R (4.4.2)

And defining p(x|β) which is the prior distribution which we will assume is Gaussian with mean x̄ and
covariance P. The final distribution to be defined in equation 4.4.1 is p(β) which is also assumed to be
Gaussian with mean β̄ and covariance Σβ . By taking the log of equation 4.4.1 and multiplying it by -2
we obtain the following

− 2L = (β − β̄)T Σ−1 T −1 T −1
β (β − β̄) + ln |P | + (z − Hx) R (z − Hx) + (x − x̄) P (x − x̄) (4.4.3)

By differentiating equation 4.4.3 with respects to x we are able to obtain the Kalman filter update
equation. However if we differentiate with respects to β and by defining the following:

∂P −1 ∂P −1
= −P −1 P
∂βj ∂βj
∂ ln |P | ∂P
= tr[P −1 ]
∂βj ∂βj

We can obtain an equivalent update equation for β 4 :


 
1 −1 ∂P 1 ∂P T
β = β̄ − tr P + (z − Hx̄)T (R + HP H T )−1 H H (R + HP H T )−1 (z − Hx̄) (4.4.4)
2Σβ ∂β 2Σβ ∂β

The implication is that this is an iterative method, however it is suggested that only one iteration is
required per time step. This equation introduces an extra term which requires evaluation: ∂P
∂β . In order
to evaluate this we require two simultaneous ensembles to calculate P (β) and P (β − δβ). Once we have
these two covariance matrices we can then define the term using a first order approximation.

∂P P (β) − P (β − δβ)
= (4.4.5)
∂β δβ
In doing this it is implied that we will be able to have a better estimate of σ. This method assumes that
Σβ is constant which is where the method differs from the Kalman filter.

3
For the full proof please refer to appendix B
4
For a full derivation please refer to (DelSole and Yang, 2009)

28
4.4.1 Algorithm
This section describes the procedure to implement a filter that estimates σ along with all other parameters.

Algorithm 4.4.1 Hybrid filter using an EnKF for xt , µ and θ and a Bayesian estimator for σ
define N {number of members}
distribute N normally with a chosen variance
define δβ and Σβ

for k = 1 to T do
for i = 1 to N do
xfk,i = f (xak−1,i , bak−1,i , βk−1 )
xβ fk,i = f (xak−1,i , bak−1,i , (β − δβ)k−1 ) {This value is used to calculate P(β − δβ)}
bfk,i = g(xak−1,i , bak−1,i , βk−1 )
end for

X0 fk = Xfk (1 − 1N ) X0β fk = Xβ fk (1 − 1N ) B0 fk = Bfk (1 − 1N )

Pxx = 1 0f 0f T 1 0 f 0 fT
N −1 X k X k Pβ xx = N −1 Xβ k Xβ k
Pbx = 1 0f 0f T 1 0 f 0f T
N −1 B k X k Pβ bx = N −1 Xβ k B k
Pbb = 1 0f 0f T
N −1 B k B k

Pxx PTbx Pβ xx Pβ Tbx


   
∂P P(β)−P(β−δβ)
P(β) = P(β − δβ) = ∂β = δβ
Pbx Pbb Pβ bx Pbb
h i
βk = βk−1 − 1
2Σβ tr P−1 ∂P
∂β +
1
2Σβ (z − Hx̄)T (R + HPHT )−1 H ∂P T T −1
∂β H (R + HPH ) (z − Hx̄)
−1
Kxk = Pxx HTx Hx Pxx HTx + R
−1
Kbk = Pbx HTx Hx Pxx HTx + R
 
x̄ak = x̄fk + Kbk zk − Hx x̄fk
 
b̄ak = b̄fk + Kbk zk − Hx x̄fk
 q −1
α = 1 + Hx PxxRHT +R
x

for j = 1 to N do
x0 aj,k = x0 fj,k − αKx Hx x0 fj,k
b0 aj,k = b0 fj,k − αKb Hx b0 fj,k

xaj,k = x̄aj,k + x0 aj,k


baj,k = b̄aj,k + b0 aj,k
end for

end for

29
Fig. 4.4.1: Using the proposed Bayesian approach to estimating σ. This has been taken from a simplified case
of only estimating xt and σ. This demonstrates that the method does not seem to work.

4.4.2 Application
We will now consider the case where we can estimate σ. For testing purposes we will simplify the problem
at hand and assume that µ and θ are known quantities. We shall begin by choosing values for Σβ and
δβ. We will aim for a high variance of Σ2β = 1 × 102 and δβ = 0.01. We will begin from an initial guess
of β0 = 0.0.

From figure 4.4.1 we can see that this process does not work, however this may be down to the parameters
that we have defined for the estimation process. We can see that the figure shows that the method is very
slow to adapt to the information coming in. This can, however, be solved by increasing the set covariance
of the process. We will now consider the case where Σβ = 2.5 × 102 and δβ = 1 × 10−3 . The result is
more promising. An initial run provides a mean value of σ as 668.3535. While this is still not near the
500 target we have used to generate the OU it is some form of progress in achieving the goal set out. The
filter was run twice more and the results are now presented in figure 4.4.2. An important note is that we
have not benchmarked the time taken for this task as we are interested in looking at the performance of
the method as opposed to the efficiency.

Fig. 4.4.2: Further simplified tests with the Σβ and δβ changed. This demonstrates that there is some level
of consistency compared to the Kalman filtering approach to estimating the drift parameter.

We can see that again while the filter is not perfect we at least have some level of consistency in
obtaining a value whereas in previous attempts we did not have any form of consistency and therefore we
will continue down the path of estimating this parameter using the outlined method.

30
Estimating all parameters We will now begin testing all the parameters using the EnSRF for the
diffusion term and the Bayesian approach for the drift term. We will begin by applying artificial covariance
inflation to the EnSRF and we will use 250 ensemble members with temporal smoothing applied. Other
parameters include using λµ = λθ = 1.01 and the parameters for estimating σ. The result is shown in
figure 4.4.3.

Fig. 4.4.3: Hybrid of EnSRF and Bayesian Estimation for all unobserved parameters. This approach uses
250 ensemble members, artificial inflation and temporal smoothing applied. This shows that with
the correct parameters for estimating σ the filter is able to obtain an estimate of parameters closer
to the true value compared to previous runs.

The result of this last test is very promising. We see the diffusion parameters fluctuate around the
expected values. We still have some issue of σ undershooting, however the result is much more promising
than in our previous attempts. Taking a mean of all parameters we obtain the results presented in table
4.4.1.

µ̄ 10.6276
θ̄ 185.9806
σ̄ 404.3015

Table 4.4.1: Average Results for the hybrid filter with 250 ensemble members, temporal smoothing and arti-
ficial covariance inflation applied.

While the results are not perfect they do prove to be a step in the right direction. To ensure that
the results are correct we will run the filter again with the same parameters 3 times to ensure that the

31
estimation may not be serendipitous as we saw in the θ estimation in figure 4.2.1. The result is shown in
figure 4.4.4.

Fig. 4.4.4: Hybrid filter with 3 separate runs. 250 ensemble members have been used as well as artificial
inflation and temporal smoothing. This demonstrates an improvement over the traditional Kalman
filter methods for estimating the drift parameter.

µ̄ 9.6139
θ̄ 184.1664
σ̄ 403.8067
Time takenaverage 215.5576s

Table 4.4.2: Average results for the hybrid filter with 250 members, inflation and smoothing applied.

An interesting note on this filter is that on average it is computationally less expensive than running
the EnSRF with all parameters being estimated. This is because we are essentially only using one ’mem-
ber’ for σ as opposed to 250 or 500 as we were using previously.
Now that we have a functional method for estimating the parameters in the OU process we will now
consider fine tuning the filter and also ensuring that it is indeed suitable for the task of estimating data
from a surgery.

4.4.3 Changes in Parameters


We have previously only considered the case where the parameters of the process are constant. The
ultimate goal of this filter is that it can be applied to data during the surgery and look for changes that
may occur during the ablation process. This will most likely be shown by a change in either θ or σ; we
will assume µ will remain constant unless the surgeon moves the electrode. We will now consider the case

32
where we merge two OU processes and attempt to see whether the filter does indeed adapt to changes in
the signal. We will generate two OU processes with the parameters defined in table 4.4.3.

µ1 0 µ2 0
θ1 100 θ2 250
σ1 500 σ2 700
Number of points per process 5 × 105
Time per process 50s
∆t 1 × 104

Table 4.4.3: Parameters for separate OU processes to be stitched together to examine the ability for the filter
to adapt to changes.

We can see that from figure 4.4.5 that the filter is able to adapt to changes in parameters. Although
as we started earlier σ is still not perfect we can see some shift in the parameter. It is possible to reduce
the fluctuations of the plot however the cost of doing this is the increase in time it takes for the filter to
adapt to any changes.

Fig. 4.4.5: This demonstrates the change of parameters in the OU process. This demonstrates that the filter
is able to adapt to changes in θ (middle). The filter is also able to sense a change in σ (bottom)
however the method still demonstrates undershoot.

4.4.4 Robustness
In this section we will discuss how robust the filter is. We have already discussed the reasoning behind
using 250 ensemble members over 500 for computational cost however we will now consider the case where
we examine the initial variance of the ensemble. Throughout these tests we have been using a variance of
1×109 however we should consider how the filter performs with different variances. For our initial case we
will consider the following initial variances: 1 × 101 , 1 × 104 , 1 × 109 . We found that 1 × 104 provides the
best case for converging. We found that 1 × 109 requires almost 50 seconds worth of information before
it is able to correct itself whereas the other two cases require less than 10 seconds of information to be
able to correct themselves.

33
5 Application to Heart Surgery Data
In this section we will discuss the application of the filtering techniques proposed to the ECG data.

Now that we have the tools required, we can begin analysing data from the surgical procedure. We
have discussed how we can get a rough estimate of σ and fairly good estimates of µ and θ and how the
filter can now adapt to changes in parameter that may occur during the surgery. Our final test is to
now use the data from a surgical procedure to test whether the filter works. Throughout this section we
will benchmark the filter against a maximum likelihood estimator. We will first consider the case where
we have long standing persistent (LSP) atrial fibrillation. As we can see in figure 5.0.1 there is a major
difference when we try to extract heartbeats from the data. In figure 5.0.1a we can see that µ begins
to fail; this may be due to the presence of a heartbeat skewing the data. In figure 5.0.1b we see some
interesting features; µ begins to change drastically towards the end of the data. While this may look
as though it is about to diverge this is not the case; if we were to superimpose the data from the ECG
and µ we will see that µ does indeed track the data from the ECG. It seems as though the filter believes
that the mean is moving. This can be seen in figure 5.0.2. We also present the comparison between the
maximum likelihood estimator and the filtered data in table 5.0.2.

(a) Unfiltered Data (b) Filtered Data

Fig. 5.0.1: Unfiltered and filtered data from LSP through the hybrid filter. (a) demonstrates the filter perfor-
mance without the heartbeat. This shows that the OU process parameters could be changing after
every heartbeat. (b) demonstrates the filter once the data has been modified.

34
Fig. 5.0.2: µ and ECG data superimposed from data without heartbeats. This shows that µ seems to be
tracking the data implying the parameters could be changing after each heartbeat.

Maximum Likelihood Filter


µ -96.0205 µ -144.0886
θ 12.6478 θ 12.8961
σ 973.6496 σ 800.0923

Table 5.0.1: Comparison of the maximum likelihood estimator to the filter

We will examine another case where we use data from a patient suffering from persistent atrial fibril-
lation (PsAF). The data is measured by the III channel. By removing the majority of the data where the
heart beats and running it through the filter we are able to obtain the information in figure 5.0.3.

Fig. 5.0.3: Persistent Atrial Fibrillation Channel III data passed through the hybrid filter. The bottom graph
(σ) also confirms that the parameters of the OU process are changing after each heartbeat.

35
Maximum Likelihood Filter
µ -13.9613 µ 66.8015
θ 27.4558 θ 18.9184
σ 2.6046×103 σ 1.4051×103

Table 5.0.2: Comparison of the maximum likelihood estimator to the filter

We can see that there are some major discrepancies between the likelihood estimator and the filter.
This should be a cause for concern however the reason for the discrepancies exists because of the existence
of some information from the heartbeat being left over in the modified data. We can see the impact this
has on σ, whenever a jump occurs corresponds to where some artefact of the heartbeat has remained.
The solution to this is to improve the method used to remove the heartbeat though this requires much
more manual adjustment of the data than is being used at the moment.

36
6 Final Remarks
6.1 Limitations
In this section we will discuss the limitations of the methods proposed.

We have discussed the different versions of the Kalman filter with a lot of focus on the ensemble
square root filter which in itself is a variant of the ensemble Kalman filter. We have tested each method
with the same Ornstein-Uhlenbeck process and throughout this project we have discovered that while we
have a method that can estimate all the parameters, we have discovered that there are limitations to this
method.

ECG data As we concluded in section 5 there are limitations in the data from the ECG. We have
discussed that we are looking at the signal between each heartbeat and running this data through the
filter has proven to be harder than originally provisioned. The method used to filter out the heartbeat
was done by creating a test for monotonicity between a set amount of points and then going back through
the data and removing the data around the monotone area. This proved to work for a small amount of
the data though due to the variety of data a better method for removing the heartbeat is required. Even
a small artefact of the heartbeat can have a major impact on all the parameters.

Model Covariance Error Our first limitation is that we need to make an assumption about the model
error covariance. Throughout this project we have assumed that R2 = 0.01 however this is a parameter
that will be almost impossible to accurately determine and therefore we have to make an assumption
about this.

Fine Tuning A major issue with the filter is the level of fine tuning required by the user. We have tested
artificial covariance inflation to prevent the data from settling to a known value. As we demonstrated this
will allow for the filter to adapt to any incoming changes that may occur during a surgery such as when
a surgeon has ablated parts of the patient’s heart. We have looked at using λµ = λθ = 1.01, however
this value has required a lot of fine tuning to obtain it. In many runs we discovered problems where
this method caused rank deficiency problems in the x∗ ensemble, noticeably this occurred during the θ
estimation. We had initially been forcing the method to be log-positive, however some spurious points
could cause the method to fail thus this was removed which reduced the frequency of the rank problems.
This was further improved by using a pseudo-inverse when the need required.
Once the Bayesian approach for σ was implemented we began to suffer from further problems. We know
that σ is reliant on P however when coupled with inflation we found that the model error covariance
could increase to significantly large numbers resulting in NaN (not a number) thus causing the method
to fail therefor requiring the values of λ which had been used previously to be tuned.

Bayesian Approach to Estimating σ As we have demonstrated, we have had issues with estimating
the drift term in the OU process. While we have managed to gain some insight as to why we cannot
estimate it using conventional Kalman filtering techniques, the proposed method is not completely perfect.
The method also requires a large amount of tuning to obtain the correct Σβ and δβ values, however due
to the problems encountered with the ECG may require further testing to determine whether the values
used are correct. The method also does not conform to the idea that we have been using in the filter; we
assume that as more data comes into the filter we will be able to reduce our covariance of the parameter
being estimated however in this case the method has one prescribed covariance which does not change.
We also see that this method is still not perfect; the value is σ is normally about 25% below the required
value although we discussed this is definite improvement over that of the EnKF and EnSRF.

37
Convergence Time Another limitation of the method proposed is the convergence time of some pa-
rameters. We have noticed that µ is very quick to converge however both θ and σ can take a long time to
converge. We can understand the reason why θ takes longer than the mean; θ controls how fast the signal
reverts to the mean therefore it makes sense that the mean is found before this parameter. However, in
many runs it was found that for some parameters it can take between 10 and 20 seconds before some
values can converge. This can have a major impact on use during surgery depending on how it will be
used. If it is to be used as a live tool then this may have an impact the duration of the surgery however as
a tool to be used after operation this will have a much lesser impact. We have used a maximum likelihood
estimator to obtain values of µ, θ and σ to then generate a corresponding OU process. We found that in
many cases it takes at least 20 seconds worth of information for the filter to fully converge. The problem
is that each sample of the ECG is only 42 seconds so if any changes do indeed occur during the surgical
procedure the filter may not be able to adapt quickly enough.

Ornstein-Uhlenbeck Process as a Model We have also briefly discussed the final validity of the
OU process as a model for the heart signal. We have seen that the heartbeat can have a huge change on
the value of σ implying that a heartbeat itself can cause a change in the OU parameters.

6.2 Recommendations for Future Work


In this subsection we will discuss recommendations for further work for this project.

Further Filter Research further research into the filter could be conducted to obtain a better method
to estimating the drift term. The problem we face is that all the non-observed parameters are multiplica-
tive which increases the complexity of the problem. It is possible that we could increase the amount of
parameters in our state space vector in the following form:
Ẋt = θ(µ − Xt ) + σ φ̇ (6.2.1)
φ̇ = Ẇt (6.2.2)
We would also include the parameters µ, θ and σ in our state space vector. In doing this we would turn
σ from being multiplicative to additive which in turn reduces the error.
If this does not prove to be successful further research into the Bayesian approach to estimating σ should
be performed. It would also be useful to understand how to approach this problem from a particle filter
approach although this would increase the computational cost and also increase the complexity of the
algorithm used. We could simplify the problem at hand by running 2 separate ensembles with two different
methods; we could use the EnSRT for µ and θ and then introduce a particle filtering method for σ. While
the computational cost of running this filter might be expensive it could justify the added cost if we can
improve the accuracy of the method. We could implement various methods as described in (Simon, 2006)
to improve the sample impoverishment that occurs with the particle filter.

Improved ECG Filtering Due to time constraints we were unable to create a robust method that
would remove the heartbeat from the data from an ECG; we have seen that this is problematic when we
pass the data through our filter. The result is that the heartbeat has a large impact on σ which in turn
has an impact on the ability for the filter to estimate all the parameters correctly. However creating this
tool would require a much more complex system that would need to detect when a heartbeat is about to
occur and stop recording the data for a given time.

Benchmark ECG data The aim of this filter is to be able to classify the ECG however it is necessary
to be able to benchmark the filter against a series of healthy patients prior to testing this on data from
patients that suffer from atrial fibrillation. In doing so we will be able to create a classification system
that would allow the surgeon to understand the impact of the ablative surgery on the patient.

38
Bibliography
American Heart Association (2008), ‘Atrial fibrillation (for professionals)’, http://www.americanheart.
org/presenter.jhtml?identifier=1596. (Retrieved May 2010).

Anderson, J. L. (2007), ‘An adaptive covariance inflation error correction algorithm for ensemble filters’,
Tellus A 59(2), 210–224.

Calkins, H., Brugada, J., Packer, D. L., Cappato, R., Chen, S.-A. A., Crijns, H. J., Damiano, R. J., Davies,
D. W., Haines, D. E., Haissaguerre, M., Iesaka, Y., Jackman, W., Jais, P., Kottkamp, H., Kuck, K.
H. H., Lindsay, B. D., Marchlinski, F. E., McCarthy, P. M., Mont, J. L., Morady, F., Nademanee, K.,
Natale, A., Pappone, C., Prystowsky, E., Raviele, A., Ruskin, J. N., Shemin, R. J., Heart Rhythm So-
ciety, European Heart Rhythm Association, European Cardiac Arrhythmia Society, American College
of Cardiology, American Heart Association and Society of Thoracic Surgeons (2007), ‘Hrs/ehra/ecas
expert consensus statement on catheter and surgical ablation of atrial fibrillation: recommendations
for personnel, policy, procedures and follow-up. a report of the heart rhythm society (hrs) task force
on catheter and surgical ablation of atrial fibrillation developed in partnership with the european heart
rhythm association (ehra) and the european cardiac arrhythmia society (ecas); in collaboration with
the american college of cardiology (acc), american heart association (aha), and the society of thoracic
surgeons (sts). endorsed and approved by the governing bodies of the american college of cardiology,
the american heart association, the european cardiac arrhythmia society, the european heart rhythm
association, the society of thoracic surgeons, and the heart rhythm society.’, Europace : European
pacing, arrhythmias, and cardiac electrophysiology : journal of the working groups on cardiac pacing,
arrhythmias, and cardiac cellular electrophysiology of the European Society of Cardiology 9(6), 335–379.
URL: http://dx.doi.org/10.1093/europace/eum120

DelSole, T. and Yang, X. (2009), A bayesian method for estimating stochastic parameters. Submitted to
Physica D.

Evensen, G. (1992), ‘Using the extended kalman filter with a multilayer quasi- geostrophic ocean model’,
J. Geophys. Res 97, 17905–17924.

Evensen, G. (2006), Data Assimilation: The Ensemble Kalman Filter, Springer-Verlag New York, Inc.,
Secaucus, NJ, USA.

Evensen, G. (2009), ‘The ensemble kalman filter for combined state and parameter estimation’, Control
Systems Magazine, IEEE 29(3), 82–104.

Gartner, G. E. A., Hicks, J. W., Manzani, P. R., Andrade, D. V., Abe, A. S., Wang, T., Secor, S. M. and
Garland Jr., T. (2010), ‘Phylogeny, ecology, and heart position in snakes’, Physiological and Biochemical
Zoology 83(1), 43–54.
URL: http://www.journals.uchicago.edu/doi/abs/10.1086/648509

Iqbal, M. B., Taneja, A. K., Lip, G. Y. H. and Flather, M. (2005), ‘Recent developments in atrial
fibrillation’, British Medical Journal 330(7485), 238–243.

Kalman, R. E. (1960), ‘A new approach to linear filtering and prediction problems’, Transactions of the
ASME–Journal of Basic Engineering 82(Series D), 34–45.

Kloeden, P. E., Platen, E. and Schurz, H. (2003), Numerical solution of SDE through computer experi-
ments, Universitext, corr. 3. print. edn, Springer, Berlin [u.a.].

Krul, A. (2008), Calibration of stochastic convenience yield models for crude oil using the kalman filter,
Master’s thesis, Delft University of Technology.

39
Majda, A. J., Harlim, J. and Gershgorin, B. (2010), ‘Mathematical strategies for filtering turbulent
dynamical systems’, Discrete and Continuous Dynamical Systems 27(2), 441–486.

McGee, L. A. and Schmidt, S. F. (1985), Discovery of the kalman filter as a practical tool for aerospace
and industry, Technical Memorandum 86847, NASA.

Risken, H. (1996), The Fokker-Planck Equation, 2 edn, Springer.

Simon, D. (2006), Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, Wiley-
Interscience.

Sivakumaren, S. (2009), The detection of heart arrhythmia in electrograms, Master’s thesis, Imperial
College London.

Stewart, S., Murphy, N., Walker, A., McGuire, A. and McMurray, J. J. V. (2004), ‘Cost of an emerging
epidemic: an economic analysis of atrial fibrillation in the uk cost of an emerging epidemic: an economic
analysis of atrial fibrillation in the uk cost of an emerging epidemic: an economic analysis of atrial
fibrillation in the uk’, Heart 90(3), 286–292.

Uhlenbeck, G. E. and Ornstein, L. S. (1940), ‘On the theory of brownian motion’, Phys. Rev. 36(5), 823–
841.

van den Berg, T. (2007), ‘Calibrating the ornstein-uhlenbeck model’, http://www.sitmo.com/doc/


Calibrating_the_Ornstein-Uhlenbeck_model. (Retrieved May 2010).

Welch, G. and Bishop, G. (2006), An Introduction to the Kalman Filter, Department of Computer Science,
University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3175.

Whitaker, J. S. and Hamill, T. M. (2002), ‘Ensemble data assimilation without perturbed observations’,
MON. WEA. REV 130, 1913–1924.

Yang, X. and DelSole, T. (2009), ‘Using the ensemble kalman filter to estimate multiplicative model
parameters’, Tellus A 61(5), 601–609.

40
Appendix

A Solution to OU Process
Yt = Xt − µ (A.1)
Thus
dYt = dXt = −θYt dt + σdWt (A.2)
Now applying a variation of parameters such that

Zt = Yt eθt (A.3)

dZt = θYt eθt + eθt dYt (A.4)


This removes the drift term from the equation (−θYt ):

dZt = θYt eθt + eθt (−θYt dt + σdWt )


= 0dt + σeθt dWt (A.5)

Thus a solution to this problem can be found by integrating between t and s to obtain the following’:
Z t
Zt = Zs + σ eθu dWu (A.6)
s

By reverting back to Yt the expression becomes:

Yt = e−θt Zt
Z t
−θ(t−s) −θt
= e Ys + σe eθu dWu (A.7)
s

and finally the solution using Xt becomes:


Z t
−θ(t−s) −θt
Xt = µ + e (Xs − µ) + σe eθu dWu (A.8)
s

41
B σ Estimation Issue
(DelSole and Yang, 2009)
For simplicity we will simplify the stochastic process to the following. We will assume that α is a known
constant.
xt = αxt−1 + σdWt (B.1)
Hence our AR(1) model becomes
xft = αxat−1 + σt−1
a
dWt (B.2)
And our persistence model remains as
σtf = σt−1
a
(B.3)
If we then use our the Kalman filter equations as described in section 3.1.1 then we obtain the required
parameters in the filter update; the Kalman gain matrix and the error covariance matrices. In a simple
2D state space where xt = [xt σt ]T then our Kalman gain matrix becomes:
!
var[xft ] 1
K= f f (B.4)
cov[xt , σt ] var[xft ] + r

Thus our update equation for σ becomes


!
cov[xft , σtf ]
σta = σtf + (zt − xt ) (B.5)
var[xft ] + R

This implies that the updated σt is proportional to the covariance between xt and σt . This can be
computed by visiting equation B.2.

cov[xft , σtf ] = αcov[xat−1 , σtf ] + cov[σtf dWt , σtf ] (B.6)

The last term in the above equation can be computed


h  i
cov[σtf dWt , σtf ] = E σtf dWt σtf − E[σtf ] (B.7)
h  i
= E σtf σtf − E[σtf ] E[dWt ]
= 0

This is because dWt is white noise and thus independent of σtf . If apply this analysis to the persistence
equation we will obtain
cov[xat−1 , σtf ] = cov[xat−1 , σt−1
a
] (B.8)
Thus equation B.6 becomes
cov[xft , σtf ] = αcov[xat−1 , σt−1
a
] (B.9)
Applying the update equation for the covariance matrix it is possible to obtain the following expression
!
f f R
cov[xat , σta ] = cov[xt , σt ] (B.10)
R + var[xft ]

Hence !
R
cov[xft , σtf ] =α cov[xft−1 , σt−1
f
] (B.11)
R+ var[xft ]
We know that
R
≤1 (B.12)
R + var[xft ]

42
And thus we have the following bound

cov[xt , σt ] ≤ αcov[x0 , σ0 ] (B.13)

For stability we require |α| < 1 thus cov[xt , σt ] → 0 as t → ∞. The implication is that as cov[xft , σtf ]
tends to 0, σt is ceases to be updated. This implies that σt is constant and equal to the initial guess of
σ0 .

43
C Maximum Likelihood Estimator
(van den Berg, 2007)

function [mu,sigma,theta] = maxlikely(S,∆)


n = length(S)−1;

Sx = sum( S(1:end−1) );
Sy = sum( S(2:end) );
Sxx = sum( S(1:end−1).ˆ2 );
Sxy = sum( S(1:end−1).*S(2:end) );
Syy = sum( S(2:end).ˆ2 );

mu = (Sy*Sxx − Sx*Sxy) / ( n*(Sxx − Sxy) − (Sxˆ2 − Sx*Sy) );


theta = −log( (Sxy − mu*Sx − mu*Sy + n*muˆ2) / (Sxx −2*mu*Sx + n*muˆ2) ) / ∆;
a = exp(−theta*∆);
sigma2 = (Syy − 2*a*Sxy + aˆ2*Sxx − 2*mu*(1−a)*(Sy − a*Sx) + n*muˆ2*(1−a)ˆ2)/n;
sigma = sqrt(sigmah*2*theta/(1−aˆ2));
end

44
D Ornstein-Uhlenbeck Process MATLAB Code

function [Yt] = OU(T,N,mu,sigma,theta,Y0 )


%OU generates an OU process using the
%analytical solution

dt = T/N;

%generate vector distributed ¬N(0,1)


rnd = randn(1,N);

Yt = zeros(N,1);
Yt(1) = Y0;

for i = 2:N
%AR(1) Analytical Solution
Yt(i) = Yt(i−1)*exp(−theta*dt) + mu*(1−exp(−theta*dt))+...
sigma*sqrt((1−exp(−2*theta*dt))/(2*theta))*rnd(i);
end

end

45
E Final Filter MATLAB code

%% Initialization of filter and OU Process

close all
clear
clc

tic

%Number of dimensions
%NOTE: This should be the number of dimensions minus one as the diffusion
%parameter in the stochastic equation is not included in the state space
%vector
D = 3;

%Generate OU Process
mu = 10.0;
sigma = 1000.0;
theta = 650.0;
Mo = 1e6;
T = 100;
dt = T/Mo;
%Number of ensemble members
N = 250;

% Call OU Process
Yt = OU(T,Mo,mu,sigma,theta,mu);

%Simulate measurement noise


R = 0.01ˆ2;

%Artificial inflation parameters


lambdam = 1.02;
lambdat = 1.008;
Cmu = 0.0;
Cthe = 0.0;

%Temporal smoothing parameters


a = 0.8;
b = 0.8;

%Subsample OU Process
Y = Yt(1:10:end);
M = size(Y,1);

%Add measurement noise to process


Yn = Y+sqrt(R)*randn(M,1);

%Measurement operators
Hx = 1;
Hb = zeros(1,D−1);
H = [Hx Hb];

%Vector initialization
zhat = zeros(D,N);
zhatf = zeros(D,N);
extract = zeros(D+1,M); %NOTE: D+1 because we are also extracting \sigma
zbar = zeros(D,1);
zprimef = zeros(D,N);

46
zprimea = zeros(D,N);
zhats = zeros(D,N);
alpha = zeros(1,M);
K = zeros(D,M);

%dt time step change because of subsampling


dt = dt*Mo/M;

%Initial Conditions
zhat = repmat([Yn(1);zeros(D−1,1)],1,N)+7.5e2*randn(D,N);
zhatf = repmat([Yn(1);zeros(D−1,1)],1,N)+7.5e2*randn(D,N);

%\sigma estimation parameters


delP = 7.5e−4;
sigBeta = 2.5e2;

beta = 100;
dbeta = beta−delP;

%% ALGORITHM
for k = 1:M

%Forecast step
zhatf(1,:) = zhat(1,:).*exp(−(zhat(3,:))*dt)...
+ zhat(2,:).*(1−exp(−(zhat(3,:))*dt))...
+ (beta)*sqrt((1−exp(−2*(zhat(3,:))*dt))/(2*(zhat(3,:)))).*randn(1,N);
%Forecast step to find P(beta−dbeta)
zhats(1,:) = zhat(1,:).*exp(−(zhat(3,:))*dt)...
+ zhat(2,:).*(1−exp(−(zhat(3,:))*dt))...
+ (dbeta)*sqrt((1−exp(−2*(zhat(3,:))*dt))/(2*(zhat(3,:)))).*randn(1,N);

%Artificial inflation as proposed by Jeffrey L Anderson in the paper


%"An adaptive covariance inflation error correction algorithm for
%ensemble filters"
zhat(2,:) = sqrt(lambdam).*(zhat(2,:)−repmat(zbar(2),1,N))+repmat(zbar(2),1,N);
zhat(3,:) = sqrt(lambdat).*(zhat(3,:)−repmat(zbar(3),1,N))+repmat(zbar(3),1,N);

%Mixture of persistence model and temporal smoothing as proposed by X.


%Yang and T. DelSole in the paper "Using the Ensemble Kalman Filter to estimate
%Multiplicative model parameters"
zhatf(2,:) = (1−a)*zhat(2,:)+a*zhatf(2,:)+Cmu*randn(1,N)*sqrt(dt);
zhatf(3,:) = (1−b)*zhat(3,:)+b*zhatf(3,:)+Cthe*randn(1,N)*sqrt(dt);

%Calculate the residual


vhat = Yn(k)−H*zhatf;

%Calculate the fluctuations from the mean for the EnSRF


Xprimef = zhatf(1,:)*(speye(N)−(1/N)*ones(N));
Bprimef = zhatf(2:end,:)*(speye(N)−(1/N)*ones(N));
%Flucations for P(beta−dbeta)
Xprimef2 = zhats(1,:)*(speye(N)−(1/N)*ones(N));

%Calculations for P(beta) and P(beta−dbeta)


% {xxx}2 denotes P(beta−dbeta) calculation
Pxx = Xprimef*Xprimef'/(N−1);
Pxx2 = Xprimef2*Xprimef2'/(N−1);
Pbx = Bprimef*Xprimef'/(N−1);
Pbx2 = Bprimef*Xprimef2'/(N−1);
Pbb = Bprimef*Bprimef'/(N−1);

47
%Assemble covariance matrices
P = [Pxx Pbx'; Pbx Pbb];
P2 = [Pxx2 Pbx2'; Pbx2 Pbb];
%Calculate dP/dbeta
dP = (P−P2)/(delP);
%Bayesian update
beta = beta + 1/(2*sigBeta)*trace(pinv(P)*dP)...
− 1/(2*sigBeta)*mean(vhat)'/(H*P*H'+sqrt(R))*H*dP*H'...
/(H*P*H'+sqrt(R))*mean(vhat);
%Recalculate beta−dbeta
dbeta = beta−delP;

%Kalman gain matrix


K(:,k) = P*H'/(H*P*H'+sqrt(R));

%Analysis Covariance
Pa = (speye(D)−K(:,k)*H)*P;

%Kalman update for mean values


zbarf = mean(zhatf,2);
zbar = zbarf+K(:,k)*mean(vhat);

%Alpha calculation for EnSRF


alpha(k) = 1/(1+sqrt((R)/(Hx*Pxx*Hx+sqrt(R))));

%Update perturbances
Xprime = Xprimef−alpha(k)*K(1,k)*Hx*Xprimef;
Bprime = Bprimef−alpha(k)*K(2:end,k)*Hx*Xprimef;
Zprime = [Xprime;Bprime];

%Redistribute ensemble members around the analytical mean


zhat = zbar*ones(1,N)+Zprime;

%Extract values
extract(:,k) = [zbar;beta];

end

48
F Removing Heartbeats from ECG data

clc;
clear;

% load file name


z = %data file

%size of buffer for examining monotinicity of points


bsize = 9;
buff = zeros(bsize,2);
%determine the size of the data coming in
n=size(z,1);
%clear the extract and temp arrays
extract = [];
temp = [];
%create a structured array for determining whether to keep or
%remove data
buff=cell(ceil(n/bsize),3);
%j is used for the structured array
j=1;
for i=0:bsize:n−bsize

%take a slice of data


buff{j,1} = z(i+1:i+bsize);
temp = buff{j,1};

%extract the point in time where the data occurs


buff{j,2} = (i+1:i+bsize)';

%monotonicity check. If buff{j,3} = 0 then that slice of data


%is monotone
if diff(temp) ≤ 0
buff{j,3}=0;
elseif diff(temp) ≥ 0
buff{j,3} = 0;
else
buff{j,3} = 1;
end
%push j forward to the next row in the structured array
j=j+1;
%clear the temp slice for checking the monotonicity of the data
temp = [];
end

%go back through the data and see if we have a peak (i.e. peak of a
%heartbeat) which will be denoted by 1. If it is surrounded by 2
%0's either side of the 1 then convert the 1 to a 0 so that it is ignored.
for i=3:ceil(n/bsize)
if buff{i,3} & not(buff{i−1,3}) & not(buff{i−1,3}) & not(buff{i+1,3}) & not(buff{i+2,3})
buff{i,3} = 0;
end
end

%finally extract the data


for i=1:ceil(n/bsize)−2
if buff{i,3}
extract = [extract; buff{i,1} buff{i,2}];
end
end

49

You might also like