You are on page 1of 13

1

PhD Program in Business Administration and Quantitative Methods



FINANCIAL ECONOMETRICS

2006-2007

ESTHER RUIZ

CHAPTER 2. UNOBSERVED COMPONENT MODELS
2.1 Description and properties
When analysing the dynamic evolution of a given variable of interest, it is often helpful
to assume that it is made up by unobserved components which have a direct
interpretation. There are plenty of applications in finance of models with unobserved
components. Next, we describe some of them.

Fundamentals of prices
If we are analysing the evolution of the price of a financial stock in a given market, we
may be interested in the underlying fundamental price while the observed price is
contaminated because of market rigidities. In this case, if we assume that the
fundamental price is a random walk, the observed price is given by
t t t
t t t
y


+ =
+ =
1

where
t
is the underlying fundamental price and
t
are measurement errors. This model
can also be interpreted as a fads model where different types of traders give rise to
different unobserved components; see, for example, Potterba and Summers (1988).

Ex ante real interest differentials
Cavaglia (1992) analyses the dynamic behaviour of ex ante interest differentials across
four countries, United States, Germany, Switzerland and Holland, using monthly
observations of ex post interest differentials from 1973 to 1987. The model proposed is:
t t t
t t t
y L y
y y


+ =
+ =

*
1
*
*
) (

2
where
t
y is the ex post interest differential,
*
t
y is the ex ante real interest differential and
t
is the cross-country differential in inflation forecast errors which is assumed to be
identically and independently distributed. They conclude that ex ante real interest rates
are short-lived and mean-reverting to zero supporting theoretical models of economic
interdependence.

Factor models
There is a long tradition of factor models in finance. These models simplify the
computation of the covariance matrix of returns in the context of mean-variance
portfolio allocation. Furthermore, factors are central in two asset pricing theories: the
mutual fund separation theory, of which the CAPM is a special case, and the arbitrage
pricing theory (APT). In latent factor models, the observed variables depend on a few
factors that are modelled as GARCH processes. Multivariate latent factor models have
been used in several applications. For example, Diebold and Nerlove (1989) fitted a
one-factor model to represent the dynamic evolution of the volatilities of seven dollar
exchange rates. King, Sentana and Wadhwani (1994) used a factor model to asses the
extend of capital market integration on sixteen national stock markets. Sentana (2004)
analyses the statistical properties of alternative ways of creating actively and passively
managed mimicking portfolios from a finite number of assets. He proposes the
following model:

Nt
t
t
kt
t
t
Nkt
kt
kt
t N
t
t
Nt
t
t
Nt
t
t
f
f
f
v
v
v
r
r
r

... ... ...


...
...
...
...
... ... ...
2
1
2
1
2
1
1
21
11
2
1
2
1

where
it
r is the return of a risky asset, [ ]
t t t t
R f f E =
1
'
| which is a diagonal matrix
and [ ]
t t t t
R E =
1
'
| .

Term structure
Several authors consider models for the term structure of interest rates where the
observable variables are zero coupon rates and the unobserved variables are the factors
that drive the curve. The law of motion of these factors depend on the dynamic structure
chosen. For example, in the Vasiceck model, it is an Ornstein-Uhlenbeck process; see
Babbs abd Nowman (1999). Finally, the observed yields are given by the theoretical
3
rates implied by a no arbitrage condition plus a stochastic disturbance. For example, the
model proposed by de Rossi (2004) is given by
t t t t t t
n t
t
t
u C r B A
y
y
y

+ + + =

) (
...
) (
) (
2
1

{ }
{ } { }
{ }
t t
u
t
t
t
c
u
r
t b
b a
t a t b
t a
u
r
+ +

+
+
exp 0
~
~
exp exp
~
exp
1
1

where ) (
t
y is the spot interest rate at time t for maturity + t .

In a dynamic context, Dungey, Martin and Pagan (2000) analyze bond yield spreads
between five countries by decomposing international interest rate spread into national
and global latent factors.

Modelling volatility
Consider that we are interested in modelling the volatility of the price. There are two
main types of models proposed for this goal. The most popular are the GARCH models
where the volatility is assumed to be a non-linear function of past returns. Consider, for
example, the GARCH(1,1) model given by
t t t
y =
2
1
2
1
2

+ + =
t t t
y
where
t
is an IID white noise with variance 1. The parameters have to be restricted to
guarantee the positiveness of the conditional variance. In particular, 0 > , 0 and
0 . The stationarity condition is 1 < + .

ARCH-type models assume that the volatility can be observed one-step-ahead.
However, a more realistic model for volatility can be based on modelling it having a
predictable component that depends on past information and an unexpected noise. In
this case, the volatility is a latent unobserved variable. One interpretation of the latent
volatility is that it represents the arrival of new information into the market; see, for
example, Clark (1973). In the simplest case, the log-volatility follows an AR(1) process.
Then, we have the ARSV(1) model given by
4
t t t
y
*
=
t t t
+ =

) log( ) log(
2
1
2

where
t
is a strict white noise with variance 1. The noise of the volatility equation,
t
,
is assumed to be a Gaussian white noise with variance
2

independent of the noise of


the level,
t
. The Gaussianity of
t
may seem rather ad hoc. However, there are several
empirical studies that support this assumption both for exchange rates and stock returns;
see Andersen, T.G., T. Bollerslev, F.X. Diebold and H. Ebens (2001) and Andersen,
T.G., T. Bollerslev, F.X. Diebold and P. Labys (2001, 2003).
The necessity of assumptions about the dynamics of the underlying latent variables is
the main criticism against unobserved component models. However, given that the
variable of interest cannot be observed, we can only estimate them restricting somehow
their behaviour. The main point is whether these assumptions are sensible and
compatible with the data under analysis.

2.2 State space models
In general, a linear unobserved component model can be written as a state space model
as follows
t t t t t
t t t t t
c T
d Z y


+ + =
+ + =
1

where
t
is the latent state at time t which has k components,
t
is a white noise
process with variance
t
H and
t
is a k -dimensional white noise with covariance
matrix
t
Q uncorrelated with
t
at all leads and lags. The system matrices
t t t t t
T d Z Q H , , , , and
t
c are assumed to be predetermined in the sense that they are
known at time 1 t . When they are fixed, the model is said to be time-invariant.
The first equation is known as measurement equation while the second is the
transition equation.
Consider, for example, two of the models described above.
a) In the model for fundamental prices,
1 , 0 , 1 , , ,
2 2
= = = = = =
t t t t t t t
T d Z Q H

and 0 =
t
c .
b) In the ex ante real interest differentials model,
5

= = = =
+
0 ... 0 0
... ... ... ...
0 ... 0 1
...
, ), ,..., , ( ), 0 ,... 0 , 1 (
2 1
*
1
*
1
*
p
t t t p t t t t t
T y y y Z

,
2

=
t
H
and
2

=
t
Q .
When
t
and
t
are assumed to be Gaussian, the model is a Gaussian state space
model.
Unobserved component models depend on several disturbances. Provided the model is
linear, the components driven these disturbances can be combined to give a model with
a single disturbance. This is known as reduced form. The reduced form is an ARIMA
model, and the fact that it is derived from a structural form will typically imply
restrictions on its parameters.
Consider, once more, the random walk plus noise model. Taking first differences, we
obtain the following expression:
t t t
y + =
The mean and variance of
t
y are given by
0 ) ( ) ( = + =
t t t
E y E
2 2 2
2 ) ( ) (

+ = + =
t t t
E y Var
The dynamic properties of
t
y can be analysed by looking at its autocorrelation
function given by

=
+
=
+

=
2 , 0
1
,
2
1
2
) (
2 2
2
h
h
q
h




The constant
2
2

= q is known as the signal to noise ratio. From the autocorrelation


function above, it is easy to see that the reduced form of the random walk plus noise
model is an IMA(1,1) model with negative parameter. Equating the autocorrelations of
first differences at lag one gives the following expression of the MA parameter
2
2 4
2
q q q +
=
6
When 0 = q ,
t
y reduces to a non-invertible MA(1) model, i.e.
t
y is a white noise
porcess. On the other hand, as q increases, the autocorrelations of order one, and
consequently, , decreases. In the limit, if 0
2
=

,
t
y is a white noise and
t
y is a
random walk.

2.2 The Kalman filter: filtered estimates of the unobserved components
The Kalman filter is made up of two sets of equations. First, we have the prediction
equations that give us the one-step ahead predictions of the unobserved components:
[ ]
t t t t t t t
c a T Y E a + = =
1 1 1 /
|
where [ ]
1 1 1
|

=
t t t
Y E a . The one step-ahead MSE matrices of the components are
given by
[ ]
t t t t t t t t t t
Q T P T Y a E P + = =

'
1 1 1 / 1 /
|
where [ ]
1 1 / 1 /
| )' )( (

=
t t t t t t t t
Y a a E P is the MSE matrix of
1 t
a . Once we have
these one-step ahead estimates of the state, we can also obtain the one-step ahead
estimated of
t
y and corresponding prediction errors and their MSEs as follows
[ ]
t t t t t t t t t
t t t t t t t t
a Z y y
d a Z Y y E y
+ = =
+ = =


) (
|
1 / 1 /
1 / 1 1 /

( )
t t t t t t
H Z P Z E F + = =

'
1 /
2

The one-step ahead estimates of the state,
1 / t t
a , can be updated using the new
information provided by the observation
t
y . The resulting equations are known as
updated equations. These equations can be easily derived using the properties of the
multivariate normal distribution. In particular, consider the distribution of
t
and
t
y
conditional on past information up to and including time 1 t . The conditional mean
and variance of
t
and
t
y have been derived before. The conditional covariance
between both variables can be easily derived taking into account that
t
y can be written
as
t t t t t t t t t t
a Z d a Z y + + + =

) (
1 / 1 /

and, therefore,
7
[ ]
'
1 /
'
1 / 1 /
1
1 / 1 /
1
1
)) )' )(( (( )' )( ( ) | , (
t t t
t t t t t t t t
t
t t t t t t t t
t
t t t
Z P
Z a a E d a Z y a E Y y Cov

=
+ = =
Consequently, the required conditional distribution is given by

t t t t
t t t t t
t t t t
t t
t
t
t
F P Z
Z P P
d a Z
a
N Y
y
1 /
'
1 / 1 /
1 /
1 /
1
, |

.
From where, we can see that the updated equations are given by
[ ]
[ ]
1 /
1 '
1 / 1 /
1 '
1 / 1 /
| )' )( (
|


= =
+ = =
t t t t t t t t t t t t t t t
t t t t t t t t t t
P Z F Z P P Y a a E P
F Z P a Y E a



The prediction error plays a role in updating the new estimates. The more the predictor
deviates from its realized value, the bigger the change made to the estimator of the state.
If the model is Gaussian, and given the initial conditions,
0
a and
0
P , the Kalman filter
delivers the conditional mean of the estate which is the Minimum MSE estimator of the
state as each new observation becomes available. When the disturbances are not
normally distibuted, it is not longer true, in general, that the Kalman filter yields the
conditional mean of the state vector. If the model is Gaussian, then the estimates are
minimum MSE linear estimates.
It is important to note that, in time-invariant models, the observations
t
y do not affect
the MSE matrices
1 / t t
P and
t
P . Therefore, these matrices are both conditional and
unconditional MSE matrices.
Consider, for example, the Random walk plus noise model with known
2

and
2

. To
initalize the filter, we need initial values for
0
a and
0
P . One alternative is to use what is
known as a diffuse prior distribution which in this case, is given by 0
0
= m and
=
0
P , where ) (
0
0
0
E m = . This says that nothing is known about the initial state.
Then, using the expression of the prediction equations, we obtain
2
0 0 / 1
0 0 / 1
0

+ =
= =
P P
m m

We can update this estimate of the underlying level at time 1 by using the information
contained in
1
y . Then, using the updated equations of the Kalman filter, we obtain
8
2
2 2
0
2
0
2
0 0 / 1
1
1 0 / 1 0 / 1 1
1
2 2
0
2
0
0 / 1 1
1
0 / 1
0 / 1 1
1 ) (
) (

+ +
+
+ = =
+ +
+
= + =

P
P
P P F P P P
y
P
P
m y
F
P
m m

Then, we can follow using recursively the prediction and updated equations,
2
1 1 / 2
1 1 / 2

+ =
=
P P
m m

and

+ +
+
+ = =
+ =

2 2
1
2
1
2
1 1 / 2
1
2 1 / 2 1 / 2 2
1 / 2 2
2
1 / 2
1 / 2 2
1 ) (
) (

P
P
P P F P P P
m y
F
P
m m

Note that initializing with a difusse prior is equivalent to using the first observation as
an initial value at time 1 = t .
If the state is generated by a stationary process, the initial conditional for the Kalman
filter are given by its marginal mean and variance.
Summarizing, if the system matrices are observable at time 1 t , the Kalman filter
yields optimal:
i) One-step ahead estimates of the unobserved components:
1 / t t
a .
ii) Updated estimates of the unobserved components:
t
a .
iii) One-step ahead prediction errors of
t
y and their variances:
t
and
t
F .
Consider, for example, the following series generated by a Random walk plus noise
model with parameters 1
2
=

and 49 . 0
2
=

(in red). The one-step ahead estimates of the


underlying level appear in blue.
-20
-16
-12
-8
-4
0
4
8
250 500 750 1000
Y YFILTERED

9

2.3 Smoothed Estimation of the unobserved components
There are also other filters, known as smoothing algorithms that give estimates of the
components based on the information contained in the whole sample. The fixed-
smoothing algorithm consists of a set of recursions which start with the final quantities
T
a and
T
P given by the Kalman filter and work backwards. The smoothing estimate of
t
is given by
[ ]
1
/ 1
'
1
*
*'
/ 1 / 1
*
/
1 1 / 1
*
/
) (
) ( |

+ +
+ +
+ + +
=
+ =
+ = =
t t t t t
t t t T t t t T t
t t t T t t t T t T t
P T P P
P P P P P P
c a T a P a Y E a

Given that the smoothing estimate of
t
is based on more information than the filtered
estimates, its MSE,
T t
P
/
, is, in general, smaller than that of the filtered estimator.
These smoothers are very useful because they also provide what is known as the
auxilary residuals which are estimates of the disturbances associated to each of the
different components of the model. These auxiliary residuals can be used to identify
outliers that affect different components (Harvey and Koopman, 1992) or to identify
whether the components of a given series are conditionally heteroscedastic (Broto and
Ruiz, 2005a,b). Expressions of the auxiliary residuals have been derived by Durbin and
Koopman (2001).

The following figure represents the smoothed estimates of the underlying level together
with the one-step ahead estimates for the same series considered above.
-20
-16
-12
-8
-4
0
4
8
250 500 750 1000
YFILTERED YSMOOTHED

10
2.4 Prediction
Once, we reach T t = , we can run the prediction equations to obtain forecasts of future
values and their MSEs.
[ ]
[ ] Z T Q T Z Z T P T Z Y a a E P
a T Z Y E a
k
j
j
t
j k
T
k
t k T k T k T k T k T
T
k
t T k T k T

+ = =
= =

=
+ + + + +
+ +
1
0
' '
'
' ' | )' )( (
|


For example, in the random walk plus noise model
2
/ 1
/

k P P
m m
T T T
T T k T
+ =
=
+
+


2.5 Estimation of the parameters
Up to now, we have assumed that the parameters of the model are known. However, in
practice, they are unknown and should be estimated from the data available. If the
model is conditionally Gaussian, then the parameters can be estimated by Maximum
Likelihood (ML) . Remember that the Kalman filter provides the innovations (one-step
ahead errors) and their variances.
The likelihood function can be written as follows:

=

=
T
t
t t
Y y p L
1
1
) | (
The conditional distribution of
t
y can be easily derived by writting
t t t t t t t t t t
a Z d a Z y + + + =

) (
1 / 1 /

Then, if ) (
1 /

t t t
a and
t
are conditionally normal,
) , ( |
1 / 1 t t t t t t t
F d a Z N Y y +


and the log-likelihood function can be written down inmediately as

= =
=
T
t
T
t t
t
t
F
F
T
L
1 1
2
2
1
| | log
2
1
) 2 log(
2
log


This expression is know as the prediction error decomposition form of the likelihood.
The parameters are estimated by maximizing numerically the likelihood function. The
asymptotic properties of the ML estimator are the usual ones as far as the parameters lie
on the interior of the parameter space. However, in many models of interest, the
parameters are variances, and it is of interest to know whether they are zero (we have
deterministic components). In this case, the asymptotic distribution could still be related
11
with the Normal but is modified as to take into account of the boundary; see Harvey
(1989).
If the model is not conditionally Gaussian, then maximizing the Gaussian log-
likelihood, we obtain what is known as the Quasi-Maximum Likelihood (QML)
estimator. In this case, the estimator looses its eficiency. Alternatives based on the true
likelihood are more efficient but, when they can be defined, are more complicated from
the computational point of view. Furthermore, droping the Normality assumption tends
to affect the asymptotic distribution of all the model parameters. In this case, the
asymptotic distribution is given by
1 1
) (

= IJ J T
where

=
'
log
2

L
E J and

=
'
log log

L L
E I and the expectations are taken
with respect to the true distribution; see Gourieroux (1997).
Once the parameters are estimated, the Kalman filter is run again with the parameters
fixed in the estimated values to yield one-step ahead and updated estimates of the
unknown estates,
1 /

t t
a and
t
a respectively, and the smoother to yield estimates based
on the whole sample,
T t
a
/
. As a by-product, we also obtain several residuals:
a) Standarized residuals:
t
t
t
F

~

= . Apply standard tests for Normality,
heteroscedasticity and serial correlation.
b) Auxiliary residuals:
t T t t T t T t
t T t t t T t
c a T a
d a Z y
=
=
/ 1 / /
/ /



References
Broto, C. and E. Ruiz (2005), Unobserved component models with asymmetric
conditional variances, Computational Statistics and Data Analysis, forthcoming.

Harvey, A.C: (1989), Forecasting, Structural Time series models and the Kalman filter,
London, Cambridge University Press. Chapter 4.

Harvey, A.C. (1993), Time Series Models, 2
nd
ed., Harvester Wheatsheaf, London.
Chapters 4 and 5.
12

Harvey, A.C. and S.J . Koopman (1992), Diagnostic checking of unobserved-
components time series models, Journal of Business & Economic Statistics, 10, 377-
389.

Koopman, S.J . (1993), Disturbance smoother for state space models, Biometrika, 80,
117-126.

Koopman, S.J ., A.C. Harvey, J .A. Doornik and N. Shephard (2000). STAMP: Structural
Time Series Analyser, Modeller and Predictor. Timberlake Consultants Press.

Wells, The Kalman Filter in Finance. Kluwer Academic Publishing.

Exercices
1. (a)Consider an AR(1) model in which the first observation is fixed. Write down
the likelihood function of the observation at time is missing. (b) Given a value
of the AR parameter, , show that the estimator of the missing observation
obtained by smoothing is
2
1 1
/
1
) (

+
+
=
+
y y
y
T
.
2. Consider a random walk plus noise model. If 0
2
=

, show that running the


Kalman filter initialised with a diffuse prior yields an estimator of
t
equal to
the mean of the first t observations. Show that the variance of this estimator is
calculated by the Kalman filter to be
t
2

.
3. Using the Kalman filter obtain estimates of the underlying level of the IBEX35.
Derive the reduced form model and check whether this model is in concordance
with the model fitted analysing the correlogram of
t
y .
4. Obtain the reduced form of the local level model given by
t t t
t t t t
t t t
y



+ =
+ + =
+ =

1
1

13
where
t t
, and
t
are mutually uncorrelated white noise process with variances
2 2
,

and
2

respectively.

You might also like