You are on page 1of 115

R EVIEW T OPIC 1: S IMPLE R EGRESSION

R EVIEW T OPIC 1: S IMPLE R EGRESSION


I NTRODUCTORY F INANCIAL E CONOMETRICS
Review of Econometric Theory
3 C REDITS , 51 H OURS
Readings:
Jianhua Gang

Wooldridge, Ch.2

School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

1 / 110

J IANHUA G ANG (RUC)

R EGRESSION A NALYSIS

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

R EGRESSION A NALYSIS

2 / 110

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL


Generalized idea of a random sample of n independently and
identically distributed (i.i.d.) observations from N (, 2 ).

Regression analysis involves the estimation and evaluation of the


relationship between a variable of interest (dependent variable,
explained variable, regressand) and one or more other variables
(independent variables, explanatory variables, regressors).
What is estimation, prediction (forecast), the fitting?

Have sample of n independent observations y1 , ..., yn , each of


which is normally distributed with variance 2 ,but conditional
mean governed by
E(yi ) = + xi , i = 1, ..., n.
where,
1
2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

3 / 110

and are termed regression parameters/regression coefficients.


The term xi varies with i, but is not random (nonstochastic, fixed in
repeated sampling).
What is sampling?

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

4 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL

R EVIEW T OPIC 1: S IMPLE R EGRESSION

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL

C LASSICAL N ORMAL S IMPLE R EGRESSION M ODEL


If ui = yi ( + xi ) denotes the error (or disturbance term), then
write simple regression model as:
yi = + xi + ui , ui NID(0, 2 ), i = 1, ..., n,

If we regard + xi as the equation of a straight line, then


1
2

the intercept is the mean of y when xi equals zero


the slope is the change in the mean of y when xi increases by one
unit. (This interpretation of the intercept is not always sensible in
economic applications.)

(1)

The assumption that the regressor x is Nonstochastic is


inappropriate in many applications in economics and it is relaxed
later.
More useful to think of the classical assumption as being
appropriate when we conditional on the values of x1 , ..., xn . Thus,
conditional upon the values of x1 , ..., xn , the yi are independent
normal variables with means + xi and common constant
variance 2 for i = 1, ..., n.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

5 / 110

E STIMATION OF PARAMETERS

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

E STIMATION OF PARAMETERS

6 / 110

M ETHOD OF M OMENTS E STIMATION

M ETHOD OF M OMENTS E STIMATION

Population moments conditions (assumptions provided before as in


(1)):
The following general approaches to estimate , and 2 are
considered: method of moments (MM); ordinary least squares
(OLS); and maximum likelihood estimation (MLE).

E(ui ) = 0,
E(xi ui ) = 0,
E(u2i

These slides do not contain full mathematical details.

2 ) = 0.

Let the MM estimator of and be b


and b
, with associated
bi = yi (b
+b
xi ), i = 1, ..., n.
residuals u

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

7 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

8 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

M ETHOD OF M OMENTS E STIMATION

R EVIEW T OPIC 1: S IMPLE R EGRESSION

M ETHOD OF M OMENTS E STIMATION

O RDINARY L EAST S QUARES E STIMATION (OLS)

Obtain MM: solving the derived equations (replacing E(.) by


bi ), the equations are:
n1 (.), and ui by u

Choose estimates b
and b
to get "best fit" in the sense of
minimizing
S(, ) = [yi ( + xi )]2 .

ubi

b2i
u

[yi (b + bxi )] = 0,

xi ubi =

First order conditions (the F.O.C.s) are,

xi [yi (b + bxi )] = 0,

S(b
, b
)

S(b
, b
)

= 0

It can be proved that under weak conditions, MME are consistent


and asymptotically normally distributed.

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

J IANHUA G ANG (RUC)

R EVIEW T OPIC 1: S IMPLE R EGRESSION

9 / 110

O RDINARY L EAST S QUARES E STIMATION (OLS)

xi [yi (b + bxi )]
i

ubi = 0

xi ubi = 0

(2)

O RDINARY L EAST S QUARES E STIMATION (OLS)

(xi x)(yi y)

b
=

It is clear that the normal equations imply that the OLS estimates
of and are equal to the corresponding MME previously.

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

10 / 110

The solution of b
and b
which minimize the objective function
S(, ) are,

(3)

Equations (2) and (3) are called the normal equations (b


ui is an OLS
residual).

J IANHUA G ANG (RUC)

= 0

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

= 0

O RDINARY L EAST S QUARES E STIMATION (OLS)

Ignoring an irrelevant factor of 2, these equations are,

J IANHUA G ANG (RUC)

R EVIEW T OPIC 1: S IMPLE R EGRESSION

O RDINARY L EAST S QUARES E STIMATION (OLS)

[yi (b + bxi )]

O RDINARY L EAST S QUARES E STIMATION (OLS)

11 / 110

(xi x)2
i

b
= yb
x

where x denotes a sample average, e.g. x = n1 xi .


i

We have to postpone discussions of estimation of 2 later.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

12 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

M AXIMUM L IKELIHOOD E STIMATION

R EVIEW T OPIC 1: S IMPLE R EGRESSION

M AXIMUM L IKELIHOOD E STIMATION

OLS D ECOMPOSITION OF S UM OF S QUARES

OLS D ECOMPOSITION OF S UM OF S QUARES

Because yi N ( + xi , 2 ), i = 1, ..., n, so that


f (yi ) = (22 )1/2 exp{[yi ( + xi )]2 /22 }, i.

Let b
yi = ( b
+b
xi ) denote a typical OLS predicted value, then the
normal equation for OLS yield several results.

We already assume that yi , ..., yn are independent, so


f (y1 , ..., yn ) =

yi

f ( yi ) = L
i

b2 =
equals OLS. The MLE of 2 is

n 1

ub2i = MM estimate.

13 / 110

OLS D ECOMPOSITION OF S UM OF S QUARES

=
=

(b + bxi )ubi = b ubi + b xi ubi = 0


(byi + ubi )

=
i

i
b
y2i

+
i

b2i
u

+0

(byi n1 byi )2 + ub2i


i

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

14 / 110

G OODNESS OF F IT

G OODNESS OF F IT

(yi n1 yi )2 = (byi n1 byi )2 + ub2i


i

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

(yi n1 yi )2

OLS D ECOMPOSITION OF S UM OF S QUARES

y2i

The MLE of and must minimize [yi ( + xi )]2 and so

R EVIEW T OPIC 1: S IMPLE R EGRESSION

n
[y ( + xi )]2
l(, , 2 ) = ln(22 ) i
.
2
22
i

(byi + ubi ) = byi + ubi = byi


i

byi ubi

The log-likelihood is, therefore,

J IANHUA G ANG (RUC)

or put this in another way,

Total Sum of Squares (TSS)=Explained Sum of Squares(ESS) +


Residual Sum of Squares(RSS)

Coefficient of determination R2 is index of goodness of fit of OLS


line with
ESS
RSS
= 1
, 0 R2 1.
R2 =
TSS
TSS
R2 = r2XY , where rXY = XY (correlation coefficient between x and
y).

Note sums of squares are measured about sample averages.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

15 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

16 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S AMPLING P ROPERTIES OF OLS E STIMATORS

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S AMPLING P ROPERTIES OF OLS E STIMATORS

S AMPLING P ROPERTIES OF OLS E STIMATORS

S AMPLING D ISTRIBUTION OF OLE E STIMATORS


For the classical normal simple regression model, b
and b
are
jointly normally distributed with
E(b
) =
E( b
) =

Best linear unbiased estimator (BLUE) of and , even when


errors ui are not normally distributed.

Var(b
) =

Consistent and asymptotically efficient (MLE).

2
(xi x)2

i
2

)
+ x2 Var(b
n
Cov(b
, b
) = xVar(b
)
Var(b
) =

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

17 / 110

S AMPLING P ROPERTIES OF OLS E STIMATORS

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S AMPLING D ISTRIBUTION OF OLE E STIMATORS

18 / 110

E STIMATION OF SIGMA - SQUARE

E STIMATION OF SIGMA - SQUARE

The OLS estimator of the regression parameters can be written as


It can be shown that, in classical normal simple regression model,

b
= + wi ui

ub2i = RSS 2 2 (n 2)

b
= + zi ui

is independent of b
and b
.

where the nonstochastic terms wi and zi depend upon the


regressor values, e.g.

Note (n 2) is the number of observations minus the number of


regression parameters estimated to derive the residuals and is called
the degree of freedom parameter for the regression.

zi = (xi x)/ (xj x)2 .


j

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

19 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

20 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

E STIMATION OF SIGMA - SQUARE

R EVIEW T OPIC 1: S IMPLE R EGRESSION

E STIMATION OF SIGMA - SQUARE

S TATISTICAL I NFERENCE

S TATISTICAL I NFERENCE
S TOCHASTIC

SPECIFICATION OF CLASSICAL MODEL

Hence,
b2i ) = 2 (n 2)
E( u
i

And so the newly-defined (sample) estimator

s2 =

Study of statistical inference requires the specification of the


probabilistic model for y1 , ..., yn .We make the following
assumptions.

ub2i
i

n2

b2 = [(n 2)/n] s2 is
is unbiased. The ML estimator, however,
biased (of course when sample size gets relatively small).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

21 / 110

S TATISTICAL I NFERENCE

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S TATISTICAL I NFERENCE

22 / 110

S AMPLING DISTRIBUTIONS FOR INFERENCE

S AMPLING DISTRIBUTIONS FOR INFERENCE

S TOCHASTIC SPECIFICATION OF CLASSICAL MODEL

A1 There exist observation invariant parameters and such that


E(yi ) = + xi i;
n

A2 The regressor x is nonrandom and satisfies Sxx =

( xi x ) 2 > 0
1

for n > 1. For the purpose of asymptotic theory, it is conventional


to assume 0 < lim n1 S < ;
A3 Let ui = yi E(yi ),common variance (homoskedasticity)
var(ui ) = 2 i. If the ui do not have the same variance, have
heteroskedasticity.
A4 Let ui = yi E(yi ),uncorrelated disturbances so E(ui uj ) = 0 if
i 6= j.If have time series data and assumption is false then say have
autocorrelation (or serial correlation).
A5 Let ui = yi E(yi ),normally distributed distanbances (so that A4
implies independence).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

23 / 110

b
and b
are N (, var(b
)) and N ( , var(b
)), respectively, so that
q
z(b
) = (b
)/ var(b
) N (0, 1)
q
) N (0, 1)
z( b
) = (b
)/ var(b
b2i 2 2 (n 2) independently of b
RSS = u
and b
, so
RSS
2
b
(n 2) independently of z(b
) and z( ), so
2
t(b
) =

t( b
) =
J IANHUA G ANG (RUC)

q
q

z(b
)

RSS
(n2) 2

z( b
)

RSS
(n2) 2

t(n 2)

t(n 2)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

24 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S AMPLING DISTRIBUTIONS FOR INFERENCE

R EVIEW T OPIC 1: S IMPLE R EGRESSION

S AMPLING DISTRIBUTIONS FOR INFERENCE


RSS
(n2)

S AMPLING DISTRIBUTIONS FOR INFERENCE

S AMPLING DISTRIBUTIONS FOR INFERENCE

= s2 so that, for example,


(b
)
z( b
)
t( b
) = q = q
q t(n 2)
2
s
s2
b
var
(

)
2

Hence,
t( b
) =

in which, the denominator


var(b
)(

s2
2
s2
s2
)
=
(
)(
)
=
2
SXX 2
SXX

is the estimator of var(b


) and the square root of this quantity is
called the estimated standard error, denoted by
s
r
q
2
s
s2
b
b
var(b
)(when n big)
SE( ) = var( )( 2 ) =

SXX

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

25 / 110

C ONFIDENCE I NTERVALS (C.I. S )

Similar for t(b


),

J IANHUA G ANG (RUC)

t(b
) =

SE(b
)

t(n 2)

)
(b
t(n 2),
SE(b
)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

C ONFIDENCE I NTERVALS (C.I. S )

26 / 110

H YPOTHESIS T ESTING : T S TATISTIC

H YPOTHESIS T ESTING : T S TATISTIC


Consider the null hypothesis that restricts one of the regression
parameters, e.g. H0 : = 0 , where 0 is some specified constant.

Let d1 be such that

For whatever value of ,

prob(d1 t(n 2) d1 ) = (1 )

t( b
) =

(b
)
t(n 2),
SE(b
)

t0 (b
) =

(b
0 )
t(n 2).
SE(b
)

Then the (1 ) 100 per cent confidence intervals (C.I.) for


and are given by,

respectively.

and so if H0 is true,

b
d1 SE(b
)
b
)
d1 SE(b

) is termed as the test statistic.


t0 ( b
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

27 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

28 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

H YPOTHESIS T ESTING : T S TATISTIC

R EVIEW T OPIC 1: S IMPLE R EGRESSION

H YPOTHESIS T ESTING : T S TATISTIC

R ELAXING THE A SSUMPTION OF F IXED R EGRESSORS


Suppose that x, like y, is a r.v.. Consider the results above that can
now be regarded as being derived, conditional upon the values
x1 , ..., xn .

The critical/rejection region depends upon the nature of the


alternative hypothesis and the prespecified significance level,
denoted by .
1

2
3

H1 : 6= 0 reject H0 if |t0 (b
)| > d1 ,where
prob(t(n 2) > d1 ) = /2
H1+ : > 0 reject H0 if t0 (b
) > d2 ,where prob(t(n 2) > d2 ) =
H1 : < 0 reject H0 if t0 (b
) < d2 ,where
prob(t(n 2) < d2 ) =
Just replace by and b
by b
in the above to obtain test procedures
for (the intercept).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

R ELAXING THE A SSUMPTION OF F IXED R EGRESSORS

29 / 110

R ELAXING THE A SSUMPTION OF F IXED R EGRESSORS

E(b
|x1 , ..., xn ) = , E(b
|x1 , ..., xn ) = and E(s2 |x1 , ..., xn ) = 2 .These
expectations do not depend upon the x values and so OLS
estimators are unconditionally unbiased. Similar remarks apply to
probability limits;
var(b
|x1 , ..., xn ), var(b
|x1 , ..., xn ) and cov(b
, b
|xx1 , ..., xn ),as given
above, do depend on the x values, and so do not correspond to
unconditional characteristics.
Fortunately, 2 does not pose major problems for inference. The
variables (b
) /SE(b
) and (b
)/SE(b
) are, given x values,
both distributed as t(n 2), still. This distribution does not depend
on x values, but just on the values of (n 2). Hence the t tests
and confidence intervals described above are unconditinally valid.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 1: S IMPLE R EGRESSION

R ELAXING THE A SSUMPTION OF F IXED R EGRESSORS

30 / 110

P RESENTATION OF R ESULTS ( EARNINGS ON SCHOOLING )

P RESENTATION OF R ESULTS ( EARNINGS ON


SCHOOLING )

It is, however, important to note,


1

It has been assumed that the errors u1 , ..., un NID(0, 2 ) whether


or not we condition on the x values, i.e. the regressor values and
error terms are statistically independent.
Assumptions in 1 can be weakened but we cannot expect to get
results that are exact, i.e. valid for finite sample sizes, and often
have to resort to asymptotically valid results in practical situations.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

31 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

32 / 110

R EVIEW T OPIC 1: S IMPLE R EGRESSION

P REDICTION

R EVIEW T OPIC 1: S IMPLE R EGRESSION

P REDICTION

P REDICTION

P REDICTION

Suppose wish to make predictions for period f , f > n (the sample


size), with xf known and assuming the data generation process for
y is unchanged so that,

Suppose wish to make predictions for period f , f > n (the sample


size), with xf known and assuming the data generation process for
y is unchanged so that,

yf = + xf + uf , uf N (0, 2 ).

yf = + xf + uf , uf N (0, 2 ).

Prediction of E(yf ): use the predictor b


yf = b
+b
xf , where the OLS
estimators use the data for i = 1, ..., n. This predictor is BLUE for
E(yf ) = + xf .
The predictor b
yf is a linear combination of the OLS estimators and
so is normally distributed.
The variance of b
yf can be estimated, and confidence intervals and
tests of hypotheses are feasible.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

33 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

Prediction of yf : use same predictor which implies a forecast error


h

 i
of (yf b
y f ) = uf ( b
) + b
xf , which has zero
expectation, given OLS unbiased and E(uf ) = 0.
The forecast error is normally distributed, being a linear
combination of three normal variates, and has a variance that can
be estimated. Confidence intervals and tests of hypotheses, e.g.
H0 : E ( yf b
yf ) = 0,are feasible.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

34 / 110

C LASSICAL M ULTIPLE R GRESSION M ODEL

C LASSICAL M ULTIPLE R GRESSION M ODEL


Have sample of n independent observations y1 , ..., yn , each of
which is normally distributed with variance 2 , but means vary
according to
E(yi ) = + 1 x1i + ... + k xki = + j xji , i = 1, ..., n.

READING

Wooldridge, Ch.3, 4
and j are parameters/coefficients.
Regressors xji vary with i, but nonrandom (nonstochastic, i.e. fixed
in repeated sampling).
can be regarded as an intercept with = E(yi ), given all xji = 0.
Slopes j can often be regarded as partial derivatives: j =

E(yi )
xji .

Note: Regressor might be discrete or a nonlinear function of some


other regressor; so that interpretations vary.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

35 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

36 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

C LASSICAL M ULTIPLE R GRESSION M ODEL

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

T HE C LASSICAL M ULTIPLE R GRESSION M ODEL

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL


The following assumptions are made in the classical normal
regression model:

If ui = yi ( + j xji ) denotes the error or disturbance term,


j

then write classical normal multiple regression model as:

A1 There exist observation invariant parameters and j , j = 1, ..., k


such that
E(yi ) = + j xji i;
j

yi = + j xji + ui , ui NID(0, ), i = 1, ..., n,

A2 The regressor xji are nonrandom and satisfy

j
n

(xji xj )2 > 0, xj = n1 xji

where NID stands for, normally and independently distributed.

where n > 1 and j = 1, ..., k. For the purpose of asymptotic theory,


n

assume 0 < limn n1 (xji xj )2 < for all j.


1

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

37 / 110

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL

38 / 110

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL

S TOCHASTIC S PECIFICATION OF C LASSICAL M ODEL


Assumption A2 is often too restrictive for economic applications
in which some regressors are probably better regarded as random,
rather than fixed in repeated sampling.

The following assumptions are made in the classical normal


regression model:
A3 Also need to assume that no regressor is just a linear combination
of the other regressors and the intercept term.
A4 Common variance (homoskedasticity) var(ui ) = 2 i. If the ui do
not have the same variance, have heteroskedasticity.
A5 Uncorrelated disturbances so E(ui uj ) = 0 if i 6= j.If have time series
data and assumption is false then say have autocorrelation/serial
correlation.
A6 Normally distributed distanbances (so that A5 implies
independence).

As in the case of the simple regression model, we can start by


thinking about the conditional distribution of yi , holding the
values xji (i = 1, ..., n; j = 1, ...k) constant. Having derived results
for the conditinal model, we can see which of them will apply to
the unconditional model for y.
For the former model, we have that, given the values of the
regressors, the variates yi are independent with conditional
distributions N ( + xji j , 2 ) for i = 1, ..., n.
j

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

39 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

40 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

M ETHOD OF M OMENTS E STIMATION

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

M ETHOD OF M OMENTS E STIMATION

O RDINARY L EAST S QUARES E STIMATION

= 0

The OLS estimators are chosen to minimize,


"

= 0

The F.O.C.s yields the normal equations,

Have, E(ui ) = 0 and E(xji ui ) = 0 for j = 1, ..., k.


Therefore, MM estimators, denoted byb, can be derived form

ubi

O RDINARY L EAST S QUARES E STIMATION

S(, 1 , ..., k ) =

yi

+ j xji

!#2

xji ubi
i

ubi

bi is the residual yi (b
for j = 1, ..., k, where u
+b
j xji ), i = 1, ..., n.

xji ubi

The MM estimate of 2 can be derived from


E(u2i 2 ) = 0,
it is

b2i .
b 2 = n1 u

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

= 0

bi is the OLS residual


for j = 1, ..., k, where
! u
yi

b
j xji
+b

, i = 1, ..., n.

J IANHUA G ANG (RUC)

= 0

Hence the OLS estimators are equal to the MM estimators.


41 / 110

M AXIMUM L IKELIHOOD E STIMATION

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

M AXIMUM L IKELIHOOD E STIMATION

42 / 110

S AMPLING P ROPERTIES OF OLS E STIMATORS

S AMPLING P ROPERTIES OF OLS E STIMATORS

Using methods similar to those appropriate in the context of the


simple regression model, it can be shown that the log likelihood
function is given by,
Best linear unbiased estimator (BLUE) of and j , j = 1, ..., k,even
when errors ui are not normally distributed.

S(, 1 , ..., k )
n
.
l(, 1 , ..., k , ) = ( ) ln(22 )
2
22
2

Consistent and asympototically efficient (MLE).

The MLE of the regression parameters must minimize


S(, 1 , ..., k ) and so OLSE = MLE.
b2
The MLE of 2 is RSS
n , where RSS = ui is the OLS residual sum
of squares function.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

43 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

44 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

OLS D ECOMPOSITION OF S UM OF S QUARES

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

OLS D ECOMPOSITION OF S UM OF S QUARES

OLS D ECOMPOSITION OF S UM OF S QUARES

OLS D ECOMPOSITION OF S UM OF S QUARES

j xji denote a typical OLS predicted value.The


Let b
yi = b
+b
j

normal equation for OLS yield several results,

yi

(byi + ubi ) = byi + ubi = byi


i

byi ubi

bi + bj xji u
bi = 0
= b
u

y2i =
i

by2i + ub2i , given 2 byi ubi = 0

J IANHUA G ANG (RUC)

(byi n byi )
i

Note sums of squares are measured about sample averages.

(byi + ubi )2
i

( yi n y i )

Total Sum of Squares (TSS)=Explained Sum of Squares(ESS) +


Residual Sum of Squares(RSS)

or put it another way,

(b + bj xji )ubi
i

b2i
+u
i

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

45 / 110

G OODNESS OF F IT

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

G OODNESS OF F IT

46 / 110

E XPRESSIONS FOR OLS E STIMATORS

E XPRESSIONS FOR OLS E STIMATORS

It can be shown that


R2

Coefficient of determination
is index of goodness of fit of OLS
ESS
RSS
2
line with R = TSS = 1 TSS , 0 R2 1.
2

Some use degree-of-freedom adjusted R2 , denoted by R ,and


2
defined by R = 1 {RSS/ (n k 1) / [TSS/ (n 1)]} .This
index can be negative.

b
j xj ,
= yb
j

with a typical slope estimator given by

b
j =

If add regressors to a model and re-estimate by OLS, R2 cannot


2
fall (monotonic function on # of parameters), but R can.

exji yi
i

ex2ji

where e
xji is the ith residual from the OLS regression of the
jth regressor on the other (k 1) regressors and the intercept term.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

47 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

48 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

E XPRESSIONS FOR OLS E STIMATORS

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

E XPRESSIONS FOR OLS E STIMATORS

S AMPLING D ISTRIBUTION OF OLE E STIMATORS

For the classical normal multiple regression model,


b
N (, var(b
)).

It can also be shown that


b
j = j +

exji ui
i

ex2ji

= j +

exji ui
i

RSSj

Since the OLS estimators of the slope parameters can be written as


b
xji ui / e
x2ji = j + e
xji ui /RSSj and the disturbances
j = j + e

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

49 / 110

J IANHUA G ANG (RUC)

2 2

= RSS (n k 1)

Note that (n k 1) is the number of observations minus the


number of regression parameters estimated to derive the
residuals and is called the degree of freedom parameter for the
regression.
b2i ) = 2 (n k 1) and so the estimator s2 = (n1k1) ( u
b2i )
E( u
i

is unbiased.

b2 = [(n k 1)/n] s2 is biased (of


However, the MLE estimator,
course when sample size is relatively small).
I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric
S PRING
Theory
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

50 / 110

S AMPLING DISTRIBUTIONS FOR INFERENCE

S AMPLING DISTRIBUTIONS FOR INFERENCE

independently of b
and bj , j.

J IANHUA G ANG (RUC)

var(b
j ) = 2 /RSSj , j = 1, ..., k.

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

It can be shown that, in classical normal simple regression model,

E( b
j ) = j

E STIMATION OF SIGMA - SQUARE

E STIMATION OF SIGMA - SQUARE

b2i
u

ui are NID(0, 2 ), they are all normally distributed with

where RSSj is the residual sum of squares from the OLS


estimation of the auxiliary regression of the jth regressor on the
other (k 1) regressors and the intercept term.

J IANHUA G ANG (RUC)

S AMPLING D ISTRIBUTION OF OLE E STIMATORS

51 / 110

b
)) and N ( j , var( bj )), respectively, so that
and b
j are N (, var(b
z(b
) = (b
)/

RSS =

var(b
) N (0, 1)
q
z( bj ) = ( bj j )/ var( bj ) N (0, 1).

i ub2i 2 2 (n k 1) independently of b and bj , so

RSS/2 2 (n k 1) independently of z(b


) and z( bj ), so
q

[RSS/(n k 1)] /2 t(n k 1)


q
t( bj ) = z( bj )/ [RSS/(n k 1)] /2 t(n k 1).
t(b
) = z(b
)/

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

52 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

S AMPLING DISTRIBUTIONS FOR INFERENCE

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

S AMPLING DISTRIBUTIONS FOR INFERENCE

C ONFIDENCE I NTERVALS

We know RSS/(n k 1) = s2 , so that, for example,



q
p
p
t( bj ) = z( bj )/ s2 /2 = ( bj j )/
var( bj ) s2 /2

var( bj )(s2 /2 ) = (2 /RSSj )(s2 /2 ) = s2 /RSSj which is the


estimator of var( b ) and the square root of this quantity is called
j

the (estimated) standard error, denoted by SE(b


).
Hence,

simlarly

J IANHUA G ANG (RUC)



t( bj ) = bj j /SE( bj ) t(n k 1)
) /SE(b
) t(n k 1)
t(b
) = (b

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

53 / 110

the (1 ) 100 per cent confidence intervals for and j are


) and b d1 SE( b ), respectively.
given by b
d1 SE(b
j

J IANHUA G ANG (RUC)

j0

Then t0 ( bj ) is the test statistic. The critical/rejection region


depends upon the nature of the alternative hypothesis and the
prespecified significance level, denoted by .

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

54 / 110

T EST OF H YPOTHESES USING T S TATISTICS

T EST OF H YPOTHESES USING T S TATISTICS

For whatever value of j , t( bj ) = ( bj j )/SE( bj ) t(n k 1)


,and so if H0 is true t0 ( b ) = ( b )/SE( b ) t(n k 1).
j

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

Consider null hypothesis that restricts one of the regression


parameters, e.g. H0 : j = j0 (some specified constant),

J IANHUA G ANG (RUC)

Let d1 be such that prob(d1 t(n k 1) d1 ) = (1 )

T EST OF H YPOTHESES USING T S TATISTICS

T EST OF H YPOTHESES USING T S TATISTICS

C ONFIDENCE I NTERVALS

55 / 110

H1 : j 6= j0 reject H0 if |t0 ( bj )| > d1 ,where


prob(t(n k 1) > d1 ) = /2
H + : > reject H0 if t0 ( b ) > d2 ,where
1

j0

prob(t(n k 1) > d2 ) =

H1 : j < j0 reject H0 if t0 ( bj ) < d2 ,where


prob(t(n k 1) < d2 ) =
Just replace j by and bj by b
in the above to obtain test
procedures relevant to testing hypotheses concerning the
intercept.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

56 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

F T EST OF S EVERAL L INEAR R ESTRICTIONS

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

F T EST OF S EVERAL L INEAR R ESTRICTIONS

F T EST OF S EVERAL L INEAR R ESTRICTIONS

E XAMPLE
Suppose that the null hypothesis to be tested is denoted by H0 and
consists of several linear restrictions on the parameters of the
regression model. Thus H0 specifies the values of, say, q < (k + 1)
linear combinations of the regression coefficients. For example, with
k = 4 and q = 3, H0 could consist of the following restrictions:
+ 1 = 0; 2 = 1; and 4 = 0. We now need a joint test of all the
restrictions of H0 ,rather than a collection of separate t-tests.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

57 / 110

F T EST OF S EVERAL L INEAR R ESTRICTIONS

Let RSS(H0 ) be the sum of squared residuals obtained under the


restrictions of H0 .In the example of the previous note, RSS(H0 ) is
derived by applying OLS to the restricted model:
(yi x2i ) = 1 (x1i 1) + 3 x3i + ui .
Let RSS(H1 ) be the RSS obtained by applying OLS to the
unrestricted model. In the previous example, RSS(H1 ) is derived by
4

applying OLS to yi = + j xji + ui .


j=1

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

F T EST OF S EVERAL L INEAR R ESTRICTIONS

58 / 110

P REDICTION

P REDICTION

D EFINITION
Define the F statistic by the following equation
F=

F T EST OF S EVERAL L INEAR R ESTRICTIONS

Suppose wish to make predictions for period f , f > n (n is the


sample size), with xjf known and it being assumed that the data
generation process (DGP) for y is unchanged so that

[RSS(H0 ) RSS(H1 )] df (H1 )

,
RSS(H1 )
q

yf = + j xjf + uf , uf N (0, 2 ).
j

in which df (H1 ) is the degrees of freedom parameter for the


unrestricted model, i.e. df (H1 ) = (n k 1).

Prediction of E(yf ): use the predictor b


yf = b
+b
j xjf , where the
j

If H0 is true, then F F(q, df (H1 )).


The null hypothesis is regarded as inconsistent with the data if the
sample (observed) value of F is significantly large, i.e. the test is
one-sided.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

59 / 110

OLS estimators use the data for i = 1, ..., n. This predictor is BLUE
for E(yf ) = + xf .
The predictor b
yf is a linear combination of the OLS estimators and
so is normally distributed. The variance of b
yf can be estimated, and
confidence intervals and tests of hypotheses are feasible.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

60 / 110

R EVIEW T OPIC 2: M ULTIPLE R EGRESSION

P REDICTION

R EVIEW T OPIC 3: M ULTICOLLINEARITY

P REDICTION

R EVIEW T OPIC 3: M ULTICOLLINEARITY

Suppose wish to make predictions for period f , f > n (n is the


sample size), with xjf known and it being assumed that the DGP
for y is unchanged so that,
yf = + j xjf + uf , uf N (0, 2 )
j

READING

Prediction of yf : use same predictor which implies a forecast error


of
#
"


b
) + j j xjf
( yf b
y f ) = uf ( b

Wooldridge, Ch.3.

which has zero expectation, given OLS unbiased and E(uf ) = 0.


The forecast error is normally distributed, being a linear
combination of normal variates, and has a variance that can be
estimated.
Confidence intervals and tests of hypotheses, e.g.
H0 : E ( yf b
yf ) = 0,is feasible.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 3: M ULTICOLLINEARITY

61 / 110

M ULTICOLLINEARITY

M ULTICOLLINEARITY

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 3: M ULTICOLLINEARITY

62 / 110

M ULTICOLLINEARITY

M ULTICOLLINEARITY

The information content of a sample available for the purpose of


estimating the individual regression parameters depends, in part,
upon the intercorrelations between the regressors.
Let R2j denote the R2 statistic from the OLS estimation of the

It can be proved that,


var(b
j ) = 2 /RSSj = 2 /

"

(xji xj )
i


1 R2j .

auxiliary regression of the jth regressor on the other (k 1)


regressors and the intercept term. Since it has been assumed that
no regressor is a linear combination of the other regressors and
the intercept term, it follows that R2j < 1 for all j.

Thus, ceteris paribus, high degrees of multicollinearity lead to high


values of sampling variances.

If R2j = 1 for some j, then say that there is perfect multicollinearity.


If R2j is close to 1 for some j, then have a high degree of
multicollinearity.

However, in practice, we cannot vary R2j with 2 and (xji xj )2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

63 / 110

Note: imprecise estimators can lead to wide condidence intervals


and weak tests of hypotheses.
i

held constant. Variances may be small even when there is a high


degree of multicollinearity, or large when the regressor are
uncorrelated.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

64 / 110

R EVIEW T OPIC 3: M ULTICOLLINEARITY

M ULTICOLLINEARITY

R EVIEW T OPIC 3: M ULTICOLLINEARITY

M ULTICOLLINEARITY

M ULTICOLLINEARITY

Also note that although the multicollinearity is indeed a problem,


but nontheless no assumptions of the classical multiple
regression model have been violated.
Therefore, provided multicollinearity is not perfect, then OLS
estimators are BLUE and MLE. Similarly the standard test
procedures are valid and retain optimality properties relative to
other tests.
Klein proposes the rule of thumb that multicollinearity is a
"problem" if maxj R2j > R2 .
If trying to consider multicollinearity, it is not sufficient to look
only at pairwise correlations between regressors (might be nested
models where reside complex relationship or even stochastic).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 3: M ULTICOLLINEARITY

M ULTICOLLINEARITY

65 / 110

Multicollinearity is a feature of the nonrandom regressor set and


so we cannot test for it. Some measures for multicollinearity have
been proposed, but they are open to objection and the R2j statistics
are simple to calculate and interpret.
Models can be reparameterized to make transformed regressor
uncorrelated, but the transformed parameters may have no
economic interest.
As noted above, multicollinearity can lead to large variances and
weak tests, e.g. might have every individual slope estimate being
insignificant (as indicated by a t-test), but a highly significant F
statistic for the hypothesis that all slopes equal zero.

J IANHUA G ANG (RUC)

M ULTICOLLINEARITY

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

66 / 110

R EVIEW T OPIC 4: T HE M EAN F UNCTION

M ULTICOLLINEARITY

R EVIEW T OPIC 4: T HE M EAN F UNCTION

Multicollinearity can also lead to large changes in parameter


estimates when there are small changes in the data.
Various "treatments" have been described, e.g. drop some
variables, use first differences, use outside estimates of some
coefficients. These treatments usually introduce new problems,
e.g. dropping an insignificant, but relevant, variable will lead to
biased estimator in the amended model.

READING
Wooldridge, Ch.3, Ch. 7, Ch. 9.

Real solution is to get more valid information, so using false


restrictions is not a good strategy. May also have to wait for more
data.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

67 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

68 / 110

I NCORRECT S PECIFICATION IN THE M EAN


R EVIEW T OPIC 4: T HE M EAN F UNCTION F UNCTION -C ONSEQUENCES

I NCORRECT S PECIFICATION IN THE M EAN


R EVIEW T OPIC 4: T HE M EAN F UNCTION F UNCTION -C ONSEQUENCES

C ONSEQUENCES

C ONSEQUENCES

C ASE 1

C ASE 2

Have assumed that there exist observation invariant parameters


and 1 , ..., k such that the conditional mean is given by

Have assumed that there exist observation invariant parameters


and 1 , ..., k such that the conditional mean is given by
E(yi |xji , j = 1, ..., k) = + j xji ,

E(yi |xji , j = 1, ..., k) = + j xji ,

where xji is ith value of jth regressor.

where xji is ith value of jth regressor.


1. May have included irrelavant regressors, i.e. some j equals zero. OLS
estimators are still unbiased and consistent, but no longer efficient
(they fail to use valid information set that corresponds to some
coefficients being zero).

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

J IANHUA G ANG (RUC)

69 / 110

I NCORRECT S PECIFICATION IN THE M EAN


R EVIEW T OPIC 4: T HE M EAN F UNCTION F UNCTION -C ONSEQUENCES

2. May have omitted some relevant regressors: Write the conditional mean
function as E(yi |xji , j = 1, ..., k) = + j xji + E(fi |xji , j = 1, ...k.),
j

where fi stands for an omitted factor. In general, OLS estimators of


regression parameters and j are biased and inconsistent. The
estimator s2 is biased and inconsistent, and the standard t- and
F-tests are no longer valid.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 4: T HE M EAN F UNCTION

C ONSEQUENCES

70 / 110

T EST P ROCEDURES -RESET T EST

T EST P ROCEDURES -RESET T EST

C ASE 3

If have strong belief about the omitted factor, can use precise test.
For example, if sure that fi is a linear combination of q variables zji ,
can apply F-test of H0 : 1 = ... = q = 0 in the expanded model

May use incorrect functional form, e.g. assume


yi = + j xji + ui , ui NID(0, 2 ),

yi = + j xji + j zji + ui , ui NID(0, 2 ).

when the true model is a log-log form

If do not have strong belief, then can use "information


parsimonious" RESET test. In this test, fit the null model

log(yi ) = + j log(xji ) + vi , vi NID(0, 2 ).


j

yi = + j xji + ui , ui NID(0, 2 ),

The OLS estimators of the false linear-linear model do not


correspond to parameters of economic interest.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

by OLS to obtain predicted values b


yi , i = 1, ..., n.
71 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

72 / 110

R EVIEW T OPIC 4: T HE M EAN F UNCTION

T EST P ROCEDURES -RESET T EST

R EVIEW T OPIC 4: T HE M EAN F UNCTION

T EST P ROCEDURES -RESET T EST

T ESTS FOR S TABILITY

Then test H0 : 1 = ... = q = 0 in the artificial model,

Suppose we divide the sample into two subsamples, denoted by


1 and 2 .Let 1 contains n1 observations and 2 contains
n2 = n n1 observations. The unrestricted model of the
alternative hypothesis is then written as,

yi )j+1 + ui , ui NID(0, 2 ).
yi = + j xji + j (b
j

yi = + j xji + ui , ui NID(0, 2 ), if i 1 ,

Notes:
1

2
3
4
5
6
7

No b
yi term because this is a linear combination of the intercept term
and the regressors xji ;
F-test is valid even though added variables are random;
Choice of q has impact on power;
No rule for determining the best value of q;
Often use quite small values of q, e.g. 1 or 2;
Cannot expect RESET to indicate how a model should be re-specified;
Cannot assume RESET will always have high power.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 4: T HE M EAN F UNCTION

73 / 110

T ESTS FOR S TABILITY

restrictions of H0 : = and j = j , j = 1, ..., k .

Suppose that ns > (k + 1), s = 1, 2.Let RSSs denote the residual


sum of squares (RSS) for the OLS regression of yi on the intercept
term and the xji using only the observations for s , s = 1, 2, and
RSS denote the residual sum of squares for this OLS regression
using all n observations. H0 can be tested using the F statistic
RSS (RSS1 + RSS2 ) n 2k 2
RSS1 + RSS2
k+1

yi = + j xji + ui , ui NID(0, 2 ), if i 2 .
j

so that changes in regression coefficients are permitted (under the


unrestricted model!). Should note the homoskedasticity.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

74 / 110

T ESTS FOR S TABILITY

If, say, n2 (k"+ 1), then use predictive


failure test. Test n2
!#
restrictions E yi e
xji
+e
= 0, i ,whereedenotes an
j

estimator derived using only the observations of 1 .The


F-statistics is
RSS RSS1 n1 k 1
F=
,
RSS1
n2
which is F(n2 , (n1 k 1)) when the model is stable.

However, in case of n2 < k + 1,the n2 restrictions being tested may


be satisfied even though H0 is false.

which is F(k + 1, (n 2k 2)) under H0 , with large values


indicating the inconsistency of H0 .
I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric
S PRING
Theory
2013

and

T ESTS FOR S TABILITY

The null hypothesis


of constant coefficients consists
n
o of the (k + 1)

J IANHUA G ANG (RUC)

R EVIEW T OPIC 4: T HE M EAN F UNCTION

T ESTS FOR S TABILITY

F=

T ESTS FOR S TABILITY

75 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

76 / 110

R EVIEW T OPIC 4: T HE M EAN F UNCTION

T REATMENT

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

T REATMENT

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

The only treatment that allows valid inference is the correct


specification of the mean function.

READING

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

Wooldridge, Ch.5.

77 / 110

N ON - NORMAL D ISTURBANCES

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

N ON - NORMAL D ISTURBANCES

78 / 110

C ONSEQUENCES

C ONSEQUENCES
OLS estimators are still BLUE, but, in general, are NOT normally
distributed. Therefore the t and F tests are no longer valid in
finite samples.

Now suppose that the regression model is

The standard formulae for confidence intervals are also invalid in


finite samples.

yi = + j xji + ui , i = 1, ..., n,
j

where the disturbances are independently and identically


distributed (i.i.d.) with zero mean and variance 2 < ,but the
common distribution is NOT normal.

Under weak conditions, OLS estimators are consistent and a


Central Limit Theorem can be used to show that they are
asymptotically normally distributed, implying that t and F tests of
linear restrictions on regression coefficients are asymptotically
valid. The usual confidence intervals are also asymptotically
valid.
The prediction error test is, however, not asymptotically valid.
Since MLE maximizes wrong likelihood function, it does not
produce asymptotically efficient estimators.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

79 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

80 / 110

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

T EST P ROCEDURES

R EVIEW T OPIC 5: N ON - NORMAL D ISTURBANCES

T EST P ROCEDURES

T REATMENT

T REATMENT

When the ui are NID(0, 2 ), the following conditions are satisfied:


E(u3i ) = 0; and E(u4i ) 34 = 0.
bi ,then it is natural to look
If a typical OLS residual is denoted by u
b3i and
at tests based upon the sample moments n1 u
b3i 3b
b2i .Jarque and Bera propose a
4 , where 2 = n1 u
n1 u
test of the joint significance of these terms. However, this test is
only asymptotically valid and, in large samples, there is little
need to assume normality when examining OLS results for the
linear multiple regression model.

If have precise information about the form of the disturbance


distribution, then can derive the likelihood function and obtain
the asymptotically efficient MLE. Otherwise, use OLS and rely
upon large sample results.

Asymptotic theory sometimes provides a poor approximation to


the actual finite sample behaviour of the Jarque-Bera statistic
when the ui are normal.
The Jarque-Bera test can have low power under some nonnormal
disturbance distributions.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

81 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

R EVIEW T OPIC 6: A UTOCORRELATION AND


H ETEROSKEDASTICITY

82 / 110

H ETEROSKEDASTICITY-I NTRODUCTION

H ETEROSKEDASTICITY-I NTRODUCTION

Allow var(ui ) to vary with i, so that y1 , ..., yn are independent


N ( + j xji , 2i ) variables, where 2i denotes var(ui ).
j

READING

Heteroskedasticity is often regarded as associated with


cross-section data, grouped data, or random coefficient models,
but can occur in time-series applications (GARCH-family models
for instance).

Wooldridge, Ch.8., Ch.12.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

83 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

84 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

C ONSEQUENCES OF H ETEROSKEDASTICITY

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

C ONSEQUENCES OF H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

OLS still unbiased and consistent, but no longer efficient in


either large or small samples.

Goldfeld-Quandt Test

OLS not MLE because MLE maximize likelihood under false


assumption that all ui have same variance.
b
xji ui / e
x2 = + e
xji ui /RSSj , so that
= + e
j

ji

var(b
j ) =

"i

ex2ji 2i / ex2ji
i

#2


2
= e
x2ji u2i / RSSj which is not equal
i

to E(s2 )/RSSj .Conventional standard errors are, therefore, biased.


The t- and F-tests are, therefore, invalid.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T ESTSFOR H ETEROSKEDASTICITY

85 / 110

T ESTSFOR H ETEROSKEDASTICITY

A finite sample test that requires normality of the distrubances.


The null hypothesis is that the errors are homoskedastic. It is
assumed that information is available about the relative
magnitudes of variances under the alternative hypothesis of
heteroskedasticity.
Using this information, reorder the data so that 21 22 ... 2n .
Split the sample into three parts containing m, c, and m
observations, with m > (k + 1) and n = 2m + c. Drop the middle
set of c observations.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

86 / 110

T ESTSFOR H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

Goldfeld-Quandt Test
Let RSS1 and RSS2 denote the OLS residual sum of squares
functions for estimation using the first m and last m observations,
respectively. Under the null hypothesis of homoskedasticity, the
statistic GQ = RSS2 /RSS1 is distributed as F(m k 1, m k 1)
and large values indicate data inconsistency of null hypothesis.

Lagrange Multiplier/Score Test


Original form suggested by Breusch-Pagan and Godfrey requires
normal disturbances even for asymptotic validity, and is not
recommended.

Problems: a) Choice of m and c; b) Need enough information to


reorder data according to values of variances.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

87 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

88 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T ESTSFOR H ETEROSKEDASTICITY

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

T ESTSFOR H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

Studentizedd Score Test


Koenkers Studentized Score test is asymptotically robust to
nonnormality. Estimate model by OLS using all observations and
bi , i = !
obtain the residuals u
1, ..., n. Assume an alternative of the
p

form

2i

= g 0 + j zji ,where the precise form of g(.) need


1

not be specified.

Studentizedd Score Test


Koenkers test statistic is nR2K and, under homoskedasticity, nR2K is
asymptotically distributed as 2 (p) with large values indicating
the rejection of the null model.
Problems: a) Large sample test; b) need enough information to
select the variable zji incorrect choice has impact on power.

Apply OLS to the artificial regression model


p

b2i = 0 + j zji + ai , i = 1, ..., n,


u
1

and obtain the coefficient of determination denoted by R2K .


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

89 / 110

T ESTSFOR H ETEROSKEDASTICITY

90 / 110

T ESTSFOR H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

Whites Direct Test

Autoregressive Conditional Heteroskedasticity Tests

Whites test can be regarded as a Koenker-type test with the zji


being the nonredundant terms of xiq and xiq xir , q, r = 1, ..., k.
Problems: a) Large sample test; b) need enough information to
select the variable zji incorrect choice has impact on power.

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T ESTS FOR H ETEROSKEDASTICITY

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

91 / 110

ARCH models are widely used - conditional variance depends


upon squared past values of ui .The test for ARCH is a
b2ij ; i = p + 1, ..., n and j = 1, ..., p.
Koenker-type check with zji = u

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

92 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

If know variances up to a constant of proportionality, can apply


OLS to transformed data to get efficient estimators. Suppose
2i = 2 w2i ,with the w2i being known, then var(ui /wi ) = 2 i.In
this case, apply OLS to the transformed model
(yi /wi ) = (1/wi ) + j (xji /wi ) + (ui /wi ), in which the (ui /wi )

If suspect heteroskedasticity and do not have very precise


information about its form, then can use Whites
heteroskedasticity consistent standard errors, denoted by
WSE(b
) and WSE(b
j ), j = 1, ..., k. for asymptotically valid
inference after OLS estimation.
White shows that, if

WSE(b
j ) =

are NID(0, ) variates.


Note: the transformed model may not contain an intercept.

ex2ji u2i /
i

2
RSSj ,

j ) is asymptotically distributed as N (0, 1) in


then (b
j j )/WSE(b
presence of unspecified heteroskedasticity.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

93 / 110

T REATMENT OF H ETEROSKEDASTICITY

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

J IANHUA G ANG (RUC)

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

94 / 110

T REATMENT OF H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

Asymptotically valid tests of hypotheses such as


H0 : j = j0

Hence, if d1 is such that


are based upon

prob(d1 N (0, 1) d1 ) = (1 ),
the (1 ) 100 per cent confidence intervals for and j are
) and b d1 WSE( b ), respectively.
given by, b
d1 WSE(b
j

J IANHUA G ANG (RUC)

under H0 .

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

b
b
b
tW
0 ( j ) = ( j j0 ) /WSE( j )N (0, 1)

Since the procedures are only asymptotically valid, can replace


N (0, 1) by t(n k 1) and this is often done. Thus can use the
following to obtain asymptotically valid tests of H0 : j = j0 .

95 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

96 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

T REATMENT OF H ETEROSKEDASTICITY

A UTOCORRELATION /S ERIAL C ORRELATION I NTRODUCTION




b ) > d1 , where
H1 : j 6= j0 reject H0 if tW
(

0
j

Have yt N ( + j xjt , 2 ),but no longer assume independence,


j

prob(t(n k 1) > d1 ) = /2;


b
H1+ : j > j0 reject H0 if tW
0 ( j ) > d2 , where
prob(t(n k 1) > d2 ) = ;
H : 6= reject H0 if tW ( b ) < d2 , where
1

j0

t = 1, ..., n. If ut = yt ( + j xjt ),then allow E(ut us ) 6= 0 for


j

some t 6= s.Use t subscript because autocorrelation is often


discussed in a time-series framework, but spatial autocorrelation
has been examined.

prob(t(n k 1) > d2 ) = ;
Just replace j by and b
by b
in the above to obtain test
procedures relevant to testing hypotheses concerning the
intercept.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

The regressors are asumed to be nonrandom. (It would be


straightforward to allow for random regressors with xjt
independent of us , for all j, s and t.) This assumption will be
relaxed later. In particular, will consider autocorrelation when
regressors include lagged values of the dependent variable.

97 / 110

C ONSEQUENCES OF A UTOCORRELATION

OLS not MLE because MLE maximizes likelihood under false


assumption that the ut are independent.
b
= + e
xjt ut
xjt ut / e
x2 = + e
xjt ut /RSSj ,and, since the e
t

jt

are not independent, var

exjt ut
t

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

98 / 110

T ESTS FOR A UTOCORRELATION

T ESTS FOR A UTOCORRELATION

OLS still unbiased and consistent, but no longer efficient in either


large or small samples.

J IANHUA G ANG (RUC)

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

C ONSEQUENCES OF A UTOCORRELATION

A UTOCORRELATION /S ERIAL C ORRELATION

xjt ut ) and so
6= var(e
t

var(b
j ) 6= 2 /RSSj .Conventional standard errors are, therefore,
biased.

In the lectures given this term, it is assumed that the ut are


covariance stationary with E(ut utg ) = (|g|) for all t, with
(|0|) = 2 . The autocorrelation of order g, denoted by (g),is the
correlation between ut and utg ,i.e. E(ut utg )/2 , with the
sequence (1), (2), ...being called the autocorrelation function or
ACF. Under the null hypothesis of serial independence, (g) = 0
for all g 6= 0. Different tests check the significance of different sets
of estimates of autocorrelations.

The t- and F- tests are invalid.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

99 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

100 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

D URBIN WATSON T EST

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

D URBIN -WATSON T EST

D URBIN WATSON T EST

D URBIN -WATSON T EST

Basically a test for nonzero values of (1), based upon OLS


residuals. The test statistic is
d=
which is approx.

L EMMA
Values of d close to 0 (resp. 4) indicate high level of positive (resp. negative)
residual first order serial correlation. The distribution of d under null
hypothesis of independent errors depends upon values of regressors, so critical
values vary from one case to another.

(ubt ubt1 )2 / ub2t


2(1 r(1))

where,
r(1) =

J IANHUA G ANG (RUC)

ubt ubt1 / ub2t

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

101 / 110

J IANHUA G ANG (RUC)

D URBIN WATSON T EST

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

D URBIN -WATSON T EST

102 / 110

D URBIN WATSON T EST

D URBIN -WATSON T EST

Have tables for combinations of n and k (and for models with and
without an intercept) giving bounds for the critical values for
testing H0 of serial independence against H1 : (1) > 0.These
upper and lower bounds, denoted by du and dl , define an interval
that contains the true known critical value. If d < dl , reject.If
d > du , accept.If dl d du ,the test is inconclusive. For
H1 : (1) < 0,use 4 du and 4 dl as bounds.

The Durbin-Watson procedure is a useful test against either first


order autoregressive (AR(1)) model ut = 1 ut1 + t ,or first order
moving average (MA(1)) model ut = t + 1 t1 ,in which the
t NID(0, 2 ).For reasons to be discussed later in time series, we
assume |1 | < 1 and | 1 | 1.
Problems:
Checks for nonzero values of (1) can be insensitive to
(g) 6= 0, g 6= 1,e.g. g = 4, when (1) = 0
Test is inconclusive when sample value of d falls between
bounds-inconclusive region.
Requires errors to be normal and regressors to be fixed, e.g. no
lagged dependent variables.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

103 / 110

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

104 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

L AGRANGE M ULTIPLIER /S CORE T ESTS

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

L AGRANGE M ULTIPLIER /S CORE T ESTS

E STIMATION

Very flexible asymptotic test based upon OLS results. It is


asymptotically valid for models with nonnormal errors and
lagged dependent variables in the regressor set.
If null hypothesis of serial independence is to be tested against
autoregressive or moving average model of order g, then apply
asymptotically valid F-test of H0 : 1 = 2 = ... = g = 0 after OLS
k

E STIMATION

If have precise information about form of autocorrelation, e.g.


type (AR or MA) and order (value of g), can use asymptotically
efficient MLE or apporoximation.
k

Model can then be written as yt = + j xjt + ut ,with either


1

btj + ut ,in which


estimation of the model yt = + j xjt + j u

ut = 1 ut1 + ... + g utg + t , t NID(0, 2 ), AR(g), or


ut = t + 1 t1 + ... + g tg , t NID(0, 2 ), MA(g). MLE, or
approximations based upon minimizing 2t are available in

of yt = + j xjt + ut .For "gaps" in alternative model, omit

econometric softwares.

btj are lagged values of the residuals from the OLS estimation
the u
k

btj = 0.
selected j terms. If t j is not positive, set u
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

105 / 110

O THER P ROBLEMS OF A UTOCORRELATION

106 / 110

O THER P ROBLEMS OF A UTOCORRELATION

R ESIDUAL S ERIAL C ORRELATION OR G ENUINE


D ISTURBANCE A UTOCORRELATION ?

Significant outcomes of tests designed for autocorrelation can be


caused by misspecification of the mean function, e.g. omit
relevant regressors or use wrong functional form. In such cases,
re-estimation allowing for autocorrelation is of little value.

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

R ESIDUAL S ERIAL C ORRELATION OR G ENUINE


D ISTURBANCE A UTOCORRELATION ?

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

107 / 110

A procedure, called the COMFAC test, has been developed to test


the null hypothesis that the errors of a regression equation are
generated by an autoregressive process of specified order. The
COMFAC test uses as its alternative an expanded version of the
original regression equation obtained by adding lagged values of
the dependent variable and the initial set of regressors. Details are
not provided because this test, while asymptotically valid, has
finite sample properties that cause concern; see Gregory and Veall,
Economic Letters, 1986, 22, 203-208. Moreover, the alternative
adopted in the COMFAC procedure may be inadequate and yield
a test that rarely detects a false null hypothesis.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

108 / 110

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

O THER P ROBLEMS OF A UTOCORRELATION

R EVIEW T OPIC 6: H ETEROSKEDASTICITY

R ESIDUAL S ERIAL C ORRELATION OR G ENUINE


D ISTURBANCE A UTOCORRELATION ?

R ESIDUAL S ERIAL C ORRELATION OR G ENUINE


D ISTURBANCE A UTOCORRELATION ?

Mizon (A simple message for autocorrelation correctors: dont,


Journal of Econometrics, 1995, 69, 267-288) offers the following
conclusions:
Although it is important to test for autocorrelation, it is rarely
appropriate to "autocorrelation correct" in response to rejecting the
null hypothesis of independent disturbances;
and, when re-estimation assuming autoregressive errors imposes
invalid restrictions, inconsistent parameter estimators will result.

The nature of the restrictions to which Mizon refers can be


illustrated by considering a simple case in which the model of the
null is yt = xt + ut ,with ut = 1 ut1 + t , t NID(0, 2 ), i.e. the
disturbances ut are AR(1).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

O THER P ROBLEMS OF A UTOCORRELATION

109 / 110

Under this null,


yt = xt + 1 (yt1 xt1 ) + t ,
or equivalently,
yt = xt + 1 yt1 1 xt1 + t , t NID(0, 2 ),
in which the coefficient of xt1 is restricted to be minus the
product of the coefficients of xt and yt1 . Note that this restriction
is not linear.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Review of Econometric


S PRING
Theory
2013

110 / 110

T OPIC 1 I NTRODUCTION OF T IME S ERIES

T OPIC 1 I NTRODUCTION OF T IME S ERIES


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 1 Introduction of Time Series
3 C REDITS , 51 H OURS
Statistical analysis of data observed over time.

Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

T OPIC 1 I NTRODUCTION OF T IME S ERIES

1 / 18

T IME S ERIES D ATA

J IANHUA G ANG (RUC)

T OPIC 1 I NTRODUCTION OF T IME S ERIES

T IME S ERIES D ATA

M OMENTS

For a generic random variable we can define the mean, variance,


and for pairs of random variables we can also define covariance,
correlation etc. In a time series we define these for each Yt :

Data observed between two dates, normalized as t = 1 and t = T.


Equispaced, i.e. we observe Y1 , Y2 , ..., Yt , Yt+1 , ..., YT1 , YT and NO
intermediate observation is missing.
Yt depends on Ys (if theres any) if and only if s < t
Yt does not depends on Ys if s > t.
Then, the vector {Y1 , Y2 , ..., Yt , Yt+1 , ..., YT1 , YT } is a time series.

D EFINITIONS (M OMENTS OF T IME S ERIES )


Mean: E(Yt )= t ;

2
2
Variance: E (Y
nt t ) = t

o
Covariance: E (Yt t )(Yt+j t+j ) = t (j)
Correlation:

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

2 / 18

M OMENTS

D EFINITION (T IME S ERIES D ATA )

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

3 / 18

t (j)
t t+j

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

4 / 18

T OPIC 1 I NTRODUCTION OF T IME S ERIES

O PERATORS

T OPIC 1 I NTRODUCTION OF T IME S ERIES

O PERATORS

S TATIONARITY AND E RGODICITY

S TATIONARITY AND E RGODICITY

P ROBLEM
Suppose {Y1 , Y2 , ..., Yt , Yt+1 , ..., YT1 , YT } is a single realization from a
stochastic process {Yt }
.We are interested in the model that generated
the time series, but we do not know it. How can we make inference, using
one single realization?

Lag operator: L
L Yt = Yt1
So, L1 Yt = Yt+1

First Difference operator:


= 1L
Yt = (1 L)Yt = Yt Yt1
Also, 2 Yt = (1 L)2 Yt = Yt 2Yt1 + Yt2

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

J IANHUA G ANG (RUC)

T OPIC 1 I NTRODUCTION OF T IME S ERIES

S OLUTION
We must use the fact that this is a T-dimensional observation:

5 / 18

R ESTRICT H ETEROGENEITY

Restrict heterogeneity over time;

Restrict dependence over time.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

T OPIC 1 I NTRODUCTION OF T IME S ERIES

R ESTRICT H ETEROGENEITY

6 / 18

R ESTRICT H ETEROGENEITY

R ESTRICT H ETEROGENEITY

Assume some properties are common to all the Yt s in


{Y1 , Y2 , ..., YT } .For example,

In this way, we may try to estimate or (j) using the sample


counterparts. "Covariance stationarity" is also known as a "weak
stationarity" or simply as "stationarity" (without other references).

D EFINITION (C OVARIANCE S TATIONARITY )


For time series Yt {Yt }
,

For stationary processes, we shorten the notation and introduce j


for (j) to indicate the autocovariance.

E(Yt ) = , t

E (Yt )(Yt+j )
= (j), t


The plot of j against j is called autocovariance function.

i.e. the first two moments are finite and do not depend on time
(spatial equivalent).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

7 / 18

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

8 / 18

T OPIC 1 I NTRODUCTION OF T IME S ERIES

R ESTRICT H ETEROGENEITY

T OPIC 1 I NTRODUCTION OF T IME S ERIES

R ESTRICT HETEROGENEITY

R ESTRICT D EPENDENCE OVER T IME


Given n , and given the process is stationary, then the sample
moments would estimate the population moments consistently.

An alternative restriction on heterogeneity is:

One may generalize this argument and allow for some


dependence, provided that it is not too much: a sufficient

D EFINITION (S TRICT S TATIONARITY )



For any j1 , ...jn , the joint distribution of Yt+j1 , ..., Yt+jn and of


Yt+ +j1 , ..., Yt+ +jn is the same for any .
1

The joint distribution only depends on the spatial difference


,not on time;

Strict and Covariance stationarity do not imply each other.

J IANHUA G ANG (RUC)

condition for consistent estimation of is

|j | < .
j=0

D EFINITION
One restriction on the dependence that allows to consistently estimate
the population moments using the sample moments in stationary
processes is called Ergodicity.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 Introduction SofPRING


Time Series
2013

T OPIC 1 I NTRODUCTION OF T IME S ERIES

R ESTRICT D EPENDENCE OVER T IME

9 / 18

J IANHUA G ANG (RUC)

R ESTRICT D EPENDENCE OVER T IME

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

T OPIC 1 I NTRODUCTION OF T IME S ERIES

R ESTRICT D EPENDENCE

10 / 18

F ORECASTS BASED ON A L INEAR P ROJECTION

F ORECASTS BASED ON A L INEAR P ROJECTION


Assume: Yt is stationary; E(Yt ) = 0 (if E(Yt ) = 6= 0, then
consider Yt instead). Then,

Often we are interested in time series because we want to answer


one of the two questions:
1

Forecasting: What value do you expect for Yt+1 if you observed


Y1 , ..., Yt ?
Impulse response: What is the consequence on Yt of a shock that
took place (t j) periods ago?

Linear forecast of Yt+1 using Yt is


b t + 1 | t = a ( 1 ) Yt ;
Y
1

Linear forecast of Yt+1 using Yt and Yt1 is


b t+1|t = a(2) Yt + a(2) Yt1 ;
Y
1
2

We first address these questions in the case of stationary processes.


3

Linaer forecast of Yt+1 using Yt , ..., Ytm+1 is


(m)
b t+1|t,...,tm+1 = a(m) Yt + a(m) Yt1 + ...am
Y
Ytm+1 .
1
2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

11 / 18

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

12 / 18

T OPIC 1 I NTRODUCTION OF T IME S ERIES

F ORECASTS BASED ON A L INEAR P ROJECTION

T OPIC 1 I NTRODUCTION OF T IME S ERIES

F ORECAST

W OLD D ECOMPOSITION

W OLD D ECOMPOSITION
Of course, in some cases a non-linear forecast may be better.

Now, which values of


linear projection?

(m) (m)
(m)
(1 , 2 , ..., m )
(m)

(m)

characterise a good
(m)

Let Xt = (Yt , ..., Ytm+1 ) , = (1 , 2 , ..., m ) ,then must


meet E [(Yt+1 Xt ) Xt ] = 0 (i.e., the forecast error Yt+1 Xt is
not correlated with Xt ).
Then, given Yt+1 = Yt+1 , (Yt+1 being single component),

T OPIC 1 I NTRODUCTION OF T IME S ERIES

Any stationary process Yt may be represented in the form


Yt = kt + j tj
j=0

where

It can be proved that b


gives the best linear forecast.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

D EFINITION (W OLD D ECOMPOSITION )

E(Yt+1 Xt ) E(Xt Xt ) = 0
 1

b
E(Xt Yt+1 )
= E(Xt Xt )

J IANHUA G ANG (RUC)

However, a linear model is usually easier to use, so it is important


that any stationary process may be given a linear representation.
This can be discussed using the Wold Decomposition.

0 = 1, 2j <
j=0

13 / 18

W OLD D ECOMPOSITION

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

T OPIC 1 I NTRODUCTION OF T IME S ERIES

W OLD D ECOMPOSITION

14 / 18

I MPULSE R ESPONSE

I MPULSE R ESPONSE
For a process Yt that admits

Yt = + j tj

and t ,the error made in forecasting Yt on the basis of a linear


function,
b (Yt |Yt1 , ...)
t = Yt E

j=0

for t such that, for any t,

is such that, for any t, E(t ) = 0, E(2t ) = 2 , E(t s ) = 0 if t 6= s.

E(t ) = 0, E(2t ) = 2 ,

kt is the linear deterministic component of Yt : it can be predicted


arbitrarily well as a linear function of past Yt , i.e.,
b (kt |Yt1 , ...) and it is such that E(kt tj ) = 0 j.
kt = E

E(t s ) = 0, s 6= t.
notice that

Yt
= j
tj

so j is the effect on Yt of a shock that took place (t j) periods


before. A plot of j (againtst j) is called impulse response function.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

15 / 18

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

16 / 18

T OPIC 1 I NTRODUCTION OF T IME S ERIES

ACF

T OPIC 1 I NTRODUCTION OF T IME S ERIES

A UTOCORRELATION F UNCTION

PARTIAL A UTOCORRELATION F UNCTION

D EFINITION (PACF)

D EFINITION (ACF)

For a stationary Yt with E(Yt ) = 0, consider its linear projection,

For a stationary Yt ,define the autocorrelation,


j =

(m)
b t+1|t,...,tm+1 = (m) Yt + (m) Yt1 + ... + m
Y
Ytm+1
1
2

j
0

(1)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

(2)

(m)

For different values of m, 1 , 2 , ..., m are the first m partial


(j)

A plot of j (against j) is called autocorrelation function.

J IANHUA G ANG (RUC)

PACF

autocorrelations, and a plot of j (against j) is called partial


autocorrelation function.

17 / 18

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 1 IntroductionS PRING


of Time2013
Series

18 / 18

T OPIC 2 M OMENT G ENERATING F UNCTION

T OPIC 2 M OMENT G ENERATING F UNCTION


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 2 MGF
3 C REDITS , 51 H OURS
It is however essential to consider the MGFs in order to
depict/solve relevant time series problems.

Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

S PRING 2013

1 / 19

P RELIMINARIES

J IANHUA G ANG (RUC)

T OPIC 2 M OMENT G ENERATING F UNCTION

P RELIMINARIES :

P RELIMINARIES :

S AMPLE S PACE AND R ANDOM VARIABLES

B INOMIAL D ISTRIBUTION

Define,

f (x) =

x x

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

2 / 19

P RELIMINARIES

Define as,

x sample space;
x random variable

Then a probability density function (pdf) f (x) is a mapping from


x to the set of R with the probability that:

f (x) = 1;

Z x
Pr {x x } =

f (x)dx = 1.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

3 / 19

n!
px (1 p)nx , for x = 1, 2, ..., n.
x!(n x)!

The density arises as a sequence of the binomial expansion of:

(a + b)n =

n!

x!(n x)! ax bnx ,

x=0

written as x Bin(n, p).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

4 / 19

T OPIC 2 M OMENT G ENERATING F UNCTION

P RELIMINARIES

T OPIC 2 M OMENT G ENERATING F UNCTION

P RELIMINARIES :

P RELIMINARIES :

P OISSON D ISTRIBUTION

N ORMAL D ISTRIBUTION

P RELIMINARIES

Define as,

e x
, for x = 1, 2, ..., n.
x!
The density arises from the identity of:
f (x) =

Define as,

o
n
(x )2
exp 22

f (x) =
22

written as x N , 2 , where < x < .

e =

x
x!
x=0

in which = E(x).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

S PRING 2013

5 / 19

E XPECTATIONS AND M OMENTS

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

E XPECTATIONS AND L OWER -O RDER M OMENTS

6 / 19

E XPECTATIONS AND M OMENTS

E XPECTATIONS AND L OWER -O RDER M OMENTS

The expectation (or the mean, or 1st. moment) of a random


variable is defined by,

x f (x) discrete

x x
Z
E(x) =

x f (x) dx continuous

From which we obtain as special case the raw moments:


 
i = E xi

x x

i.e., it is a weighted average of x over all possible outcomes.


The expectation of a measurable function g(x) of a r.v. x is
therefore defined by:

g (x) f (x)

x x
Z
E {g(x)} =

g (x) f (x)dx

S PRING 2013

And the central moments:

o
n
i = E (x )i

Note that 2 is the variance of x, i.e., 2 = 2 .

x x

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

7 / 19

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

8 / 19

T OPIC 2 M OMENT G ENERATING F UNCTION

E XPECTATIONS AND M OMENTS

T OPIC 2 M OMENT G ENERATING F UNCTION

H IGHER -O RDER M OMENTS

C ALCULATION OF M OMENTS

What about the higher-order (central) moments?


In definition, the third and the fourth moments measure the
following properties:
3 :

Skewness of the distribution

4 :

Kurtosis of the distribution

It is simple to show that:


g(x) = c E {g(x)} = c

E {c g(x)} = c E {g(x)}

E {a + b g(x)} = a + bE {g(x)}

E {g(x) + h(x)} = E {g(x)} + E {h(x)}

For comparative purpose, skewness and kurtosis are usually


measured by:
3 =
4 =

J IANHUA G ANG (RUC)

and hence,

3
3
4
4

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

o
 
n
2
2 = E (x )2 = E x2 [E(x)]

S PRING 2013

9 / 19

M OMENT G ENERATING F UNCTIONS (MGF S )

x x

so that,

S PRING 2013

10 / 19

M OMENT G ENERATING F UNCTIONS (MGF S )

#
2 x2
1 + x +
+ ... f (x)dx
2!

di [Mx ()]
i

2
3
i
2 + 3 + ... + i + ...
2!
3!
i!

|=0 = i (raw moments)

d
Hence we call the function Mx () the MGF of x. Note that this
property is true in either the discrete or the continuous case.

x x

S PRING 2013

"

= 1 + 1 +

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

x x

)
2 2

x
Mx () = E ex = E 1 + x +
+ ...
2!
)
"
#
(
Z

(x)i
(x)i
=
= E
i! f (x)dx
i!
i=0
i=0

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

M OMENT G ENERATING F UNCTIONS (MGF S )

Calculating the moments of even simple r.v.s can be difficult.


However, consider the following function:

e f (x)
x x
n o
Z
Mx () = E ex =

ex f (x)dx

J IANHUA G ANG (RUC)

T OPIC 2 M OMENT G ENERATING F UNCTION

M OMENT G ENERATING F UNCTIONS (MGF S )

E XPECTATIONS AND M OMENTS

11 / 19

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

12 / 19

T OPIC 2 M OMENT G ENERATING F UNCTION

M OMENT G ENERATING F UNCTIONS (MGF S )

T OPIC 2 M OMENT G ENERATING F UNCTION

M OMENT G ENERATING F UNCTIONS (MGF S )

E XAMPLE OF MGF

A N E XAMPLE

It is also easy to see that the MGF satisfies two very important
properties.
E XAMPLE
Observations x1 through xn which are independent copies from r.v.
x Po ().Suppose were interested in the properties (distribution,
moments, etc.) of the sample mean:

g(x) = ax + b Mg(x) () = eb Mx (a)

g(x) = x1 + x2 Mg(x) () = Mx1 () Mx2 ()

Therefore, (Given that x1 , x2 , x3 , ..., xn are independent copies of


the r.v. x.)

X=

Mn1 xi =
Mn 1 xi
1n

Mxi () = [Mx ()]

i=1
n



1
n
= Mxi ( ) = Mx ( )
n
n
i=1

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

J IANHUA G ANG (RUC)

1 n
Xi
n i
=1

T OPIC 2 M OMENT G ENERATING F UNCTION

S PRING 2013

13 / 19

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

J IANHUA G ANG (RUC)

E XAMPLE OF MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

A N E XAMPLE

S PRING 2013

14 / 19

E XAMPLE OF MGF

A N E XAMPLE
P ROBLEM
Calculate the MGF of Sn = nX;

P ROBLEM
Calculate the MGF of X;

S OLUTION
S OLUTION

n o
Mx () = E ex =

MSn
() =

x

e
e x

= e
= e
x!
x!
x=0
x=0
n
o
o
n 
= e exp e = exp e 1
x

S PRING 2013

n 
oin
exp e 1

n 
o
= exp n e 1

x=0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

i=1

ex f (x)

J IANHUA G ANG (RUC)

Mx () =

Note that the MGF of Sn is of the same form as that for x, i.e. letting = n
n 
o
MSn () = exp e 1
i.e. Sn Po (n) = Po ( ).

15 / 19

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

16 / 19

T OPIC 2 M OMENT G ENERATING F UNCTION

E XAMPLE OF MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

A N E XAMPLE

E XAMPLE OF MGF

A N E XAMPLE
P ROBLEM
The moments of X.

P ROBLEM
Calculate the MGF of X;

S OLUTION

S OLUTION
h

E X

Mxi ( )

n
n
n
i=1
n h

oi
n

n

= Mx ( ) = exp e n 1
n
o
n 
= exp n e n 1

MX () = M Sn
( ) = M xi ( ) =

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

T OPIC 2 M OMENT G ENERATING F UNCTION

S PRING 2013

h 2i
E X
=
2X

17 / 19

E XAMPLE OF MGF

That is we immediately find that


 
E X =

2X =
n
If we consider X as an estimator for , we refer to these properties
as unbiasedness, and given the consistency, that is the variance
tends to be zero.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

 
E X =

A N E XAMPLE

J IANHUA G ANG (RUC)

i
i

19 / 19

J IANHUA G ANG (RUC)

o
n

di exp n(e n 1)
i
n d
o
d exp n(e n 1)

o
n d
2
d exp n(e n 1)
2

| =0
| =0 =
| =0 = 2 +

i d
   2

= E X E X
= (central moments)
n
h

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 2 MGF

S PRING 2013

18 / 19

T OPIC 3 ARMA M ODELS

T OPIC 3 ARMA M ODELS


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 3 ARMA Models

We said we are interested in the j in the representation:

3 C REDITS , 51 H OURS

Yt = +

j t j

j =0

Jianhua Gang
for the impulse response analysis and for forecasting.
School of Finance
Renmin University of China

However, in general we dont know the j , and we cant hope to


estimate an infinite number of parameters, so we have to propose
parsimonious models.

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

1 / 47

W HITE N OISE P ROCESS

T OPIC 3 ARMA M ODELS

T HE S IMPLEST M ODEL : W HITE N OISE

2 / 47

W HITE N OISE P ROCESS

T HE S IMPLEST M ODEL : W HITE N OISE

If t is w .n.(0, 2 ),

D EFINITION

t may be independent, but needs not be;


t may be strictly stationary, but needs not be;
t is covariance stationary;

{ t }
is white noise if:
E ( t ) = 0t

and if Yt = +

E (2t ) = 2 t
(j )

so, j = 0, j = 0, and j

and if Yt = +

= 0 j 6= 0.i.e. the process has no memory.

mean if

is stationary if

j t j ,then Yt

is stationary and ergodic for the

3 / 47

j =0

j =0

j < .

j =0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

2j < ;

j t j ,then Yt

j =0

E ( t s ) = 0t (t 6 = s )

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

4 / 47

T OPIC 3 ARMA M ODELS

MA(1)

T OPIC 3 ARMA M ODELS

MA(1)

I NVERTIBLE MA(1)

I NVERTIBLE MA(1)

Let t w .n.(0, 2 ), then Yt = + t + t 1 is the MA(1).


We can check stationarity noticing that 0 = 1, 1 = ,so

Rewrite t = Yt t 1 as t = Yt Lt

2j = 1 + 2 < .

using the lag operator. Then, (1 + L)t = Yt ,so, for | | < 1,

j =0

Otherwise, we can check that the first two moments do not depend on
time.
1
2

Mean: E (Yt ) =
Autocovariances:

= E [(Yt )2 ] = E [(t + t 1 )2 ] = (1 + 2 )2

= E [(Yt )(Yt 1 )] = 2
= 0

i.e. Yt =

j =1

However, for | | 1,the representation is not invertible.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

5 / 47

J IANHUA G ANG (RUC)

MA( Q )

6 / 47

MA( Q )

MA( Q )

Let t w .n.(0, 2 ), then Yt = + t + 1 t 1 + ... + q t q is


MA(q).
Mean: E (Yt ) =
Autocovariances:

The autocorrelations drop to 0 after q lags.

The impulse response are j , j q,and drop to 0 after q lags.

0
j q

= E [(t + 1 t 1 + ... + q t q ) ]
= (1 + 21 + ... + 2q )2
= E [(t + 1 t 1 + ... + q t q )
(t j + 1 t 1 j + ... + q t q j )]

j >q

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

MA( Q )

( )j Yt j + t .

Autocorrelations: 1 = 2 , j 2 = 0.
1 +

J IANHUA G ANG (RUC)

Yt
= Yt ( )j Lj = ( )j Yt j ,
(1 + L)
j =0
j =0

0
j 2
3

t =

= ( j + 1 j +1 + 2 j +2 + ... + q j q )2
= 0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

7 / 47

Invertibility: set = 0;recall Yt = (1 + 1 L + ... + q Lq )t and


factor (1 + 1 L + ... + q Lq ) = (1 1 L)(1 2 L)...(1 q L) in
the MA(1) we asked that |1 | < 1: in the same way here we have to
ask that |1 | < 1, |2 | < 1, ..., |q | < 1.This is sometimes stated as
asking that the roots of the equation in z of the form
(1 + 1 z + ... + q z q ) = 0 lie OUTSIDE the unit circle.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

8 / 47

T OPIC 3 ARMA M ODELS

MA( INFINITY )

T OPIC 3 ARMA M ODELS

MA( INFINITY )

AR(1)

Let t w .n.(0, 2 ), then Yt = + t + 1 t 1 + ... =

Let t w .n.(0, 2 ), then Yt = c + Yt 1 + t is AR (1). Assume


further that || < 1. Since Yt 1 = c + Yt 2 + t 1 ,then replace
into the previous equation:

j t j

AR(1)

is MA(). Under the additional assumption that

j =0

|j | < ,we can derive the moments replacing j

j =0

by j in a

Yt

= c + (c + Yt 2 + t 1 ) + t
= (1 + )c + 2 Yt 2 + t 1 + t
= ...iterating

Yt

MA(q ) and taking the limit for q .


1
2

Mean: E (Yt ) =
Autocovariances:

2k 2

k k +j 2

k =0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 3 ARMA M ODELS

j =0

j =0

as n , and || < 1
n
1
Yt =
c + 0 + j t j
1
j =0

k =0

j c + n +1 Yt n 1 + j t j

9 / 47

J IANHUA G ANG (RUC)

AR(1)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

AR(1)

10 / 47

AR(1)

AR(1)

So an AR (1) with || < 1 may be written as a MA(). Notice that

the condition

|j | < is met, because j = j , so

Mean:

j =0

| j | =

j =0

| |j =
j =0

1
1 ||

stationary and ergodic for the mean.


This can also be obtained by rewriting Yt as Yt = c + LYt + t ,
using the lag operator, and then (1 L)Yt = c + t . Since || < 1,
Yt

0 =

j c + j t j
j =0

2k 2 = 2k 2 = 1 2 2

k =0

c
(= )
1

Autocovariances: using the formula for the MA() process,

= (1 L)1 c + (1 L)1 t

J IANHUA G ANG (RUC)

E (Yt ) =

< ,then it also follows that the process is

k =0

k =0

k k +j 2 =

k =0

k k +j 2 =

k =0

2k j 2 =

j
2
1 2

j =0

c
+ j t j
1 j =0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

11 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

12 / 47

T OPIC 3 ARMA M ODELS

AR(1)

T OPIC 3 ARMA M ODELS

AR(1)

AR(1)

AR(1)

Autocorrelations

Upon knowing that the process is stationary, we could derive the mean and
autocovariances:

j
= j
j =
0

Mean: E (Yt ) = E (c + Yt 1 + t ) = c + E (Yt 1 ) + E (t ) using


stationarity, E (Yt ) = , E (Yt 1 ) = ,
so = c + and then = 1 c
Autocovariances: Replacing c = (1 ), rewrite Yt as

= + Yt 1 + t
Yt = (Yt 1 ) + t
Yt

then
0 = E (Yt )2 = E ((Yt 1 ) + t )2

= 2 E (Yt 1 )2 + E (2t ) + 2E ((Yt 1 )t )


= 2 0 + 2

Impulse Response Function: j = j


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

13 / 47

J IANHUA G ANG (RUC)

AR(1)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

AR(1)

14 / 47

AR( P )

AR( P )

solving for 0,

Let t w .n.(0, 2 ), then Yt = c + 1 Yt 1 + ... + p Yt p + t is


AR (p ).

2
.
1 2
= E [(Yt )(Yt j )]

0 =
j 1

P ROBLEM
How can we check for stationarity?

= E [((Yt 1 ) + t )(Yt j )]
= E [(Yt 1 )(Yt j )] + E (t (Yt j ))
= j 1
So
j 1 =

J IANHUA G ANG (RUC)

S OLUTION
Factoring (1 1 L ... p Lp ) = (1 1 L)...(1 p L) stationary
follows if |j | < 1 for all j.
Another way to state this condition is to check that the solutions of the
equation in z, (1 1 z ... p z p ) = 0 are all OUTSIDE the unit circle.

j
2 .
1 2

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

15 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

16 / 47

T OPIC 3 ARMA M ODELS

AR( P )

T OPIC 3 ARMA M ODELS

AR( P )

AR( P )

AR( P )
Given stationarity,
Autocovariances:
0 = E (Yt )2

Given the stationarity,


Mean:
E (Yt ) = E (c + 1 Yt 1 + ... + p Yt p + t )
= c + 1 + ... + p
c
=
1 1 ... p

j 1

= E [(1 (Yt 1 ) + ... + p (Yt p ) + t ) (Yt )]


= E [(1 (Yt 1 )(Yt ) + ... + p (Yt p )(Yt )
+t (Yt ))]
= 1 1 + ... + p p + 2
= E [(Yt )(Yt j )]
= E [(1 (Yt 1 ) + ... + p (Yt p ) + t ) (Yt j )]
= E [(1 (Yt 1 )(Yt j ) + ... + p (Yt p )(Yt j )
+t (Yt j )]
= 1 j 1 + ... + p j p

This is a linear system in j , j = 0, ..., p.


I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models
S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 3 ARMA M ODELS

17 / 47

J IANHUA G ANG (RUC)

AR( P )

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

AR( P )

18 / 47

AR( P )

AR( P )

Now try AR(2).

Notice that if the roots of (1 1 z 2 z 2 = 0) are complex, then


the autocorrelations show a cyclical dynamics.

0 = 1 1 + 2 2 + 2
1 = 1 0 + 2 1
2 = 1 1 + 2 0
and notice that 1 = 1 ,so replacing 1 and 2 ,
1

1 2 0


21
=
+ 2 0
1 2
(1 2 )
h
i 2
=
(1 + 2 ) (1 2 )2 21

1 =
2
0

We can therefore also get autocorrelations.


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

19 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

20 / 47

T OPIC 3 ARMA M ODELS

I MPULSE R ESPONSE F UNCTION (IRF)

T OPIC 3 ARMA M ODELS

I MPULSE R ESPONSE F UNCTION

I MPULSE R ESPONSE F UNCTION (IRF)

I MPULSE R ESPONSE F UNCTION

In general we can compute the IRF inverting (L)Yt = t into


Yt = (L)1 t (here we used stationary) so (L)1 = (L).i.e.,

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

21 / 47

J IANHUA G ANG (RUC)

I MPULSE R ESPONSE F UNCTION (IRF)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

I MPULSE R ESPONSE F UNCTION

22 / 47

ARMA( P, Q )

ARMA( P, Q )

Let t w .n.(0, 2 ), then


Yt = c + 1 Yt 1 + ... + p Yt p + t + 1 t 1 + ... + q t q is
ARMA(p, q ).
Stationarity of the whole ARMA(p, q ) depends on the
autoregressive part only (whilst the invertibility depends on the
MA part only) :
Using the lag operator, the ARMA(p, q ) is
(1 1 L ... p Lp )Yt = (1 + 1 L + ... + q Lq )t .
For stationarity, we have to check if the roots of
(1 1 z ... p z p ) = 0 are all outside the unit circle.
For invertibility, we require that the roots of
(1 + 1 z + ... + q z q ) = 0 are outside the unit circle.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

23 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

24 / 47

T OPIC 3 ARMA M ODELS

ARMA( P, Q )

T OPIC 3 ARMA M ODELS

ARMA( P, Q )

ARMA( P, Q )

ARMA( P, Q )
Given the stationarity,
Autocovariances: The autocovariances are a combination between
those of an AR (p ) and an MA(q ), so for j > q,

Given the stationarity,


Mean:

j = 1 j 1 + ... + p j p
E (Yt ) = E (c + 1 Yt 1 + ... + p Yt p

For example, ARMA(1, 1), Yt = c + Yt 1 + t + t 1 (|| < 1) :


Firstly notice that

+t + 1 t 1 + ... + q t q )
= c + 1 + ... + p + 0 + ... + 0
c
=
1 1 ... p

E [(Yt )t ] = E [((Yt 1 ) + t + t 1 )t ]

= 0 + 2 + 0 = 2
E [(Yt )t 1 ] = E [((Yt 1 ) + t + t 1 )t 1 ]

= 2 + 0 + 2 = ( + )2
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

25 / 47

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

J IANHUA G ANG (RUC)

ARMA( P, Q )

T OPIC 3 ARMA M ODELS

ARMA( P, Q )

26 / 47

ARMA( P, Q )

ARMA( P, Q )

so
0 = E [((Yt 1 ) + t + t 1 )(Yt )]

so

= E [(Yt 1 )(Yt )] + E [t (Yt )]


+E [t 1 (Yt )]
= 1 + 2 + ( + )2

0
1

j 2 = j 1

= E [(Yt 1 )(Yt 1 )] + E [t (Yt 1 )]


+E [t 1 (Yt 1 )]
= 0 + 0 + 2
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models
S PRING 2013

and

1 = E [(Yt )(Yt 1 )]

J IANHUA G ANG (RUC)


( + )2
= 1+
1 2


( + )2
= 2 + +
1 2
2

27 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

28 / 47

T OPIC 3 ARMA M ODELS

ARMA( P, Q )

T OPIC 3 ARMA M ODELS

ARMA( P, Q )

IRF OF ARMA

I MPULSE R ESPONSE F UNCTION OF ARMA( P, Q )

The autocorrelation can be derived in the same way: for the generic
ARMA(p, q ), for j > q,

Given stationarity, inverting (L)Yt = (L)t

=
1
(L) (L) =
(L) =
(1 + 1 L + ... + q Lq ) =
Yt

( L ) 1 (L ) t
(L)
(L) (L)

(1 1 L... p Lp )

(1 + 1 L + 2 L2 + ...)
(1 + 1 L + ... + q Lq ) = 1 1 L + 1 L 2 L2 + 2 L2
3 L3 2 1 L3 1 2 L3 + 3 L3 + ...

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 3 ARMA M ODELS

29 / 47

J IANHUA G ANG (RUC)

IRF OF ARMA

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

I MPULSE R ESPONSE F UNCTION OF ARMA( P, Q )

30 / 47

IRF OF ARMA

I MPULSE R ESPONSE F UNCTION OF ARMA( P, Q )

In the ARMA(1, 1) case, then,


solve this for the various power of L:

1 = +

L0 : 1 = 1
1

: 1 = 1 + 1

: 2 = 2 + 1 1 + 2

L
L

j 2 = j 1 = ( + ) j 1
The ARMA(1, 1) could also be decomposed in impulse response by
looking at
Yt = Yt 1 + t , t = t + t 1

L3 : 3 = 3 + 3 + 2 1 + 1 2

(and = 0 to keep notation short).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

31 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

32 / 47

T OPIC 3 ARMA M ODELS

IRF OF ARMA

T OPIC 3 ARMA M ODELS

I MPULSE R ESPONSE F UNCTION OF ARMA( P, Q )

C OMMON FACTORS

C OMMON FACTORS

Then,

Yt

j =0

j =0

j t j + j t j 1
j =0

In ARMA modelling, it may be that the same factor appears both in


(L) and of (L) : in this case, the ARMA(p, q ) process cannot be
distinguished, on the basis of the autocorrelation structure (or from
the weights in the MA() representation), from an
ARMA(p 1, q 1) process.

j t j = j (t j + t j 1 )
j =0

In this case, it is sometimes also said that the model ARMA(p, q ) is


overparametrised.

j t j + l 1 t l
j =0

l =1

The ARMA(p, q ) model may be simplified (and indeed it may be


desirable to do so, especially if the parameters 1 , ..., p and
1 , ... q have to be estimated).

= t + j 1 t j + j 1 t j
j =1

j =1

= t + ( + ) j 1 t j
j =1

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

33 / 47

J IANHUA G ANG (RUC)

C OMMON FACTORS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

E XAMPLE :

34 / 47

A FINAL COMMENT ON STATIONARY AND INVERTIBLE ARMA

A FINAL COMMENT ON STATIONARY AND INVERTIBLE


ARMA
Yt = 1.2Yt 1 0.35Yt 2 + t 0.7t 1

is

We already saw that for a stationary ARMA(p, q ), it is also possible


to give an MA() representation; in the same way, it is also possible
to give an AR () representation (indeed, this is a proper definition of
"invertibility"). All these models have the same autocovariances /
autocorrelation structures, and are therefore indistinguishable.

We can choose the representation that is more convenient for our


purpose: for example, we may like the MA() if we are interested in
the impulse rensponse function, the AR () if we want to compute t
given observations on {Yt }
(and assuming we know the
parameters), or we may prefer the ARMA(p, q ) if we are interested in
estimating the parameters.

Yt 1.2Yt 1 + 0.35Yt 2 = t 0.7t 1

(1 1.2L + 0.35L2 )Yt = (1 0.7L)t


(1 0.7L)(1 0.5L)Yt = (1 0.7L)t
so, simplifying (1 0.7L), the process has the same autocorrelation
structure (and the same weights in the MA() representation) of

(1 0.5L)Yt = t
i.e.
Yt = 0.5Yt 1 + t
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

35 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

36 / 47

T OPIC 3 ARMA M ODELS

T RANSFORMATION OF ARMA M ODELS

T OPIC 3 ARMA M ODELS

T RANSFORMATION OF ARMA M ODELS

T RANSFORMATION OF ARMA M ODELS

F ILTERS
Sometimes data are treated (by nature or by the researcher) by
summing / averaging / differencing ...
For Yt , a filter h(L) is applied as:
Xt = h(L)Yt

ARMA models are quite standard and typical behaviour in econometrics,


however, we may derive some transformations from this particular
framework.

where

h (L) =

hj Lj

j =

If

|hj | < ,

j =

j =

| j | <

then
Xt = + (L)t
where
= h(1)c, (L) = h(L)(L)
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

37 / 47

J IANHUA G ANG (RUC)

T RANSFORMATION OF ARMA M ODELS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

D OES AVERAGING REVEALS A SIGNAL ?

38 / 47

T RANSFORMATION OF ARMA M ODELS

S UM OF ARMA PROCESSES

Our variable of interest (signal) may be obscured by a noise.


Suppose Yt is w .n.(0, 2 ), and
1
Xt =
k

Example:
Yt = Xt + vt

k 1

Yt j

where

j =0

Xt = ut + ut 1

as in (moving) average of quarterly or monthly data on a yearly basis:


then averaging induced dependence where there was none.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

39 / 47

and ut is w .n.(0, 2u ), vt is w .n.(0, 2v ), E (ut vt ) = 0 for all t, .


Suppose we are interested in Xt ,but we can only observe Yt .What are
the properties of Yt ?

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

40 / 47

T OPIC 3 ARMA M ODELS

T RANSFORMATION OF ARMA M ODELS

T OPIC 3 ARMA M ODELS

S UM OF ARMA PROCESSES

S UM OF ARMA PROCESSES
In order to find ,compute

E (Yt ) = 0 for all t.


0 = E (Xt + vt )2 = E (Xt2 ) + E (vt2 ) + 2E (Xt vt )
1

j 2

1 =

= (1 + 2 )2u + 2v
= E (Xt + vt )(Xt 1 + vt 1 ) = E (Xt Xt 1 ) + E (vt Xt 1 )
+E (Xt vt 1 ) + E (vt vt 1 )
= 2u
= 0

= 1
1 + 2
solve for :
= 1 + 1 2
q
1 1 421
1,2 =
21

Yt = t + t 1 , t w .n.(0, 2 )

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

41 / 47

42 / 47

T RANSFORMATION OF ARMA M ODELS

S UM OF ARMA PROCESSES

1 421

21

the process is invertible. We can also derive


1 = 2 ,so

2 = 2u .

L EMMA
In general, consider

Yt = Xt + Wt
2 ,for

example from

where Xt and Wt are (zero mean) stationary processes such that Xt and
W are not correlated at any t, , then
E (Yt Yt j ) = E (Xt Xt j ) + E (Wt Wt j )

(Notice that we observe Yt ,so we can estimate and 2 : since


however there are three parameters of interest, , 2u , and 2v ,we,
however, cannot estimate them without an identification assumption.
In other words, Yt contains less information than Xt and vt ).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

S UM OF ARMA PROCESSES

1 =

J IANHUA G ANG (RUC)

T RANSFORMATION OF ARMA M ODELS

For = 1 , where

2u
.
(1 + 2 )2u + 2v

Since in an MA(1),

so Yt is MA(1), i.e., we can represent it as

J IANHUA G ANG (RUC)

T RANSFORMATION OF ARMA M ODELS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

43 / 47

i.e.
Yj = Xj + W
j

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

44 / 47

T OPIC 3 ARMA M ODELS

T RANSFORMATION OF ARMA M ODELS

T OPIC 3 ARMA M ODELS

S UM OF TWO MA PROCESSES

T RANSFORMATION OF ARMA M ODELS

S UM OF TWO AR PROCESSES
Suppose,
Yt = Xt + Wt
where

(1 L)Xt = ut , (1 L)Wt = vt ( 6= )
then

L EMMA
If Xt is MA(q1 ) and Wt is MA(q2 ), then Yt is MA(max[q1, q2 ]).

(1 L)(1 L)Xt = (1 L)ut ,


(1 L)(1 L)Wt = (1 L)vt ,
so

(1 L)(1 L) (Xt + Wt )
= (1 L)ut + (1 L)vt
So Yt is ARMA(2, 1). (If = , Yt is AR (1) ).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013
T OPIC 3 ARMA M ODELS

45 / 47

T RANSFORMATION OF ARMA M ODELS

S UM OF T WO ARMA P ROCESSES

L EMMA
If Xt is ARMA(p1 , q1 ), Wt is ARMA(p2 , q2 ), then Yt is ARMA(p, q ) with
p p1 + p2
and
q max(p1 + q2 , p2 + q1 )

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

47 / 47

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 3 ARMA Models


S PRING 2013

46 / 47

T OPIC 4 E STIMATION OF ARMA

T OPIC 4 E STIMATION OF ARMA


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 4 Estimation of ARMA
3 C REDITS , 51 H OURS
MLE

Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

T OPIC 4 E STIMATION OF ARMA

1 / 42

E STIMATION : S AMPLE MOMENTS

Y = (Y1 , ..., YT )
be a Normally distributed vector with
E (Y ) = ,E ((Y )(Y ) ) =

Sample autocovariance

bj =

Sample autocorrelation

J IANHUA G ANG (RUC)

E STIMATION : M AXIMUM L IKELIHOOD (ML)

Let

Sample Mean

2 / 42

E STIMATION : MAXIMUM LIKELIHOOD

We described the properties of some stationary processes by focusing


on some population moments (mean, autocovariances,...). However,
we only have the data that we observed, (y1 , ..., yT ) ,so we can only
compute estimates of these moments. Are these estimates useful?
Y =

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

T OPIC 4 E STIMATION OF ARMA

E STIMATION : S AMPLE MOMENTS

J IANHUA G ANG (RUC)

1
Yt
T t
=1

The Gaussian density, computed at the points


y = (yT , ..., y1 )

1 T
(Yt Y )(Yt j Y )
T t =
j +1

in the support of Y is
fY T ,...Y 1 (yT , ..., y1 )

bj

b
j =
b0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

1
= (2 )T /2 ||1/2 exp( (y ) 1 (y ))
2
3 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

4 / 42

T OPIC 4 E STIMATION OF ARMA

E STIMATION : M AXIMUM L IKELIHOOD (ML)

T OPIC 4 E STIMATION OF ARMA

E STIMATION : MAXIMUM LIKELIHOOD

E XAMPLES :

Now assume that y = (yT , ..., y1 ) is the realization of Y, and


consider =() where is a set of parameters of interest. Then,

1
= (2 )T /2 |()|1/2 exp( (y ) ()1 (y ))
2
is the likelihood function. Maximizing that function w.r.t. yields the
(exact) maximum likilihood estimate.
Note the difference between and .

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

T OPIC 4 E STIMATION OF ARMA

AR (1)(|0 | < 1) :
Yt = c0 + 0 Yt 1 + t , t Nid (0, 20 )

fY T ,...Y 1 (yT , ..., y1 ; )

J IANHUA G ANG (RUC)

E STIMATION : M AXIMUM L IKELIHOOD (ML)

5 / 42

= (c, , 2 ) , (||

1
2
...
...
() =
1 2
T 2 T 3
T 1 T 2

J IANHUA G ANG (RUC)

E STIMATION : M AXIMUM L IKELIHOOD (ML)

... T 2 T 1
... T 3 T 2

...
...
...

...
1

...

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

T OPIC 4 E STIMATION OF ARMA

E XAMPLES :

< 1) and

6 / 42

E STIMATION : M AXIMUM L IKELIHOOD (ML)

E XAMPLES :

MA(1)(| 0 | < 1) :
Yt = 0 + t + 0 t 1 , t

The likelihood function may be computed for a given set of


observations and for any parameter (within the range of the
parameter space).

Nid (0, 20 )

= (c, , 2 ) , and
() =


(1 + 2 )

2
2
(1 + ) ...
0

0
J IANHUA G ANG (RUC)

For example, assume that we know, 0 = 0 and that we observed

(1 + 2 )

...

...

...
0

...
...

...
1

...

(1 + 2 )

time
obs.

...

2
(1 + )
1
0

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

y1
0.5

y2
0.8

y3
0.2

y4
2

and suppose you want to estimate 0 in the MA(1) model with the
additional assumption that 0 = 0 and 20 = 1: consider five
potential values for 0 : 0.5, 0.25, 0, 0.25, 0.5.
Then, we have to compute () for each : for example, when
= 0.5,

7 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

8 / 42

T OPIC 4 E STIMATION OF ARMA

E STIMATION : M AXIMUM L IKELIHOOD (ML)

T OPIC 4 E STIMATION OF ARMA

E XAMPLES :

E STIMATION : M AXIMUM L IKELIHOOD (ML)

E XAMPLES :
and

then,

() =

0.5 2
(1 +0.5 )

(1 + 0.52 ) ...

0
0

J IANHUA G ANG (RUC)

0.5
(1 +0.52 )

1
...
0
0

...
...
...
...
...

0
0
...
1
0.5
(1 +0.52 )

0
0
...

0.5
(1 +0.52 )
1

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation ofSARMA


PRING 2013

T OPIC 4 E STIMATION OF ARMA

(y ) ()1 (y )

= 0.5 0.8 0.2 2
1

1.25 0.5
0
0
0.5 1.25 0.5
0

0
0.5 1.25 0.5
0
0
0.5 1.25

0.5
0.8

0.2
2
= 4.6903
9 / 42

J IANHUA G ANG (RUC)

E STIMATION : M AXIMUM L IKELIHOOD (ML)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

E XAMPLES :

10 / 42

E STIMATION : M AXIMUM L IKELIHOOD (ML)

E XAMPLES :

The function may be computed for all the , | | < 1 (b


= 0.76)

So,
1
(2 )T /2 |()|1/2 exp( (y ) ()1 (y ))
2
1

1/2
= (2 )4/2 (1.332)
exp( 4.6903)
2
= 2.1033 103
Therefore, we may get all the likelihoods for different .

0.5 0.25 0
0.25
0.5
103 f 3.178 2.618 2.153 1.967 2.103

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

11 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

12 / 42

T OPIC 4 E STIMATION OF ARMA

E STIMATION : M AXIMUM L IKELIHOOD (ML)

T OPIC 4 E STIMATION OF ARMA

E XAMPLES :

ML OF AR(1)

AR(1)

Yt = c0 + 0 Yt 1 + t , |0 | < 1, t Nid (0, 20 )

The computation of
Then

1
(2 )T /2 |()|1/2 exp( (y ) ()1 (y ))
2

Yt N

is very heavy, because it requires the inversion of the T T .matrix of


1
() for all the admissible values .

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

13 / 42

1
c0
, 20
1 0
1 20

fY1 (y1 ; )


2


c
1/2

2
1 y1 1

= (2 )1/2
exp


2
2
1 2
1 2

J IANHUA G ANG (RUC)

ML OF AR(1)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

AR(1)

so, setting = (c, , 2 ) , look at the likelihood of Y1

Luckily, it is sometimes easy to rewrite the likelihood function in a


way that does not require the inversion of ();otherwise, it is also
possible to modify the problem so that, again, we can avoid the
inversion of ().

J IANHUA G ANG (RUC)

14 / 42

ML OF AR(1)

AR(1)
and, by the same arguement,

Of course, the same likelihood may be expressed for Y2 , however in


this case we can also exploit the fact that we observed Y1 on the
period before:
Y2 |Y1 N (c0 + 0 Y1 , 20 ),
so

= (2 )

fY |Y ,...,Y
t

t =2

t 1

(yt |yt 1 , ..., y1 ; )fY1 (y1 ; )

where
fY t |Yt 1 ,...,Y 1 (yt |yt 1 , ..., y1 ; )
"
#

1 (yt c yt 1 )2
1/2 2 1/2

= (2 )
exp
2
2

fY 2 |Y 1 (y2 |y1 ; )
1/2

fY T ,...,Y 1 (yT , ..., y1 ; ) =

"
#
2
2 1/2
y
1

y
(
)
2
1

exp
2
2

when t = 2, ..., T .

Also notice that, for the AR(1),

We can then express the likelihood for Y1 and Y2 factoring

fY t |Yt 1 ,...,Y 1 (yt |yt 1 , ..., y1 ; ) = fY t |Yt 1 (yt |yt 1 ; )

fY 2 ,Y 1 (y2 , y1 ; ) = fY 2 |Y 1 (y2 |y1 ; )fY 1 (y1 ; )

so in what follows we simplify the notation in this way.


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

15 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

16 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF AR(1)

T OPIC 4 E STIMATION OF ARMA

AR(1)

ML OF AR(1)

AR(1)

The log-likelihood is
T

l () = ln(fY1 (y1 ; )) +

ln(fY |Y (yt |yt 1 ; ))


t

t =2

1
= ln 2
2
1 2

Maximizing that function would give the "maximum likelihood


estimate" when t is normally distributed.

t 1

c
1 y1 1

2
2
2
1

2

However, although we eliminated the problem of inverting (),we


still cannot express our estimate b
as a closed form function of the
observations, so we still have to compute the likelihood function on
all the admissible parameters in order to find the maximum.

1 T (yt c yt 1 )2
T 1
2
ln(2 )

2
2 j =2
2
We then succeed in writing the (log) likelihood in a way that does
not require the inversion of a T T matrix.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

J IANHUA G ANG (RUC)

T OPIC 4 E STIMATION OF ARMA

17 / 42

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

J IANHUA G ANG (RUC)

ML OF AR(1)

T OPIC 4 E STIMATION OF ARMA

AR(1)

18 / 42

ML OF AR(1)

AR(1)

Consider, on the other hand, estimating 0 by maximizing

and, equating the derivative to 0,

1 T (yt c yt 1 )2
T 1
ln(22 )
2
2 j =2
2

That estimate is known as "conditional maximum likelihood


estimate", because it is the maximum likelihood estimate if Y1 is not
random (so, the log-likelihood above is called "conditional"
log-likelihood). In this case, a closed form solution exists.
In order to find the closed form solution, first notice that 2 can be
estimated and concentrated out:
"
#
2
T

y
T 1
1
y

(
)
t
t

ln(22 )
2
2
2 j =2
2
T

=
J IANHUA G ANG (RUC)

T 1 1
1
(yt c yt 1 )
+
2
2
2 j =2
( 2 )2

c2 =

so replacing this in the log-likelihood, and we get the solutions:

(yt y . )(yt 1 y .1 )
t =2

b =

where, y . =

19 / 42

T
1
(yt y . ) (yt 1 y .1 )

T 1 t =2

b
c =

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T
1
(yt c yt 1 )2

T 1 j =2

1
T 1

J IANHUA G ANG (RUC)

( yt 1 y . 1 ) 2
t =2
T

t =2

t =2

yt , y .1 = T 11 yt 1
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING
ARMA2013

20 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF AR(1)

T OPIC 4 E STIMATION OF ARMA

AR(1)

ML OF AR( P )

AR( P )

So for the "conditional maximum likelihood estimate" a closed


form solution exists, and it is the OLS estimate in
Yt = c0 + 0 Yt 1 + t .
Notice that this is not the likelihood function of our original
stationary AR(1) process, but the likelihood of the process,

Assume
Yt = c0 + 0;1 Yt 1 + ... + 0;p Yt p + t ,
where
t Nid (0, 20 )
and the roots of 1 0;1 z ... 0;p z p = 0 are outside the unit
circle.

Yt = c0 + 0 Yt 1 + t , |0 | < 1,

Introduce

t Nid (0, 20 ), when t > 1

Yp = (Y1 , ..., Yp ) , p = E (Yp ), yp = (y1 , ..., yp )

Y1 = y1

and

(hence the name, "conditional maximum likelihood")

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

2 1

Vp = ( )

21 / 42

J IANHUA G ANG (RUC)

ML OF AR( P )



Yp p

Yp p

 

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

AR( P )



22 / 42

ML OF AR( P )

AR( P )

and take again the Gaussian density:


fY p ,...,Y 1 (yp , ..., y1 ; )

so the likelihood can be written as

= (2 )p/2 |2 Vp ()|1/2




1 
1
exp 2 yp p Vp () yp p
2

fY T ,...,Y 1 (yT , ..., y1 ; )


T

= fYp,...,Y1 (yp , ..., y1 ; )

fY t |Y t 1 ,...,Yt p (yt |yt 1 , ..., yt p ; )

t =p +1

fY t |Y t 1 ,...,Y t p (yt |yt 1 , ..., yt p ; )

where the problem of inverting the T T matrix () is reduced to


inverting a p p matrix Vp ().

= (2 )1/2 |2 |1/2
"
#
2
1 (yt c 1 yt 1 ... p yt p )
exp
2
2
when t = p + 1, ..., T .
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

23 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

24 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF AR( P )

T OPIC 4 E STIMATION OF ARMA

AR( P )

ML OF AR( P )

AR( P )

Taking logarithms, the log-likelihood is

Maximizing the log-likelihood yields the "maximum likelihood


estimate".

l () = ln(fY p,...,Y1 (yp , ..., y1 ; ))


h
i
T
+ ln fYt |Yt 1 ,...,Yt p (yt |yt 1 , ..., yt p ; )

Again, a "conditional maximum likelihood estimate" can be


considered instead: this is obtained by treating Yp , ..., Y1 as given,
and maximizing

t =p +1

p
= ln(2 ) |2 Vp ()|1/2
2



1 
2 yp p Vp1 () yp p
2
T p

ln(22 )
2
2
1 T (yt c 1 yt 1 ... p yt p )

2 t =p +1
2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

t =p +1

fY t |Y t 1 ,...,Y t p (yt |yt 1 , ..., yt p ; )

instead. The value of that maximized the log-likelihood is called


"conditional maximum likelihood estimate". This turns out to
be the OLS estimate of c0 , 0;1 , ..., 0;p in the corresponding
regression model.

25 / 42

J IANHUA G ANG (RUC)

ML OF MA(1)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

MA(1)

26 / 42

ML OF MA(1)

MA(1)

Suppose
i.e. the density of Yt |t 1 is

Yt = 0 + t + 0 t 1 , | 0 | < 1, t Nid (0, 20 )

fY t |t 1 (yt |t 1 ; 0 )


1 (yt 0 0 t 1 )2
1
exp
= q
2
20
220


1 2t
1
exp 2
= q
2 0
220

Under an additional assumption that


0 = 0
we can also derive a "conditional maximum likelihood estimate" of
in an MA(1).
In general, since t Nid (0, 20 ),then
Yt |t 1 N (0 + 0 t 1 , 20 )

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

27 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

28 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF MA(1)

T OPIC 4 E STIMATION OF ARMA

MA(1)

ML OF MA(1)

MA(1)

Unfortunately t 1 is not observable.


However, suppose that we know 0 ,then Y1 = 0 + 1 + 0 0 , and,
given 0 and 0 we can also compute

and
fY t ,Y t 1 ,...,Y 1 |0 (yt , yt 1 , ..., y1 |0 ; 0 )

= fY1 |0 (y1 |0 ; 0 )

1 (0 ) = y1 0 0 0

(the notation 1 (0 ) means that 1 is computed for a given value of


the vector of parameters ).
Having computed 1 (0 )(and given 0 ) we can also compute
2 (0 ) = y2 0 0 1 ( 0 ), and,iterating the procedure,
t (0 ) = yt 0 0 t 1 (0 )
Then
fY t |t 1 (yt |t 1 ; 0 ) = fY t |Y t 1 ,...,Y 1 ,0 (yt |yt 1 , ..., y1 , 0 ; 0 )
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

29 / 42

fY |Y ,...,Y , (yt |yt 1 , ..., y1 , 0 ; 0 )


t 1

t =2



T
t ( 0 )2
.
= (2 )T /2 (20 )T /2 exp
220
t =1
Notice that this is not the density of (Yt , Yt 1 , ..., Y1 ) where each Yt
has an MA(1) representation, but that a density (i.e., the density of
(Yt , Yt 1 , ..., Y1 ) when each Yt has MA(1) representation)
conditional on 0 .
Moreover, we cannot compute a likelihood, because we cant observe
0 .
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING
ARMA2013

J IANHUA G ANG (RUC)

ML OF MA(1)

T OPIC 4 E STIMATION OF ARMA

MA(1)

30 / 42

ML OF MA(1)

MA(1)
Therefore,

However, consider the process

fY t ,Y t 1 ,...,Y 1 ,0 =0 (yt , yt 1 , ..., y1 , 0 = 0; 0 )

Yt = 0 + t + 0 t 1 ,

= fY1 |0 =0 (y1 |0 = 0; 0 )

with

Nid (0, 20 ), when

fY |Y ,...,Y , =0 (yt |yt 1 , ..., y1 , 0 = 0; 0 )

t > 0, 0 = 0

t =2

This process is very similar to the stationary MA(1), and it has the
density above (setting 0 = 0); given that we know 0 ,we can
initialize the iterations (for all the admissible values of )

= (2 )

t 1

T /2

2 T /2

( )


t ()2
exp 22 .
t =1
T

Taking the logs, the (conditional) log-likelihood is


t () = yt t 1 ()
We can then compute the likelihood (which is, then, a "conditional
likelihood") as a function of a set of observations (yt , yt 1 , ..., y1 ) ,
and of a generic vector of unknown parameters ,
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

31 / 42

l () =

T
1
T
ln(2 ) ln(2 ) 2
2
2
2

2t ()
t =1

The value of that maximizes the (conditional) log-likelihood is


called "conditional maximum likelihood estimate".
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

32 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF MA( Q )

T OPIC 4 E STIMATION OF ARMA

MA( Q )

ML OF MA( Q )

MA( Q )
Iteratively, and we can formulate a "conditional maximum likelihood":
fY t ,Y t 1 ,...,Y 1 ,0 =0 (yt , yt 1 , ..., y1 , 0 = 0; 0 )

Corresponding with the MA(1) case, the MA(q) process can be


written as:

= fY1 |0 = 0(y1 |0 = 0; 0 )
T

fY |Y ,...,Y , =0 (yt |yt 1 , ..., y1 , 0 = 0; 0 )


t

Yt = 0 + t + 0;1 t 1 + ... + 0;q t q , t Nid (0, 20 )


and the roots of 1 + 0;1 z + ... + 0;q

zq

t =2

= 0 are all outside the unit circle.

Introduce 0 = (0 , ..., q +1 ) .Again, if 0 = 0,we compute

= (2 )

t 1

T /2

2 T /2

| |


t ()2
exp 22 .
t =1
T

Taking the logs, the (conditional) log-likelihood is

t () = yt 1 t 1 () ... q t q ()
l () =

T
1
T
ln(2 ) ln(2 ) 2
2
2
2

2t ()
t =1

The value of that maximizes the (conditional) log-likelihood is


called "conditional maximum likelihood estimate".
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING
ARMA2013

J IANHUA G ANG (RUC)

T OPIC 4 E STIMATION OF ARMA

33 / 42

J IANHUA G ANG (RUC)

ML OF ARMA( P, Q )

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

T OPIC 4 E STIMATION OF ARMA

ARMA( P, Q )

34 / 42

ML OF ARMA( P, Q )

ARMA( P, Q )

Similar methodologies can also be applied to the ARMA process:


Yt
t

= c0 + 0;1 Yt 1 + ... + 0;p Yt p


+t + 0;1 t 1 + ... + 0;q t q ,
Nid (0, 20 )

The conditional likelihood is then


fY T ,...,Y p +1 |Y p, ...,Y 1 ,0 =0,...,p q +1 =0

and the roots of 1 0;1 z ... 0;p z p = 0 and of


1 + 0;1 z + ... + 0;q z q = 0 are all outside the unit circle, and there is
no common factor.

(yT , ..., yp +1 |yp , ..., y1 , p = 0, ..., p q +1 = 0; )


= (2 )(T p )/2 |2 |(T p )/2

 2
T
t ()
exp 22 .
t =p +1

Again, assume that Yp , ..., Y1 as given and 0 = (0 , ..., q +1 ) = 0.


Then we can compute
t () = yt c 0;1 yt 1 ... 0;p yt p 1 t 1 () ... q t q ()
for t > p.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

35 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

36 / 42

T OPIC 4 E STIMATION OF ARMA

ML OF ARMA( P, Q )

T OPIC 4 E STIMATION OF ARMA

ARMA( P, Q )

P SEUDO M AXIMUM L IKELIHOOD

P SEUDO M AXIMUM L IKELIHOOD

Taking logarithums, the conditional log-likelihood is


l ()
1
T p
T p
ln(2 )
ln(2 ) 2
=
2
2
2

2t ()

t =p +1

The value of that maximizies the conditional log-likelihood is called


"conditional maximum likelihood estimate".

When t is not normally distributed, the density is different and


then the maximum likelihood estimate is different as well. If we use
the gaussian density even if t is not normally distributed, then, our
estimate is no longer the maximum likelihood one. In this case it
usually known as Pseudo (or Quasi) maximum likelihood instead.

The conditional log-likelihood is maximized as long as the final


term (conditional RSS) is minimized.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

37 / 42

A PPENDIX O PTIMIZATION OF THE OBJECTIVE FUNCTION

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

38 / 42

A PPENDIX O PTIMIZATION OF THE OBJECTIVE FUNCTION

O PTIMIZATION OF THE OBJECTIVE FUNCTION

O PTIMIZATION OF THE OBJECTIVE FUNCTION


for a generic (0 ) ,and consider an approximate second order Taylor
expansion of l (),

In general, it is not always possible to obtain a closed form formula


for the estimate, and it may be extremely time consuming to compute
the log-likelihood function (even the conditional log-likelihood) for all
the potential .
The optimisation of the log-likelihood may be carried using a
numerical algorithm, such as the Newton-Raphson one. Introduce

Recall that l () is maximized at b


iif

l ()
| (0) (gradient)
=
2 l ()
H ((0 ) ) =
| (0) (Hessian)
=
g ((0 ) ) =

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

ih
i
h
l () l ((0 ) ) + g ((0 ) ) (0 )
i
i
h
1h
(0 ) H ((0 ) ) (0 )
2
l ()
| b=0
=

Now, consider the approximation of the derivative around (0 ) :


h
i
h
i
l ()
g ((0 ) ) H ((0 ) ) (0 ) .

39 / 42

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

40 / 42

A PPENDIX O PTIMIZATION OF THE OBJECTIVE FUNCTION

A PPENDIX O PTIMIZATION OF THE OBJECTIVE FUNCTION

O PTIMIZATION OF THE OBJECTIVE FUNCTION

O PTIMIZATION OF THE OBJECTIVE FUNCTION

If the approximation was perfect, we could have just computed b

solving for
h
i
h
i
g ((0 ) ) H ((0 ) ) (0 ) = 0,

i.e.,

Next, we can improve, by considering a second order approximation of


l () in (1 ) ,and compute
h
i
(2 ) = (1 ) + H ((1 ) )1 g ((1 ) ) .

i
h
= (0 ) + H ((0 ) )1 g ((0 ) ) .

However, this may be a rather poor estimate, because the


approximation is not exact (there is a remainder, in this case of the
third order, in the Taylor expansion of l ()). Lets call this possibly
poor estimate (1 ) ,then, where
i
h
(1 ) = (0 ) + H ((0 ) )1 g ((0 ) )
clearly, this is (in a certain probabilistic sense better than a generic
(0 ) .

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

41 / 42

The procedure can then be iterated until convergence (which


gives b
).
In many cases, you may start the optimisation with any set of
starting values, but this may result in a rather slow optimisation, or
even in an "incorrect" solution (you may end up picking a local
maximum, rather than the maximum). It is then advisable to start
from a "good" point, that is, from a consistent estimate of
(typically, an estimate that you may compute easily, even if it is less
efficient than maximum likelihood): the correlogram based
estimate is a good starting point (given certain regularity conditions,
properties as in the pseudo-maximum likelihood estimate may be
obtained after just one step).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 4 Estimation SofPRING


ARMA2013

42 / 42

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME


S ERIES
I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 5 Models of Heteroskedasticity
3 C REDITS , 51 H OURS
Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

1 / 30

A N E XCURSION INTO N ON - LINEARITY L AND

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A N E XCURSION INTO N ON - LINEARITY L AND

2 / 30

A S AMPLE

O N S CALE ( DATA : C HICAGO B OARD O PTIONS


E XCHANGE )

Motivation: the linear structural (and time series) models cannot


explain a number of important features common to much financial
data

F IGURE : CBOE VIX and SPX (S&P500 Index) Scale

leptokurtosis
volatility clustering or volatility pooling
leverage effects
SPX (S&P500 Index)

Our traditional structural model could be something like:


yt = 1 + 2 x2t + ... + k xkt + ut

2000

100

1000

50

CBOE VIX

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

J IANHUA G ANG (RUC)

or more compactly
y = X + u, u N 0, 2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

0
1990

3 / 30

J IANHUA G ANG (RUC)

1992

1994

1996

1998
2002
Year

2004

2006

2008 2009

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

4 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A S AMPLE

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

O N ( LOG -)R ETURN S ERIES

N ON - LINEAR M ODELS : A D EFINITION


Campbell, Lo and MacKinlay (1997) define a non-linear data
generating process (DGP) as one that can be written as,

F IGURE : SPX Return and Percentage Change of VIX (%VIX)

yt = f (ut , ut 1 , ut 2 , ...)

Total Return of SPX

15
10

where ut is an i.i.d. error term and f is a non-linear function.

5
0

They also give a slightly more specific definition as,

-5
-10
1990

1992

1994

1996

1998

2000
Year

2002

2004

2006

20082009

yt = g (ut 1 , ut 2 , ...) + ut 2 (ut 1 , ut 2 , ...)

Percentage Change of VIX

80
60
40

where g (.) is a function of past error terms only and 2 is a variance


term.

20
0
-20
-40
1990

J IANHUA G ANG (RUC)

N ON - LINEAR M ODELS : A D EFINTION

1992

1994

1996

1998

2000
Year

2002

2004

2006

20082009

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

Models with nonlinear g (.) are non-linear in the mean, while those
with nonlinear 2 (.) are nonlinear in variance.
5 / 30

T YPES OF N ON - LINEAR M ODELS

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T YPES OF N ON - LINEAR M ODELS

6 / 30

T ESTING FOR N ON - LINEARITY

T ESTING FOR N ON - LINEARITY


The traditional tools of time series analysis (acfs, spectral analysis)
may find no evidence that we could use a linear model, but the data
may still not be independent.

The linear paradigm is a useful one: Many apparently non-linear


relationships can be made linear by a suitable transformation.
On the other hand, it is likely that many relationships in finance are
intrinsically non-linear.
There are many types of non-linear models, e.g.

Portmanteau tests (discuss later) for non-linear dependence have been


developed.
The simplest is Ramseys RESET test, which took the form:
u
bt = 0 + 1 ybt2 + 2 ybt3 + ... + p 1 ybtp + vt

ARCH / GARCH
switching models
bilinear models

Many other non-linearity tests are available.

One particular non-linear model that has proved very useful in finance
is the ARCH model due to Engle (1982).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

7 / 30

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

8 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A UTOREGRESSIVE C ONDITIONALLY H ETEROSKEDASTIC (ARCH)


T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES M ODELS

H ETEROSKEDASTICITY R EVISITED

H ETEROSKEDASTICITY R EVISITED

ARCH M ODELS

An example of a structural model is


yt = 1 + 2 x2t + 3 x3t + 4 x4t + ut , ut N 0, 2u

Box and Jenkins (1970) ARMA models...

The assumption that the variance of the error is constant is known as


homoskedasticity, i.e. var (ut ) = 2 .
What if the variance of the error is not constant?
heteroskedasticity;
would imply that standard error estimates could be wrong.

The mean process have been extended to essentially analogous


models for the variance (seminal paper of Engle (1982)).
Autoregressive conditional heteroscedasticity (ARCH) models are now
commonly used to describe and forecast changes in the volatility of
financial time series.
Bollerslev et al. (1992, 1994), Bera and Higgins (1993), Pagan
(1996), Palm (1996) and Shephard (1996), among others.

Is the variance of the errors likely to be constant over time?


Not for financial data!!!

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

9 / 30

A UTOREGRESSIVE C ONDITIONALLY H ETEROSKEDASTIC (ARCH)


T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES M ODELS

ARCH M ODELS

10 / 30

A UTOREGRESSIVE C ONDITIONALLY H ETEROSKEDASTIC (ARCH)


T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES M ODELS

ARCH M ODELS

So use a model which does not assume that the variance is constant.
Recall the definition of the variance of ut :
n
o
2t = var (ut |ut 1 , ut 2 , ...) = E [ut E (ut )]2 |ut 1 , ut 2 , ...
we usually assume that E (ut ) = 0, so


2t = var (ut |ut 1 , ut 2 , ...) = E ut2 |ut 1 , ut 2 , ...

The full model would be


yt = 1 + 2 x2t + ... + k xkt + ut , ut N 0, 2t

where 2t = 0 + 1 ut21 .
We can easily extend this to the general case where the error variance
depends on q lags of squared errors:

This is an ARCH(q) model.


Instead of calling the variance 2t ,in the literature it is usually called
ht ,so the model is

Previous squared error terms.

This leads to the Autoregressive Conditionally Heteroscedastic


(ARCH) model:
2t = 0 + 1 ut21

yt = 1 + 2 x2t + ... + k xkt + ut , ut N (0, ht )


where
ht = 0 + 1 ut21 + 2 ut22 + ... + q ut2q

This is known as the ARCH(1) model.


I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity
S PRING 2013

2t = 0 + 1 ut21 + 2 ut22 + ... + q ut2q

Now, what could the current value of the variance of the errors
plausibly depend upon?

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

J IANHUA G ANG (RUC)

11 / 30

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

12 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A NOTHER WAY OF W RITING ARCH M ODELS

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A NOTHER WAY OF W RITING ARCH M ODELS

E XAMPLE : T ESTING FOR ARCH E FFECTS

T ESTING FOR ARCH E FFECTS


1 First, run any postulated linear regression of the form given in the
equation above, e.g.

For illustration, consider an ARCH(1). Instead of the above, we can


write

yt = 1 + 2 x2t + ... + k xkt + ut , ut N 0, 2t
q
t =
0 + 1 ut21
The two are different ways of expressing exactly the same model. The
first form is easier to understand while the second form is required for
simulating from an ARCH model, for example.

yt = 1 + 2 x2t + ... + k xkt + ut


saving the residuals, u
bt .

2 Then square the residuals, and regress them on q own lags to test for
ARCH of order q, i.e. run the regression
bt21 + 2 u
bt22 + ... + q u
bt2q + vt
u
bt2 = 0 + 1 u

where vt is i.i.d. Obtain R 2 from this regression.

3 The test statistic is defined as TR 2 (the number of observations


multiplied by the coefficient of multiple correlation) from the last
regression, and is distributed as a 2 (q ).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

13 / 30

E XAMPLE : T ESTING FOR ARCH E FFECTS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T ESTING FOR ARCH E FFECTS

P ROBLEMS WITH ARCH( Q ) M ODELS

How do we decide on q?

H0 : 1 = 2 = ... = q = 0

The required value of q might be very large;

H1 : q 6= 0, q = 1 or 2 or 3...

Non-negativity constraints might be violated;

If the value of the test statistic is greater than the critical value from
the 2 distribution, then reject the null hypothesis.
Note that the ARCH test is also sometimes applied directly to returns
instead of the residuals from Stage 1 above.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

14 / 30

P ROBLEMS WITH ARCH MODELS

4 The null and alternative hypotheses are

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

15 / 30

When we estimate an ARCH model, we require i > 0, i = 1, 2, ..., q


(since variance cannot be negative).
Therefore, a natural extension of an ARCH(q) model which circumvents
some of these problems is a GARCH model.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

16 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

G ENERALISED ARCH (GARCH) M ODELS

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

GARCH M ODELS

G ENERALISED ARCH (GARCH) M ODELS

GARCH M ODELS

Due to Bollerslev (1986). Allow the conditional variance to be


dependent upon previous own lags.
The variance equation is now
But in general a GARCH(1,1) model will be sufficient to capture the
volatility clustering in the data.

2t = 0 + 1 ut21 + 2t 1
This is a GARCH(1,1) model, which is like an ARMA(1,1) model for
the variance equation.
We could also show that a GARCH(1,1) model can be written as an
infinite order ARCH model.

Why is GARCH better than ARCH?


more parsimonious-avoiding overfitting;
less likely to breach non-negativity constraints.

We can again extend the GARCH(1,1) model to a GARCH(p,q):


q

2t = 0 + i ut2i +
i =1

J IANHUA G ANG (RUC)

j 2t j

j =1

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

17 / 30

T HE U NCONDITIONAL VARIANCE UNDER GARCH

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T HE U NCONDITIONAL VARIANCE UNDER GARCH

18 / 30

E STIMATION OF ARCH/GARCH M ODELS

E STIMATION OF ARCH/GARCH M ODELS

The unconditional variance of ut is given by,


var (ut ) =

Since the model is no longer of the usual linear form, we cannot use
OLS.

0
1 ( 1 + )

We use maximum likelihood as we already discussed.

when 1 + < 1
1 + > 1 is termed non-stationarity in variance;

The method works by finding the most likely values of the parameters
given the actual data.

1 + = 1 is termed integrated GARCH.

More specifically, we form a log-likelihood function and maximise it.

For non-stationarity in variance, the conditional variance forecasts will


not converge on their unconditional value as the horizon increases.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

19 / 30

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

20 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

E STIMATION OF GARCH M ODELS USING M AXIMUM


T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES L IKELIHOOD

E STIMATION OF ARCH/GARCH M ODELS

E STIMATION OF ARCH/GARCH M ODELS

E STIMATION OF GARCH M ODELS USING M AXIMUM


L IKELIHOOD

The steps involved in actually estimating an ARCH or GARCH model


are as follows
1

Specify the appropriate equations for the mean and the variance - e.g.
an AR(1)-GARCH(1,1) model:
yt
2t

= + yt 1 + ut , ut N (0, 2t )
=

0 + 1 ut21

(1)

2t 1

Specify the log-likelihood function to maximise:

Now we get model (1) and likelihood function (2).


Unfortunately, the LLF for a model with time-varying variances
cannot be maximised analytically, except in the simplest of cases. So
a numerical procedure is used to maximise the log-likelihood
function. A potential problem: local optima or multimodalities in
the likelihood surface.
The way we do the optimisation is:

1 T
T
1 T (yt yt 1 )2
l = log(2 ) log(2t )
2
2 t =1
2 t =1
2t

(2)

1
2
3

The computer will maximise the function and give parameter values
and their standard errors.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

21 / 30

N ON -N ORMALITY AND M AXIMUM L IKELIHOOD

Set up LLF.
Use regression to get initial guesses for the mean parameters.
Choose some initial guesses for the conditional variance parameters.
Specify a convergence criterion - either by criterion or by value.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

N ON -N ORMALITY AND M AXIMUM L IKELIHOOD

22 / 30

E XTENSIONS OF THE B ASIC GARCH

E XTENSIONS OF THE B ASIC GARCH

Recall that the conditional normality assumption for ut is essential.


We can test for normality using the following representation

Since the GARCH model was developed, a huge number of extensions


and variants have been proposed. Three of the most important
examples are EGARCH, GJR, and GARCH-M models.

vt N (0, 1)
q ut = vt t
t = 0 + 1 ut21 + 2t 1
vt = utt
The sample counterpart is vbt =

Problems with GARCH(p,q) Models:

ubt
bt

Are the vbt normal? Typically vbt are still leptokurtic, although less so
than u
bt . Is this a problem? Not really, as we discussed before. We
can use the ML with a robust variance/covariance estimator. ML
with robust standard error is called Quasi-Maximum Likelihood or
QML (also known as pseudo-).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

23 / 30

Non-negativity constraints may still be violated;


GARCH models cannot account for leverage effects

Possible solutions: the exponential GARCH (EGARCH) model or the


GJR model, which are asymmetric GARCH models.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

24 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T HE EGARCH M ODEL

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T HE EGARCH M ODEL

T HE GJR M ODEL

Suggested by Nelson (1991). The variance equation is given by

r


2
ut 1
|ut 1 |
+ q

log 2t = + log 2t 1 + q

2t 1
2t 1

Due to Glosten, Jaganathan and Runkle


2t = 0 + 1 ut21 + 2t 1 + ut21 It 1
where

Advantages of the model


Since we model the log 2t , then even if the parameters are negative,
2t will be positive.
We can account for the leverage effect by noticing that a negative
shock (u
t 1 ) has an asymmetric effect on the dependent variable
log 2t as opposed to a positive shock.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

T HE GJR M ODEL

25 / 30

A N E XAMPLE OF GJR

It 1 = 1, if ut 1 < 0
It 1 = 0, otherwise
For a leverage effect, we would see > 0.
We require 1 + 0 and 1 0 for non-negativity conditions.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

J IANHUA G ANG (RUC)

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

A N E XAMPLE OF GJR

26 / 30

N EWS I MPACT C URVES

N EWS I MPACT C URVES


The news impact curve plots the next period volatility (ht ) that would
arise from various positive and negative values of ut 1 , given an
estimated model.
News Impact Curves for SPX returns using coefficients from GARCH
and GJR Model Estimates:

Using monthly SPX returns, December 1979 - June 1998


Estimating a GJR model, we obtain the following results.

0.14
GARCH

= 0.172

2t =

GJR
0.12

(3.198 )
Value of Conditional Variance

yt

1.243 + 0.015 ut21 + 0.498 2t 1 + 0.604 ut21 It 1

(16.372 )

(0.437 )

(14.999 )

(5.772 )

0.1

0.08

0.06

0.04

0.02

0
-1

-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Value of Lagged Shock

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

27 / 30

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

28 / 30

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

GARCH- IN -M EAN

T OPIC 5 M ODEL OF H ETEROSKEDASTICITY IN T IME S ERIES

GARCH- IN -M EAN

W HY W E N EED GARCH FAMILY ?

W HY W E N EED GARCH FAMILY ?

We expect a risk to be compensated by a higher return. So why not


let the return of a security be partly determined by its risk?
Engle, Lilien and Robins (1987) suggested the ARCH-M specification.
A GARCH-M model would be

yt = + t 1 + ut , ut N 0, 2t
2t = 0 + 1 ut21 + 2t 1

GARCH can model the volatility clustering effect since the conditional
variance is autoregressive. Such models can be used to forecast
volatility.
We could show that
Var (yt |yt 1 , yt 2 ..., ) = Var (ut |ut 1 , ut 2 , ...)

can be interpreted as a sort of risk premium.

So modelling 2t will give us models and forecasts for yt as well.

It is possible to combine all or some of these models together to get


more complex hybrid models - e.g. an ARMA-EGARCH(1,1)-M
model.

Variance forecasts are additive over time.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

29 / 30

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 5 Models of Heteroskedasticity


S PRING 2013

30 / 30

T OPIC 6 M ULTIVARIATE GARCH

T OPIC 6 M ULTIVARIATE GARCH


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 6 Multivariate GARCH
3 C REDITS , 51 H OURS
Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

T OPIC 6 M ULTIVARIATE GARCH

1 / 44

MGARCH FAMILY

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

T OPIC 6 M ULTIVARIATE GARCH

MGARCH FAMILY

2 / 44

MGARCH FAMILY

MGARCH FAMILY

P ROBLEM
1

Is the volatility of a market leading the volatility of other markets?

Is the volatility of an asset transmitted to another asset directly


(through its conditional variance) or indirectly (through its conditional
covariances)?

Is the impact the same for negative and positive shocks of the same
amplitude?

Whether or not the correlations between asset returns change over


time.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

Literature regarding MGARCH models: Bollerslev et al. (1988);


Gourieroux (1997); De Santis and Gerard (1998); Hafner and
Herwartz (1998); Franses and van Dijk (2000); Lien and Tse (2002).
MGARCH models: initially developed in the late 1980s and the first
half of the 1990s. After 2000s, another active phase of this field.

Are they higher during periods of higher volatility (sometimes


associated with financial crises)?
Are they increasing in the long run, perhaps because of the
globalization of financial markets?
J IANHUA G ANG (RUC)

The above questions can be answered using MGARCH.


For example, impact of volatility in financial markets on real variables
like exports and output growth rates, and the volatility of these
growth rates.

3 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

4 / 44

T OPIC 6 M ULTIVARIATE GARCH

MGARCH FAMILY

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

MGARCH FAMILY

C AT. 1: D IRECT G ENERALIZATION

M ODEL C ONSTRUCTION IN G ENERAL

VEC M ODELS (B OLLERSLEV

D EFINITION (C ONDITIONAL M EAN )

D EFINITION (C ONDITIONAL VARIANCE )

yt = t ( ) + t

ET AL .

ht = c + A t 1 + Ght 1
where

D EFINITION (C ONDITIONAL VARIANCE )

ht

t = Ht1/2 ( ) zt

where Ht1/2 ( ) is a N N positive definite matrix; the N 1 random


vector zt , IN is the identity matrix of order N.

Var (zt ) = IN

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

T OPIC 6 M ULTIVARIATE GARCH

= vech (Ht )

= vech t t

and vech () denotes the operator that stacks the lower triangular portion
of a N N matrix as a N (N + 1) /2 1vector. A and G are square
parameter matrices of order N (N + 1) /2 and c is a N (N + 1) /2 1
parameter vector.

E (zt ) = 0

J IANHUA G ANG (RUC)

1988)

J IANHUA G ANG (RUC)

5 / 44

C ATEGORY 1: VEC AND BEKK

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION

C AT. 1: D IRECT G ENERALIZATION

VEC M ODELS (B OLLERSLEV

VEC

ET AL .

1988)

AND

DVEC (B OLLERSLEV

ET AL .

6 / 44

1988)

The number of parameters is 78 for N = 3. So in practice, this model


is used only in the bivariate case.

The number of parameters is 78 for N = 3. So in practice, this model


is used only in the bivariate case.

Bollerslev et al. (1988) suggests DVEC (diagonal VEC) model to


overcome this: A and G matrices are assumed to be diagonal, each
element hi ,jt depending only on its own lag and on the previous values
of it jt .

Bollerslev et al. (1988) suggests DVEC (diagonal VEC) model to


overcome this: A and G matrices are assumed to be diagonal, each
element hi ,jt depending only on its own lag and on the previous values
of it jt .

DVEC can hence reduce the number of parameters to 12 for N = 3.

DVEC can hence reduce the number of parameters to 12 for N = 3.

But, even under this diagonality, large-scale systems are still highly
parameterized and difficult to estimate.

But, even under this diagonality, large-scale systems are still highly
parameterized and difficult to estimate.

Even simpler version of the DVEC (Ding and Engle, 2001): A and G
to be positive scalar (scalar model).

Even simpler version of the DVEC (Ding and Engle, 2001): A and G
to be positive scalar (scalar model).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

7 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

8 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION

C AT. 1: D IRECT G ENERALIZATION

R ISKMETRICS (1996) (N OW MSCI)

BEKK

Practitioners who study volatility processes often observe that their


model is very close to the unit root case.
To take this into account, Riskmetrics uses the exponentially weighted
moving average model (EWMA) and defines the variances and
covariances as IGARCH-type models (Engle and Bollerslev, 1986):

It is difficult to guarantee the positivity of Ht in the VEC


representation without imposing strong restrictions on the parameters.
Engle and Kroner (1995) propose alternative Ht to ensure the
positivity: the BEKK model.

ht = (1 ) t 1 + ht 1
which is a scalar VEC. The decay factor proposed by Riskmetrics is
0.94 for daily data and 0.97 for monthly data.
However, the decay factor is not estimated by suggested. Therefore
very hard to justify.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 MultivariateSGARCH


PRING 2013

T OPIC 6 M ULTIVARIATE GARCH

9 / 44

J IANHUA G ANG (RUC)

C ATEGORY 1: VEC AND BEKK

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

10 / 44

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION

C AT. 1: D IRECT G ENERALIZATION

BEKK

BEKK

D EFINITION (BEKK(1,1,K))
Ht = C C +

k =1

k =1

Ak t 1 t 1 Ak + Gk Ht 1 Gk

(1)

where C , Ak and Gk are N N matrices but C is upper triangular.

However, still prefer parsimonious models (as well as reducing the


generality).
Impose diagonal BEKK model, i.e. diagonalize Ak and Gk . (Now also
a DVEC model but less general, since DVEC is not guranteed to be
positive definite.)

K determines the generality of the process.


Parameters of the BEKK model do not represent directly the impact
of the different lagged term on the elements of Ht like in the VEC
model.

Scalar BEKK also applicable, i.e. Ak and Gk are equal to a scalar


times a matrix of ones.

When K = 1, this is a VEC with C , Ak and Gk being restricted to


be positive.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

11 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

12 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION

C AT. 1: D IRECT G ENERALIZATION

VEC

FACTOR M ODEL (E NGLE ET

AND

BEKK

AL .

(1990 B ), B OLLERSLEV AND E NGLE (1993))

D EFINITION (FGARCH(1,1,K))
The difficulty when estimating a VEC or even a BEKK model is the
high number of unknown parameters, even after imposing several
restrictions.

Lin (1992): the BEKK(1,1,K) model above is a factor GARCH model,


denoted by F-GARCH(1,1,K), if for each k = 1, ..., K , Ak and Gk have
rank one and have the same left and right eigenvectors, k and wk ,i.e.
Ak

It is thus not surprising that these models are rarely used when the
number of series is larger than 3 or 4.
Factor and orthogonal models circumvent this difficulty by imposing a
common dynamic structure on all the elements of Ht , which results in
less parameterized models.

Gk

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

13 / 44

J IANHUA G ANG (RUC)

C ATEGORY 1: VEC AND BEKK

(4)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C AT. 1: D IRECT G ENERALIZATION


FACTOR M ODEL (E NGLE ET

AL .

14 / 44

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION


(1990 B ), B OLLERSLEV AND E NGLE (1993))

(3)

N
0 for k 6= i
, wkn = 1
1 for k = i n =1

FACTOR M ODEL (E NGLE ET

AL .

(2)

where k and k are scalars, and wk and k (for k = 1, ..., K ) are N 1


vectors satisfying,
wk i

J IANHUA G ANG (RUC)

= k wk k ,
= k wk k ,

(1990 B ), B OLLERSLEV AND E NGLE (1993))

D EFINITION (FGARCH(1,1,K))
Substitute (2) and (3) into (1) and define = C C , we get
K

Ht = +

k k
k =1

2k wk t 1 t 1 wk + 2k wk Ht 1 wk

(5)

restriction (4) is an identification restriction.

Ht

The K -factor GARCH implies that Ht 1 has reduced rank K , but Ht


is of full rank because is positive definite.
The vector k is defined as factor loading, and the scalar wk t
(denoted as fkt ) is the kth factor.
The expression between brackets can be replaced by other univariate
GARCH specifications.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

Consider, for instance, the two-factor F-GARCH model:


F-GARCH(1,1,2),

15 / 44

= + 1 1 21 w1 t 1 t 1 w1 + 21 w1 Ht 1 w1

+1 1 22 w2 t 1 t 1 w2 + 22 w2 Ht 1 w2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

(6)

16 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 1: VEC AND BEKK

C AT. 1: D IRECT G ENERALIZATION

C AT. 1: D IRECT G ENERALIZATION

FACTOR M ODEL (E NGLE ET

FACTOR M ODEL (E NGLE ET

AL .

(1990 B ), B OLLERSLEV AND E NGLE (1993))

Alternatively, the two-factor model can be obtained from

AL .

(1990 B ), B OLLERSLEV AND E NGLE (1993))

A K-factor model can then be written as

t = 1 f1t + 2 f2t + et

t = ft + et

where et represents an idiosyncratic shock with constant variance


matrix and uncorrelated with the two factors.

where is a matrix of dimension N K and ft is a K 1 vector. A


factor is observable if it is specified as a function of t .

Each factor fkt has zero conditional mean and conditional variance
like a GARCH(1,1) process.

Variants of the factor model in the literature. e.g. Vrontos et al.


(2003), the full-factor multivariate GARCH model (FF-GARCH).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

17 / 44

C ATEGORY 2: L INEAR C OMBINATIONS OF U NIVARIATE GARCH

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 2: L INEAR C OMBINATIONS OF U NIVARIATE GARCH

C AT. 2: L INEAR C OMBINATIONS

C AT. 2: L INEAR C OMBINATIONS

O RTHOGONAL GARCH

O RTHOGONAL GARCH

Each factor fkt has zero conditional mean and conditional variance
like a GARCH(1,1) process.

D EFINITION (O-GARCH(1,1, M ))
Kariya (1988) and Alexander and Chibumba (1997): The N N
time-varing variance marix Ht is generated by m N univariate GARCH
models,
t = 1 f1t + 2 f2t + et
where et represents an idiosyncratic shock with constant variance matrix
and uncorrelated with the two factors.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

18 / 44

19 / 44

A K-factor model can then be written as


t = ft + et
where is a matrix of dimension N K and ft is a K 1 vector. A
factor is observable if it is specified as a function of t .
Variants of the factor model in the literature. e.g. Vrontos et al.
(2003), the full-factor multivariate GARCH model (FF-GARCH).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

20 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

T OPIC 6 M ULTIVARIATE GARCH

C AT. 3: N ONLINEAR C OMBINATIONS

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS


CCC M ODEL

Multivariate models must allow where one can specify separately (the
individual conditional variances) and the conditional correlation
matrix or other measure of dependence between individual series (like
the copula of the conditional joint density).

A hierarchical procedure:
1

Choose a GARCH-type model for each conditional variance (may vary


among within the multivariate system);
Model the conditional correlation matrix (imposing positive definiteness
for any t).

For models of this category, theoretical results on stationarity,


ergodicity and moments may not be so straightforward

MGARCH models in which the conditional correlations are constant.

Nonetheless, they are less greedy in parameters than other categories,


and therefore more easily estimable.

Thus the conditional covariances are proportional to the product of


the corresponding conditional standard deviations.
This restriction greatly reduces the number of unknown parameters
and thus simplifies the estimation. (Bollerslev (1990))

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

21 / 44

J IANHUA G ANG (RUC)

C ATEGORY 3: N ONLINEAR C OMBINATIONS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

CCC M ODEL

CCC M ODEL

D EFINITION (CCC)

R is the matrix containing the constant conditional correlation ij .The


original CCC model has a GARCH(1,1) specification for each
conditional variance in Dt :

The CCC model is defined as:


Ht = Dt RDt = ij
where

hiit hjjt



1/2
1/2
...hNNt
Dt = diag h11t

hiit = i + i 2i ,t 1 + i hii ,t 1 , i = 1, 2, ..., N.


(7)


hiit can be defined as any univariate GARCH model, and R = ij


symmetric positive definite matrix with ij = 1,for any i.

J IANHUA G ANG (RUC)

22 / 44

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

is a

23 / 44

However, unconditional covariances are difficult to calculate.


He and Terasvirta (2002b) use a VEC-type formulation for
(h11t , h22t , ..., hNNt ) to allow for interactions between the conditional
variances. They call this the extended CCC model.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

24 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

DCC M ODEL

DCC M ODEL

The DCC model of Christodoulakis and Satchell (2002) only allows


bivariate case.

CCC: Assumption that the conditional correlations are constant:


unrealistic in many empirical applications.

The DCC model of Tse and Tsui (2002) and Engle (2002) are useful
when modelling high-dimensional data sets.

DCC: Generalization of the CCC by making the conditional


correlation matrix time-dependent. (Christodoulakis and Satchell
(2002), Engle (2002) and Tse and Tsui (2002)).

D EFINITION (DCC MODEL OF T SE AND T SUI (2002) OR DCCT (M))

An additional difficulty is that the time-dependent conditional


correlation matrix has to be positive definite for any t. (The DCC
models guarantee this under simple conditions on the parameters.)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

Ht = Dt Rt Dt
where Dt is defined in (7), hiit can be defined as any univariate GARCH
model and
Rt = (1 1 2 ) R + 1 t 1 + 2 Rt 1
(8)

25 / 44

J IANHUA G ANG (RUC)

C ATEGORY 3: N ONLINEAR C OMBINATIONS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

DCC M ODEL

DCC M ODEL

D EFINITION (C ONT. DCC


DCCT (M))

MODEL OF

T SE AND T SUI (2002)

OR

In (8), 1 and 2 are non-negative parameters satisfying 1 + 2 < 1,R is a


symmetric N N positive definite parameter matrix with ii = 1 and
t 1 is the N N correlation matrix of for
= t M, t M + 1, ..., t 1. Its i, jth. element is given by:
ij ,t 1 = r

J IANHUA G ANG (RUC)

M
m =1 ui ,t m uj ,t m


M
2
2
u
u
M

m =1 i ,t m
m =1 j ,t m

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

(9)

27 / 44

26 / 44

D EFINITION (C ONT. DCC MODEL OF T SE AND T SUI (2002) OR


DCCT (M))

where uit = it / hiit .The matrix t 1 can be expressed as:


t 1 = Bt11 Lt 1 Lt 1 Bt11 ,where Bt 1 is a N N diagonal matrix with

1/2
2
ith. diagonal element given by M
and
h =1 ui ,t h
Lt 1 = (ut 1 , ..., ut M ) is a N M matrix, with ut = (u1t u2t ...uNt ) .

A necessary condition to ensure the positivity of t 1 ,and therefore


also of Rt , is that M > N. Then Rt is itself a correlation matrix if
Rt 1 is also a correlation matrix (notice iit = 1 for any i).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

28 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

DCC M ODEL

DCC M ODEL

Alternatively, Engle (2002) proposes a different DCC model (see also


Engle and Sheppard, 2001).
D EFINITION (C ONT. DCC MODEL OF E NGLE (2002) OR DCCE (1,1))
D EFINITION (DCC MODEL OF E NGLE (2002) OR DCCE (1,1))

with ut as in (9). Q is the N N unconditional variance matrix of ut , and


and are non-negative scalar parameters satisfying + < 1.

Ht = Dt Rt Dt
where





1/2
1/2
1/2
1/2
Q
...qNN
diag
q
...q
Rt = diag q11,t
t
11,t
,t
NN ,t

The element of Q can be estimated or set to their empirical


couterpart to render the estimation even simpler.

where the N N symmetric positive definite matrix Qt = (qij ,t ) is given


by:
Qt = (1 ) Q + ut 1 ut 1 + Qt 1
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate
S PRING
GARCH
2013

J IANHUA G ANG (RUC)

T OPIC 6 M ULTIVARIATE GARCH

29 / 44

C ATEGORY 3: N ONLINEAR C OMBINATIONS

12t

J IANHUA G ANG (RUC)

M
m =1 u1,t m u2,t m


M
2
2
u
u
M

m =1 1,t m
m =1 2,t m

(10)

2
(1 ) q 22 + u2,t
1 + q22,t 1

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

For both DCCT and DCCE models, one can test 1 = 2 = 0 or


= = 0, respectively to check whether imposing constant
conditional correlations is empirically relevant.
A drawback of the DCC models is that 1 , 2 in DCCT and , in
DCCE are scalars, so that all the conditional correlations obey the
same dynamics. This is however necessary to ensure that Rt is
positive definite for any t through sufficient conditions on the
parameters.

(1 ) q 12 + u1,t 1 u2,t 1 + q12,t 1


q

2
(1 ) q 11 + u1,t
1 + q11,t 1
1

C ATEGORY 3: N ONLINEAR C OMBINATIONS

Unlike in DCCT the DCCE model does not formulate the conditional
correlation as a weighted sum of past correlations.

= (1 1 2 ) 12 + 2 12,t 1 + 1

q

30 / 44

C AT. 3: N ONLINEAR C OMBINATIONS

Explicit difference between DCCT (M ) and DCCE (1, 1), see


conditional correlations:

+ r

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C AT. 3: N ONLINEAR C OMBINATIONS

12t

J IANHUA G ANG (RUC)

(11)

31 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

32 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

T OPIC 6 M ULTIVARIATE GARCH

C AT. 3: N ONLINEAR C OMBINATIONS

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS


W HY DCC

DCC models can be estimated consistently in two steps, which makes


this approach feasible when N is high.
Of course, when N is large, the restriction of common dynamics gets
tighter, but for large N the problem of maintaining tractability also
gets harder. In this respect, several variants of the DCC model are
proposed in the literature.

DCC models open the door to using flexible GARCH specifications in


the variance part.

Billio et al. (2003) argue that constraining the dynamics of the


conditional correlation matrix to be the same for all the correlations is
not desirable.

As the conditional variances (together with the conditional means)


can be estimated using N univariate models, one can easily extend the
DCC-GARCH models to more complex GARCH-type structures.

Pelletier (2003) proposes a model where the conditional correlations


follow a switching regime driven by an unobserved Markov chain so
that the correlation matrix is constant in each regime but may vary
across regimes.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

33 / 44

J IANHUA G ANG (RUC)

C ATEGORY 3: N ONLINEAR C OMBINATIONS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

34 / 44

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

G ENERAL D YNAMIC C OVARIANCE M ODEL

G ENERAL D YNAMIC C OVARIANCE M ODEL

D EFINITION (GDC M ODEL )


Ht = Dt Rt Dt + t
where,

A model somewhat different from the previous ones but that nests
several of them is the general dynamic covariance (GDC) model
proposed by Kroner and Ng (1998).

Dt = (dijt ), diit =

(12)

iit , dijt = 0, t = ( ijt )

Rt is specified as DCCT (M ) or DCCE (1, 1). = (ij ), ii = 0, ij = ji ,


ijt = ij + i t 1 t 1 j + gi Ht 1 gj , for any i, j.

(13)

i , gi , i = 1, ..., N are (N 1) vectors of parameters, and = ( ij ) is


positive definite and symmetric.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

35 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

36 / 44

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

T OPIC 6 M ULTIVARIATE GARCH

C ATEGORY 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

C AT. 3: N ONLINEAR C OMBINATIONS

G ENERAL D YNAMIC C OVARIANCE M ODEL

C OPULA -MGARCH M ODELS

Elementwise we have,
hijt
hiit

p
= ijt iit jjt + ij ijt , for i 6= j
= | iit | , for any i.

(14)

Any N-dimensional joint distribution function may be decomposed


into its N marginal distributions, and a copula function that
completely describes the dependence between the N variables (Sklar
(1959), Nelsen (1999), Patton (2000), Jondeau and Rockinger
(2001)).
Standard copula-GARCH:

where the ijt are given by the BEKK formulation in (13).


The GDC model contains several MGARCH models as special cases.

1
2
3

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

37 / 44

GARCH for conditional variances;


marginal distributions for each series;
a conditional copula function.

J IANHUA G ANG (RUC)

C ATEGORY 3: N ONLINEAR C OMBINATIONS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

C AT. 3: N ONLINEAR C OMBINATIONS

E STIMATION I SSUES

C OPULA -MGARCH M ODELS

T WO -S TEP MLE

Papers put need to allow for time-variation in the conditional copula,


extending the DCC models to other specifications of the conditional
dependence, so that the copula function is rendered time-varying
through its parameters, which can be functions of past data.
Can be estimated using a two-step maximum likelihood approach.
Feature of copula-GARCH models: the ease with which very flexible
joint distributions may be obtained in the bivariate case.
Their application to higher dimensions is a subject for further
research.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

39 / 44

38 / 44

E STIMATION I SSUES

Two-step maximum likelihood estimation: Engle and Sheppard


(2001) show that the loglikelihood can be written as the sum of a
mean and volatility part (depending on a set ofunknown
 parameters).

b
b
Therefore estimate coefficients separately, say, 1 , 2 .
But maximizing them separately is not fully efficient since they are
limited information estimators. However, one iteration of a
Newton-Raphson
algorithm applied to the total likelihood starting at
 
b
b
1 , 2 provides an estimator that is asymptotically efficient.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

40 / 44

T OPIC 6 M ULTIVARIATE GARCH

E STIMATION I SSUES

T OPIC 6 M ULTIVARIATE GARCH

E STIMATION I SSUES

D IAGNOSTIC C HECKING

D IAGNOSTIC C HECKING

VARIANCE TARGETING (VTE)

MGARCH models are difficult: too many parameters!


A simple trick to ensure a reasonable value of the model-implied
unconditional covariance matrix, which also helps to reduce the
number of parameters in the maximization of the likelihood function,
is referred to as VTE by Engle and Mezrich (1996).

It is desirable to check,
1

VTE:
1

ex ante: whether the data present evidence of multivariate ARCH


effects;
ex post: check the adequacy of the MGARCH specification.

Two kinds of tests: univariate/multivariate(very sparse) tests.

Re-parameterization of the model and the estimation of the


unconditional variance;
QML estimation of the remaining parameters.

Merits: when the model is misspecified, the VTE can be superior to


the QMLE for long-term prediction.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

41 / 44

D IAGNOSTIC C HECKING

E (ztzt ) = I
N;
Cov zit2 , zjt2 = 0, for all i 6= j;


Cov zit2 , zj2,t k = 0, for k > 0.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

42 / 44

D IAGNOSTIC C HECKING

D IAGNOSTIC C HECKING

Since the dynamics of the series is assumed to be captured by the


model (at least in the first two conditional moments), the
standardized error term zt = Ht1/2 t should obey the following
moment conditions (Ding and Engle, 2001):
2

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

T OPIC 6 M ULTIVARIATE GARCH

D IAGNOSTIC C HECKING

J IANHUA G ANG (RUC)

Testing 1 has power to detect misspecification in the conditional


mean;

Testing 2 is suited to check if the conditional distribution is Gaussian,


which could be false even if Ht is correctly specified.

Testing 3 aims at checking the adequacy of the dynamic specification


of Ht , regardless of the validity of the assumption about the
distribution of zt .
Tse (2002), diagnostics for conditional heteroscedasticity models
applied in the literature can be divided into three categories:
portmanteau tests, residual-based diagnostics and Lagrange multiplier
tests.

43 / 44

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 6 Multivariate


S PRING
GARCH
2013

44 / 44

T OPIC 7 M ULTIVARIATE M ODELS

T OPIC 7 M ULTIVARIATE M ODELS


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 7 Multivariate Models
3 C REDITS , 51 H OURS
Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

T OPIC 7 M ULTIVARIATE M ODELS

1 / 52

S IMULTANEOUS E QUATIONS M ODELS

J IANHUA G ANG (RUC)

T OPIC 7 M ULTIVARIATE M ODELS

S IMULTANEOUS E QUATIONS M ODELS

All of the variables contained in the X matrix are assumed to be


EXOGENOUS.

Qst
Qdt

(1)
(2)
(3)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

= + P + S + u
Q = + P + kT + v
Q

(4)
(5)

This is a simultaneous STRUCTURAL FORM of the model.


The point is that price and quantity are determined simultaneously
(price affects quantity and quantity affects price). So, P and Q are
endogenous variables, while S and T are exogenous.
We can obtain REDUCED FORM equations corresponding to (4)
and (5) by solving equations (4) and (5) for P and for Q (separately).

in which St = price of a substitute good; Tt = some variable


embodying the state of technology.
J IANHUA G ANG (RUC)

S IMULTANEOUS E QUATIONS M ODELS : T HE S TRUCTURAL F ORM

Assuming that the market always clears, and dropping the time
subscripts for simplicity.

y is an ENDOGENOUS variable.
An example from economics to illustrate-the demand and supply of a
good:

= + Pt + St + ut
= + Pt + kTt + vt
= Qst

2 / 52

S IMULTANEOUS E QUATIONS M ODELS : T HE


S TRUCTURAL F ORM

All the models we have looked at thus far have been single equations
models of the form: y = X + u

Qdt

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

3 / 52

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

4 / 52

T OPIC 7 M ULTIVARIATE M ODELS

O BTAINING THE R EDUCED F ORM

T OPIC 7 M ULTIVARIATE M ODELS

O BTAINING THE R EDUCED F ORM

O BTAINING THE R EDUCED F ORM

O BTAINING THE R EDUCED F ORM

Re-arranging (6):
Solving for Q,
+ P + S + u = + P + kT + v

P=

(6)

v u
+
T
S

(8)

Q can be ultimately calculated by multiplying (7) through with :

Solving for P,
S
u
Q
kT
v
Q

=

(7)

Q=

u v

T+
S+

(9)

(8) and (9) are the reduced form equations for P and Q.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

T OPIC 7 M ULTIVARIATE M ODELS

5 / 52

S IMULTANEOUS E QUATIONS B IAS

6 / 52

S IMULTANEOUS E QUATIONS B IAS

S IMULTANEOUS E QUATIONS B IAS

But what would happen if we had estimated equations (4) and (5),
i.e. the structural form equations, separately using OLS?
Both equations depend on P. One of the CLRM assumptions was that
E (X u ) = 0, where X is a matrix containing all the variables on the
R.H.S. of the equation.
It is clear from (8) that P is related to the errors in (4) and (5) i.e. it is stochastic.
Hence, when estimating coefficient before P, it is biased! (Since
E (X u ) 6= 0 in general!)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

T OPIC 7 M ULTIVARIATE M ODELS

S IMULTANEOUS E QUATIONS B IAS

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

7 / 52

Conclusion: Application of OLS to structural equations which are part


of a simultaneous system will lead to biased coefficient estimates.
Is the OLS estimator still consistent, even though it is biased?
No - In fact the estimator is inconsistent as well.
Hence it would NOT be possible to estimate equations (4) and (5)
validly using OLS.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

8 / 52

T OPIC 7 M ULTIVARIATE M ODELS

AVOIDING S IMULTANEOUS E QUATIONS B IAS

T OPIC 7 M ULTIVARIATE M ODELS

AVOIDING S IMULTANEOUS E QUATIONS B IAS

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

Can We Retrieve the Original Coefficients from the s?


Short answer: sometimes.

So, what can we do?


1

Taking equations (8) and (9), we can rewrite them as


P
Q

= 10 + 11 T + 12 S + 1
= 20 + 21 T + 22 S + 2

(10)
(11)

We CAN estimate equations (10) and (11) using OLS since all the
R.H.S. variables are exogenous.
But ... we probably dont care what the values of the coefficients
are; what we wanted were the original parameters in the structural
equations - , , , , , k.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 MultivariateSModels


PRING 2013

T OPIC 7 M ULTIVARIATE M ODELS

P ROBLEM
As well as simultaneity, we sometimes encounter another problem:
Identification. Consider the following demand and supply equations

= + P
Q = + P
Q

9 / 52

J IANHUA G ANG (RUC)

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

(13)

10 / 52

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

If an equation (model) is identified in general its coefficients can be


estimated. The appropriate estimation technique will depend upon
whether it is exactly identified or over-identified.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

(12)

We cannot tell which is which! (same equations in nature from OLS view!)

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

J IANHUA G ANG (RUC)

I DENTIFICATION OF S IMULTANEOUS E QUATIONS

11 / 52

Both equations of (12) and (13) are UNIDENTIFIED or NOT


IDENTIFIED, or UNDERIDENTIFIED.

We do not have enough information from the equations to estimate


four parameters. Notice that we would not have had this problem with
equations (4) and (5) since they have different exogenous variables.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

12 / 52

T OPIC 7 M ULTIVARIATE M ODELS

W HAT D ETERMINES THE I DENTIFICATION ?

T OPIC 7 M ULTIVARIATE M ODELS

W HAT D ETERMINES THE I DENTIFICATION ?

W HAT D ETERMINES THE I DENTIFICATION ?

W HAT D ETERMINES THE I DENTIFICATION ?

We could have three possible situations:


1

An equation is unidentified
like (12) and (13)
we cannot get the structural coefficients from the reduced form
estimates.
An equation is exactly identified
e.g. (4) or (5)
can get unique structural form coefficient estimates.
An equation is over-identified
Examples given later
More than one set of structural coefficients could be obtained from
the reduced form.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

13 / 52

W HAT D ETERMINES THE I DENTIFICATION ?

The order condition - is a necessary but not sufficient condition for an


equation to be identified.
The rank condition - is a necessary and sufficient condition for
identification. We specify the structural equations in a matrix
form and consider the rank of a coefficient matrix.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

14 / 52

W HAT D ETERMINES THE I DENTIFICATION ?

T HE O RDER C ONDITION

E XAMPLE
In the following system of equations, the Ys are endogenous, while the Xs
are exogenous. Determine whether each equation is over-, under-, or
just-identified.

D EFINITION
Statement of the Order Condition (from Ramanathan 1995, pp.666)
Let G denote the number of structural equations. An equation is just
identified if the number of variables excluded from an equation is G-1.
If more than G-1 are absent, it is over-identified. If less than G-1 are
absent, it is not identified.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

T HE O RDER C ONDITION

J IANHUA G ANG (RUC)

How do we tell if an equation is identified or not?


There are two conditions we could look at:

15 / 52

Y1 = 0 + 1 Y2 + 2 Y3 + 3 X1 + 4 X2 + u1

(14)

Y2 = 0 + 1 Y3 + 2 X1 + u2

(15)

Y3 = 0 + 1 Y2 + u3

(16)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

16 / 52

T OPIC 7 M ULTIVARIATE M ODELS

W HAT D ETERMINES THE I DENTIFICATION ?

T OPIC 7 M ULTIVARIATE M ODELS

T HE O RDER C ONDITION

T HE R ANK C ONDITION

S OLUTION
G=3
If # excluded variables = 2, the eq. is just identified
If # excluded variables > 2, the eq. is over-identified
If # excluded variables < 2, the eq. is not identified
Hence,
Equation 14: Not identified
Equation 15: Just identified
Equation 16: Over-identified

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

W HAT D ETERMINES THE I DENTIFICATION ?

In a system of G equations any particular equation is identified iff it is


possible to construct at least one non-zero determinant of the order
(G-1) from the coefficients excluded from that particular equation but
contained in other equations of the model.
or A sufficient condition for the identification of a relationship is that
the rank of the matrix of parameters of all the excluded variables
(endogenous and pre-determined) from that equation be equal to
(G-1).

17 / 52

J IANHUA G ANG (RUC)

W HAT D ETERMINES THE I DENTIFICATION ?

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

T HE R ANK C ONDITION

18 / 52

W HAT D ETERMINES THE I DENTIFICATION ?

T HE R ANK C ONDITION

For example:
y1 = 3y2 2x1 + x2 + u1

Results:

y2 = y3 + x3 + u2

y3 = y1 y2 2x3 + u3

2
3

RC/OC: Equation 1 is exactly identified;


RC: Equation 2 is exactly identified, OC: over-identified;
RC: Equation 3 is not identified, OC: exactly identified.

y1 + 3y2 + 0y3 2x1 + x2 + 0x3 + u1 = 0


0y1 y2 + y3 + 0x1 + 0x2 + x3 + u2 = 0
y1 y2 y3 + 0x1 + 0x2 2x3 + u3 = 0

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

19 / 52

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

20 / 52

T OPIC 7 M ULTIVARIATE M ODELS

T EST FOR E XOGENEITY-H AUSMAN T EST

T OPIC 7 M ULTIVARIATE M ODELS

T EST FOR E XOGENEITY-H AUSMAN T EST

T EST FOR E XOGENEITY-H AUSMAN T EST

T EST FOR E XOGENEITY-H AUSMAN T EST

We can, however, formally test this using a Hausman test, which is


calculated as follows:
How do we tell whether variables really need to be treated as
endogenous or not?
Consider again equations (14)-(16). Equation (14) contains Y2 and
Y3 - but do we really need equations for them?

1 Obtain the reduced form equations corresponding to equations


(14)-(16). The reduced forms turn out to be:
Y1
Y2
Y3

= 10 + 11 X1 + 12 X2 + v1
= 20 + 21 X1
+ v2
= 30 + 31 X1
+ v3

(17)
(18)
(19)

Estimate the reduced form equations (17)-(19) using OLS, and obtain
b2 , Y
b3 .
b1 , Y
the fitted values: Y

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

21 / 52

J IANHUA G ANG (RUC)

T EST FOR E XOGENEITY-H AUSMAN T EST

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

T EST FOR E XOGENEITY-H AUSMAN T EST

22 / 52

R ECURSIVE S YSTEMS

R ECURSIVE S YSTEMS
Consider the following system of equations:
Y1 = 10

(21)

2. Run the regression corresponding to equation 14.

Y2

(22)

3. Run the regression 14 again, but now also including the fitted values
b3 as additional regressors:
b2 , Y
Y

Y3

b2 + 3 Y
b3 + u1 (20)
Y1 = 0 + 1 Y2 + 2 Y3 + 3 X1 + 4 X2 + 2 Y

4. Use an F-test to test the joint restriction that H0 : 2 = 3 = 0. If


the null hypothesis is rejected, Y2 and Y3 should be treated as
endogenous.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

23 / 52

+ 11 X1 + 12 X2 + u1
= 20 + 21 Y1
+ 21 X1 + 22 X2 + u2
= 30 + 31 Y1 + 32 Y2 + 31 X1 + 32 X2 + u3

(23)

P ROBLEM
Assume that the error terms are not correlated with each other. Can we
estimate the equations individually using OLS?
(21) contains no endogenous variables, so X1 and X2 are NOT
correlated with u1 . So we can use OLS on (21).
(22) contains endogenous variable Y1 . We can use OLS on (22) if all
the R.H.S. variables are uncorrelated with the error u2 (True!). In
fact, Y1 is not correlated with u2 because there is no Y2 term in
equation (21). So we can use OLS on (22).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

24 / 52

T OPIC 7 M ULTIVARIATE M ODELS

R ECURSIVE S YSTEMS

T OPIC 7 M ULTIVARIATE M ODELS

R ECURSIVE S YSTEMS

I NDIRECT L EAST S QUARES (ILS)

I NDIRECT L EAST S QUARES (ILS)

Cannot use OLS on structural equations, but we can validly apply it


to the reduced form equations.
Equation 23: Contains both Y1 and Y2 ; we require these to be
uncorrelated with u3 . By similar arguments to the above, equations
(21) and (22) do not contain Y3 , so we can use OLS on (23).
This is known as a RECURSIVE or TRIANGULAR system. We do not
have a simultaneity problem here.

D EFINITION
If the system is just identified, ILS involves estimating the reduced form
equations using OLS, and then using them to substitute back to obtain
the structural parameters.

But in practice not many systems of equations will be recursive...


However, ILS is not used much because:
1
2

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

25 / 52

Solving back to get the structural parameters can be tedious.


Most simultaneous equations systems are over-identified.

J IANHUA G ANG (RUC)

E STIMATION USING T WO -S TAGE L EAST S QUARES

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

E STIMATION USING T WO -S TAGE L EAST S QUARES

E STIMATION USING T WO -S TAGE L EAST S QUARES

E STIMATION USING T WO -S TAGE L EAST S QUARES

In fact, we can use this technique for just-identified and


over-identified systems.

In fact, we can use this technique for just-identified and


over-identified systems.

Two stage least squares (2SLS or TSLS) is done in two stages:

Two stage least squares (2SLS or TSLS) is done in two stages:

Obtain and estimate the reduced form equations using OLS. Save the
fitted values for the dependent variables.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

27 / 52

26 / 52

Obtain and estimate the reduced form equations using OLS. Save the
fitted values for the dependent variables.
Estimate the structural equations, but replace any R.H.S. endogenous
variables with their Stage 1 fitted values.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

27 / 52

T OPIC 7 M ULTIVARIATE M ODELS

E STIMATION USING T WO -S TAGE L EAST S QUARES

T OPIC 7 M ULTIVARIATE M ODELS

E STIMATION USING T WO -S TAGE L EAST S QUARES

E STIMATION USING T WO -S TAGE L EAST S QUARES

E STIMATION USING T WO -S TAGE L EAST S QUARES

Example: Say equations (14)-(16) are required.


1

Estimate the reduced form equations (17)-(19) individually by OLS and


b2 , Y
b3 .
b1 , Y
obtain the fitted values, Y
Replace the R.H.S. endogenous variables with their Stage 1 estimated
values:
Y1
Y2
Y3

b2 + 2 Y
b3 + 3 X1 + 4 X2 + u1
= 0 + 1 Y
b3 + X1 + u2
= 0 + 1 Y
2
b
= 0 + 1 Y2 + u3

(24)
(25)
(26)

b2 and Y
b3 will not be correlated with u1 , will not be correlated
Now Y
with u2 , and will not be correlated with u3 .

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

28 / 52

I NSTRUMENTAL VARIABLES (IV)

The standard error estimates also need to be modified compared


with their OLS counterparts, but once this has been done, we can use
the usual t- and F-tests to test hypotheses about the structural form
coefficients.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

29 / 52

I NSTRUMENTAL VARIABLES (IV)

I NSTRUMENTAL VARIABLES (IV)

Recall that the reason we cannot use OLS directly on the structural
equations is that the endogenous variables are correlated with the
errors.
One solution: abandon Y2 or Y3 , rather, use some other variables
instead.
We want these other variables to be (highly) correlated with Y2 and
Y3 , but not correlated with the errors - the INSTRUMENTS.
Say, some suitable instruments for Y2 and Y3 , z2 and z3 respectively.
We do not use the instruments directly, but run regressions of the
form:
Y2 = 1 + 2 z2 + 1

(27)

Y3 = 3 + 4 z3 + 2

(28)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

If the disturbances in the structural equations are autocorrelated,


the 2SLS estimator is not even consistent.

T OPIC 7 M ULTIVARIATE M ODELS

I NSTRUMENTAL VARIABLES (IV)

J IANHUA G ANG (RUC)

It is still of concern in the context of simultaneous systems whether


the CLRM assumptions are supported by the data.

30 / 52

b2 and Y
b3 , and replace Y2
Obtain the fitted values from (27) & (28), Y
and Y3 with these in the structural equation.
We do not use the instruments directly in the structural equation.
It is typical to use more than one instrument per endogenous
variable.
If the instruments are the variables in the reduced form equations,
then IV is equivalent to 2SLS.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

31 / 52

T OPIC 7 M ULTIVARIATE M ODELS

I NSTRUMENTAL VARIABLES (IV)

T OPIC 7 M ULTIVARIATE M ODELS

I NSTRUMENTAL VARIABLES (IV)

V ECTOR A UTOREGRESSIVE (VAR) M ODELS

What happens if we use IV/2SLS unnecessarily?


The coefficient estimates will still be consistent, but will be
inefficient compared to those that just used OLS directly.
The Problem With IV:

A natural generalisation of autoregressive models.


VAR: a systems regression model i.e. there is more than one
dependent variable.
y1t

P ROBLEM
What are the instruments?

y2t

S OLUTION
Solution: 2SLS is easier.
Other Estimation Techniques:
1
2

3SLS - allows for non-zero covariances between the error terms.


LIML-(Limited Information ML) estimating reduced form equations
by maximum likelihood.
FIML-(Full Information ML) estimating all the equations
simultaneously using maximum likelihood.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

32 / 52

Yt = 0 + 1 Yt 1 + ut
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

33 / 52

VAR V. S . S TRUCTURAL E QUATIONS M ODELS

VAR V. S . S TRUCTURAL E QUATIONS M ODELS

No need to specify endogeneity/exogeneity: all are (weakly)


endogenous!
Allows a variable to depend on more than just its own lags or
combinations of white noise terms, so more general than ARMA
modelling.
Provided that there are no contemporaneous terms on the R.H.S.
of the equations, Can simply use OLS separately on each equation.
Forecasts are often better than traditional structural models.

J IANHUA G ANG (RUC)

or even more compactly as

T OPIC 7 M ULTIVARIATE M ODELS

Advantages of VAR Modelling

= 10 + 11 y1t 1 + 11 y2t 1 + u1t


= 20 + 21 y2t 1 + 21 y1t 1 + u2t

One important feature of VARs is the compactness, e.g. bivariate


VAR(1):

  
    
10
y1t 1
u1t
y1t
11 11
=
+
+

y2t
20
y2t 1
u2t
21
21

VAR V. S . S TRUCTURAL E QUATIONS M ODELS

VAR V. S . S TRUCTURAL E QUATIONS M ODELS

V ECTOR A UTOREGRESSIVE (VAR) M ODELS

34 / 52

Problems with VARs


1

2
3
4
5

VARs are theoretical (as are ARMA models). What if not the VAR
process?
How to decide the appropriate lag length?
So many parameters to estimate!
Do we need to ensure all components of the VAR are stationary?
How do we interpret the coefficients?

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

35 / 52

T OPIC 7 M ULTIVARIATE M ODELS

O PTIMAL L AG L ENGTH FOR VAR

T OPIC 7 M ULTIVARIATE M ODELS

C HOOSE O PTIMAL L AG L ENGTH FOR


VAR(C ROSS -E QUATION R ESTRICTIONS )

O PTIMAL L AG L ENGTH FOR VAR

C HOOSE O PTIMAL L AG L ENGTH FOR


VAR(C ROSS -E QUATION R ESTRICTIONS )

In the spirit of (unrestricted) VAR modelling, each equation should


have the same lag length.
Suppose that a bivariate VAR(8), and we want to examine a
restriction that the coefficients on lags 5 through 8 are jointly zero.
(de facto, H0 : VAR (4) against HA : VAR (8))
This can be done using a likelihood ratio (LR) test.

Denote the variance-covariance matrix of residuals (given by u


bu
b /T ),
b
as . The LR test statistic for this joint hypothesis is:
i
h
b
b
LR = T ln
r ln u
LR is asymptotically distributed as a:

2 (q ), q = # of restrictions
In the our case above we restrict 4 lags of two variables in each of the
two equations:4 2 2 = 16 restrictions.
P ROBLEM
Conducting the LR test is cumbersome and requires a normality
assumption for the disturbances

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

36 / 52

O PTIMAL L AG L ENGTH FOR VAR

37 / 52

F EEDBACK E FFECT AND P RIMITIVE F ORM OF VAR S

F EEDBACK E FFECT AND P RIMITIVE F ORM OF VAR S

What if the equations had a contemporaneous feedback term?

Multivariate versions of the information criteria can be defined as:



b
MAIC = ln
+ 2k/T
k
b
MSBIC = ln
+ ln(T )
T
2k
b
MHQIC = ln +
ln(ln T )
T

y1t
y2t

= 10 + 11 y1t 1 + 11 y2t 1 + 12 y2t + u1t


= 20 + 21 y2t 1 + 21 y1t 1 + 22 y1t + u2t

in compact form:

 
   
    
10
y1t 1
y2t
u1t
y1t
11 11
12 0
=
+
+
+
0 22
21 21
y2t
20
y2t 1
y1t
u2t

k is the total number of regressors in all equations.

The values of the information criteria are constructed for 0,1,... lags
(up to some pre-specified maximum kmax ).

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

C HOOSE O PTIMAL L AG L ENGTH FOR


VAR(I NFORMATION C RITERIA )

J IANHUA G ANG (RUC)

J IANHUA G ANG (RUC)

38 / 52

This VAR is in primitive form.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

39 / 52

T OPIC 7 M ULTIVARIATE M ODELS

P RIMITIVE V. S . S TANDARD F ORM VAR S

T OPIC 7 M ULTIVARIATE M ODELS

P RIMITIVE V. S . S TANDARD F ORM VAR S


1

B LOCK S IGNIFICANCE AND C AUSALITY T ESTS

We can take the contemporaneous terms over to the L.H.S. and write
    


  
y1t
10
y1t 1
u1t
1
12
11 11
=
+
+
22
1
21 21
y2t
20
y2t 1
u2t
BYt = 0 + 1 Yt 1 + ut
We then pre-multiply both sides by B 1 :

or
Yt = A0 + A1 Yt 1 + et
This is known as a standard form VAR, which we can estimate using
ML as before.
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate
S PRING
Models2013

T OPIC 7 M ULTIVARIATE M ODELS

40 / 52

G RANGER C AUSALITY T ESTS

These tests could also be referred to as Granger causality tests.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

41 / 52

I MPULSE R ESPONSES

VAR models are often difficult to interpret: one solution is to


construct the impulse responses and variance decompositions.
Impulse responses trace out the responsiveness of the dependent
variables in the VAR to shocks in the error term. A unit shock is
applied to each variable and its effects are noted.
Consider for example a simple bivariate VAR(1):

Granger causality tests seek to answer questions such as Do


changes in y1 cause changes in y2 ?

J IANHUA G ANG (RUC)

Implied Restriction
21 = 0 and 21 = 0 and 21 = 0
11 = 0 and 11 = 0 and 11 = 0
12 = 0 and 12 = 0 and 12 = 0
22 = 0 and 22 = 0 and 22 = 0

I MPULSE R ESPONSES

Each of these four joint hypotheses can be tested within the F-test
framework, since each set of restrictions contains only parameters
drawn from one equation.

Hypothesis
1. Lags of y1t do not explain current y2t
2. Lags of y1t do not explain current y1t
3. Lags of y2t do not explain current y1t
4. Lags of y2t do not explain current y2t

T OPIC 7 M ULTIVARIATE M ODELS

G RANGER C AUSALITY T ESTS

12 y1t 1 11 12 y1t 2 11 12 y1t 3 u1t


+
+
+

22 y 2t 1 21 22 y 2t 2 21 22 y 2t 3 u 2t

We might be interested in testing the following hypotheses, and their


implied restrictions on the parameter matrices:

Yt = B 1 0 + B 1 1 Yt 1 + B 1 ut

J IANHUA G ANG (RUC)

It is likely that, when a VAR includes many lags of variables, it will be


difficult to see which sets of variables have significant effects on
each dependent variable and which do not. For illustration, consider
the following bivariate VAR(3):
y1t 10 11
+
=
y 2t 20 21

or
2

B LOCK S IGNIFICANCE AND C AUSALITY T ESTS

y1t = 10 + 11 y1t 1 + 11 y2 t 1 + u1t

If y1 causes y2 , lags of y1 should be significant in the equation


for y2 . If this is the case, we say that y1 Granger-causes y2 .
If y2 causes y1 , lags of y2 should be significant in the equation for y1 .
If both sets of lags are significant, there is bi-directional causality

y2 t = 20 + 21 y2 t 1 + 21 y1t 1 + u2 t

A change in u1t will immediately change y1 . It will change y2 and also


y1 during the next period.
We can examine how long and to what degree a shock to a given
equation has on all of the variables in the system.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

42 / 52

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

43 / 52

T OPIC 7 M ULTIVARIATE M ODELS

VARIANCE D ECOMPOSITIONS

T OPIC 7 M ULTIVARIATE M ODELS

VARIANCE D ECOMPOSITIONS

I MPULSE R ESPONSES AND VARIANCE


D ECOMPOSITIONS : T HE O RDERING OF THE
VARIABLES

Variance decompositions offer a slightly different method of


examining VAR dynamics. They give the proportion of the
movements in the dependent variables that are due to their own
shocks, versus shocks to the other variables.
This is done by determining how much of the s step ahead forecast
error variance for each variable is explained in innovations to each
explanatory variable (s = 1,2,. . . ).
The variance decomposition gives information about the relative
importance of each shock to the variables in the VAR.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

T HE O RDERING OF THE VARIABLES

44 / 52

For calculating IRs and variance decompositions, the ordering of the


variables is important.
Main reason: VAR errors often violate the independence of one
another. Instead, they typically correlates to some degree.
Therefore, the notion of examining the effect of the innovations
separately has little meaning, since they have a common
component.
Thus, must orthogonalise the innovations.
In the bivariate VAR, this problem would be approached by
attributing all of the effect of the common component to the first of
the two variables in the VAR.
In the general case where there are more variables, the situation is
more complex but the interpretation is the same.
J IANHUA G ANG (RUC)

L AG N UMBER S ELECTION

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013
L AG N UMBER S ELECTION

L AG N UMBER S ELECTION

45 / 52

T ESTS OF R ANDOMNESS

T ESTS OF R ANDOMNESS

If Yt is i.i.d. (and has finite variance) then 1 , ..., k are all 0.


How do we choose the lags p, q in an ARMA(p, q ) model?
1

By by looking at the sample autocorrelations and the sample partial


autocorrelations, and trying to recognize the pattern of a model with
given p, q.
By using an automatic selection criterion (information criterion).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

46 / 52

Then, the sample autocorrelations (b


j , b
h , j 6= h, j 1, h 1) are
asymptotically independent and

Tb
j d N (0, 1) ,(j 1)

We can use this property to design two tests to check if the data are
independently distributed.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

47 / 52

L AG N UMBER S ELECTION

T ESTS OF R ANDOMNESS

L AG N UMBER S ELECTION

T EST OF R ANDOMNESS

P ORTMANTEAU TEST

P ORTMANTEAU TEST

We can also test a group of k autocorrelations jointly : under the


null,
This test is so simple that it can be inspected
so the
visually,
computers usually plots two error bars at 1.96/ T with the sample
autocorrelation function.
(Notice: although it is called "test for randomness" by some
computer softwares and some references, a more appropriate name
would be "test for independent distribution").

b2j d

2k

j =1

(this test may be of particular interest when we suspect a


seasonal structure in the data: for example with quarterly data the
first three autocorrelations may be zero, and then the fourth one may
be non-zero). (The test may be sensitive to the choice of k on some
occasions).
The tests for independent distribution and the Portmanteau test
may provide preliminary information about the sample AC.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013
L AG N UMBER S ELECTION

48 / 52

J IANHUA G ANG (RUC)

M ODEL S ELECTION : I NFORMATION C RITERIA

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013
L AG N UMBER S ELECTION

M ODEL S ELECTION : I NFORMATION C RITERIA

49 / 52

M ODEL S ELECTION : I NFORMATION C RITERIA

M ODEL S ELECTION : I NFORMATION C RITERIA


The solution: add a penalty which increases with p and q.
IC = 2l (b
) + penalty

2(p + q )
Akaike IC
penalty :
(p + q ) ln(T ) Bayes IC

An automatic way to select p, q. The idea: use "maximum


likelihood" to choose p, q.
The problem: if you compare an ARMA(p, q ) with an
ARMA(p + 1, q ), the ARMA(p, q ) has always smaller likelihood.
This is because the estimate from the ARMA(p, q ) model maximises
bp +1 = 0,while the
the likelihood with the constraint that
ARMA(p + 1, q ) does not impose that constraint, so the
bp +1 = 0,
ARMA(p + 1, q ) has higher maximum likelihood unless
exaclty (which is an event with probability zero in finite sample even
when the true p;0 = 0 actually) (Notice analogy with regression here:
when you increase the number of regressors, the R 2 does not decrease,
and in general increases, even when the regressors are irrelevant).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

50 / 52

BIC: consistent estimation of p, q.


AIC: inconsistent estimation of p, q (may select larger than correct
p, q in large samples).
Both BIC and AIC may select smaller then correct p, q in finite
samples (this however is not necessarily a bad thing: it may result,
in small samples).

An alternative approach: of course, we can also compare an


ARMA(p, q ) with an ARMA(p + 1, q ), or with an ARMA(p, q + 1),
using a likelihood ratio test. The criterion is then adding lags as
long as the likelihood ratio test statistic is above a user-chosen critical
value (for example, 5% significance would have c.v. 3. 84)
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

51 / 52

L AG N UMBER S ELECTION

PARSIMONIOUS M ODELLING

PARSIMONIOUS M ODELLING
Large econometrics models tend to do badly in terms of
forecasting, and are outperfomed by small ARMA models (Box
& Jenkins).
Even in ARMA models, increasing the number of parameters reduces
the precision of with which each parameter is estimated. This is
beacuse when the parameters are estimated, their variance
contributed to the variance of the forecast.
Adding extra parameters may then help to reduce or eliminate the
forecast bias, but the gain in terms of reduction bias 2 is outweighted
by the loss in increased variance of the forecast.
Should balance the number of estimated parameters and the number
of observations.
Sometimes, Information Criteria have been advocated also to select
more parsimonious models.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 7 Multivariate


S PRING
Models2013

52 / 52

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T OPIC 8 T RENDING VARIABLES AND


C O - INTEGRATION
I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 8 Trending and Co-Integration

In previous lectures, it has been assumed that all data are


trend-free.

3 C REDITS , 51 H OURS

However, in many cases, this assumption turns out to be


inappropriate,

Jianhua Gang

E XAMPLE
Examine the series for consumption and income. The presence of trends
can sometimes invalidate the usual asymptotic theory for OLS and test
procedures.

School of Finance
Renmin University of China

Spring 2013

A discussion of trends and related topics such as tests for unit roots
and cointegration is, therefore, required.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

1 / 49

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS


Applied workers often specify models that include both lagged values
of the dependent variable and a distributed lag component in the
regression function. These models are called autoregressive
distributed lag (ADL) models.

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

Applied workers often specify models that include both lagged values
of the dependent variable and a distributed lag component in the
regression function. These models are called autoregressive
distributed lag (ADL) models.
A very simple ADL relationship is
yt = + 1 yt 1 + 0 xt + 1 xt 1 + ut , |1 | < 1.

yt = + 1 yt 1 + 0 xt + 1 xt 1 + ut , |1 | < 1.

The short-run multiplier is E (yt )/xt = 0 .


The cumulative effect corresponding to the long-run multiplier is

The autoregressive component implies that a change in xt affects yt


and all future values of yt .
It can be shown that
E (yt +j )/xt = 1j 1 ( 1 + 1 0 ) , j = 1, 2, ...such terms are called
dynamic multipliers.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

2 / 49

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

A very simple ADL relationship is

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

J IANHUA G ANG (RUC)

E (yt +j )/xt = ( 0 + 1 ) /(1 1 ) = ,

j =0

then, say the slope of the long-run relationship and the


intercept is = /(1 1 ).
3 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

4 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

A UTOREGRESSIVE DISTRIBUTED LAG (ADL) MODELS


It is useful to note that the simple ADL can be written in a
mathematically equivalent form that has a parameterization of direct
economic interest, called the error correction model (ECM).
The ECM is derived from the ADL as follows:
yt = + 1 yt 1 + 0 xt + 1 xt 1 + ut ,

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

A UTOREGRESSIVE DISTRIBUTED LAG (ADL) MODELS

The ECM is nonlinear in the coefficients 0 , 1 , , and , all of which


have meanings,
The coefficient 0 measures the contemporaneous effect.
The coefficient 1 can be thought of as reflecting speed of
adjustment.
The coefficients and are the intercept and the slope of the
long-run relationship.

can be written as

(yt + yt 1 ) = + 1 yt 1 + 0 (xt + xt 1 ) + 1 xt 1 + ut ,
yt = 0 xt (1 1 ) [yt 1 xt 1 ] + ut
Thus the ECM has first differences in y linked to first differences in x
and the extent by which y deviates from the long-run expected value in
the previous period. (Martingale)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

5 / 49

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

If the OLS estimates of the ADL are denoted byband the nonlinear
least squares estimates of the ECM are denoted bye, then it can
be shown that,

e =

e
=

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

A UTOREGRESSIVE DISTRIBUTED LAG (ADL) MODELS

b
0
0 = e
b
1 = e
1

J IANHUA G ANG (RUC)

6 / 49

A UTOREGRESSIVE D ISTRIBUTED L AG (ADL) M ODELS

A UTOREGRESSIVE DISTRIBUTED LAG (ADL) MODELS

For future reference, it is important to note that if 0 and 1 are


estimated by applying OLS to
yt = 0 xt (1 1 ) [yt 1 xt 1 ] + ut

(1 b1 )
1 )
(b
0 + b
(1 b1 )

the estimates of these parameters are b


0 = e
0 and
b
e
1 = 1 , respectively .

This two-step approach can play an important role when the data
contain trends and will be discussed later in further detail.

While OLS estimation of the ADL and nonlinear least squares


estimation of the ECM yield the same point estimates, the latter
method (NLS) has the advantage of giving estimated standard errors
for estimates of long-run parameters as part of standard output.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

7 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

8 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T RENDING VARIABLES

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T RENDING VARIABLES

T RENDING VARIABLES

In the past, attention has been restricted to (covariance) stationary


processes. Recall that a time series variable zt is covariance stationary
if E (zt ) = z , var (zt ) = 2z ,and cov (zt , zt g ) = (|g |) for all t.

A variable zt follows a simple random walk if

Two models of trends are discussed.


1

zt = zt zt 1 = ut , ut NID (0, 2 ),

First, zt is said to be a trend stationary process if zt = f (t ) + ut ,in


which f (.) is a deterministic function and ut is a stationary process.
(Often linear trend t);
Second, zt is said to be a difference stationary process if
zt = zt zt 1 = ut ,where ut is a stationary process.

Series that can be differenced to obtain stationary variables are called


integrated series and it is useful to adopt the following terminology
and notation: a stationary process is denoted I (0) (integrated of order
zero); and a series zt is said to be integrated of order d if d zt I (0).
J IANHUA G ANG (RUC)

T RENDING VARIABLES

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending andSCo-Integration


PRING 2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

9 / 49

T RENDING VARIABLES

Hence,

zt =

us + z0 .
1

If z0 is zero, then, zt is sum of current and past innovations.


Clearly zt I (1) and zt I (0).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T RENDING VARIABLES

(1)

10 / 49

I(0) AND I(1) P ROCESSES

D IFFERENCES BETWEEN I(0) AND I(1) VARIABLES


We are able to illustrate the differences using two simple models:

Model (1) can be extended so that zt has a nonzero mean, i.e. to


allow for a drift,
zt = zt zt 1 = a + ut , ut NID (0, 2 ),

1. I (0) : Let zt be an I (0) variate generated by the stable AR (1) model


zt = zt 1 + ut , || < 1, ut NID (0, 2 ).
It can be proved that for whatever value of t:

so that, if z0 = 0, then,

E (zt )

var (zt )

corr (zt , zt s )

zt =

us + at,

(2)

and equation (2) implies the existence of both deterministic and


stochastic trend components.

0;
h

i
2 / 1 2 < ;
s s 0

Thus this I (0) variable has constant mean, constant variance (hence
large departures are rare), and autocorrelations decline as the order
increases.

Also can write zt =

j ut j ,so low weights are given to distant past


0

with j tending rapidly to zero as j , i .e.finite memory.


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

11 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

12 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

I(0) AND I(1) P ROCESSES

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

D IFFERENCES BETWEEN I(0) AND I(1) VARIABLES

E XAMPLES OF I(1) AND I(0)

GNP D EFLATOR AS EXAMPLE : (I(2) PROCESS )

We are able to illustrate the differences using two simple models:


2. I (1) :Let zt be an I (1) variate generated by the random walk
zt = zt zt 1 = ut , ut NID (0, 2 )
t

where, zt =

us ,if z0 = 0.
1

Hence,
var (zt )

E (zt )

=
=

corr (zt , zt g )

0,
t2 , (monotonic on t )
q
(t g )/ t (t g ), (dependence on t )

Even if g is large, corr (zt , zt g ) can be close to 1 for t g .Clearly zt


is nonstationary.
t

Note that zt =

us

implies an innovation affects all later values of

zt ,i.e. there is an infinitely long memory.


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

13 / 49

E XAMPLES OF I(1) AND I(0)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

GNP D EFLATOR AS EXAMPLE : (AC FUNCTION )

14 / 49

C ONSEQUENCES FOR OLS ANALYSIS

C ONSEQUENCES FOR OLS ANALYSIS


Analysis is relatively straightforward when variables are
trend-stationary with linear trends. In multiple regression with several
regressors having linear trends, adding a trend term to the basic
model and fitting
yt = + j xjt + t + ut , ut i.i.d.(0, 2 ),
j

which provides a basis for valid estimation and inference, with the
additional regressor serving as a trend-removing agent.
However, it has been established that the asymptotic theory of
OLS estimators and tests developed for I (0) variables can be
misleading when applied to data from I (1) processes. With
nonstationary variables, OLS estimators may tend to nonstandard
distributions, rather than normality, as n .
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

15 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

16 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C ONSEQUENCES FOR OLS ANALYSIS

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C ONSEQUENCES FOR OLS ANALYSIS

T ESTING FOR U NIT R OOTS

It is therefore important to test whether or not variables are generated


by I (1) difference stationary processes.
Many results have been derived for testing the null hypothesis that
a series is difference stationary against the alternative of
covariance stationarity.
- Consider, a simple AR (1) model
2

- If

= 1 zt I (1)
< 1 zt I (0)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

Let DP and RE denote the true data process and the regression
equation used to compute the test, respectively. Will consider
three cases.
C ASE 1 H1 : zt stable AR (1) with zero mean.

C ASE 2 H1 : zt stable AR (1) with constant nonzero mean.

- Can define the polynomial ( ) = 1 .Solving


( ) = 1 = 0 yields the root = 1/, which is unity if = 1.
- Hence we talk of testing for a unit root. Several tests for unit roots
are based upon work by Dickey and Fuller (hereafter DF).
J IANHUA G ANG (RUC)

The correct identification of any deterministic component is crucial.

DP(1) zt = ut , ut = ut 1 + t , t NID (0, 2 ),with 1 < 1


and H0 : = 1.Apply OLS to,
RE(1) zt = ( 1)zt 1 + t .

zt = zt 1 + ut , ut NID (0, ).

||

U NIT R OOT T ESTS

17 / 49

U NIT R OOT T ESTS

DP(2) zt = 0 + ut , ut = ut 1 + t , t NID (0, 2 ),with


1 < 1 and H0 : = 1.Apply OLS to,
RE(2) zt = + ( 1)zt 1 + t , 0 (1 ).H0 implies that
= 1 = 0.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T ESTING FOR U NIT R OOTS

18 / 49

U NIT R OOT T ESTS

T ESTING FOR U NIT R OOTS


b 1) denote an OLS estimator (same notation for all REs).
Let (
b 1) or
Can base tests on either K (1) = n(
b 1)/SE (
b 1) = (
b 1)/SE (
b ).For RE(3), can also
t (1) = (
use F(0, 1), the F statistic for testing the two restrictions of
= 1 = 0.

C ASE 3 H1 : zt linear trend + stable AR (1)


DP(3) zt = 0 + 1 t + ut , ut = ut 1 + t , t NID (0, 2 ),with
1 < 1 and H0 : = 1.Apply OLS to,
RE(3) zt = + t + ( 1)zt 1 + t ,in which
[ 0 (1 ) + 1 ] and 1 (1 ).H0 implies that = 0.

Only t (1) type tests are to be considered. DF show that,


provided DP and RE are correctly matched, these test statistics have
nonstandard asymptotic distributions under H0 .DF also provide
estimated critical values for each of the three cases given
above.
If get incorrect matching of RE with DP, results can be quite
different and wrong.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

19 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

20 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C RITICAL VALUES OF DF- TAU TESTS :

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T ESTING FOR U NIT R OOTS -C RITICAL VALUES OF


DF- TAU TESTS :

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C RITICAL VALUES OF DF- GAMMA TESTS :

T ESTING FOR U NIT R OOTS -C RITICAL VALUES OF


DF- GAMMA TESTS :

21 / 49

A UGMENTED D ICKEY F ULLER (ADF) T EST

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

A UGMENTED D ICKEY F ULLER (ADF) T EST

22 / 49

P ROBLEMS OF ADF T ESTS

P ROBLEMS OF ADF T ESTS

Useful to relax assumption that, under H0 ,the zt are independent.


A fairly general specification is that, under the unit root hypothesis,
zt follows a stationary mixed autoregressive-moving average process
(l )zt = (l )t , t i.i.d.(0, 2 )
in which (l ) and (l ) are polynomials in the lag operator.
Often approximate the autocorrelation of the zt by autoregressive
model to obtain the Augmented-DF (ADF) test.
For example, the ADF form corresponding to RE(3) can be written as
p

zt = A + A t + (A 1)zt 1 + j zt j + t
1

it is asymptotically valid, for each of the three combinations of DP and


RE, to use the same critical values for ADF and DF.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

23 / 49

Schwert points out that it may be difficult to obtain a satisfactory


autoregressive approximation to the serial correlated when zt has a
moving average component.
Choi finds that the use of a large value of p in ADF tests can lead to
low power.
More generally, studies of power indicate that it may be very difficult
to discriminate between trend stationary and difference stationary
processes, e.g. DF test for unit roots have low power when the data
are trend stationary.
Perron finds that if the data have segmented trends, e.g. structural
breaks, then unit root test lack power.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

24 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATION

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATION

C O - INTEGRATION

The recognition of the existence of unit roots coupled with ideas


about long run equilibrium relationships leads to the study of
co-integration. A simple case of a single relationship will be
considered. More general treatment are available and the book by
Banerjee et al. (1994) contains many useful discussions and
references; also see the book by Harris (1995).

Definition: The variable of (z1t , z2t , ..., zmt ) are said to be


co-integrated if:
1

zit I (d ), d > 0, i;
2

The topics to be covered are: the definition of co-integration and its


links with equilibrium; testing for the absence of co-integration; and
Grangers Representation theorem which concerns the Error
Correction Models.
Co-integration theory was developed by Engle and Granger (1987)
(Nobel Prize in Economics, 2003)

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

25 / 49

C O - INTEGRATION AND E QUILIBRIUM R ELATIONSHIPS

Let yt and xt denote consumption and income, respectively. Suppose


that both variables are I (1).Consider ut defined by
yt = + xt + ut .In general, a linear combination of I (1)
variables, such as ut = yt ( + xt ) is also I (1).
Engle and Granger argue that there cannot be a meaningful
equilibrium relationship between yt and xt unless ut I (0),since an
I (1) error will wander widely and rarely cross the line through zero.
Thus co-integration is sometimes viewed as being required for a
certain type of equilibrium relationship.
However, economists might wish to include I (0) variables in an
equilibrium relationship, as well as I (1) regressors.

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

And there exist weights 1 , 2 , ..., m such that


at =

j zjt I (d b ), d b > 0, with some j

27 / 49

6= 0.

If 1 and 2 are satisfied, then, in general, we write


(z1t , z2t , ..., zmt ) CI (d, b ).
Will only consider the case d = b = 1,i.e. a linear combination of I (1)
processes is a stationary I (0) variable.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATION AND E QUILIBRIUM R ELATIONSHIPS

J IANHUA G ANG (RUC)

C O - INTEGRATION

26 / 49

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

For technical reasons, the null hypothesis is taken to be no


co-integration, so that, using the simple example above, the true ut
is I (1)(in which case, first difference the data and then apply classical
methods for stationary processes).
Thus, if and were known, could calculate the ut and apply an
ADF test for a unit root. If the test indicated the rejection of the unit
root restriction, the evidence could be viewed as supporting the
assumption of co-integration with ut I (0),but the parameters are, of
course, unknown.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

28 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

The parameters and can be estimated by applying OLS - this is


known as fitting the co-integrating regression (CR). If yt and xt are
co-integrated, the OLS estimator of from the CR exhibits a property
known as superconsistency because, as n , it approaches the true
value at a faster rate than in the classical stationary variables case.
However, in small samples, biases may be important.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

29 / 49

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

Having estimated the CR, the associated OLS residuals u


bt can be
used in a test for a unit root in the process ut ,which is a test for the
absence of co-integration. Two simple tests can be used with the u
bt .
One procedure is to compute the DW statistic which should be close
to zero under the unit root hypothesis. This check is not
recommended.

J IANHUA G ANG (RUC)

30 / 49

C O - INTEGRATION E XAMPLE 1: PPP T HEORY

C O - INTEGRATION E XAMPLE 1: PPP T HEORY

The second approach involves applying the ADF test in t-ratio form
after OLS estimation of the equation
p

b
ut = b
ut 1 + j b
ut j + et ,
1

Consider for example the PPP theory for the ex-change rate. In
perfect markets there are no arbitrage opportunities so the exchange
rate R is determined by the relative movements of the domestic price
level P and the foreign price level. P* i.e.

in which p is selected so that et appears to be a sequence of i.i.d.


variables.

R=

This test is denoted CRADF. The DF tables are not valid for CRADF.
Asymptotic distributions under the unit root hypothesis depend upon
the number of I (1) regressors and whether or not the CR includes an
intercept and/or a trend term.
Finite sample critical values have been estimated by computer
method for various cases and are availble in some estimation
programs, e.g. PcGive.
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and
S PRING
Co-Integration
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

J IANHUA G ANG (RUC)

T ESTING FOR THE A BSENCE OF C O - INTEGRATION

31 / 49

P
r = p p (in log)
P

This can be seen as an long equilibrium. Data like exchange rates and
inflation levels are usually I (1). So they are quite volatile. However if
the PPP theory is correct they should not drift apart a lot over time
i.e.
r (p p ) small

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

32 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATION E XAMPLE 1: PPP T HEORY

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATION E XAMPLE 1: PPP T HEORY

C O - INTEGRATION E XAMPLE 2: T HE M ONEY D EMAND

C O - INTEGRATION E XAMPLE 2: T HE M ONEY D EMAND


Suppose: mt real money supply; rt interest rate; t inflation; Yt real
income.

If in fact there exists a vector of coefficients = ( 1 , 2 , 3 )such


that
1 rt 2 pt 3 pt = I (0)

Theories for demand for money suggest that

then rt , pt ,and pt are said to co-integrate and is the cointegrating


vector. Note: A co-integrating vector does not always exist, or there
might be more than one co-integrating vectors.

Moreover in practice the variables above are usually I (1) so they may
co-integrate.

mt = 1 rt + 2 t + 3 Yt

So now, in order to run out the cointegrating vector. One should:


1
2
3

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

33 / 49

J OHANSEN (1988) A PPROACH

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

34 / 49

J OHANSEN (1988) A PPROACH

J OHANSEN (1988) A PPROACH

In the previous lecture we have seen how we can test for


co-integration using the EG methodology. The EG approach involves
some serious drawbacks.
1 Suppose for example that x1t and x2t are I (1) and we want to test for
co-integration between those two variables. Recall the way to do it is
estimate the following regression by OLS1
x1t = b
1 + b
2 x2t + u
bt

and apply DF test on the residuals u


bt . Because x1t and x2t can be
treated in a symmetric fashion, hence an alternative regression can be
3 + b
4 x1t + bt
x2t = b

and do the same thing on bt .Theoretically the two approaches are


equivelant and they should give the same answer when the sample
used is large. In practice however, they may give different answers
because the sample sizes used are not large enough.
I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and
S PRING
Co-Integration
2013

J IANHUA G ANG (RUC)

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

J OHANSEN (1988) A PPROACH

J IANHUA G ANG (RUC)

Use DF test to make sure that all the variables are I (1).
Use OLS to estimate the model mt = c
1 rt + c
2 t + c
3 Yt + ubt .
Conduct test on the residual. If there is co-integration, then the
residuals must be stationary, otherwise the residuals will be I (1).

35 / 49

2 Moreover when m variables co-integrate, it is possible to have more


than one distinct co-integrating relationships (this number is actually
up to (m 1) co-integrating vectors can be found). The EG
methodology cannot estimate distinct co-integrating relationships.
The Johansen (1988) procedure provides a framework that
circumvents those problems. The Johansen approach involves
estimation of a system of equations rather than a single equation.
Before we consider this approach, we need to introduce the VAR and
VECM.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

36 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

VAR M ODEL

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

VAR M ODEL

VAR M ODEL

VAR M ODEL
The VAR model is, as the name suggests, an autoregression of a
vector process. Consider a simplest example of a VAR. This is a
two-variable VAR model with lag of first order (VAR(1)).

  
  
1t
y1t
11 12 y1t 1
=
+
21 22 y2t 1
y2t
2t

Recall the AR(p) model is yt =

i yt i + t , and it can be
i =1

reparameterized as

yt = yt 1 + i yt i + t
i =2

or

It can be re-written as a more compact expression


Yt = 1 Yt 1 + t

p 1

yt = yt p +

ci yt i + t

Now, to generalize, the VAR(p) is then

i =1

Scalar autoregressive models are inapropriate for co-intgration


analysis, as they involve only one variable (yt ). But co-integration
involves more than one variables.

Yt =

i Yt i + t
i =1

where Yt = [y1t , ..., ymt ] and i is (m m) matrices of coefficients.


J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

37 / 49

VECM

J IANHUA G ANG (RUC)

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

R EPARAMETERISING THE VAR: T HE VECM

Yt = 1 Yt 1 + t

i Yt i + t

This expression can be easily rearranged into

i =1

where i = I

j
j =1

1 Yt 1 = Yt t
.

The model is called Vector Error Correction model (VECM). Notice


that the only term in levels is Yt p The rest of the terms appear in
differences. As it will be explained in detail later, the Johansen
approach relies on the VECM.

J IANHUA G ANG (RUC)

VECM, I(1), AND C O - INTEGRATION

Consider the VECM representation of VAR(1) on two component I (1)


series. Yt = [yy1t2t ]I (1).

p 1

38 / 49

T HE VECM, I(1) PROCESSES AND C O - INTEGRATION

Just like the scalar AR(p) model, the VAR(p) model can finally be
reparameterised as follows
Yt = p Yt p +

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

39 / 49

Note: the right hand side is I (0) so 1 Yt 1 must be I (0) as well i.e.
the rows of the matrix 1 are co-integrating vectors and y1t and y2t
co-integrate. The rank of 1 gives the number of the linearly
independent co-integrating vectors.
Note that m = 2 so we cannot have more than one linearly
independent co-integrating vectors.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

40 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

VECM, I(1), AND C O - INTEGRATION

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T HE VECM, I(1) PROCESSES AND CO - INTEGRATION

The result from last slide can be generalized easily to higher order
VECMs. Consider the model as before and suppose that
Y (t ) = I (1). Then,
1
2

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

At the centre of the subsequent analysis is the matrix p .In particular


we are interested in the rank of p .

41 / 49

VECM P ROPERTIES R EQUIRED FOR C O - INTEGRAITON

r = m : all component series of Yt are I (0),so co-integration is not an


issue.
0 < r < m : all component series are at least I (1) and co-integration
exists.
r = 0 : all component seris are I (1), but co-integration does not exists.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

42 / 49

T RACE T ESTS AND M AX T ESTS FOR C O - INTEGRATION V ECTORS

T RACE T ESTS AND M AX T ESTS FOR C O - INTEGRATION


V ECTORS

Let the rank of p be r . Then the following hold:

J IANHUA G ANG (RUC)

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

P ROPERTIES OF VECM REQUIRED FOR


CO - INTEGRATION

P ROPERTIES OF VECM REQUIRED FOR


CO - INTEGRATION

So far we have considered what co-integration implies for the


properties of VECM. Now reverse the question and ask which
properties of the VECM imply co-integration.

The rows of p are co-integrating vectors of Yt p .


rank (p ) = r , where r is the number of linearly independent
co-integrating vectors.
Since r m 1, p is of reduced rank (singular).

J IANHUA G ANG (RUC)

VECM P ROPERTIES R EQUIRED FOR C O - INTEGRAITON

43 / 49

Under Johansens approach, the test statistics for co-integration are


formulated as
g
bi )
trace (r ) = T ln(1
i =r +1

and

b r +1 )
max (r , r + 1) = T ln(1

b i is the estimated value for the ith. ordered eigenvalue from


where
the p matrix, r is the rank of matrix p , T is the number of
observations, and g is the dimension of the p .

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

44 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

T RACE T ESTS AND M AX T ESTS FOR C O - INTEGRATION V ECTORS

T RACE T ESTS AND M AX T ESTS FOR C O - INTEGRATION


V ECTORS

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

C O - INTEGRATING V ECTORS

O BTAINING LINEARLY INDEPENDENT


CO - INTEGRATING VECTORS FROM THE VECM
Recall that a reduced rank matrix can be decomposed into a product
of two full rank matrices. If co-integration exists then the m m
matrix p is of reduced rank (r < m ) and can be expressed as

The trace (r ) tests the null that the number of co-integrating vectors
is less than or equal to r against an unspecified alternative, while
the max (r , r + 1) tests the null that the number of co-integrating
vectors is r against an alternative of r + 1.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

45 / 49

C O - INTEGRATING V ECTORS

p =
where , are m r full rank matrices.
Consider, for example, the case of m = 2. Then if y1t , y2t
co-integrate r = 1 and
 

1 
1 2
p = =
2


1 1 1 2
=
2 1 2 2
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

O BTAINING LINEARLY INDEPENDENT


CO - INTEGRATING VECTORS FROM THE VECM

46 / 49

E CONOMIC INTERPRATATION OF THE VECM

E CONOMIC INTERPRATATION OF THE VECM


Error Correction Models (ECM) have been widely used in economics
e.g. theories for the demand for money. The idea is as follows: let xt
be the optimal money balance that the individual wants to hold in
period t. Moreover, let xt the actual money stock. Equilibrium is
attained when xt = xt (long run).
In practice however xt may be different from xt due to adjustment
costs. The disequilibrium error in period t is defined as

Now consider p Yt p ,the term in the VECM



  
1 ( 1 y1t + 2 y2t )
y1t
=
p Yt p =
y2t
2 ( 1 y1t + 2 y2t )
1 y1t + 2 y2t is the co-integrating relationship and defines the
co-integrating vector (when r > 1). The matrix of weightings can
be seen as a matrix of "speed adjustment" coefficients.

et = xt xt
The ECM suggests that xt changes over time to correct disequilibrium
errors that occurred in the past i.e.
xt = et 1
, where is a speed adjustment coefficient.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

47 / 49

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

48 / 49

T OPIC 8 T RENDING VARIABLES AND C O - INTEGRATION

E CONOMIC INTERPRATATION OF THE VECM

E CONOMIC INTERPRATATION OF THE VECM

In the context of VECM, long-run equlibrium relation(s) exists when


there is a co-integration. The long-run relationship is defined by
Yt p . When Yt p = 0 the system is in equilibrium. In the short
run however Yt p 6= 0 and Yt p gives the disequilibrium error.
Yt changes over time in response
1

to the past error ( Yt p ) according to the adjustment coefficients


given by the matrix , and
to past changes Yt 1 ,Yt 2 ...Yt p +1 .

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 8 Trending and


S PRING
Co-Integration
2013

49 / 49

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

T OPIC 9 C AUSALITY, E XOGENEITY, AND S HOCK


I NTRODUCTORY F INANCIAL E CONOMETRICS
Topic 9 Causality, Exogeneity and Shock
3 C REDITS , 51 H OURS
Jianhua Gang
School of Finance
Renmin University of China

Spring 2013

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 1 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

D YNAMIC M ACROECONOMETRIC M ODELS

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 2 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

D YNAMIC M ACROECONOMETRIC M ODELS

D YNAMIC MACROECONOMETRIC MODELS

D YNAMIC MACROECONOMETRIC MODELS

G OALS

O UR DATA

Time-series properties of macro variables;


How certain variables are related to each other (Interactions,
causality);
Systemic dynamics/transmissions of shocks.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 3 / 31

E XAMPLES (M ACRO DATA )


Quantity, Price. These data are the result of aggregation procedures with
respect to economic agents, goods, and time.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 4 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

D YNAMIC M ACROECONOMETRIC M ODELS

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

D YNAMIC M ACROECONOMETRIC M ODELS

D YNAMIC MACROECONOMETRIC MODELS

D YNAMIC MACROECONOMETRIC MODELS

VARIABLES

D EFINITION (L INEAR SYSTEM )

D EFINITION (E NDOGENOUS VARIABLE )

Denote yt , as a vector of n endogenous variables at time t; xt , the vector


of m exogenous variables, then create a linear system:

Some variables that are specific to the phenomenon under study, allow
ones to follow their evolutions.
D EFINITION (E XOGENOUS VARIABLE )
In order to explain the phenomenon, some variables may have influence on
the endogenous variables, and that the values of which are fixed outside
the phenomenon.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 5 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

LINEAR SYSTEM

D YNAMIC M ACROECONOMETRIC M ODELS

A0 yt + A1 yt 1 + ... + Ap yt p + B0 xt + B1 xt 1 + ... + Bp xt p + = 0,
(1)
where Aj , j = 0, 2, ..., p are n n; Bj are n m matrices, and is a n 1
vector. The A0 is supposed to be nonsingular, so that the whole system
allows for a unique determination of the current values of the endogenous
variables.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 6 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

D YNAMIC M ACROECONOMETRIC M ODELS

D YNAMIC MACROECONOMETRIC MODELS

D YNAMIC MACROECONOMETRIC MODELS

E XAMPLE - K EYNESIAN MODEL

E XAMPLE - K EYNESIAN MODEL

Aim of the model: derive the impact on the economy of an


autonomous (exogenously decided) expenditure (Gt ) policy.
Endogenous variables: GDPt , Ct , It .
Exogenous variables: Gt

The system,

GDPt = Ct + It + Gt
Ct = aGDPt 1

It = b (GDPt 1 GDPt 2 )

(2)

where the first equation represents equilibrium of total supply =


total demand; second the demand of consumption (a to be some
fraction between 0 and 1); third the propensity to invest (assuming
growth period).
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 7 / 31

It is hence convenient to rewrite model (2) as in,

GDPt
GDPt 1
1 1 1
0 0 0
0 1
0 Ct a 0 0 Ct 1
0 0
1
b 0 0
It
It 1

GDPt 2
1
0 0 0
+ 0 0 0 Ct 2 0 Gt = 0
0
b 0 0
It 2

J IANHUA G ANG (RUC)

(3)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 8 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

R ANDOMNESS

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

R ANDOMNESS

R ANDOMNESS

R ANDOMNESS
D YNAMICS AND

DISTURBANCES

The dynamic model (3) is deterministic and does not reflect short-run
disturbances.
If the whole dynamics has been correctly included in the initial
specification as in (3), these disturbances should be independent.
With random factors, we may re-write the model (2) as in,

TDt = Ct + It + Gt

GDPt = TDt
C = aGDPt 1 + ut

t
It = b (GDPt 1 GDPt 2 ) + vt

(4)

where, given the equilibrium conditions, the price adjustment ensures


total demand = total supply, while clearly the behaviors (Ct , and
It ) are determined on factors other than just revenue. So we add in
error terms in these behavior equations.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock 9 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C ONTROL AND E NVIRONMENT VARIABLES

Model (4) can be further written into,



 



 
a a
0
0
Ct 1
Ct 2
Ct
=
+
b b
b b
It
It 1
It 2
 
 
 
0
u
a
Gt 2 + t
Gt 1 +
+
vt
b
b

(5)

More compactly,
A0 yt + A1 yt 1 + A2 yt 2 + B0 xt + B1 xt 1 + B2 xt 2 + = t

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock10 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C ONTROL AND E NVIRONMENT VARIABLES

C ONTROL VARIABLES AND ENVIRONMENT VARIABLES

C ONTROL VARIABLES AND ENVIRONMENT VARIABLES

D EFINITIONS

D EFINITIONS

D EFINITION (C ONTROL VAR .)


Exogenous variables that can be controled by policy maker. (a.k.a.
instruments, economic policy, decision variables)
D EFINITION (E NVIRONMENT

VAR .)

It is hence possible to distinguish the difference among exogenous


variables (xt environment, zt control) in the model,
A0 yt + A1 yt 1 + ... + Ap yt p + B0 xt + B1 xt 1 + ... + Bp xt p

+C0 zt + C1 zt 1 + ... + Cp zt p + = t

Other exogenous variables have their own evolution on which we cannot


easily intervene.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock11 / 31

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock12 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C ONTROL AND E NVIRONMENT VARIABLES

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C ONTROL AND E NVIRONMENT VARIABLES

C ONTROL VARIABLES AND ENVIRONMENT VARIABLES

C ONTROL VARIABLES AND ENVIRONMENT VARIABLES

E VOLUTION OF THE ENVIROMENT

B LOCK - RECURSIVE AND

VAR .

Implied assumptions of model (6):

Consider, control var.s are fixed; environment var.s influence on


endogenous var.s is before yt . Therefore,

A0 yt + A1 yt 1 + ... + Ap yt p + B0 xt + B1 xt 1 + ... + Bp xt p

+C0 zt + C1 zt 1 + ... + Cp zt p + = t
xt + D1 xt 1 + ... + Dp xt p + E0 zt + E1 zt 1 + ... + Ep zt p
+F1 yt 1 + ... + Fp yt p + = ut

(6)

where, {t } and {ut } are two mutually uncorrelated W.N. processes.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock13 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

WEAK EXOGENEITY

C ONTROL AND E NVIRONMENT VARIABLES

The control var.s can have an impact on the endogenous var. or the
environment var. However, they do not influence them directly. (i.e. do
not alter through Aj , Bj , Dj , Fj );
The x are exogenous because the xt s are fixed prior to the yt s
(F0 = 0, and cov (ut , t ) = 0).

The model (6) is called block-recursive (determination of x and then


of y ).
The recursive model (6) corresponds to the weak exogeneity (with
information of lagged endogenous variable y ).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock14 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C ONTROL VARIABLES AND ENVIRONMENT VARIABLES

C HARACTERIZATION OF THE E CONOMIC P OLICY

C HARACTERIZATION OF ECONOMIC POLICY

A UTONOMOUS ENVIRONMENT VAR .

Policy maker could intervene on the control var.s (value or evolution)


so as to affect the endogenous var.s.
More restrictive if we assume xt s are determined autonomously
(without a relationship to the lagged endogenous var.s). This
corresponds to imposing Fj = 0, j,

E XAMPLE
In Keynesian model, the government can alter Gt so as to influence the
economy. e.g. maintain a constant level of expenditure,

A0 yt + A1 yt 1 + ... + Ap yt p + B0 xt + B1 xt 1 + ... + Bp xt p

+C0 zt + C1 zt 1 + ... + Cp zt p + = t
xt + D1 xt 1 + ... + Dp xt p + E0 zt + E1 zt 1 + ... + Ep zt p
+ = ut

Gt = Gt 1 ,
(7)

where, {t } and {ut } are two mutually uncorrelated W.N. processes.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock15 / 31

or to modify government expenditure according to the observed evolution


of investment,
Gt Gt 1 = (It 1 It 2 ).
This is how the values of the control var.s will be fixed in term of the main
aggregates. And this can be expressed by adding in a policy equation,
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock16 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C HARACTERIZATION OF THE E CONOMIC P OLICY

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C HARACTERIZATION OF ECONOMIC POLICY

C HARACTERIZATION OF THE E CONOMIC P OLICY

C HARACTERIZATION OF ECONOMIC POLICY

D EFINITION (W ITH POLICY EQUATION )


From equation (8) to (10), additional recursiveness can be observed:

A0 yt + A1 yt 1 + ... + Ap yt p + B0 xt + B1 xt 1 + ... + Bp xt p

+C0 zt + C1 zt 1 + ... + Cp zt p + = t
xt + D1 xt 1 + ... + Dp xt p + E0 zt + E1 zt 1 + ... + Ep zt p
+F1 yt 1 + ... + Fp yt p + = ut
zt + G1 zt 1 + ... + Gp zt p + H1 xt 1 + ... + Hp xt p
+I1 yt 1 + ... + Ip yt p + = vt

(8)
(9)

determination of z of x of y .
However, policy maker may only give values that he wants to the
coefficients Gj , Hj , Ij , whereas he does not have any influence on the
other parameters of the model.

(10)

where cov (t , ut ) = cov (t , vt ) = cov (ut , vt ) = 0.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock17 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

VARIOUS F ORMS OF A D YNAMIC M ODEL

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock18 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

T HE STRUCTURAL FORM

VARIOUS F ORMS OF A D YNAMIC M ODEL

T HE STRUCTURAL FORM
S IMULTANEITY

The structural form, for example, corresponds to the initial equation:


A0 yt + A1 yt 1 + ... + Ap yt p

+B0 xt + B1 xt 1 + ... + Bp xt p + = t ,

(11)

Traditionally, A0 is often expressed with unit elements along its main


diagonal, then equation (11) can be re-written as,
yt

= + (I A0 )yt A1 yt 1 ... Ap yt p
B0 xt B1 xt 1 ... Bp xt p + t ,

Simultaneity among the variables can be introduced through the


coefficients of A0 and through the nonzero contemporaneous
correlation of the elements of the vector .
While the simultaneity appearing in A0 is easily interpretable in terms
of equilibrium, the one appearing in var () is not!

(12)

It is not possible to keep two sources of simultaneity separate.

where (I A0 ) has zero elements on the main diagonal.


System as in (12) could be difficult to interpret without additional
constraints.
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock19 / 31

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock20 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

VARIOUS F ORMS OF A D YNAMIC M ODEL

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

T HE REDUCED FORM

T HE REDUCED FORM

D EFINITION

C OMMENTS

VARIOUS F ORMS OF A D YNAMIC M ODEL

D EFINITION (R EDUCED FORM )


Endogenous var. is expressed as a function of the lagged endogenous
var.s, of the exogenous var.s, and of the disturbance term, e.g.,
yt

= A 01 (A0 yt + A1 yt 1 + ... + Ap yt p
+B0 xt + B1 xt 1 + ... + Bp xt p + ) + A01 t .

(13)

Therefore initial parameters are transformed into other summarized


forms. Sometimes easy to calculate/estimate but...
Problem: do we really care about reduced estimations?

Model (13) can be simplified into:


yt = A (0)1 ((A(L) A(0))yt + B (L)xt + ) + A(0)1 t
where,
A(L) = A0 + A 1 L + ... + Ap Lp , B (L) = B0 + B1 L + ... + Bp Lp .
J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock21 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

VARIOUS F ORMS OF A D YNAMIC M ODEL

T HE FINAL FORM

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock22 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

C AUSALITY

D EFINITION

D EFINITION (T HE FINAL FORM )


The above expressions can be further transformed into expressing the
current value of the endogenous var.s yt as a function of the exogenous
variables and of the disturbances , < t.This is the final form.
Given all roots of the polynomial A(L) are outside the unit circle,
yt = A (L)1 B (L)xt A(L)1 + A(L)1 A0 (A01 )t ,

Previous cases: study distinction between endogenous, exogenous


(control/environment) variables.
Now consider an approach: analyzing the joint evolution of the
various variables of interest, and in examining whether some of them
are fixed before others.
Can be used on processes of {xt } and {yt }.(may also be used on
control var.s {zt }).

which allows us to separate the influence of the exogenous var.s and of


the disturbances on y .

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock23 / 31

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock24 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

C AUSALITY

D EFINITIONS - CAUSALITY

D EFINITION -

C AUSALITY

NONCAUSALITY

D EFINITION (G RANGER (1969))


As a result of the properties of the linear regression, the variable
forecast based on more information is necessarily the best one. i.e.

1. y causes x at time t iff,


E (xt | x t 1 , y t 1 ) 6= E (xt | x t 1 );
2. y causes x instantaneously at time t iff,

var ((xt | x t 1 , y t 1 )) var ((xt | x t 1 ))


then we have the following conditions.

E (xt | x t 1 , y t ) 6= E (xt | x t 1 , y t 1 ).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock25 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

J IANHUA G ANG (RUC)

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

C AUSALITY

D EFINITION -

D EFINITION -

NONCAUSALITY

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock26 / 31
C AUSALITY

NONCAUSALITY

D EFINITION (N ONCAUSALITY )
1. y does not cause x at time t iff,
var ((xt | x t 1 , y t 1 )) = var ((xt | x t 1 ));
2. y does not cause x instantaneously at time t iff,

C OROLLARY (S YMMETRIC )
The two following statements are equivalent:
1. y does not cause x instantaneously at time t;
2. x does not cause y instantaneously at time t.

var ((xt | x t 1 , y t )) = var ((xt | x t 1 , y t 1 )).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock27 / 31

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock28 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY

C AUSALITY

C AUSALITY R EVERSAL

L IMIT

The definitions of causality proposed are valid for any time t. In


reality, for certain phenomena we could observe a causality reversal.
It is clear that definitions of causality as shown above involves
conditions on the forecast error only.

Therefore, need to provide a definition applicable in the absence of


such reversals.

It hence might be preferable, then, to use terms such as


predictability and instantaneous predictability instead of
causality and instantaneous causality.

D EFINITION (A BSENCE OF REVERSAL )


y does not cause x (instantaneous) iff y does not cause x (instantaneous)
at time t for all possible times t.

However, academia still uses the term causality.


Therefore, should keep in mind constantly the previous definitions is
sometimes not suitable to describe real-world phenomenon.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock29 / 31

T OPIC 9 C AUSALITY, E XOGENEITY AND S HOCK

C AUSALITY AND VAR M ODELS

C AUSALITY AND VAR M ODELS


Consider the expression of VAR,
     

yt
y (L) yx (L)
c

= y + yt
cx
xt
xt
xy (L) x (L)

(14)

where the usual conditions on the roots of the autoregressive


characteristic polynomial are satisfied. We can hence choose a
normalization of the type,


y (0) yx (0)
=I
(0) =
xy (0) x (0)
Therefore, in this case, all simultaneous links between the two
processes are summarized in the covariance cov (yt , xt ).

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock31 / 31

It is therefore true that when the process (x, y ) is stationary, it is


apparent that the definitions for a certain date or for all dates
coincide.

J IANHUA G ANG (RUC)

I NTRODUCTORY F INANCIAL E CONOMETRICS Topic 9 Causality, Exogeneity


S PRING 2013
and Shock30 / 31

You might also like