Professional Documents
Culture Documents
SPURIOUS REGRESSION
If we regress a y series with unit root on regressors
who also have unit roots the usual t tests on
regression coefficients show statistically signifcant
regressions, even if in reality it is not so.
The Spurious Regression Problem can appear with
I(0) series (see Granger, Hyung and Jeon (1998)).
This is telling us that the problem is generated by
using WRONG CRITICAL VALUES!!!!
In a Spurious Regression the errors would be
correlated and the standard t-statistic will be
wrongly calculated because the variance of the
errors is not consistently estimated. In the I(0) case
the solution is:
t t - distribution , where (long - run variance of )1/2
SPURIOUS REGRESSION
How do we detect a Spurious Regression
(between I(1) series)?
Looking at the correlogram of the residuals and also
by testing for a unit root on them.
How do we convert a Spurious Regression into
a valid regression?
By taking differences.
Does this solve the SPR problem?
It solves the statistical problems but not the
economic interpretation of the regression. Think that
by taking differences we are loosing information and
also that it is not the same information contained in
a regression involving growth rates than in a
regression involved the levels of the variables.
8
SPURIOUS REGRESSION
Typical symptom: High R2, t-values, F-value, but low
DW
1. Egyptian infant mortality rate (Y), 1971-1990, annual
data, on Gross aggregate income of American farmers
(I) and Total Honduran money supply (M)
Y ^ = 179.9 - .2952 I - .0439 M, R2 = .918, DW = .4752, F = 95.17
(16.63) (-2.32) (-4.26) Corr = .8858, -.9113, -.9445
2. US Export Index (Y), 1960-1990, annual data, on
Australian males life expectancy (X)
Y ^ = -2943. + 45.7974 X, R2 = .916, DW = .3599, F = 315.2
(-16.70) (17.76)
Corr = .9570
9
SPURIOUS REGRESSION
3. US Defense Expenditure (Y), 1971-1990, annual data,
on Population of South African (X)
Y ^= -368.99 + .0179 X, R2 = .940, DW = .4069, F = 280.69
(-11.34) (16.75) Corr = .9694
4. Total Crime Rates in the US (Y), 1971-1991, annual data,
on Life expectancy of South Africa (X)
Y ^= -24569 + 628.9 X, R2 = .811, DW = .5061, F = 81.72
(-6.03) (9.04)
Corr = .9008
5. Population of South Africa (Y), 1971-1990, annual data,
on Total R&D expenditure in the US (X)
Y ^= 21698.7 + 111.58 X, R2 = .974, DW = .3037, F = 696.96
(59.44)
(26.40) Corr = .9873
10
SPURIOUS REGRESSION
Does it make sense a regression between two
I(1) variables?
Yes if the regression errors are I(0).
Can this be possible?
The same question asked David Hendry to Clive
Granger time ago. Clive answered NO WAY!!!!! but
he also said that he would think about. In the plane
trip back home to San Diego, Clive thought about it
and concluded that YES IT IS POSSIBLE. It is possible
when both variables share the same source of the
I(1)ness (co-I(1)), when both variables move
together in the long-run (co-move), ... when both
variables are COINTEGRATED!
11
COINTEGRATION
An mx1 vector time series Yt is said to
be cointegrated of order (d, b), CI(d,b)
where 0<bd, if each of its component
series Yit is I(d) but some linear
combination of the series Yt is I(db)
for some nonzero constant vector .
is the cointegrating vector or the
long run parameter and it is not unique.
The most common case is d=b=1.
12
COINTEGRATION
More generally, if the mx1 vector series Yt
contains more than two components, each
being I(1), then there may exist k (<m) linearly
independent 1xm vectors 1, 2,, k, such
that Yt is a nonstationary kx1 vector process
where
1 , , k
EXAMPLE
Consider the following system of processes
Yt B Yt 1 at .
or
Let say we have a unit root. Then, we
*
can write
B 1 1 B B
This is like a multivariate version of the
augmented Dickey-p Fuller test
Yt Yt 1 Yt i at .
i 1
15
Yt 1 1 Yt 1 B Yt 1 at .
Ytavector
Yt 1error
correction
B Yt 1 amodel
t.
This is called
(VECM).
*
16
Y2t ut
Y
1t
You can apply a unit root test to the estimated OLS
residual from estimation of the above equation,
but
if t~I(0).
p 1the model
H0: =1 vs H1: <1 for
t t 1 j t j at
j 1
p 1
H0: =0 vs H1: <1 for
the model
t t 1 j t j at
j 1
23
T
s
x
j
t j
xt p at
j 1
where xt xt xt 1
,dt is a vector of
deterministic variables, such as constant
and seasonal dummy variables,
j I 1 L j , j = 1,L
are mm, A ,A and are mk
parameter matrices, the are i.i.d. Nm(0, )
errors, and
I B
det(
) has all of its roots outside the
unit circle.
, p-1
p 1
j 1
28
H 0 H k H m
33
Z 0 t Z1t Z pt at
where Z x Z, (x ,..., x ) Z ,x
,,..., )
1
p 1
0t
t
pt
t p
1t
t 1
t p 1
The likelihood ratio statistic for hypothesis
is given by
H 0 : A
2
ln
n
ln
1
i
denotes the eigenvalues
of
m
where
and arei ordered by
i k 1
1
S p 0 S00
S0 p wrt S pp
1 m 0.
34
1
11
Sij M ij M i1M M ij
1
M ij n Z it Z jt ; i , j 0 ,1, p .
t 1
tr dY Y YY dt
Y
d
Y
dY Y
0
YY dt
0
Y dY
37
Analysis of U.S.
Economic Variables
(From SAS Online Doc)
SAS Code
symbol1 v=none height=1 c=black;
symbol2 v=none height=1 c=black;
title 'Analysis of U.S. Economic Variables';
data us_money;
date=intnx( 'qtr', '01jan54'd, _n_-1 );
format date yyq. ;
input y1 y2 y3 y4 @@;
y1=log(y1);
y2=log(y2);
label y1='log(real money stock M1)' y2='log(GNP in bil.
of 1982 dollars)' y3='Discount rate on 91-day T-bills'
y4='Yield on 20-year Treasury bonds';
datalines;
... data lines omitted ... ;
legend1 across=1 frame label=none;
39
40
SAS Output
This example performs the DickeyFuller test for stationarity, the
Johansen cointegrated test integrated
order 2, and the exogeneity test. The
VECM(2) fts the data. From the
outputs shown below, you can see
that the series has unit roots and is
cointegrated in rank 1 with integrated
order 1. The ftted VECM(2) is given as
41
SAS Output
42
SAS Output
Dickey-Fuller Unit Root Tests
Variable Type
Rho
Pr <
Rho
Tau
Pr < Tau
y1
Zero
Mean
0.05
0.6934
1.14
0.9343
Single
Mean
-2.97
0.6572
-0.76
0.8260
Trend
-5.91
0.7454
-1.34
0.8725
y2
Zero
Mean
0.13
0.7124
5.14
0.9999
Single
Mean
-0.43
0.9309
-0.79
0.8176
Trend
-9.21
0.4787
-2.16
0.5063
y3
Zero
Mean
-1.28
0.4255
-0.69
0.4182
Single
Mean
-8.86
0.1700
-2.27
0.1842
Trend
-18.97
0.0742
-2.86
0.1803
y4
Zero
Mean
0.40
0.7803
0.45
0.8100
Single
Mean
-2.79
0.6790
-1.29
0.6328
Trend
-12.12
0.2923
-2.33
0.4170
43
SAS Output
Cointegration Rank Test for I(2)
r\k-r-s
0
Trace
of I(1)
5% CV
of I(1)
55.9633
47.21
219.6239
89.21508 27.32609 20.6542
5
29.38
5% CV
I(2)
73.61779 22.13279
2.6477
15.34
38.29435
0.0149
3.84
3.84000
44
SAS Output
Long-Run Parameter
Beta Estimates When
RANK=1
Adjustment Coefficient
Alpha Estimates When
RANK=1
Variable
Variable
y1
1.00000
y1
-0.01396
y2
-0.46458
y3
14.51619
y2
-0.02811
y4
-9.35520
y3
-0.00215
y4
0.00510
45
Diagnostic Checks
Schematic Representation of Cross Correlations
of Residuals
Variabl
0
e/Lag
y1
++..
....
++..
....
+...
..--
....
y2
++++ ....
....
....
....
....
....
y3
.+++
....
+.-.
..++
-...
....
....
y4
.+++
....
....
..+.
....
....
....
+is>2*stderror,-is<-2*stderror,.isbetween
Portmanteau Test for Cross Correlations
of Residuals
Up To Lag
DF
Chi-Square
Pr>ChiSq
16
53.90
<.0001
32
74.03
<.0001
48
103.08
<.0001
64
116.94
<.0001
46
Diagnostic Checks
Univariate Model ANOVA Diagnostics
Variable
Standard
R-Square Deviatio
n
F Value
Pr>F
y1
0.6754
0.00712
32.51
<.0001
y2
0.3070
0.00843
6.92
<.0001
y3
0.1328
0.00807
2.39
0.0196
y4
0.0831
0.00403
1.42
0.1963
Univariate Model White Noise Diagnostics
Variabl
e
Durbin
Watso
n
y1
y2
y3
Normality
ARCH
ChiSquare
Pr>C
hiSq
F Value
Pr>F
2.1341
8
7.19
0.0275
1.62
0.2053
2.0400
3
1.20
0.5483
1.23
0.2697
1.8689
253.76
<.000
1.78
0.1847
47
Diagnostic Checks
Univariate Model AR Diagnostics
AR1
Variabl
e
F Value Pr>F
AR2
AR3
AR4
F Value
Pr>F
F Value
Pr>F
F Value
Pr>F
y1
0.68
0.4126
2.98
0.0542
2.01
0.1154
2.48
0.0473
y2
0.05
0.8185
0.12
0.8842
0.41
0.7453
0.30
0.8762
y3
0.56
0.4547
2.86
0.0610
4.83
0.003
2
3.71
0.006
9
y4
0.01
0.9340
0.16
0.8559
1.21
0.3103
0.95
0.4358
48
Diagnostic Checks
Testing Weak Exogeneity of
Each Variables
Variable
DF
Chi-Square
Pr>ChiSq
y1
6.55
0.0105
y2
12.54
0.0004
y3
0.09
0.7695
y4
1.81
0.1786
49