Professional Documents
Culture Documents
1. What is heteroscedasticity?
Recall that for estimation of coefficients and for
regression inference to be correct:
1. Equation is correctly specified:
2. Error Term has zero mean
3. Error Term has constant variance
4. Error Term is not autocorrelated
5. Explanatory variables are fixed
6. No linear relationship between RHS variables
Case
1
2
3
4
5
6
7
8
9
10
Residual
-174.0
1971.3
-37638.2
18116.5
84261.8
9261.8
17116.5
-38238.2
-883.5
3971.3
300000
200000
+ive
residual
Purchase price
100000
-ive
residual
+
-
-100000
0
Number of rooms
10
12
14
cov(u1 , u2 ) L cov(u1 , un )
var(u1 )
cov(u , u )
var(u2 )
cov(u2 , un )
2
1
cov(u1 , u2 ,....un ) =
M
O
M
var(un )
cov(un , u1 ) cov(un , u2 ) L
2 0 L 0
0 2
0
=
where 2is a scalar
M
O M
0 L 2
0
400000
300000
200000
+ive
residual
Purchase price
100000
-ive
residual
+
-
-100000
0
Number of rooms
8
10
12
14
Number of rooms
Unstandardized Residual >= 3
<3
N
669
96
Mean
-680.676
4743.461
Std.
Deviation
31647.60
15024.51
Std. Error
Mean
1223.567
1533.433
Group Statistics
Number of rooms
Unstandardized Residual>= 4
<4
Std.
Std. Error
N
Mean
Deviation
Mean
452 -1575.28 36020.35 1694.255
313 2274.843 18350.73 1037.245
2. Causes
What might cause the variance of the residuals to
change over the course of the sample?
the error term may be correlated with:
either the dependent variable and/or the explanatory variables in
the model,
or some combination (linear or non-linear) of all variables in the
model
or those that should be in the model.
But why?
10
11
(iii) Non-linearities
If the true relationship is non-linear:
yi = a + b xi2 + ui
yi = a + b xi + vi
then the residual in this estimated regression will
capture the non-linearity and its variance will be
affected accordingly:
vi = f(xi2, ui)
13
(iv) Aggregation
Sometimes we aggregate our data across groups:
e.g. quarterly time series data on income = average income of a
group of households in a given quarter
14
3. Consequences
Heteroscedasticity by itself does not cause OLS
estimators to be biased or inconsistent*
NB neither bias nor consistency are determined by the
covariance matrix of the error term.
15
4.5
n = 1,000
4
3.5
n = 500
3
n = 300
2.5
n = 200
1.5
n = 150
0.5
16
7.9
7.55
7.2
6.85
6.5
6.15
5.8
5.45
5.1
4.75
4.4
4.05
3.7
3.35
2.65
2.3
1.95
1.6
1.25
0.9
0.55
0.2
-0.2
-0.5
-0.9
-1.2
-1.6
-1.9
-2.3
-2.6
-3
-3.3
-3.7
-4
hat
4.5
n = 1,000
4
3.5
n = 500
n = 300
2.5
n = 200
1.5
n = 150
1
0.5
17
7.9
7.55
7.2
6.85
6.5
6.15
5.8
5.45
5.1
4.75
4.4
4.05
3.7
3.35
2.65
2.3
1.95
1.6
1.25
0.9
0.55
0.2
-0.2
-0.5
-0.9
-1.2
-1.6
-1.9
-2.3
-2.6
-3
-3.3
-3.7
-4
hat
18
4. Detection
Q/ How can we tell whether our model suffers from
heteroscedasticity?
19
20
21
22
23
24
13-25
SSRL
SSRS
>
nL k
nS k
SSRL
nL k
Compute G =
SSRS
nS k
(6) Compare G to the critical value for an F-statistic
with (nL k) and (nS k) degrees of freedom.
13-27
13-28
GoldfeldQuandt Test:
An Example (cont.)
(1) Divide the n observations into men and women,
of sizes nm and nw .
(2) We have only two groups, so choose both of them.
H 0 : m 2 = w2 against H a : m 2 w2
(3) For the men, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + i
(4) For the women, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + vi
13-29
GoldfeldQuandt Test:
An Example (cont.)
(5) sm 2 =
sw
SSRm 1736.64
=
= 0.5123
n m k 3394 - 4
SSRw 1851.52
=
=
= 0.5893
n w k 3146 - 4
Compute G =
0.5893
= 1.15
0.5123
Assumes that:
i2 = a1 + a2z1 + a3 z3 + a4z4 am zm
[1]
where zs are all independent variables. zs can be some
or all of the original regressors or some other variables or
some transformation of the original regressors which you
think cause the heteroscedasticity:
e.g. i2 = a1 + a2exp(x1) + a3 x32 + a4x4
31
32
33
34
~ 2 P
35
36
Problems:
Note that although t-tests become reliable when you
use Whites standard errors, F-tests are still not
reliable (so Chows first test still not reliable).
Whites SEs have been found to be unreliable in small
samples
but revised methods for small samples have been developed
to allow robust SEs to be calculated for small n.
37
Heteroskedasticity Tests
Glejser test
This makes sense conceptuallyyou are testing to see if
one of your independent variables is significantly
related to the variance of your residuals.
$ i = b0 + b1Xi + ei
u
13-39
13-40
13-41
Yi = X i + i
E( i ) = 0
Var( i ) = di
2
Cov( i , j ) = 0 for i j
X i , di fixed across samples
13-43
such that
w d
j
=1
Xi
Solution: wi =
13-44
di
X
j
d
j
GLS
13-45
Yi X i
d d
i
i
Xj
d
j
13-46
Var( i ) = d
2
13-47
2
i
13-48
Xi
fixed across samples.
di
13-49
13-50
1. First, divide Yi , X i by di
2. Apply OLS to
13-51
Yi X i
d d
i
i
Xj
d
j
Yi X i
,
di di
13-52
1.
2.
13-53
renti = 0 + 1incomei + i
13-54
Var( i ) = income
2
2
i
rent
1
= 0
+ 1 + vi
income i
incomei
13-55
TABLE 10.1
Rent and Income in New York
13-56
13-57
Checking Understanding
An initial guess:
Var( i ) = income
2
2
i
di = incomei
rent
1
= 0
+ 1 + vi
income i
incomei
13-58
Checking Understanding
If we have the correct model of heteroskedasticity,
then OLS with the transformed data should be
homoskedastic.
rent
1
= 0
+ 1 + vi
income i
incomei
1
1
ei = 0 + 1
+ 2
+ i
2
incomei
incomei
nR2 = 7.17
The critical value at the 0.05 significance level for a
Chi-square statistic with 2 degrees of freedom is 5.99
We reject the null hypothesis.
13-60
GLS: An Example
Our initial guess:
2
2
Var(
)
=
income
This guess didnt do very well. Cani we do better? i
13-61
13-62
GLS: An Example
We seem to have overcorrected
for heteroskedasticity.
Lets try
Var( i ) = incomei
2
rent
income i
13-63
= 0
income i
+ 1 incomei + vi
var( i ) = 2 Xi
13-64
GLS: An Example
Unthinking application of the White test
procedures for the transformed data leads to
1
1
ei = 0 + 1
+ 2
+ 3 income
income
income i
i
1
+ 4 incomei + 5
income + i
income i
13-66
13-67
5. Solutions
A. Weighted Least Squares
B. Maximum likelihood estimation. (not covered)
C. Whites Standard Errors
68
13-69
Problems:
Wrong choice of weights can produce biased estimates of the standard
errors.
we can never know for sure whether we have chosen the correct
weights, this is a real problem.
If the weights are correlated with the disturbance term, then the WLS
slope estimates will be inconsistent.
Also: Dickens (1990) found that errors in grouped data may be
correlated within groups so that weighting by the square root of the
group size may be inappropriate. See Binkley (1992) for an assessment of
tests of grouped heteroscedasticity.
70
71
Summary
(1) Causes
(2) Consequences
(3) Detection
(4) Solutions
72
Reading:
Kennedy (1998) A Guide to Econometrics, Chapters 5,6,7 & 9
Maddala, G.S. (1992) Introduction to Econometrics chapter 12
Field, A. (2000) chapter 4, particularly pages 141-162.
Green, W. H. (1990) Econometric Analysis
Grouped Heteroscedasticity:
Binkley, J.K. (1992) Finite Sample Behaviour of Tests for Grouped
Heteroskedasticity, Review of Economics and Statistics, 74, 563-8.
Dickens, W.T. (1990) Error components in grouped data: is it ever
worth weighting?, Review of Economics and Statistics, 72, 328-33.
Breusch Pagan critique:
Koenker, R. (1981) A Note on Studentizing a Test for
Heteroskedascity, Journal of Applied Econometrics, 3, 139-43.
73