Ekonometrika

HETEROSKEDASTISITAS
Dahlan Tampubolon Ph.D
1. What is heteroscedasticity?
Recall that for estimation of coefficients and for
regression inference to be correct:
1. Equation is correctly specified:
2. Error Term has zero mean
3. Error Term has constant variance
4. Error Term is not autocorrelated
5. Explanatory variables are fixed
6. No linear relationship between RHS variables
When assumption 3 holds,

i.e. the errors ui in the regression equation have common variance
(ie constant or scalar variance)
then we have homoscedasticity.

or a scalar error covariance matrix
When assumption 3 breaks down, we have what is known

as heteroscedasticity.
or a non-scalar error covariance matrix (also caused by 4.)
2
Recall that the value of the Residual for each observation

i is the vertical distance between the observed value of the
dependent variable and the predicted value of the
dependent variable
I.e. the difference between the observed value of the
dependent variable and the line of best fit value:
Case
1
2
3
4
5
6
7
8
9
10
Price Predicted Price

19000
19174.0
30000
28028.7
8100
45738.2
55000
36883.5
130000
45738.2
55000
45738.2
54000
36883.5
7500
45738.2
36000
36883.5
32000
28028.7
Residual
-174.0
1971.3
-37638.2
18116.5
84261.8
9261.8
17116.5
-38238.2
-883.5
3971.3
N.B. Predicted price is the value on the regression line

that corresponds to the values of the dependent variables
(in this case, No. rooms) for a particular observation.
4
(Assume that this represents multiple

observations of y for each given value of x):
400000
300000
200000
+ive
residual
Purchase price
100000
-ive
residual
+
-
-100000
0
Number of rooms
10
12
14
Homoskedasticity => variance of error

term constant for each observation
cov(u1 , u2 ) L cov(u1 , un )
var(u1 )
cov(u , u )
var(u2 )
cov(u2 , un )
2
1
cov(u1 , u2 ,....un ) =
M
O
M
var(un )
cov(un , u1 ) cov(un , u2 ) L
2 0 L 0
0 2
0
=
where 2is a scalar
M
O M
0 L 2
0
Each one of the residuals has a sampling distribution,

each of which should have the same variance -homoscedasticity
Clearly, this is not the case within in this sample, and
so is unlikely to be true across samples:
400000
300000
200000
+ive
residual
Purchase price
100000
-ive
residual
+
-
-100000
0
Number of rooms
8
10
12
14
This is confirmed when we look at the

standard deviation of the residual for
different parts of the sample
Group Statistics
Number of rooms
Unstandardized Residual >= 3
<3
N
669
96
Mean
-680.676
4743.461
Std.
Deviation
31647.60
15024.51
Std. Error
Mean
1223.567
1533.433
Group Statistics
Number of rooms
Unstandardized Residual>= 4
<4
Std.
Std. Error
N
Mean
Deviation
Mean
452 -1575.28 36020.35 1694.255
313 2274.843 18350.73 1037.245
2. Causes
What might cause the variance of the residuals to
change over the course of the sample?
the error term may be correlated with:
either the dependent variable and/or the explanatory variables in
the model,
or some combination (linear or non-linear) of all variables in the
model
or those that should be in the model.
But why?
10
(i) Non-constant coefficient

Suppose that the slope coefficient varies across i:
yi = a + bi xi + ui
suppose that it varies randomly around some fixed

value :
bi = + i
then the regression actually estimated by SPSS will

be:
yi = a + ( + i) xi + ui
= a + xi + (i xi + ui)
where (i x + ui) is the error term in the SPSS
regression. The error term will thus vary with x.
11
(ii) Omitted variables

Suppose the true model of y is:
yi = a + b1xi + b2zi + ui
but the model we estimate fails to include z:

yi = a + b1xi + vi
then the error term in the model estimated by

SPSS (vi) will be capturing the effect of the omitted
variable, and so it will be correlated with z:
vi = c zi + ui
and so the variance of vi will be non-scalar
12
(iii) Non-linearities
If the true relationship is non-linear:
yi = a + b xi2 + ui
but the regression we attempt to estimate is linear:
yi = a + b xi + vi
then the residual in this estimated regression will
capture the non-linearity and its variance will be
affected accordingly:
vi = f(xi2, ui)
13
(iv) Aggregation
Sometimes we aggregate our data across groups:
e.g. quarterly time series data on income = average income of a
group of households in a given quarter
if this is so, and the size of groups used to calculate the

averages varies,
variation of the mean will vary
larger groups will have a smaller standard error of the mean.
the measurement errors of each value of our variable will be correlated
with the sample size of the groups used.
Since measurement errors will be captured by the

regression residual
regression residual will vary the sample size of the underlying
groups on which the data is based.
14
3. Consequences
Heteroscedasticity by itself does not cause OLS
estimators to be biased or inconsistent*
NB neither bias nor consistency are determined by the
covariance matrix of the error term.
However, if heteroscedasticity is a symptom of

omitted variables, measurement errors, or nonconstant parameters,
OLS estimators will be biased and inconsistent.
15
Unbiased and Consistent Estimator

Asymptotic Distribution of OLS Estimate
The Estimate is Unbiased and Consistent since as the sample size increases, the mean of the
distribution tends towards the population value of the slope coefficient
hat
4.5
n = 1,000
4
3.5
n = 500
3
n = 300
2.5
n = 200
1.5
n = 150
0.5
16
7.9
7.55
7.2
6.85
6.5
6.15
5.8
5.45
5.1
4.75
4.4
4.05
3.7
3.35
2.65
2.3
1.95
1.6
1.25
0.9
0.55
0.2
-0.2
-0.5
-0.9
-1.2
-1.6
-1.9
-2.3
-2.6
-3
-3.3
-3.7
-4
hat
Biased but Consistent Estimator

Asymptotic Distribution of OLS Estimate
The Estimate is Biased but Consistent since as the sample size increases, the mean of the
distribution tends towards the population value of the slope coefficient
hat
4.5
n = 1,000
4
3.5
n = 500
n = 300
2.5
n = 200
1.5
n = 150
1
0.5
17
7.9
7.55
7.2
6.85
6.5
6.15
5.8
5.45
5.1
4.75
4.4
4.05
3.7
3.35
2.65
2.3
1.95
1.6
1.25
0.9
0.55
0.2
-0.2
-0.5
-0.9
-1.2
-1.6
-1.9
-2.3
-2.6
-3
-3.3
-3.7
-4
hat
NB not heteroskedasticity that causes the bias,

but failure of one of the other assumptions that happens
to have hetero as the side effect.
testing for hetero. is closely related to tests for misspecification
generally.
Unfortunately, there is usually no straightforward way to identify
the cause
Heteroskedasticity does, however, bias the OLS

estimated standard errors for the estimated coefficients:
which means that the t tests will not be reliable:
t = bhat /SE(bhat).
F-tests are also no longer reliable

e.g. Chows second Test no longer reliable (Thursby)
18
4. Detection
Q/ How can we tell whether our model suffers from
heteroscedasticity?
19
4.1 Specific Tests/Methods

(A) Visual Examination of Residuals
See above
(B) Levenes Test

See last term
(C) Goldfeld-Quandt Test:

S.M. Goldfeld and R.E. Quandt, "Some Tests for
Homoscedasticity," Journal of the American Statistical Society,Vol.60,
1965.
H0: i2 is not correlated with a variable z

H1: i2 is correlated with a variable z
20
G-Q test procedure is as follows:

(i) order the observations in ascending order of x.
(ii) omit p central observations (as a rough guide take p n/3
where n is the total sample size).
This enables us to easily identify the differences in variances.
(iii) Fit the separate regression to both sets of observations.

The number of observations in each sample would be (n - p)/2, so we
need (n - p)/2 > k where k is the number of explanatory variables.
(iv) Calculate the test statistic G where:

G = RSS2/ (1/2(n - p) -k)
RSS1/ (1/2(n - p) -k)
G has an F distribution: G ~ F[1/2(n - p) - k, 1/2(n - p) -k]
NB G must be > 1. If not, invert it.
Prob: In practice we dont usually know what z is.

But if there are various possible zs then it may not matter which
one you choose if they are all highly correlated which each other.
21
The GoldfeldQuandt Test

Both the White test and the BreuschPagan test
focus on smoothly changing variances
for the disturbances.
The GoldfeldQuandt test compares
the variance of error terms across
discrete subgroups.
Under homoskedasticity, all subgroups
should have the same estimated variances.
22
The GoldfeldQuandt Test (cont.)

The GoldfeldQuandt test compares
the variance of error terms across discrete subgroups.
The econometrician must divide the data into h discrete
subgroups.
23

If the GoldfeldQuandt test is appropriate, it will
generally be clear which subgroups to use.
24

For example, the econometrician
might ask whether men and womens incomes vary
similarly around their predicted means, given education
and experience.
To conduct a GoldfeldQuandt test, divide the data into h
= 2 groups, one for men and one for women.
13-25

(1) Divide the n observations into h groups, of sizes n1..nh
(2) Choose two groups, say 1 and 2.
H 0 : 12 = 2 2 against H a : 12 2 2
(3) Regress Y against the explanators for group 1.
(4) Regress Y against the explanators for group 2.
13-26
GoldfeldQuandt Test (cont.)

(5) Relabel the groups as L and S, such that
SSRL
SSRS
>
nL k
nS k
SSRL
nL k
Compute G =
SSRS
nS k
(6) Compare G to the critical value for an F-statistic
with (nL k) and (nS k) degrees of freedom.
13-27
GoldfeldQuandt Test: An Example

Do men and womens incomes vary similarly about their
respective means, given education and experience?
That is, do the error terms for an income equation have
different variances for men and women?
We have a sample with 3,394 men and 3,146 women.
13-28
GoldfeldQuandt Test:
An Example (cont.)
(1) Divide the n observations into men and women,
of sizes nm and nw .
(2) We have only two groups, so choose both of them.
H 0 : m 2 = w2 against H a : m 2 w2
(3) For the men, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + i
(4) For the women, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + vi
13-29
GoldfeldQuandt Test:
An Example (cont.)
(5) sm 2 =
sw
SSRm 1736.64
=
= 0.5123
n m k 3394 - 4
SSRw 1851.52
=
=
= 0.5893
n w k 3146 - 4
Compute G =
0.5893
= 1.15
0.5123
(6) Compare G to the critical value for an F-statistic

with 3142 and 3390 degrees of freedom, which is
0.99997 for the 5% significance level.
We reject the null hypothesis at the 5% level.
13-30
4.2 General Tests

A. Breusch-Pagan Test :
T.S. Breusch and A.R. Pagan, "A Simple Test for Heteroscedasticity
and Random Coefficient Variation," Econometrica, Vol. 47, 1979.
Assumes that:
i2 = a1 + a2z1 + a3 z3 + a4z4 am zm
[1]
where zs are all independent variables. zs can be some
or all of the original regressors or some other variables or
some transformation of the original regressors which you
think cause the heteroscedasticity:
e.g. i2 = a1 + a2exp(x1) + a3 x32 + a4x4
31
Procedure for B-P test:

(i) Obtain OLS residuals uihat from the original regression
equation and construct a new variable g:
gi = uhat 2 / ihat 2
where ihat 2 = RSS / n
(ii) Regress gi on the zs (include a constant in the regression)

(iii) B = 1/2(REGSS) from the regression of gi on the zs,
where B has a Chi-square distribution with m-1 degrees of
freedom.
32
Problems with B-P test:

B-P test is not reliable if the errors are not
normally distributed and if the sample size is small
Koenker (1981) offers an alternative calculation of
the statistic which is less sensitive to nonnormality in small samples:
BKoenker = nR2 ~ 2m-1
where n and R2 are from the regression of uhat 2 on the zs,
where BKoenker has a Chi-square distribution with m-1
degrees of freedom.
33
B. White (1980) Test

The most general test of heteroscedasticity
no specification of the form of hetero required
(i) run an OLS regression - use the OLS regression to

calculate uhat 2 (i.e. square of residual).
(ii) use uhat 2 as the dependent variable in another
regression, in which the regressors are:
(a) all "k" original independent variables, and
(b) the square of each independent variable, (excluding dummy
variables), and all 2-way interactions (or crossproducts) between
the independent variables.
The square of a dummy variable is excluded because it will be
perfectly correlated with the dummy variable.
Call the total number of regressors (not including the constant

term) in this second equation, P.
34
(iii) From results of equation 2, calculate the test statistic

nR2
~ 2 P
where n = sample size, and R2 = unadjusted coefficient of

determination.
The statistic is asymptotically (I.e. in large samples) distributed as
chi-squared with P degrees of freedom, where P is the number of
regressors in the regression, not including the constant
35
Notes on Whites test:

The White test does not make any assumptions about the
particular form of heteroskedasticity, and so is quite
general in application.
It does not require that the error terms be normally distributed.
However, rejecting the null may be an indication of model
specification error, as well as or instead of heteroskedasticity.
generality is both a virtue and a shortcoming.

It might reveal heteroscedasticity, but it might also simply be
rejected as a result of missing variables.
it is "nonconstructive" in the sense that its rejection does not
provide any clear indication of how to proceed.
NB: if you use Whites standard errors, eradicating the

heteroscedasticity is less important.
36
Problems:
Note that although t-tests become reliable when you
use Whites standard errors, F-tests are still not
reliable (so Chows first test still not reliable).
Whites SEs have been found to be unreliable in small
samples
but revised methods for small samples have been developed
to allow robust SEs to be calculated for small n.
37
Heteroskedasticity Tests
Glejser test
This makes sense conceptuallyyou are testing to see if
one of your independent variables is significantly
related to the variance of your residuals.
$ i = b0 + b1Xi + ei
u
Generalized Least Squares

OLS is unbiased, but not efficient.
The OLS weights are not optimal.
Suppose we are estimating a straight line through the
origin:
= X + with higher X
Under homoskedasticity,Yobservations
values are relatively less distorted by the error term.
OLS places greater weight on observations with high
X values.
13-39

Suppose observations with higher X values have error
terms with much higher variances.
Under this DGP, observations with high X s (and high
variances of ) may be more misleading than
observations with low X s (and low variances of ).
In general, we want to put more weight on
observations with smaller i2
13-40
Figure 10.3 Heteroskedasticity with

Smaller Disturbances at Smaller X s
13-41

To construct the BLUE Estimator for S, we follow
the same steps as before, but with our new
variance formula. The resulting estimator is
Generalized Least Squares.
Start with a linear estimator, wiYi
Impose the unbiasedness conditions,
wi X Ri = 0 for R S, wi X Si = 1
Find wi to minimize wi2 i2
13-42
Generalized Least Squares (cont.)

For an example, consider the DGP
Yi = X i + i
E( i ) = 0
Var( i ) = di
2
Cov( i , j ) = 0 for i j
X i , di fixed across samples
13-43

min w 2 wi 2 di 2
i
such that
w d
j
=1
Xi
Solution: wi =
13-44
di
X
j
d
j
GLS
13-45
Yi X i
d d
i
i
Xj
d
j

In practice, econometricians choose a different method
for implementing GLS.
Historically, it was computationally difficult to program a
new estimator (with its own weights) for every
different dataset.
It was easier to re-weight the data first, and THEN apply
the OLS estimator.
13-46

We want to transform the data so that
it is homoskedastic. Then we can
apply OLS.
It is convenient to rewrite the variance term of the
heteroskedastic DGP as
Var( i ) = d
2
13-47
2
i

If we know the di factor for each observation, we can
transform the data by dividing through by di.
Once we divide all variables by di, we obtain a new
dataset that meets the GaussMarkov conditions.
13-48
GLS: DGP for Transformed Data

Yi
X i i
1
= 0 + 1
+
di
di
di di
i
E = 0
di
i 1
1 2 2
Var = 2 Var ( i ) = 2 di = 2
di
di di
i j
1
Cov , =
Cov( i , j ) = 0
di d j di d j
Xi
fixed across samples.
di
13-49

This procedure, Generalized Least Squares, has two
steps:
1.
2.
Divide all variables by di

Apply OLS to the transformed variables
This procedure optimally weights down observations

with high dis
GLS is unbiased and efficient
13-50

Example: a straight line through the origin:
1. First, divide Yi , X i by di
2. Apply OLS to
13-51
Yi X i
d d
i
i
Xj
d
j
Yi X i
,
di di

Note: we derive the same BLUE Estimator (Generalized
Least Squares) whether we:
13-52
1.
Find the optimal weights for heteroskedastic data, or
2.
Transform the data to be homoskedastic, then use OLS

weights
GLS: An Example (Chapter 10.5)

We can solve heteroskedasticity by dividing our variables
through by di.
The DGP with the transformed data is GaussMarkov.
The catch: we dont observe di.
How can we implement this strategy
in practice?
13-53
GLS: An Example (cont.)

We want to estimate the relationship
renti = 0 + 1incomei + i
We are concerned that higher income individuals are

less constrained in how much income they spend in
rent. Lower income individuals cram into what
housing they can afford; higher income individuals find
housing to suit their needs/tastes.
That is, Var(i ) may vary with income.
13-54

An initial guess:
di = incomei
Var( i ) = income
2
2
i
If we have modeled heteroskedasticity correctly, then the

BLUE Estimator is:
rent
1
= 0
+ 1 + vi
income i
incomei
13-55
TABLE 10.1
Rent and Income in New York
13-56
TABLE 10.5 Estimating a Transformed

RentIncome Relationship,
var( ) = 2 X 2
i
13-57
Checking Understanding
An initial guess:
Var( i ) = income
2
2
i
di = incomei
rent
1
= 0
+ 1 + vi
income i
incomei
How can we test to see if we have correctly modeled

the heteroskedasticity?
13-58
Checking Understanding
If we have the correct model of heteroskedasticity,
then OLS with the transformed data should be
homoskedastic.
rent
1
= 0
+ 1 + vi
income i
incomei
We can apply either a White test or a Breusch

Pagan test for heteroskedasticity to the model
with the transformed data.
13-59
Checking Understanding (cont.)

To run the White test, we regress
1
1
ei = 0 + 1
+ 2
+ i
2
incomei
incomei
nR2 = 7.17
The critical value at the 0.05 significance level for a
Chi-square statistic with 2 degrees of freedom is 5.99
We reject the null hypothesis.
13-60
GLS: An Example
Our initial guess:
2
2
Var(
)
=
income
This guess didnt do very well. Cani we do better? i
Instead of blindly guessing, lets try looking at the data

first.
13-61
Figure 10.4 The RentIncome Ratio Plotted

Against the Inverse of Income
13-62
GLS: An Example
We seem to have overcorrected
for heteroskedasticity.
Lets try
Var( i ) = incomei
2
rent
income i
13-63
= 0
income i
+ 1 incomei + vi
TABLE 10.6 Estimating a Second Transformed

RentIncome Relationship,
var( i ) = 2 Xi
13-64
GLS: An Example
Unthinking application of the White test
procedures for the transformed data leads to
1
1
ei = 0 + 1
+ 2
+ 3 income
income
income i
i
1
+ 4 incomei + 5
income + i
income i
The interaction term reduces to a constant, which

we already have in the auxilliary equation, so we
omit it and use only the first
4 explanators.
13-65

nR2 = 6.16
The critical value at the 0.05 significance level for a
Chi-squared statistic with 4 degrees of freedom is
9.49
We fail to reject the null hypothesis that the
transformed data are homoskedastic.
Warning: failing to reject a null hypothesis does NOT
mean we can accept it.
13-66

Generalized Least Squares is not trivial to apply in
practice.
Figuring out a reasonable di can be quite difficult.
Next time we will learn another approach to constructing
di , Feasible Generalized Least Squares.
13-67
5. Solutions
A. Weighted Least Squares
B. Maximum likelihood estimation. (not covered)
C. Whites Standard Errors
68
Figure 10.2 Homoskedastic Disturbances More

Misleading at Smaller X s
13-69
A. Weighted Least Squares

If the differences in variability of the error term can be predicted
from another variable within the model, the Weight Estimation
procedure (available in SPSS) can be used.
computes the coefficients of a linear regression model using WLS, such
that the more precise observations (that is, those with less variability)
are given greater weight in determining the regression coefficients.
Problems:
Wrong choice of weights can produce biased estimates of the standard
errors.
we can never know for sure whether we have chosen the correct
weights, this is a real problem.
If the weights are correlated with the disturbance term, then the WLS
slope estimates will be inconsistent.
Also: Dickens (1990) found that errors in grouped data may be
correlated within groups so that weighting by the square root of the
group size may be inappropriate. See Binkley (1992) for an assessment of
tests of grouped heteroscedasticity.
70
C. Whites Standard Errors

White (op cit) developed an algorithm for correcting the
standard errors in OLS when heteroscedasticity is present.
The correction procedure does not assume any particular
form of heteroscedasticity and so in some ways White has
solved the heteroscedasticity problem.
71
Summary
(1) Causes
(2) Consequences
(3) Detection
(4) Solutions
72
Reading:
Kennedy (1998) A Guide to Econometrics, Chapters 5,6,7 & 9
Maddala, G.S. (1992) Introduction to Econometrics chapter 12
Field, A. (2000) chapter 4, particularly pages 141-162.
Green, W. H. (1990) Econometric Analysis
Grouped Heteroscedasticity:
Binkley, J.K. (1992) Finite Sample Behaviour of Tests for Grouped
Heteroskedasticity, Review of Economics and Statistics, 74, 563-8.
Dickens, W.T. (1990) Error components in grouped data: is it ever
worth weighting?, Review of Economics and Statistics, 72, 328-33.
Breusch Pagan critique:
Koenker, R. (1981) A Note on Studentizing a Test for
Heteroskedascity, Journal of Applied Econometrics, 3, 139-43.
73

Ekonometrika

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ekonometrika

Uploaded by

Copyright:

Available Formats

HETEROSKEDASTISITAS

Dahlan Tampubolon Ph.D

When assumption 3 holds,

then we have homoscedasticity.

When assumption 3 breaks down, we have what is known

Recall that the value of the Residual for each observation

Price Predicted Price

N.B. Predicted price is the value on the regression line

(Assume that this represents multiple

Homoskedasticity => variance of error

Each one of the residuals has a sampling distribution,

This is confirmed when we look at the

(i) Non-constant coefficient

suppose that it varies randomly around some fixed

then the regression actually estimated by SPSS will

(ii) Omitted variables

but the model we estimate fails to include z:

then the error term in the model estimated by

but the regression we attempt to estimate is linear:

if this is so, and the size of groups used to calculate the

Since measurement errors will be captured by the

However, if heteroscedasticity is a symptom of

Unbiased and Consistent Estimator

Biased but Consistent Estimator

NB not heteroskedasticity that causes the bias,

Heteroskedasticity does, however, bias the OLS

F-tests are also no longer reliable

4.1 Specific Tests/Methods

(B) Levenes Test

(C) Goldfeld-Quandt Test:

H0: i2 is not correlated with a variable z

G-Q test procedure is as follows:

(iii) Fit the separate regression to both sets of observations.

(iv) Calculate the test statistic G where:

Prob: In practice we dont usually know what z is.

The GoldfeldQuandt Test

The GoldfeldQuandt Test (cont.)

The GoldfeldQuandt Test (cont.)

The GoldfeldQuandt Test (cont.)

The GoldfeldQuandt Test (cont.)

GoldfeldQuandt Test (cont.)

GoldfeldQuandt Test: An Example

(6) Compare G to the critical value for an F-statistic

4.2 General Tests

Procedure for B-P test:

(ii) Regress gi on the zs (include a constant in the regression)

Problems with B-P test:

B. White (1980) Test

(i) run an OLS regression - use the OLS regression to

Call the total number of regressors (not including the constant

(iii) From results of equation 2, calculate the test statistic

where n = sample size, and R2 = unadjusted coefficient of

Notes on Whites test:

generality is both a virtue and a shortcoming.

NB: if you use Whites standard errors, eradicating the

Generalized Least Squares

Generalized Least Squares

Figure 10.3 Heteroskedasticity with

Generalized Least Squares

Generalized Least Squares (cont.)

Generalized Least Squares (cont.)

Generalized Least Squares (cont.)

Generalized Least Squares (cont.)

Generalized Least Squares (cont.)