You are on page 1of 73

HETEROSKEDASTISITAS

Dahlan Tampubolon Ph.D

1. What is heteroscedasticity?
Recall that for estimation of coefficients and for
regression inference to be correct:
1. Equation is correctly specified:
2. Error Term has zero mean
3. Error Term has constant variance
4. Error Term is not autocorrelated
5. Explanatory variables are fixed
6. No linear relationship between RHS variables

When assumption 3 holds,


i.e. the errors ui in the regression equation have common variance
(ie constant or scalar variance)

then we have homoscedasticity.


or a scalar error covariance matrix

When assumption 3 breaks down, we have what is known


as heteroscedasticity.
or a non-scalar error covariance matrix (also caused by 4.)
2

Recall that the value of the Residual for each observation


i is the vertical distance between the observed value of the
dependent variable and the predicted value of the
dependent variable
I.e. the difference between the observed value of the
dependent variable and the line of best fit value:

Case
1
2
3
4
5
6
7
8
9
10

Price Predicted Price


19000
19174.0
30000
28028.7
8100
45738.2
55000
36883.5
130000
45738.2
55000
45738.2
54000
36883.5
7500
45738.2
36000
36883.5
32000
28028.7

Residual
-174.0
1971.3
-37638.2
18116.5
84261.8
9261.8
17116.5
-38238.2
-883.5
3971.3

N.B. Predicted price is the value on the regression line


that corresponds to the values of the dependent variables
(in this case, No. rooms) for a particular observation.
4

(Assume that this represents multiple


observations of y for each given value of x):
400000

300000

200000

+ive
residual

Purchase price

100000
-ive
residual

+
-

-100000
0

Number of rooms

10

12

14

Homoskedasticity => variance of error


term constant for each observation

cov(u1 , u2 ) L cov(u1 , un )
var(u1 )
cov(u , u )
var(u2 )
cov(u2 , un )
2
1
cov(u1 , u2 ,....un ) =

M
O
M

var(un )
cov(un , u1 ) cov(un , u2 ) L
2 0 L 0

0 2
0

=
where 2is a scalar
M

O M

0 L 2
0

Each one of the residuals has a sampling distribution,


each of which should have the same variance -homoscedasticity
Clearly, this is not the case within in this sample, and
so is unlikely to be true across samples:

400000

300000

200000

+ive
residual

Purchase price

100000
-ive
residual

+
-

-100000
0

Number of rooms
8

10

12

14

This is confirmed when we look at the


standard deviation of the residual for
different parts of the sample
Group Statistics

Number of rooms
Unstandardized Residual >= 3
<3

N
669
96

Mean
-680.676
4743.461

Std.
Deviation
31647.60
15024.51

Std. Error
Mean
1223.567
1533.433

Group Statistics

Number of rooms
Unstandardized Residual>= 4
<4

Std.
Std. Error
N
Mean
Deviation
Mean
452 -1575.28 36020.35 1694.255
313 2274.843 18350.73 1037.245

2. Causes
What might cause the variance of the residuals to
change over the course of the sample?
the error term may be correlated with:
either the dependent variable and/or the explanatory variables in
the model,
or some combination (linear or non-linear) of all variables in the
model
or those that should be in the model.

But why?

10

(i) Non-constant coefficient


Suppose that the slope coefficient varies across i:
yi = a + bi xi + ui

suppose that it varies randomly around some fixed


value :
bi = + i

then the regression actually estimated by SPSS will


be:
yi = a + ( + i) xi + ui
= a + xi + (i xi + ui)
where (i x + ui) is the error term in the SPSS
regression. The error term will thus vary with x.

11

(ii) Omitted variables


Suppose the true model of y is:
yi = a + b1xi + b2zi + ui

but the model we estimate fails to include z:


yi = a + b1xi + vi

then the error term in the model estimated by


SPSS (vi) will be capturing the effect of the omitted
variable, and so it will be correlated with z:
vi = c zi + ui
and so the variance of vi will be non-scalar
12

(iii) Non-linearities
If the true relationship is non-linear:
yi = a + b xi2 + ui

but the regression we attempt to estimate is linear:

yi = a + b xi + vi
then the residual in this estimated regression will
capture the non-linearity and its variance will be
affected accordingly:

vi = f(xi2, ui)

13

(iv) Aggregation
Sometimes we aggregate our data across groups:
e.g. quarterly time series data on income = average income of a
group of households in a given quarter

if this is so, and the size of groups used to calculate the


averages varies,
variation of the mean will vary
larger groups will have a smaller standard error of the mean.
the measurement errors of each value of our variable will be correlated
with the sample size of the groups used.

Since measurement errors will be captured by the


regression residual
regression residual will vary the sample size of the underlying
groups on which the data is based.

14

3. Consequences
Heteroscedasticity by itself does not cause OLS
estimators to be biased or inconsistent*
NB neither bias nor consistency are determined by the
covariance matrix of the error term.

However, if heteroscedasticity is a symptom of


omitted variables, measurement errors, or nonconstant parameters,
OLS estimators will be biased and inconsistent.

15

Unbiased and Consistent Estimator


Asymptotic Distribution of OLS Estimate
The Estimate is Unbiased and Consistent since as the sample size increases, the mean of the
distribution tends towards the population value of the slope coefficient
hat

4.5

n = 1,000
4

3.5

n = 500
3

n = 300

2.5

n = 200

1.5

n = 150

0.5

16

7.9

7.55

7.2

6.85

6.5

6.15

5.8

5.45

5.1

4.75

4.4

4.05

3.7

3.35

2.65

2.3

1.95

1.6

1.25

0.9

0.55

0.2

-0.2

-0.5

-0.9

-1.2

-1.6

-1.9

-2.3

-2.6

-3

-3.3

-3.7

-4

hat

Biased but Consistent Estimator


Asymptotic Distribution of OLS Estimate
The Estimate is Biased but Consistent since as the sample size increases, the mean of the
distribution tends towards the population value of the slope coefficient
hat

4.5

n = 1,000
4

3.5

n = 500

n = 300

2.5

n = 200
1.5

n = 150
1

0.5

17

7.9

7.55

7.2

6.85

6.5

6.15

5.8

5.45

5.1

4.75

4.4

4.05

3.7

3.35

2.65

2.3

1.95

1.6

1.25

0.9

0.55

0.2

-0.2

-0.5

-0.9

-1.2

-1.6

-1.9

-2.3

-2.6

-3

-3.3

-3.7

-4

hat

NB not heteroskedasticity that causes the bias,


but failure of one of the other assumptions that happens
to have hetero as the side effect.
testing for hetero. is closely related to tests for misspecification
generally.
Unfortunately, there is usually no straightforward way to identify
the cause

Heteroskedasticity does, however, bias the OLS


estimated standard errors for the estimated coefficients:
which means that the t tests will not be reliable:
t = bhat /SE(bhat).

F-tests are also no longer reliable


e.g. Chows second Test no longer reliable (Thursby)

18

4. Detection
Q/ How can we tell whether our model suffers from
heteroscedasticity?

19

4.1 Specific Tests/Methods


(A) Visual Examination of Residuals
See above

(B) Levenes Test


See last term

(C) Goldfeld-Quandt Test:


S.M. Goldfeld and R.E. Quandt, "Some Tests for
Homoscedasticity," Journal of the American Statistical Society,Vol.60,
1965.

H0: i2 is not correlated with a variable z


H1: i2 is correlated with a variable z

20

G-Q test procedure is as follows:


(i) order the observations in ascending order of x.
(ii) omit p central observations (as a rough guide take p n/3
where n is the total sample size).
This enables us to easily identify the differences in variances.

(iii) Fit the separate regression to both sets of observations.


The number of observations in each sample would be (n - p)/2, so we
need (n - p)/2 > k where k is the number of explanatory variables.

(iv) Calculate the test statistic G where:


G = RSS2/ (1/2(n - p) -k)
RSS1/ (1/2(n - p) -k)
G has an F distribution: G ~ F[1/2(n - p) - k, 1/2(n - p) -k]
NB G must be > 1. If not, invert it.

Prob: In practice we dont usually know what z is.


But if there are various possible zs then it may not matter which
one you choose if they are all highly correlated which each other.

21

The GoldfeldQuandt Test


Both the White test and the BreuschPagan test
focus on smoothly changing variances
for the disturbances.
The GoldfeldQuandt test compares
the variance of error terms across
discrete subgroups.
Under homoskedasticity, all subgroups
should have the same estimated variances.

22

The GoldfeldQuandt Test (cont.)


The GoldfeldQuandt test compares
the variance of error terms across discrete subgroups.
The econometrician must divide the data into h discrete
subgroups.

23

The GoldfeldQuandt Test (cont.)


If the GoldfeldQuandt test is appropriate, it will
generally be clear which subgroups to use.

24

The GoldfeldQuandt Test (cont.)


For example, the econometrician
might ask whether men and womens incomes vary
similarly around their predicted means, given education
and experience.
To conduct a GoldfeldQuandt test, divide the data into h
= 2 groups, one for men and one for women.

13-25

The GoldfeldQuandt Test (cont.)


(1) Divide the n observations into h groups, of sizes n1..nh
(2) Choose two groups, say 1 and 2.
H 0 : 12 = 2 2 against H a : 12 2 2
(3) Regress Y against the explanators for group 1.
(4) Regress Y against the explanators for group 2.
13-26

GoldfeldQuandt Test (cont.)


(5) Relabel the groups as L and S, such that

SSRL
SSRS
>
nL k
nS k

SSRL
nL k
Compute G =
SSRS
nS k
(6) Compare G to the critical value for an F-statistic
with (nL k) and (nS k) degrees of freedom.
13-27

GoldfeldQuandt Test: An Example


Do men and womens incomes vary similarly about their
respective means, given education and experience?
That is, do the error terms for an income equation have
different variances for men and women?
We have a sample with 3,394 men and 3,146 women.

13-28

GoldfeldQuandt Test:
An Example (cont.)
(1) Divide the n observations into men and women,
of sizes nm and nw .
(2) We have only two groups, so choose both of them.
H 0 : m 2 = w2 against H a : m 2 w2
(3) For the men, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + i
(4) For the women, regress
log(income)i = 0 + 1edi + 2 expi + 3 expi 2 + vi
13-29

GoldfeldQuandt Test:
An Example (cont.)
(5) sm 2 =

sw

SSRm 1736.64
=
= 0.5123
n m k 3394 - 4

SSRw 1851.52
=
=
= 0.5893
n w k 3146 - 4

Compute G =

0.5893
= 1.15
0.5123

(6) Compare G to the critical value for an F-statistic


with 3142 and 3390 degrees of freedom, which is
0.99997 for the 5% significance level.
We reject the null hypothesis at the 5% level.
13-30

4.2 General Tests


A. Breusch-Pagan Test :
T.S. Breusch and A.R. Pagan, "A Simple Test for Heteroscedasticity
and Random Coefficient Variation," Econometrica, Vol. 47, 1979.

Assumes that:
i2 = a1 + a2z1 + a3 z3 + a4z4 am zm
[1]
where zs are all independent variables. zs can be some
or all of the original regressors or some other variables or
some transformation of the original regressors which you
think cause the heteroscedasticity:
e.g. i2 = a1 + a2exp(x1) + a3 x32 + a4x4

31

Procedure for B-P test:


(i) Obtain OLS residuals uihat from the original regression
equation and construct a new variable g:
gi = uhat 2 / ihat 2
where ihat 2 = RSS / n

(ii) Regress gi on the zs (include a constant in the regression)


(iii) B = 1/2(REGSS) from the regression of gi on the zs,
where B has a Chi-square distribution with m-1 degrees of
freedom.

32

Problems with B-P test:


B-P test is not reliable if the errors are not
normally distributed and if the sample size is small
Koenker (1981) offers an alternative calculation of
the statistic which is less sensitive to nonnormality in small samples:
BKoenker = nR2 ~ 2m-1
where n and R2 are from the regression of uhat 2 on the zs,
where BKoenker has a Chi-square distribution with m-1
degrees of freedom.

33

B. White (1980) Test


The most general test of heteroscedasticity
no specification of the form of hetero required

(i) run an OLS regression - use the OLS regression to


calculate uhat 2 (i.e. square of residual).
(ii) use uhat 2 as the dependent variable in another
regression, in which the regressors are:
(a) all "k" original independent variables, and
(b) the square of each independent variable, (excluding dummy
variables), and all 2-way interactions (or crossproducts) between
the independent variables.
The square of a dummy variable is excluded because it will be
perfectly correlated with the dummy variable.

Call the total number of regressors (not including the constant


term) in this second equation, P.

34

(iii) From results of equation 2, calculate the test statistic


nR2

~ 2 P

where n = sample size, and R2 = unadjusted coefficient of


determination.
The statistic is asymptotically (I.e. in large samples) distributed as
chi-squared with P degrees of freedom, where P is the number of
regressors in the regression, not including the constant

35

Notes on Whites test:


The White test does not make any assumptions about the
particular form of heteroskedasticity, and so is quite
general in application.
It does not require that the error terms be normally distributed.
However, rejecting the null may be an indication of model
specification error, as well as or instead of heteroskedasticity.

generality is both a virtue and a shortcoming.


It might reveal heteroscedasticity, but it might also simply be
rejected as a result of missing variables.
it is "nonconstructive" in the sense that its rejection does not
provide any clear indication of how to proceed.

NB: if you use Whites standard errors, eradicating the


heteroscedasticity is less important.

36

Problems:
Note that although t-tests become reliable when you
use Whites standard errors, F-tests are still not
reliable (so Chows first test still not reliable).
Whites SEs have been found to be unreliable in small
samples
but revised methods for small samples have been developed
to allow robust SEs to be calculated for small n.

37

Heteroskedasticity Tests
Glejser test
This makes sense conceptuallyyou are testing to see if
one of your independent variables is significantly
related to the variance of your residuals.

$ i = b0 + b1Xi + ei
u

Generalized Least Squares


OLS is unbiased, but not efficient.
The OLS weights are not optimal.
Suppose we are estimating a straight line through the
origin:
= X + with higher X
Under homoskedasticity,Yobservations
values are relatively less distorted by the error term.
OLS places greater weight on observations with high
X values.

13-39

Generalized Least Squares


Suppose observations with higher X values have error
terms with much higher variances.
Under this DGP, observations with high X s (and high
variances of ) may be more misleading than
observations with low X s (and low variances of ).
In general, we want to put more weight on
observations with smaller i2

13-40

Figure 10.3 Heteroskedasticity with


Smaller Disturbances at Smaller X s

13-41

Generalized Least Squares


To construct the BLUE Estimator for S, we follow
the same steps as before, but with our new
variance formula. The resulting estimator is
Generalized Least Squares.
Start with a linear estimator, wiYi
Impose the unbiasedness conditions,
wi X Ri = 0 for R S, wi X Si = 1
Find wi to minimize wi2 i2
13-42

Generalized Least Squares (cont.)


For an example, consider the DGP

Yi = X i + i
E( i ) = 0
Var( i ) = di
2

Cov( i , j ) = 0 for i j
X i , di fixed across samples
13-43

Generalized Least Squares (cont.)


min w 2 wi 2 di 2
i

such that

w d
j

=1
Xi

Solution: wi =

13-44

di

X
j

d
j

Generalized Least Squares (cont.)

GLS

13-45

Yi X i
d d
i
i
Xj
d
j

Generalized Least Squares (cont.)


In practice, econometricians choose a different method
for implementing GLS.
Historically, it was computationally difficult to program a
new estimator (with its own weights) for every
different dataset.
It was easier to re-weight the data first, and THEN apply
the OLS estimator.

13-46

Generalized Least Squares (cont.)


We want to transform the data so that
it is homoskedastic. Then we can
apply OLS.
It is convenient to rewrite the variance term of the
heteroskedastic DGP as

Var( i ) = d
2

13-47

2
i

Generalized Least Squares (cont.)


If we know the di factor for each observation, we can
transform the data by dividing through by di.
Once we divide all variables by di, we obtain a new
dataset that meets the GaussMarkov conditions.

13-48

GLS: DGP for Transformed Data


Yi
X i i
1
= 0 + 1
+
di
di
di di
i
E = 0
di
i 1
1 2 2
Var = 2 Var ( i ) = 2 di = 2
di
di di
i j
1
Cov , =
Cov( i , j ) = 0
di d j di d j

Xi
fixed across samples.
di
13-49

Generalized Least Squares


This procedure, Generalized Least Squares, has two
steps:
1.
2.

Divide all variables by di


Apply OLS to the transformed variables

This procedure optimally weights down observations


with high dis
GLS is unbiased and efficient

13-50

Generalized Least Squares (cont.)


Example: a straight line through the origin:

1. First, divide Yi , X i by di
2. Apply OLS to

13-51

Yi X i
d d
i
i
Xj
d
j

Yi X i
,
di di

Generalized Least Squares (cont.)


Note: we derive the same BLUE Estimator (Generalized
Least Squares) whether we:

13-52

1.

Find the optimal weights for heteroskedastic data, or

2.

Transform the data to be homoskedastic, then use OLS


weights

GLS: An Example (Chapter 10.5)


We can solve heteroskedasticity by dividing our variables
through by di.
The DGP with the transformed data is GaussMarkov.
The catch: we dont observe di.
How can we implement this strategy
in practice?

13-53

GLS: An Example (cont.)


We want to estimate the relationship

renti = 0 + 1incomei + i

We are concerned that higher income individuals are


less constrained in how much income they spend in
rent. Lower income individuals cram into what
housing they can afford; higher income individuals find
housing to suit their needs/tastes.
That is, Var(i ) may vary with income.

13-54

GLS: An Example (cont.)


An initial guess:
di = incomei

Var( i ) = income
2

2
i

If we have modeled heteroskedasticity correctly, then the


BLUE Estimator is:

rent
1
= 0
+ 1 + vi
income i
incomei
13-55

TABLE 10.1
Rent and Income in New York

13-56

TABLE 10.5 Estimating a Transformed


RentIncome Relationship,
var( ) = 2 X 2
i

13-57

Checking Understanding
An initial guess:

Var( i ) = income
2

2
i

di = incomei

rent
1
= 0
+ 1 + vi
income i
incomei

How can we test to see if we have correctly modeled


the heteroskedasticity?

13-58

Checking Understanding
If we have the correct model of heteroskedasticity,
then OLS with the transformed data should be
homoskedastic.
rent
1
= 0
+ 1 + vi
income i
incomei

We can apply either a White test or a Breusch


Pagan test for heteroskedasticity to the model
with the transformed data.
13-59

Checking Understanding (cont.)


To run the White test, we regress

1
1
ei = 0 + 1
+ 2
+ i
2
incomei
incomei
nR2 = 7.17
The critical value at the 0.05 significance level for a
Chi-square statistic with 2 degrees of freedom is 5.99
We reject the null hypothesis.

13-60

GLS: An Example
Our initial guess:
2
2
Var(

)
=

income
This guess didnt do very well. Cani we do better? i

Instead of blindly guessing, lets try looking at the data


first.

13-61

Figure 10.4 The RentIncome Ratio Plotted


Against the Inverse of Income

13-62

GLS: An Example
We seem to have overcorrected
for heteroskedasticity.
Lets try

Var( i ) = incomei
2

rent

income i

13-63

= 0

income i

+ 1 incomei + vi

TABLE 10.6 Estimating a Second Transformed


RentIncome Relationship,

var( i ) = 2 Xi

13-64

GLS: An Example
Unthinking application of the White test
procedures for the transformed data leads to
1
1
ei = 0 + 1
+ 2
+ 3 income
income
income i
i
1
+ 4 incomei + 5
income + i
income i

The interaction term reduces to a constant, which


we already have in the auxilliary equation, so we
omit it and use only the first
4 explanators.
13-65

GLS: An Example (cont.)


nR2 = 6.16
The critical value at the 0.05 significance level for a
Chi-squared statistic with 4 degrees of freedom is
9.49
We fail to reject the null hypothesis that the
transformed data are homoskedastic.
Warning: failing to reject a null hypothesis does NOT
mean we can accept it.

13-66

GLS: An Example (cont.)


Generalized Least Squares is not trivial to apply in
practice.
Figuring out a reasonable di can be quite difficult.
Next time we will learn another approach to constructing
di , Feasible Generalized Least Squares.

13-67

5. Solutions
A. Weighted Least Squares
B. Maximum likelihood estimation. (not covered)
C. Whites Standard Errors

68

Figure 10.2 Homoskedastic Disturbances More


Misleading at Smaller X s

13-69

A. Weighted Least Squares


If the differences in variability of the error term can be predicted
from another variable within the model, the Weight Estimation
procedure (available in SPSS) can be used.
computes the coefficients of a linear regression model using WLS, such
that the more precise observations (that is, those with less variability)
are given greater weight in determining the regression coefficients.

Problems:
Wrong choice of weights can produce biased estimates of the standard
errors.
we can never know for sure whether we have chosen the correct
weights, this is a real problem.
If the weights are correlated with the disturbance term, then the WLS
slope estimates will be inconsistent.
Also: Dickens (1990) found that errors in grouped data may be
correlated within groups so that weighting by the square root of the
group size may be inappropriate. See Binkley (1992) for an assessment of
tests of grouped heteroscedasticity.

70

C. Whites Standard Errors


White (op cit) developed an algorithm for correcting the
standard errors in OLS when heteroscedasticity is present.
The correction procedure does not assume any particular
form of heteroscedasticity and so in some ways White has
solved the heteroscedasticity problem.

71

Summary
(1) Causes
(2) Consequences
(3) Detection
(4) Solutions

72

Reading:
Kennedy (1998) A Guide to Econometrics, Chapters 5,6,7 & 9
Maddala, G.S. (1992) Introduction to Econometrics chapter 12
Field, A. (2000) chapter 4, particularly pages 141-162.
Green, W. H. (1990) Econometric Analysis
Grouped Heteroscedasticity:
Binkley, J.K. (1992) Finite Sample Behaviour of Tests for Grouped
Heteroskedasticity, Review of Economics and Statistics, 74, 563-8.
Dickens, W.T. (1990) Error components in grouped data: is it ever
worth weighting?, Review of Economics and Statistics, 72, 328-33.
Breusch Pagan critique:
Koenker, R. (1981) A Note on Studentizing a Test for
Heteroskedascity, Journal of Applied Econometrics, 3, 139-43.

73

You might also like