Heteroskedasticity

Lecture 8: Heteroskedasticity
Xing Hong
Department of Economics
University of Maryland, College Park
Fall 2015
Xing Hong (UMD) ECON422 Econometrics Fall 2015 1 / 30

Gauss-Markov Assumptions Revisited
Assumption MLR.1 (Linear in Parameters): The model in the

population can be written as
y = 0 + 1 x1 + 2 x2 + + k xk + u
where 0 , 1 , ...k are the unknown parameters (constants) of interest

and u is an unobserved random error or disturbance term.
Linear regression: y = 0 + 1 x1 + 2 x22 + u
Non-linear regression: y = 0 + 1 x1 + 12 x2 + u.
Assumption MLR.2 (Random Sampling): We have a random

sample of n observations {(xi1 , xi2 , , xik , yi ) : i = 1, 2, . . . , n},
following the population model in Assumption MLR.1.
This ensures that the error terms {u1 , u2 , . . . , un } are i.i.d.

Assumption MLR.3 (No Perfect Colinearity): In the sample (and
therefore in the population), none of the independent variables is
constant, and there are no exact linear relationship among the
independent variables.
"none of the independent variables is constant": similar to the
assumption made in the simple regression model.
In order to know the effect of x1 on y , you need at least some variation
in x1 in the sample.
Implication: require sample size n k + 1
no exact linear relationship among x1 , x2 , . . . , xk .
Exact Linear relationship: x3 = ax1 + bx2 (not allowed)
Non-linear relationship: x3 = a log(x1 ) + bx22 (okay)
If a model suffers from perfect collinearity, then it cannot be estimated
by OLS.

Assumption MLR.4 (Zero Conditional Mean) The error u has an
expected value of zero given any values of the independent variables.
In other words,
E (u|x1 , x2 , , xk ) = 0
A more realistic assumption under multiple regression than simple

regression.
Once again, it is the key assumption in deriving the unbiasedess of
the OLS estimators.
When the assumptions holds, we say that we have exogenous
explanatory variables. If xj is correlated with u for any reason, then xj
is said to be an endogenous explanatory variable.
Suppose that the true model is consumption = 0 + 1 inc + 2 inc 2 + u,
but you estimate consumption = 0 + 1 inc + u. Then E (u|inc) 6= 0,
i.e. the zero conditional mean assumption fails and the OLS estimators
will (generally) be biased.

Assumption MLR.5 (Homoskedasticity): The error u has the same
variance given any values of the explanatory variables. In other words,
Var (u|x1 , , xk ) = 2
This assumption is not needed to show unbiasedness.

It is the key assumption in deriving the variance of the OLS
estimator and its efficiency property.
2
If the assumption fails, for example, Var (u|x1 , , xk ) = x 2 , then we
1
have heteroskedasticity and the OLS estimator no longer has the
good efficiency property as stated in the Gauss-Markov Theorem.

Heteroskedasticity
Heteroskedasticity
Var (ui |xi ) = i2 = 2 h(xi )
the variance of the error depends on the level of xi , i.e vary across
observations.
eg. 1: Var (ui |xi ) = xi2
eg. 2: Var (ui |xi ) = e xi
Heteroskedasticity is Violation of Homoskedasticity.

What does heteroskedasticity look like?
Recall that
Var (yi |xi ) = Var (ui |xi ) = i2
Which figure has heteroskedasticity?

Consequences of Heteroskedasticity
Assuming the other Gauss-Markov assumptions (MLR.1 to MLR4) hold

1 OLS estimators remain unbiased (MLR4).
2 OLS estimators remain consistent (asymptotic property: Ch5).
3 The formula for the variance of OLS estimators is no longer
valid.
4 The formula for the standard error of OLS estimators is no
longer valid.
We will see how to adjust the standard error formula
5 OLS is no longer BLUE.
We will show how to find a more efficient estimator than OLS.
But first, lets see how to test for heteroskedasticity.

Testing for Heteroskedasticity
Homoskedasticity
Var (u|x ) = E (u 2 |x ) = 2
We want to test E (u 2 |x ) = 2 . In other words, ui2 is independent of

the values of x .
A simple approach is to assume a linear function
u 2 = 0 + 1 x1 + 2 x2 + + k xk + v
The null hypothesis of homoskedasticity is

H0 : 1 = 2 = = k = 0
However, since we do not observe the actual errors u 2 , we replace u 2

with u^2 after estimating the regression.
Breusch-Pagan (BP) Test For Heteroskedasticity
Implementation:
1 Estimate the model y = 0 + 1 x1 + 2 x2 + + k xk + u by OLS,
as usual. Obtain the squared OLS residuals, uî2 (one for each
observation).
2 Run the auxiliary regression
uî2 = 0 + 1 x1 + 2 x2 + + k xk + v
3 Test the null hypothesis that H0 : 1 = 2 = = k = 0 using the

Fk,nk1 statistic.

Correcting for Heteoskedasticity
If we reject the null hypothesis of homoskedasticity (sufficiently small

p-values, for example), we must take corrective measures to account for
heteroskedasticity.
use Heteroskedasticity-Robust Standard Errors.
use Weighted Least Squares Estimators: more efficient than OLS

Heteroskedasticity-Robust Standard Error: SLR
Consider the simple linear regression model
y = 0 + 1 x + u
The OLS estimator of can be written as

P
n
(xi x )ui
^1 = 1 + i=1
Pn
(xi x )2
i=1

With homoskedasticity: Var (ui |xi ) = 2 ,
P
n P
n
(xi x )ui
(xi x )ui
Var (^1 ) =Var 1 + i=1
i=1
= Var

Pn Pn
(xi x )2 (xi x )2

i=1 i=1
P
n P
n
(xi x )2 Var (ui ) (xi x )2 2
i=1 i=1
= P
n = Pn
( (xi x )2 )2 ( (xi x )2 )2
i=1 i=1
Pn
2 (xi x )2
i=1 2
= P
n = P
n
( (xi x )2 )2 (xi x )2
i=1 i=1

With heteroskedasticity: Var (ui |xi ) = i2
P
n P
n
(xi x )2 Var (ui ) (xi x )2 2
i=1 i=1
P
n 6= Pn
( (xi x )2 )2 ( (xi x )2 )2
i=1 i=1
Instead,
P
n P
n
(xi x )2 Var (ui ) (xi x )2 i2
i=1
Var (^1 ) = P
n = i=1
Pn
( (xi x )2 )2 ( (xi x )2 )2
i=1 i=1
which cannot be simplified further.

As before, we do not observe i2 . We need to use an estimator: uî2 .
uî is the residual from the initial regression of y on x .
A valid estimator of Var (^1 ) for heteroskedasticity of any form
(including homoskedasticity), is
P
n
(xi x )2 uî2
i=1
Var (^1 ) = !2
P
n
(xi x )2
i=1
This is for simple linear regression: k = 1

Heteroskedasticity-Robust Standard Error: MLR
For Multiple Regressions, a valid estimator of Var (^j ) under
Assumption MLR.1 through MLR.4 is
P
n
^rij2 uî2
d (^j ) = i=1
Var
SSTj (1 Rj2 )
where ^rij denotes the i-th residual from regressing xj on all other
independent variables, and Rj2 is the R-squared from this regression.
Heteroskedasticity-robust standard error, or simply robust
standard error v
u Pn
^r 2 u^2
u
u i=1 ij i
u
d ^j ) =
s.e.( t
SSTj (1 Rj2 )
The corresponding t-statistics are called heteroskedasticity-robust

t-statistics.
Remarks
Question: If the heteroskedasticity-robust standard errors are valid more
often than the usual OLS standard errors, why do we bother with the
usual OLS standard errors and test statistics at all?
The robust standard errors and robust t statistics are justified
only as the sample size becomes large, even if the CLM
assumptions are true.
The approximation can be poor when sample sizes are small.
If the CLM assumptions (homoskedasticity and normality, in
particular) are true, then the usual OLS t-statistics have exact
t-distribution, regardless of the sample size.
Economists prefer the usual OLS standard errors and test statistics,
unless there is evidence of heteroskedasticity.
In practice:
The trend in applied work is to always report only the
heteroskedasticity-robust standard errors in cross-sectional
applications with large sample sizes.
It is also common to report both standard errors and let the reader
determine whether any conclusions
Xing Hong (UMD)
are sensitive to the standard
ECON422 Econometrics Fall 2015
error
17 / 30
Remarks
There are also heteroskedasticity-robust F statistics.

You should use them in place of the usual F statistics in the presence
of heteroskedasticity.
Stata and other statistical packages can do this for you.
In Stata, if you want to estimate
y = 0 + 1 x1 + 2 x2 + u
and obtain robust standard errors, use the command:

regress y x1 x2 , robust
The OLS point estimates (the coefficients) are not changed, but
standard errors, t-statistics, F -statistic, p-values, and confidence
intervals are.

Weighted Least Squares (WLS)
Gauss-Markov Theorem
Under Assumptions MLR.1 through MLR.5, OLS estimators
^0 , ^1 , , ^k are the best linear unbiased estimators (BLUEs) of
0 , 1 , , k respectively.
When the Gauss-Markov assumptions are true, we need not look for
alternative unbiased linear estimators: none will be better.
With heteroskedasticity, Gauss-Markov theorem no longer
applies. OLS is no longer the best.
We can specify the form of heteroskedasticity and use Weighted
Least Squares (WLS).
WLS is more efficient than OLS in the presence of
heteroskedasticity.

Weighted Least Squares
We consider two cases here:

The form of heteroskedasticity is known up to a multiplicative
constant.
The heteroskedasticity function must be estimated.

Weighted Least Squares
Consider the model
y = 0 + 1 x1 + 2 x2 + + k xk + u (1)
Assume
Var (u|x) = 2 h(x)
where h(x) is some function of the explanatory variables.
The idea is to transform regression equation above, so that all
the Gauss-Markov assumptions are satisfied in the transformed
model, then apply OLS.
Since the Gauss-Markov assumptions hold in the transformed model,
the OLS estimators there will be BLUE.

Var (u|x) = 2 h(x): When h(x) is known
Original model with Heteroskedasticity:
y = 0 + 1 x1 + 2 x2 + + k xk + u

Write h(xi ) as hi , which is known. Divide both sides by hi :
y x x x u
i = 0 + 1 1i + 2 2i + + k ki + i
hi hi hi hi hi hi
to get
y = 0 x0 + 1 x1 + 2 x2 + + k xk + u (2)
where yi = yi , x0i = 1 ,x = x1i
, and so on.
hi hi 1i hi
u 1 1

Var (ui |xi ) = Var i |xi = V ar (ui |xi ) = h(xi ) 2 = 2
hi hi hi
The transformed model (2) satisfies all the Gauss-Markov

Assumptions.
The WLS are the OLS in the transformed model.
They minimize:
X
n
(yi b^0 xi0

b^1 xi1
b^2 xi2 b^k xik )2
i=1
In terms of the original data, the objective function is:

Xn
(yi b^0 b^1 xi1 b^2 xi2 b^k xik )2
i=1
hi
That is, we weigh the squared residuals by 1/hi . The estimators

that minimize the weighted sum of squared residuals are therefore
called weighted least squares estimators.
Less weight (1/hi ) is given to observations with a higher error
variance ( 2 hi ) (OLS gives all observations equal weights.)

Example
Consider a simple model for savings:
savi = 0 + 1 inci + ui
and
Var (ui |inci ) = 2 inci
That is, h(x) = h(inc) = inc.
The transformed model is
sav 1 inci ui
i = 0 + 1 +
inci inci inci inci
The error term in the transformed model is ui = ui . Therefore,

inci
ui 1
Var (ui |inci ) = Var ( |inci ) = Var (ui |inci ) = 2
inci inci

Example
We obtain the weighted least squares (WLS) estimators of the

original model
savi = 0 + 1 inci + ui
by getting the OLS estimators of the transformed model
sav 1 inci ui
i = 0 + 1 +
inci inci inci inci
STATA and other statistical packages have features for computing

weighted least squares.
Typically, along with the dependent and independent variables in the
original model, we just specify the weighting function, h1i .
The estimates and standard errors will be different from OLS.
The interpretation of the estimates and test statistics remains
the same.
Exercise
Consider the simple model
y = 0 + 1 x1 + 2 x2 + u
and
Var (ui |xi1 , xi2 ) = 2 e xi1
Here, h(x1 , x2 ) = e x1 .
What is the transformed model here? What is the error term ui in
the transformed model in terms of ui and xi ?
Show that
Var (ui |xi1 , xi2 ) = 2

Var (u|x) = 2 h(x): when the form of h(x) is unknown
In the previous examples, the heteroskedasticity is known up to a

multiplicative form
Var (u|x) = 2 h(x)
In most cases, the exact form of heteroskedasticity is not

obvious.
We need to estimate the function h(x) from data.
î .
For each hi , we have an estimate h
Using hî in the transformed model instead of hi yields an estimator
called the Feasible Generalized Least Squares Estimator (FGLS).

FGLS: Approach
We still need to make some assumption about the form of h(x ). It is
just that we are not as specific as before.
A fairly common and flexible approach is to assume:
Var (u|x) = 2 exp(0 + 1 x1 + 2 x2 + + k xk )
and estimate 0 , 1 , . . . , k with data.
u 2 = 2 exp(0 + 1 x1 + 2 x2 + + k xk )v
log(u 2 ) = 0 + 0 + 1 x1 + 2 x2 + + k xk +
As usual, we must replace the unobserved u with the OLS residuals.

Therefore, we run the regression of log(^ u 2 ) on x1 , x2 , . . . , xk .
The fitted values gî from this regression give us the weights:
î = exp(^
h gi )
FGLS: Implementation
1 Run the OLS regression of y on x1 , x2 , , xk and obtain the
residuals, u^.
2 u 2 ) by first squaring the OLS residuals and then taking
Generate log(^
natural log.
3 u 2 ) on x1 , x2 , , xk and obtain the fitted
Run the regression of log(^
values, g^ .
4 Exponentiate the fitted values h^ = e g^
5 Estimate the equation y = 0 + 1 x1 + + k xk + u by WLS, using
î
weights 1/h
Remarks on FGLS:
FGLS is neither unbiased, nor BLUE.
But it is consistent and asymptotically more efficient than OLS.
It is an attractive alternative to OLS when there is heteroskedasticity
and the sample size is large.
Linear Probability Model Revisited
We call a multiple regression model a linear probability model when

its dependent variable is a dummy variable.
We showed in the previous lecture that homoskedasticity is always
violated in a linear probability model:
Var (y |x ) = p(x )(1 p(x ))
There are two ways to deal with heteroskedasticity in LPM:

use OLS estimation, but use heteroskedasticity-robust standard errors
in test statistics.
use feasible GLS: see textbook for detailed implementation.

Heteroskedasticity

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Heteroskedasticity

Uploaded by

Copyright:

Available Formats

Lecture 8: Heteroskedasticity

Xing Hong (UMD) ECON422 Econometrics Fall 2015 1 / 30

Assumption MLR.1 (Linear in Parameters): The model in the

where 0 , 1 , ...k are the unknown parameters (constants) of interest

Assumption MLR.2 (Random Sampling): We have a random

Xing Hong (UMD) ECON422 Econometrics Fall 2015 2 / 30

Xing Hong (UMD) ECON422 Econometrics Fall 2015 3 / 30

A more realistic assumption under multiple regression than simple

Xing Hong (UMD) ECON422 Econometrics Fall 2015 4 / 30

This assumption is not needed to show unbiasedness.

Xing Hong (UMD) ECON422 Econometrics Fall 2015 5 / 30

Var (ui |xi ) = i2 = 2 h(xi )

Xing Hong (UMD) ECON422 Econometrics Fall 2015 6 / 30

Xing Hong (UMD) ECON422 Econometrics Fall 2015 7 / 30

Assuming the other Gauss-Markov assumptions (MLR.1 to MLR4) hold

Xing Hong (UMD) ECON422 Econometrics Fall 2015 8 / 30

We want to test E (u 2 |x ) = 2 . In other words, ui2 is independent of

The null hypothesis of homoskedasticity is

However, since we do not observe the actual errors u 2 , we replace u 2

3 Test the null hypothesis that H0 : 1 = 2 = = k = 0 using the

Xing Hong (UMD) ECON422 Econometrics Fall 2015 10 / 30

If we reject the null hypothesis of homoskedasticity (sufficiently small

Xing Hong (UMD) ECON422 Econometrics Fall 2015 11 / 30

Consider the simple linear regression model

The OLS estimator of can be written as

Xing Hong (UMD) ECON422 Econometrics Fall 2015 12 / 30

Xing Hong (UMD) ECON422 Econometrics Fall 2015 13 / 30

which cannot be simplified further.

Xing Hong (UMD) ECON422 Econometrics Fall 2015 14 / 30

This is for simple linear regression: k = 1

Xing Hong (UMD) ECON422 Econometrics Fall 2015 15 / 30

The corresponding t-statistics are called heteroskedasticity-robust

There are also heteroskedasticity-robust F statistics.

and obtain robust standard errors, use the command:

Xing Hong (UMD) ECON422 Econometrics Fall 2015 18 / 30

Xing Hong (UMD) ECON422 Econometrics Fall 2015 19 / 30

We consider two cases here:

Xing Hong (UMD) ECON422 Econometrics Fall 2015 20 / 30

Consider the model

Xing Hong (UMD) ECON422 Econometrics Fall 2015 21 / 30

The transformed model (2) satisfies all the Gauss-Markov

In terms of the original data, the objective function is:

That is, we weigh the squared residuals by 1/hi . The estimators

Xing Hong (UMD) ECON422 Econometrics Fall 2015 23 / 30

Consider a simple model for savings:

The error term in the transformed model is ui = ui . Therefore,

Xing Hong (UMD) ECON422 Econometrics Fall 2015 24 / 30

We obtain the weighted least squares (WLS) estimators of the

STATA and other statistical packages have features for computing

Consider the simple model

Xing Hong (UMD) ECON422 Econometrics Fall 2015 26 / 30

In the previous examples, the heteroskedasticity is known up to a

In most cases, the exact form of heteroskedasticity is not

Xing Hong (UMD) ECON422 Econometrics Fall 2015 27 / 30

As usual, we must replace the unobserved u with the OLS residuals.

We call a multiple regression model a linear probability model when

Var (y |x ) = p(x )(1 p(x ))

There are two ways to deal with heteroskedasticity in LPM:

Xing Hong (UMD) ECON422 Econometrics Fall 2015 30 / 30

You might also like