Chapter 6: Instrumental Variables (IV) estimator

Chapter 6: Endogeneity and Instrumental Variables
(IV) estimator
Advanced Econometrics - HEC Lausanne
Christophe Hurlin
University of Orlans
December 15, 2013
Christophe Hurlin (University of Orlans)
December 15, 2013
1 / 68
Section 1
Introduction
December 15, 2013
2 / 68
1. Introduction
The outline of this chapter is the following:

Section 2. Endogeneity
Section 3. Instrumental Variables (IV) estimator
Section 4. Two-Stage Least Squares (2SLS)
December 15, 2013
3 / 68
1. Introduction
References
Amemiya T. (1985), Advanced Econometrics. Harvard University Press.
Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice
Hil (recommended)
Pelgrin, F. (2010), Lecture notes Advanced Econometrics, HEC Lausanne (a
special thank)
Ruud P., (2000) An introduction to Classical Econometric Theory, Oxford
University Press.
December 15, 2013
4 / 68
1. Introduction
Notations: In this chapter, I will (try to...) follow some conventions of
notation.
fY ( y )
probability density or mass function
FY ( y )
cumulative distribution function
Pr ()
probability
vector
matrix
Be careful: in this chapter, I dont distinguish between a random vector

(matrix) and a vector (matrix) of deterministic elements (except in section
2). For more appropriate notations, see:
Abadir and Magnus (2002), Notation in econometrics: a proposal for a
standard, Econometrics Journal.
December 15, 2013
5 / 68
Section 2
Endogeneity
December 15, 2013
6 / 68
2. Endogeneity
Objectives
The objective of this section are the following:
1
To dene the endogeneity issue
To study the sources of endogeneity
To show the inconsistency of the OLS estimator (endogeneity bias)
December 15, 2013
7 / 68
2. Endogeneity
Objectives in this chapter, we assume that the assumption A3

(exogeneity) is violated:
E ( j X) 6= 0N
but the disturbances are spherical:

V ( j X ) = 2 IN
December 15, 2013
8 / 68
2. Endogeneity
The reasons for suspecting E ( j X) 6= 0 are varied:
1
Errors-in-variables
Jointly endogenous variables: the usual example is running

quantities on prices to estimate a demand equation (supply also
aects the determination of equilibrium).
Omitted variables: one or more columns in X cannot be included in

the regression because no data on those variables are
available estimation will be altered to the extent that the missing
variables and the included ones are correlated
December 15, 2013
9 / 68
2. Endogeneity
1. Error-in-variables
1
Consider the regression model:

yi = xi > + i
where E ( i j xi ) = 0.
One does not observe (y , x ) but (y , x)

yi = yi + vi
xi = xi + wi
with
E (vi ) = E (vi i ) = E (vi yi ) = E wi> xi
=0
E (wi ) = E (vi wi ) = E (wi i ) = E (wi yi ) = E (vi xi ) = 0
December 15, 2013
10 / 68
2. Endogeneity
1. Error-in-variables (contd)
1
The mismeasured regression equation is given by:

yi = xi > + i
() yi = xi> + i
vi + wi>
() yi = xi> + i
with i = i
2
vi + wi> .
The composite error term i is not orthogonal to the mismeasured

independent variable xi .
E ( i xi ) 6= 0
December 15, 2013
11 / 68
2. Endogeneity
1. Error-in-variables (contd)
Indeed, we have:
i = i
vi + wi> .
As a consequence:
E ( i xi ) = E (i xi )
E (vi xi ) + E wi> xi
= E wi> xi
E ( i xi ) 6= 0
December 15, 2013
12 / 68
2. Endogeneity
2. Simultaneous equation bias

Consider the demand equation
qd = 1 p + 2 y + ud
where qd , p and y denote respectively the quantity, the price and income.
Unfortunately, the price p is not exogenous or the orthogonality condition
E (ud p ) = 0 is not satised!
E (ud p ) 6= 0
December 15, 2013
13 / 68
2. Endogeneity
2. Simultaneous equation bias (contd)

Indeed, the supply/demand system can be written as:
qd = 1 p + 2 y + ud
qs = 1 p + us
qd = qp
where E (ud ) = E (us ) = E (us ud ) = E (us y ) = E (ud y ) = 0.
December 15, 2013
14 / 68
2. Endogeneity
2. Simultaneous equation bias (contd)
Solving qd = qp , the reduced-form equations, which express the
endogenous variables in terms of the exogenous variables, write:
p=
q=
u
2 y
+ d
1 1
1
us
= 1 y + w1
1
1 2 y
ud
+ 1
1 1
1
Therefore
E (ud p ) =
1 us
= 2 y + w2
1
2ud
6= 0
This result leads to an overestimated (upward biased) price coe cient.
December 15, 2013
15 / 68
2. Endogeneity
3. Omited variables
Consider the true model:
yi = 1 + 2 x1i + 2 x2i + i
with E (i ) = E (i x1i ) = E (i x2i ) = 0.
If we regress y on a constant and x1 (omitted variable x2 ):
yi = 1 + 2 x1i + i
i = 2 x2i + i
If Cov (x1i , x2i ) 6= 0, then
E (i x1i ) 6= 0
December 15, 2013
16 / 68
2. Endogeneity
Question
What is the consequence of the endogeneity assumption on the OLS
estimator?
December 15, 2013
17 / 68
2. Endogeneity
Consider the (population) multiple linear regression model:
y = X +
where (cf. chapter 3):
y is a N
1 vector of observations yi for i = 1, .., N
X is a N K matrix of K explicative variables xik for k = 1, ., K and

i = 1, .., N
is a N
1 vector of error terms i .
= ( 1 ..K )> is a K
1 vector of parameters
December 15, 2013
18 / 68
2. Endogeneity
The OLS estimator is dened as to be:
If we assume that
>
b
OLS = X X
X> y
E ( j X) 6 = 0
Then, we have:
>
b
E
OLS X = 0 + X X
b
E
OLS
X> E ( j X ) 6 = 0
b
= EX E
OLS X
6 = 0
December 15, 2013
19 / 68
2. Endogeneity
Theorem (Bias of the OLS estimator)

If the regressors are endogenous, i.e. E ( j X) 6= 0, the OLS estimator of
is biased
b
E
OLS 6 = 0
where 0 denotes the true value of the parameters. This bias is called the
endogeneity bias.
December 15, 2013
20 / 68
2. Endogeneity
Remark
1
We saw in Chapter 1 that an estimator may be biased (nite sample

properties) but asymptotically consistent (ex: uncorrected sample
variance).
But in presence of endogeneity, the OLS estimator is also

inconsistent.
December 15, 2013
21 / 68
2. Endogeneity
Objectives We assume that:

plim
1 >
X = 6= 0K
N
where
= E (xi i ) 6= 0K
December 15, 2013
22 / 68
2. Endogeneity
Given the denition of the OLS estimator:
>
b
OLS = 0 + X X
We have:
b
plim
OLS = 0 + plim
X>
1 >
X X
N
plim
1 >
X
N
Or equivalently:
b
plim
OLS = 0 + Q
6 = 0
December 15, 2013
23 / 68
2. Endogeneity
Theorem (Inconsistency of the OLS estimator)

If the regressors are endogenous with plim N
estimator of is inconsistent
where Q = plim N
b
plim
OLS = 0 + Q
1 X>
= , the OLS
1 X> X.
December 15, 2013
24 / 68
2. Endogeneity
Remark
The bias and the inconsistency property is not conned to the coe cients
on the endogenous variables.
Consider a case where all but the last variable in X are uncorrelated with :
1
0
0
B 0 C
1
C
plim X> = = B
@ .. A
N
Then we have:
b
plim
OLS = 0 + Q
There is no reason to expect that any of the elements of the last column
of Q 1 will equal to zero.
December 15, 2013
25 / 68
2. Endogeneity
Remark (contd)
b
plim
OLS = 0 + Q
The implication is that even though only one of the variables in X is

b
correlated with , all of the elements of
OLS are inconsistent,
not just the estimator of the coe cient on the endogenous variable.
This eects is called smearing; the inconsistency due to the
endogeneity of the one variable is smeared across all of the least
squares estimators.
December 15, 2013
26 / 68
2. Endogeneity
Example (Endogeneity, OLS estimator and smearing)
Consider the multiple linear regression model
yi = 0.4 + 0.5xi 1
0.8xi 2 + i
where i is i.i.d. with E (i ) . We assume that the vector of variables

dened by wi = (xi 1 : xi 2 : i ) has a multivariate normal distribution with
wi
with
N (03
1 , )
1
1 0.3 0
= @ 0.3 1 0.5 A
0 0.5 1
It means that Cov (i , xi 1 ) = 0 (x1 is exogenous) but Cov (i , xi 2 ) = 0.5

(x2 is endogenous) and Cov (xi 1, xi 2 ) = 0.3 (x1 is correlated to x2 ).
December 15, 2013
27 / 68
2. Endogeneity
Example (Endogeneity, OLS estimator and smearing (contd))
Write a Matlab code to (1) generate S = 1, 000 samples fyi , xi 1 , xi 2 gN
i =1
of size N = 10, 000. (2) For each simulated sample, determine the OLS
estimators of the model
yi = 1 + 2 xi 1 + 3 xi 2 + i
b = b
1s b
2s b
3s
Denote
s
>
the OLS estimates obtained from the
simulation s 2 f1, ..S g . (3) compare the true value of the parameters in
the population (DGP) to the average OLS estimates obtained for the S
simulations
December 15, 2013
28 / 68
2. Endogeneity
December 15, 2013
29 / 68
2. Endogeneity
December 15, 2013
30 / 68
2. Endogeneity
Question: What is the solution to the endogeneity issue?
The use of instruments..
December 15, 2013
31 / 68
2. Endogeneity
Key Concepts
1
Endogeneity issue
Main sources of endogeneity: omitted variables, errors-in-variables,

and jointly endogenous regressors.
Endogeneity bias of the OLS estimator
Inconsistency of the OLS estimator
Smearing eect
December 15, 2013
32 / 68
Section 3
Instrumental Variables (IV) estimator
December 15, 2013
33 / 68
3. Instrumental Variables (IV) estimator
Objectives
The objective of this section are the following:
1
To dene the notion of instrument or instrumental variable
To introduce the Instrumental Variables (IV) estimator
To study the asymptotic properties of the IV estimator
To dene the notion of weak instrument
December 15, 2013
34 / 68

Denition (Instruments)
Consider a set of H variables zh 2 RN for h = 1, ..N. Denote Z the N
matrix (z1 : .. : zH ) . These variables are called instruments or
instrumental variables if they satisfy two properties:
(1) Exogeneity: They are uncorrelated with the disturbance.

E ( j Z) = 0N
(2) Relevance: They are correlated with the independent variables, X.

E (xik zih ) 6= 0
for h 2 f1, .., H g and k 2 f1, .., K g.
December 15, 2013
35 / 68

Assumptions: The instrumental variables satisfy the following properties.
Well behaved data:
plim
1 >
Z Z = QZZ a nite H
N
H positive denite matrix
1 >
Z X = QZX a nite H
N
K positive denite matrix
Relevance:
plim
Exogeneity:
plim
1 >
Z = 0K
N
December 15, 2013
36 / 68
Denition (Instrument properties)

We assume that the H instruments are linearly independent:
E Z> Z
is non singular
or equivalently
rank E Z> Z
=H
December 15, 2013
37 / 68

Remark
The exogeneity condition
E ( i j zi ) = 0 =) E (i zi ) = 0
with zi = (zi 1 ..ziH )> can expressed as an orthogonality condition or
moment condition
E zi yi xi>
=0
The sample analog is
1
N
zi yi
xi>
=0
i =1
December 15, 2013
38 / 68

Denition (Identication)
The system is identied if there exists a unique = 0 such that:
E zi yi
xi>
=0
where zi = (zi 1 ..ziH )> . For that, we have the following conditions:
(1) If H < K the model is not identied.
(2) If H = K the model is just-identied.
(3) If H > K the model is over-identied.
December 15, 2013
39 / 68

Remark
1
Under-identication: less equations (H) than unknowns (K )....
Just-identication: number of equations equals the number of

unknowns (unique solution)...=> IV estimator
Over-identication: more equations than unknowns. Two equivalent

solutions:
1
Select K linear combinations of the instruments to have a unique

solution )...=> Two-Stage Least Squares
Set the sample analog of the moment conditions as close as possible to
zero, i.e. minimize the distance between the sample analog and zero
given a metric (optimal metric or optimal weighting matrix?) =>
Generalized Method of Moments (GMM).
December 15, 2013
40 / 68
December 15, 2013
41 / 68
Assumption: Consider a just-identied model

H=K
December 15, 2013
42 / 68

Motivation of the IV estimator
By denition of the instruments:
plim
1
1 >
Z = plim Z> (y
N
N
X) = 0K
So, we have:
plim
1 >
Z y=
N
plim
1 >
Z X
N
or equivalently
=
plim
1 >
Z X
N
plim
1 >
Z y
N
December 15, 2013
43 / 68
Denition (Instrumental Variable (IV) estimator)

b of parameters
If H = K , the Instrumental Variable (IV) estimator
IV
is dened as to be:
1
b = Z> X
Z> y
IV
December 15, 2013
44 / 68
Denition (Consistency)
b is
Under the assumption that plim N 1 Z> , the IV estimator
IV
consistent:
p
b !
0
IV
where 0 denotes the true value of the parameters.
December 15, 2013
45 / 68

Proof
By denition:
So, we have:
b = +
IV
0
1 >
Z X
N
b = + plim 1 Z> X
plim
IV
0
N
1 >
Z
N
plim
1 >
Z
N
Under the assumption of exogeneity of the instruments

plim
1 >
Z = 0K
N
So, we have
b =
plim
IV
0
December 15, 2013
46 / 68
Denition (Asymptotic distribution)

b is asymptotically
Under some regularity conditions, the IV estimator
IV
normally distributed:
where
b
N
IV
0 ! N 0K
QZZ = plim
K K
1 >
Z Z
N
1,
QZX1 QZZ QZX1
QZX = plim
K K
1 >
Z X
N
December 15, 2013
47 / 68
Denition (Asymptotic variance covariance matrix)

b is
The asymptotic variance covariance matrix of the IV estimator
IV
dened as to be:
b
Vasy
IV
2
Q 1 QZZ QZX1
N ZX
A consistent estimator is given by

b
b asy
V
IV
b 2 Z> X
=
Z> Z
X> Z
December 15, 2013
48 / 68

Remarks
1
If the system is just identied H = K ,

Z> X
= X> Z
QZX = QXZ
the estimator can also written as
b
b asy
V
IV
b 2 Z> X
=
Z> Z
Z> X
As usual, the estimator of the variance of the error terms is:

b2 =
b
>b
1
=
N K
N K
yi
i =1
b
xi>
IV
December 15, 2013
49 / 68

Relevant instruments
1
Our analysis thus far has focused on the identication condition

for IV estimation, that is, the exogeneity assumption, which
produces
1
plim Z> = 0K 1
N
A growing literature has argued that greater attention needs to be
given to the relevance condition
plim
1 >
Z X = QZX a nite H
N
with H = K in the case of a just-identied model.
December 15, 2013
50 / 68
Relevant instruments (contd)

plim
1 >
Z X = QZX a nite H
N
While strictly speaking, this condition is su cient to determine the

asymptotic properties of the IV estimator
However, the common case of weak instruments, is only barely

true has attracted considerable scrutiny.
December 15, 2013
51 / 68
Denition (Weak instrument)

A weak instrument is an instrumental variable which is only slightly
correlated with the right-hand-side variables X. In presence of weak
instruments, the quantity QZX is close to zero and we have
1 >
Z X ' 0H
N
December 15, 2013
52 / 68

Fact (IV estimator and weak instruments)
b has a poor
In presence of weak instruments, the IV estimators
IV
precision (great variance). For QZX ' 0H K , the asymptotic variance
tends to be very large, since:
b
Vasy
IV
2
Q 1 QZZ QZX1
N ZX
As soon as N 1 Z> X ' 0H K , the estimated asymptotic variance

covariance is also very large since
b
b asy
V
IV
b 2 Z> X
=
Z> Z
X> Z
December 15, 2013
53 / 68
Key Concepts
1
Instrument or instrumental variable
Orthogonal or moment condition
Identication: just-identied or over-identied model
Instrumental Variables (IV) estimator
Statistical properties of the IV estimator
Weak instrument
December 15, 2013
54 / 68
Section 4
Two-Stage Least Squares (2SLS) estimator
December 15, 2013
55 / 68
4. Two-Stage Least Squares (2SLS) estimator
Assumption: Consider an over-identied model

H>K
December 15, 2013
56 / 68
Introduction
If Z contains more variables than X, then much of the preceding derivation
is unusable, because Z> X will be H K with
rank Z> X = K < H
So, the matrix Z> X has no inverse and we cannot compute the IV
estimator as:
1
b = Z> X
Z> y
IV
December 15, 2013
57 / 68
Introduction (contd)
The crucial assumption in the previous section was the exogeneity
assumption
1
plim Z> = 0K 1
N
1
That is, every column of Z is asymptotically uncorrelated with .
That also means that every linear combination of the columns of Z

is also uncorrelated with , which suggests that one approach would
be to choose K linear combinations of the columns of Z.
December 15, 2013
58 / 68

Introduction (contd)
Which linear combination to choose?
A choice consists in using is the projection of the columns of X in the
column space of Z:
1
b = Z Z> Z
X
Z> X
b for Z, we have
With this choice of instrumental variables, X
b
2SLS
b >X
X
>
b >y
X
>
X Z Z Z
>
Z X
X> Z Z> Z
Z> y
December 15, 2013
59 / 68
Denition (Two-stage Least Squares (2SLS) estimator)

The Two-stage Least Squares (2SLS) estimator of the parameters is
dened as to be:
1 >
b
b>
b y
X
2SLS = X X
1
b = Z Z> Z
where X
Z> X corresponds to the projection of the columns
of X in the column space of Z, or equivalently by
b
2SLS =
>
>
X Z Z Z
>
Z X
X> Z Z> Z
Z> y
December 15, 2013
60 / 68

Remark
By denition
1
b
b>
2SLS = X X
Since
b = Z Z> Z
X
b >y
X
Z> X = PZ X
where PZ denotes the projection matrix on the columns of Z. Reminder:

PZ is symmetric and PZ PZ> = PZ . So, we have
b
2SLS
>
X> PZ X
X> PZ PZ X
b >X
b
X
>
b >y
X
1
b >y
X
b >y
X
December 15, 2013
61 / 68
Denition (Two-stage Least Squares (2SLS) estimator)

The Two-stage Least Squares (2SLS) estimator of the parameters
can also be dened as:
b
b> b
2SLS = X X
b >y
X
b
It corresponds to the OLS estimator obtained in the regression of y on X.
b
Then, the 2SLS can be computed in two steps, rst by computing X, then
by the least squares regression. That is why it is called the two-stage LS
estimator.
December 15, 2013
62 / 68

A procedure to get the 2SLS estimator is the following
Step 1: Regress each explicative variable xk (for k = 1, ..K ) on the H
instruments.
xki = 1 z1i + 2 z2i + .. + H zHi + vi
Step 2: Compute the OLS estimators b
h and the tted values b
xki
b
xki = b
1 z1i + b
2 z2i + .. + b
H zHii
Step 3: Regress the dependent variable y on the tted values b

xki :
yi = 1 b
x1i + 2 b
x2i + .. + K b
xKi + i
b
The 2SLS estimator
2SLS then corresponds to the OLS estimator
obtained in this model.
December 15, 2013
63 / 68
Theorem
If any column of X also appears in Z, i.e. if one or more explanatory
(exogenous) variable is used as an instrument, then that column of X is
b
reproduced exactly in X.
December 15, 2013
64 / 68
Example (Explicative variables used as instrument)

Suppose that the regression contains K variables, only one of which, say,
the K th , is correlated with the disturbances, i.e. E (xKi i ) 6= 0. We can
use a set of instrumental variables z1 ,..., zJ plus the other K 1 variables
that certainly qualify as instrumental variables in their own right. So,
Z = (z1 : .. : zJ : x1 : .. : xK
1)
Then
b = (x1 : .. : xK
X
:b
xK )
where b
xK denotes the projection of xK on the columns of Z.
December 15, 2013
65 / 68
December 15, 2013
66 / 68
Key Concepts
1
Over-identied model
Two-Stage Least Squares (2SLS) estimator
December 15, 2013
67 / 68
End of Chapter 6
December 15, 2013
68 / 68

Chapter 6: Instrumental Variables (IV) estimator

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 6: Instrumental Variables (IV) estimator

Uploaded by

Copyright:

Available Formats

Chapter 6: Endogeneity and Instrumental Variables

December 15, 2013

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

The outline of this chapter is the following:

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

probability density or mass function

cumulative distribution function

Be careful: in this chapter, I dont distinguish between a random vector

Advanced Econometrics - HEC Lausanne

December 15, 2013

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

To dene the endogeneity issue

To study the sources of endogeneity

To show the inconsistency of the OLS estimator (endogeneity bias)

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Objectives in this chapter, we assume that the assumption A3

but the disturbances are spherical:

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Jointly endogenous variables: the usual example is running

Omitted variables: one or more columns in X cannot be included in

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Consider the regression model:

One does not observe (y , x ) but (y , x)

E (wi ) = E (vi wi ) = E (wi i ) = E (wi yi ) = E (vi xi ) = 0

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

The mismeasured regression equation is given by:

The composite error term i is not orthogonal to the mismeasured

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

2. Simultaneous equation bias

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

2. Simultaneous equation bias (contd)

Christophe Hurlin (University of Orlans)

Advanced Econometrics - HEC Lausanne

December 15, 2013

This result leads to an overestimated (upward biased) price coe cient.