Professional Documents
Culture Documents
(IV) estimator
Advanced Econometrics - HEC Lausanne
Christophe Hurlin
University of Orlans
1 / 68
Section 1
Introduction
2 / 68
1. Introduction
3 / 68
1. Introduction
References
Amemiya T. (1985), Advanced Econometrics. Harvard University Press.
Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice
Hil (recommended)
Pelgrin, F. (2010), Lecture notes Advanced Econometrics, HEC Lausanne (a
special thank)
Ruud P., (2000) An introduction to Classical Econometric Theory, Oxford
University Press.
4 / 68
1. Introduction
Notations: In this chapter, I will (try to...) follow some conventions of
notation.
fY ( y )
FY ( y )
Pr ()
probability
vector
matrix
5 / 68
Section 2
Endogeneity
6 / 68
2. Endogeneity
Objectives
The objective of this section are the following:
1
7 / 68
2. Endogeneity
8 / 68
2. Endogeneity
The reasons for suspecting E ( j X) 6= 0 are varied:
1
Errors-in-variables
9 / 68
2. Endogeneity
1. Error-in-variables
1
where E ( i j xi ) = 0.
xi = xi + wi
with
E (vi ) = E (vi i ) = E (vi yi ) = E wi> xi
=0
10 / 68
2. Endogeneity
1. Error-in-variables (contd)
1
() yi = xi> + i
vi + wi>
() yi = xi> + i
with i = i
2
vi + wi> .
11 / 68
2. Endogeneity
1. Error-in-variables (contd)
Indeed, we have:
i = i
vi + wi> .
As a consequence:
E ( i xi ) = E (i xi )
E (vi xi ) + E wi> xi
= E wi> xi
E ( i xi ) 6= 0
12 / 68
2. Endogeneity
13 / 68
2. Endogeneity
14 / 68
2. Endogeneity
2. Simultaneous equation bias (contd)
Solving qd = qp , the reduced-form equations, which express the
endogenous variables in terms of the exogenous variables, write:
p=
q=
u
2 y
+ d
1 1
1
us
= 1 y + w1
1
1 2 y
ud
+ 1
1 1
1
Therefore
E (ud p ) =
1 us
= 2 y + w2
1
2ud
6= 0
15 / 68
2. Endogeneity
3. Omited variables
Consider the true model:
yi = 1 + 2 x1i + 2 x2i + i
with E (i ) = E (i x1i ) = E (i x2i ) = 0.
If we regress y on a constant and x1 (omitted variable x2 ):
yi = 1 + 2 x1i + i
i = 2 x2i + i
If Cov (x1i , x2i ) 6= 0, then
E (i x1i ) 6= 0
Christophe Hurlin (University of Orlans)
16 / 68
2. Endogeneity
Question
What is the consequence of the endogeneity assumption on the OLS
estimator?
17 / 68
2. Endogeneity
Consider the (population) multiple linear regression model:
y = X +
where (cf. chapter 3):
y is a N
= ( 1 ..K )> is a K
1 vector of parameters
18 / 68
2. Endogeneity
The OLS estimator is dened as to be:
If we assume that
>
b
OLS = X X
X> y
E ( j X) 6 = 0
Then, we have:
>
b
E
OLS X = 0 + X X
b
E
OLS
X> E ( j X ) 6 = 0
b
= EX E
OLS X
6 = 0
19 / 68
2. Endogeneity
20 / 68
2. Endogeneity
Remark
1
21 / 68
2. Endogeneity
1 >
X = 6= 0K
N
where
= E (xi i ) 6= 0K
22 / 68
2. Endogeneity
Given the denition of the OLS estimator:
>
b
OLS = 0 + X X
We have:
b
plim
OLS = 0 + plim
X>
1 >
X X
N
plim
1 >
X
N
Or equivalently:
b
plim
OLS = 0 + Q
6 = 0
23 / 68
2. Endogeneity
where Q = plim N
b
plim
OLS = 0 + Q
1 X>
= , the OLS
1 X> X.
24 / 68
2. Endogeneity
Remark
The bias and the inconsistency property is not conned to the coe cients
on the endogenous variables.
Consider a case where all but the last variable in X are uncorrelated with :
1
0
0
B 0 C
1
C
plim X> = = B
@ .. A
N
Then we have:
b
plim
OLS = 0 + Q
There is no reason to expect that any of the elements of the last column
of Q 1 will equal to zero.
Christophe Hurlin (University of Orlans)
25 / 68
2. Endogeneity
Remark (contd)
b
plim
OLS = 0 + Q
26 / 68
2. Endogeneity
Example (Endogeneity, OLS estimator and smearing)
Consider the multiple linear regression model
yi = 0.4 + 0.5xi 1
0.8xi 2 + i
N (03
1 , )
1
1 0.3 0
= @ 0.3 1 0.5 A
0 0.5 1
27 / 68
2. Endogeneity
Example (Endogeneity, OLS estimator and smearing (contd))
Write a Matlab code to (1) generate S = 1, 000 samples fyi , xi 1 , xi 2 gN
i =1
of size N = 10, 000. (2) For each simulated sample, determine the OLS
estimators of the model
yi = 1 + 2 xi 1 + 3 xi 2 + i
b = b
1s b
2s b
3s
Denote
s
>
simulation s 2 f1, ..S g . (3) compare the true value of the parameters in
the population (DGP) to the average OLS estimates obtained for the S
simulations
28 / 68
2. Endogeneity
29 / 68
2. Endogeneity
30 / 68
2. Endogeneity
31 / 68
2. Endogeneity
Key Concepts
1
Endogeneity issue
Smearing eect
32 / 68
Section 3
Instrumental Variables (IV) estimator
33 / 68
Objectives
The objective of this section are the following:
1
34 / 68
35 / 68
1 >
Z Z = QZZ a nite H
N
1 >
Z X = QZX a nite H
N
Relevance:
plim
Exogeneity:
plim
1 >
Z = 0K
N
36 / 68
is non singular
or equivalently
rank E Z> Z
=H
37 / 68
zi yi
xi>
=0
i =1
38 / 68
xi>
=0
where zi = (zi 1 ..ziH )> . For that, we have the following conditions:
(1) If H < K the model is not identied.
(2) If H = K the model is just-identied.
(3) If H > K the model is over-identied.
39 / 68
40 / 68
41 / 68
42 / 68
1
1 >
Z = plim Z> (y
N
N
X) = 0K
So, we have:
plim
1 >
Z y=
N
plim
1 >
Z X
N
or equivalently
=
plim
1 >
Z X
N
plim
1 >
Z y
N
43 / 68
Z> y
IV
44 / 68
Denition (Consistency)
b is
Under the assumption that plim N 1 Z> , the IV estimator
IV
consistent:
p
b !
0
IV
where 0 denotes the true value of the parameters.
45 / 68
So, we have:
b = +
IV
0
1 >
Z X
N
b = + plim 1 Z> X
plim
IV
0
N
1 >
Z
N
plim
1 >
Z
N
1 >
Z = 0K
N
So, we have
b =
plim
IV
0
46 / 68
where
b
N
IV
0 ! N 0K
QZZ = plim
K K
1 >
Z Z
N
1,
QZX = plim
K K
1 >
Z X
N
47 / 68
2
Q 1 QZZ QZX1
N ZX
b 2 Z> X
=
Z> Z
X> Z
48 / 68
= X> Z
QZX = QXZ
the estimator can also written as
b
b asy
V
IV
b 2 Z> X
=
Z> Z
Z> X
b
>b
1
=
N K
N K
yi
i =1
b
xi>
IV
49 / 68
1 >
Z X = QZX a nite H
N
50 / 68
1 >
Z X = QZX a nite H
N
51 / 68
52 / 68
2
Q 1 QZZ QZX1
N ZX
b 2 Z> X
=
Z> Z
X> Z
53 / 68
Key Concepts
1
Weak instrument
54 / 68
Section 4
Two-Stage Least Squares (2SLS) estimator
55 / 68
56 / 68
Introduction
If Z contains more variables than X, then much of the preceding derivation
is unusable, because Z> X will be H K with
rank Z> X = K < H
So, the matrix Z> X has no inverse and we cannot compute the IV
estimator as:
1
b = Z> X
Z> y
IV
57 / 68
Introduction (contd)
The crucial assumption in the previous section was the exogeneity
assumption
1
plim Z> = 0K 1
N
1
58 / 68
2SLS
b >X
X
>
b >y
X
>
X Z Z Z
>
Z X
X> Z Z> Z
Z> y
59 / 68
X
2SLS = X X
1
b = Z Z> Z
where X
Z> X corresponds to the projection of the columns
of X in the column space of Z, or equivalently by
b
2SLS =
>
>
X Z Z Z
>
Z X
X> Z Z> Z
Z> y
60 / 68
b
b>
2SLS = X X
Since
b = Z Z> Z
X
b >y
X
Z> X = PZ X
2SLS
>
X> PZ X
X> PZ PZ X
b >X
b
X
>
b >y
X
1
b >y
X
b >y
X
December 15, 2013
61 / 68
2SLS = X X
b >y
X
b
It corresponds to the OLS estimator obtained in the regression of y on X.
b
Then, the 2SLS can be computed in two steps, rst by computing X, then
by the least squares regression. That is why it is called the two-stage LS
estimator.
62 / 68
b
The 2SLS estimator
2SLS then corresponds to the OLS estimator
obtained in this model.
Christophe Hurlin (University of Orlans)
63 / 68
Theorem
If any column of X also appears in Z, i.e. if one or more explanatory
(exogenous) variable is used as an instrument, then that column of X is
b
reproduced exactly in X.
64 / 68
1)
Then
b = (x1 : .. : xK
X
:b
xK )
where b
xK denotes the projection of xK on the columns of Z.
Christophe Hurlin (University of Orlans)
65 / 68
66 / 68
Key Concepts
1
Over-identied model
67 / 68
End of Chapter 6
Christophe Hurlin (University of Orlans)
68 / 68