Professional Documents
Culture Documents
Marsh
1.0
The Role of
Using Information:
Econometrics
1. Information from economic theory.
in Economic Analysis
2. Information from economic data.
Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
power of
labor unions capital gains tax *Econometrics* helps us combine
rent economic theory and economic data .
control
crime rate laws
1
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
1.7 1.8
How much ?
Answering the How Much? question
Listing the variables in an economic relationship is not enough.
Not a single individual or single firm. Consumption, c, is function, f, of income, i, with error, e:
2
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
1.13 1.14
Statistical Models Econometric model
Controlled (experimental)
vs. • economic model
Uncontrolled (observational) economic variables and parameters.
3
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.3 2.4
Discrete Random Variable
Controlled experiment values
of explanatory variables are chosen discrete random variable:
with great care in accordance with A discrete random variable can take only a finite
an appropriate experimental design. number of values, that can be counted by using
the positive integers.
4
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.9 2.10
Probability, f(x), for a discrete random A continuous random variable uses
variable, X, can be represented by height: area under a curve rather than the
height, f(x), to represent probability:
0.4
f(x) 0.3 f(x)
0.2
0.1 red area
green area 0.1324
0.8676
0 1 2 3 X
number, X, on Dean’s List of three roommates . .
$34,000 $55,000 X
n n
Rule 2: Σ axi = ai Σ xi n x1 + x2 + . . . + xn
n iΣ
Rule 5: x = 1 xi =
i=1 =1
=1 n
n n n
Rule 3: Σ (xi + yi) = Σ xi + Σ yi
i=1
The definition of x as given in Rule 5 implies
i=1 i=1
the following important fact:
5
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.15 2.16
Rules of Summation (continued) The Mean of a Random Variable
n
Rule 6: Σ f(xi) = f(x1) + f(x2) + . . . + f(xn)
i=1
will produce the same value. where n is the number of sample observations.
Analytical mean:
In the empirical case when the n
sample goes to infinity the values E[X] = Σ x i f(xi)
i=1
of X occur with a frequency where n is the number of possible values of xi.
equal to the corresponding f(x)
in the analytical expression. Notice how the meaning of n changes.
6
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.21 2.22
The expected value of X:
n EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)
EX = Σ xi f(xi) = 1.9
i=1
2 2 2 2 2 2
The expected value of X-squared: EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)
n
Σ
2 = 0 + .3 + 1.2 + 1.8 + 1.6
EX = xi2 f(xi)
i=1 = 4.9
It is important to notice that f(xi) does not change!
3 3 3 3 3 3
The expected value of X-cubed: EX = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) +4 (.1)
n
= 0 + .3 + 2.4 + 5.4 + 6.4
Σ
3
3
EX = xi f(xi)
i=1 = 14.5
E [g(X)] =
n
Σ [g1(xi) + g2(xi)] f(xi)
i=1
E(X+Y) = E(X) + E(Y)
n n
E [g(X)] = Σ g1(xi) f(xi) +i =Σ1 g2(xi) f(xi)
i=1
7
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2
2.27 2.28
var(X) = E [(X - EX) ]
2
variance of a discrete
var(X) = E [(X - EX) ] random variable, X:
2 2
= E [X - 2XEX + (EX) ]
2 2
= E(X ) - 2 EX EX + E (EX) n
∑(x i - EX ) f(x i)
2 2
2 2
= E(X ) - 2 (EX) + (EX) var ( X) =
2 2
= E(X ) - (EX) i=1
2 2
var(X) = E(X ) - (EX) standard deviation is square root of variance
college grads
joint pdf in household
A joint probability density function, Y=1 Y=2
f(x,y), provides the probabilities
f(x,y)
f(0,1) f(0,2)
associated with the joint occurrence
vacation X = 0 .45 .15
of all of the possible pairs of X and Y. homes
owned
X=1
.05 .35
f(1,1) f(1,2)
8
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.33 2.34
Marginal pdf
Calculating the expected value of
functions of two random variables.
The marginal probability density functions,
f(x) and f(y), for discrete random variables,
E[g(X,Y)] = Σ Σ g(xi,yj) f(xi,yj) can be obtained by summing over the f(x,y)
i j with respect to the values of Y to obtain f(x)
E(XY) = Σ Σ xi yj f(xi,yj) with respect to the values of X to obtain f(y).
i j
9
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.39 2.40
not independent Covariance
Y=1 Y=2 marginal
.50x.60=.30
pdf for X:
.50x.60=.30 The covariance between two random
X=0 .45 .15 .60 f(X = 0) variables, X and Y, measures the
linear association between them.
X=1 .05 .35 .40 f(X = 1)
.50x.40=.20 .50x.40=.20
The calculations
cov(X,Y) = E[(X - EX)(Y-EY)]
in the boxes show
marginal .50 .50 the numbers Note that variance is a special case of covariance.
pdf for Y: required to have
f(Y = 1) f(Y = 2) independence. cov(X,X) = var(X) = E[(X - EX)2 ]
divided by the square roots of their X=1 .05 .35 .40 = .24
10
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.45 Since expectation is a linear operator, 2.46
it can be applied term by term.
Zero Covariance & Correlation
The expected value of the weighted sum
of random variables is the sum of the
Independent random variables expectations of the individual terms.
have zero covariance and,
therefore, zero correlation. E[c1X + c2Y] = c1EX + c2EY
Z = (y - β)/σ
β
a
y
Z ~ N(0,1)
Y-β a-β a-β
P[Y>a] = P > = P Z >
σ σ σ
1 - z2
f(z) = exp 2
2π
11
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2.51 2.52
Y ~ N(β,σ2) Linear combinations of jointly
f(y)
normally distributed random variables
are themselves normally distributed.
β
a b
y Y1 ~ N(β1,σ 12), Y 2 ~ N(β2,σ 22), . . . , Yn ~ N(βn,σ n2)
12
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
3.2 3.3
Weekly Food Expenditures
Purpose of Regression Analysis
y = dollars spent each week on food items.
1. Estimate a relationship among economic x = consumer’s weekly income.
variables, such as y = f(x).
The relationship between x and the expected
2. Forecast or predict the value of one value of y , given x, might be linear:
variable, y, based on the value of
another variable, x. E(y|x) = β1 + β2 x
f(y|x=480)
Figure 3.1a Probability Distribution f(y|x=480) Figure 3.1b Probability Distribution of Food
of Food Expenditures if given income x=$480. Expenditures if given income x=$480 and x=$800.
E(y|x)
tu
E(y|x)=β1+β2x
di
en
p
ex
∆E(y|x) β2=
∆E(y|x) .
∆x
∆x .
β1{
x (income)
x1=480 x2=800 income xt
Figure 3.2 The Economic Model: a linear relationship
between avearage expenditure on food and income. Figure 3.3. The probability density function
for yt at two levels of household income, x t
13
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
Heteroskedastic Case 3.8 3.9
f(yt) Assumptions of the Simple Linear
y
t
re Regression Model - I
tu
ndi
pe 1. The average value of y, given x, is given by
ex
. the linear regression:
E(y) = β1 + β2x
.
. 2. For each value of x, the values of y are
distributed around their mean with variance:
var(y) = σ2
3. The values of y are uncorrelated, having zero
x1 x2 x3 income xt covariance and thus no linear relationship:
cov(yi ,yj) = 0
Figure 3.3+. The variance of yt increases 4. The variable x must take at least two different
as household income, x t , increases. values, so that x ≠ c, where c is a constant.
14
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
y 3.14 f(. ) 3.15
f(e) f(y)
.y4 ^y = b + b x
^* ^y3*
{.
^e*4
^* * *
1 2
{
2 . 2 y3
^e*1
y1. 0 β1+β2x
x1 x2 x3 x4
x
Figure 3.7b The sum of squared residuals Figure 3.4 Probability density function for e and y
from any other line will be larger.
15
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
Minimize w. r. t. β1 and β2: 3.20 Minimize w. r. t. β1 and β2: 3.21
T T
S(β1,β2) = Σ(y t - β1 - β2x t )2 (3.3.4) S(. ) = Σ (y t - β1 - β2x t )2
t =1 S(.) t =1
S(.)
∂S(. )
= - 2 Σ (y t - β 1 - β 2x t )
∂β1
∂S(. ) ∂S(.) .
= - 2Σ x t (y t - β1 - β2x t ) <0
∂βi .∂S(.)
∂β2 ∂S(.) =0
∂βi ∂βi
>0
Set each of these two derivatives equal to zero and .
solve these two equations for the two unknowns: β1 β2 bi βi
∂S(. ) - 2Σ x t (y t - b1 - b2x t ) = 0
= - 2 Σ (y t - b1 - b2x t ) = 0
∂β1
∂S(. )
= - 2Σ x t (y t - b1 - b2x t ) = 0
Σ y t - Tb1 - b2 Σ x t = 0
∂β2
Σ x t y t - b1 Σ x t - b2 Σ xt =
2
0
When these two terms are set to zero,
β1 and β2 become b1 and b2 because they no longer Tb1 + b2 Σ x t = Σ yt
represent just any value of β1 and β2 but the special
Σ xt + b2 Σ xt Σ xtyt
2
values that correspond to the minimum of S(. ) . b1 =
η = lim ∆y x = ∂y x
∆x→ 0 ∆x y ∂x y
b1 = y - b2 x
16
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
3.26 3.27
applying elasticities estimating elasticities
∂y x x
E(y) = β1 + β2 x η
^
= = b2
∂x y y
∂E(y) y^t = b1 + b2 x t = 4 + 1.5 x t
= β2
∂x x = 8 = average number of years of experience
y = $10 = average wage rate
∂E(y) x x
η = = β2 x 8
∂x E(y) E(y) η
^
= b2 = 1.5 = 1.2
y 10
x ∂y
Properties of
= β2 Least Squares
y ∂x
elasticity of y with respect to x: Estimators
x ∂y
η = β2
Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
= that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
y ∂x
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
17
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
4.2 Assumptions of the Simple 4.3
Simple Linear Regression Model
Linear Regression Model
yt = β1 + β2 x t + ε t
1. yt = β1 + β2x t + ε t
yt = household weekly food expenditures 2. E(ε t) = 0 <=> E(yt) = β1 + β2x t
xt = household weekly income 3. var(εt) = σ 2 = var(yt)
4. cov( ε i,ε j) = cov(yi,yj) = 0
For a given level of x t, the expected
level of food expenditures will be: 5. xt ≠ c for every observation
yt = β1 + β2x t + ε t
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
4.6 Substitute in 4.7
The Expected Values of b1 and b2 to get:
TΣxtεt - Σxt Σεt
The least squares formulas (estimators)
b2 = β2 +
TΣxt2 -(Σxt) 2
in the simple regression case:
18
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
4.8 4.9
An Unbiased Estimator Wrong Model Specification
Σ(xt − x )(yt − y )
In a similar manner, the estimator b 1 b2 = (4.2.6)
of the intercept or constant term can be Σ(xt − x ) 2
shown to be an unbiased estimator of β1
Expand and multiply top and bottom by T:
when the model is correctly specified.
19
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
4.14 4.15
Covariance of b1 and b2 What factors determine
variance and covariance ?
1. σ 2: uncertainty about yt values uncertainty about
−x
b1, b2 and their relationship.
2. The more spread out the xt values are then the more
cov(b 1,b2) = σ2 confidence we have in b1, b2, etc.
Σ(x t − x)2 3. The larger the sample size, T, the smaller the
variances and covariances.
4. The variance b1 is large when the (squared) xt values
are far from zero (in either direction).
If x = 0, slope can change without affecting 5. Changing the slope, b2, has no effect on the intercept,
the variance. b1, when the sample mean is zero. But if sample
mean is positive, the covariance between b1 and
b2 will be negative, and vice versa.
Under the first five assumptions of the 1. b 1 and b2 are “best” within the class
simple, linear regression model, the of linear and unbiased estimators.
ordinary least squares estimators b1 2. “Best” means smallest variance
and b2 have the smallest variance of within the class of linear/unbiased.
all linear and unbiased estimators of 3. All of the first five assumptions must
β1 and β2. This means that b1and b2 hold to satisfy Gauss-Markov.
are the Best Linear Unbiased Estimators 4. Gauss-Markov does not require
(BLUE) of β1 and β2. assumption six: normality.
5. G-Markov is not based on the least
squares principle but on b 1 and b2.
20
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
4.21
yt and ε t normally distributed 4.20
normally distributed under
The least squares estimator of β2 can be
The Central Limit Theorem
expressed as a linear combination of y t’s:
T− 2
Since the formulas for the variances of the least
squares estimators b1 and b2 show that their
variances do, in fact, go to zero, then b1 and b2, σ^ 2 is an unbiased estimator of σ 2
are consistent estimators of β1 and β2.
y^o = b1 + b2 x o
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
(4.7.2) copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
21
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
5.2 5.3
Assumptions of the Simple Probability Distribution
Linear Regression Model of Least Squares Estimators
1. yt = β1 + β2x t + ε t
σ2 Σx 2t
2. E( ε t) = 0 <=> E(yt) = β1 + β2x t b1 ~ N β 1 , Τ Σ(x t − x) 2
3. var( εt) = σ = var(yt)
2
Σe^ t
2
σ
^2 =
Τ−2 Our decision is incorrect if:
(Τ − 2) σ
σ2
^2
∼ χ Τ−2
• The null hypothesis is false and we decide not to reject it.
This is a type II error.
22
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
5.8 5.9
Create a Chi-Square Sum of Chi-Squares
23
σ
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
2 5.14 5.15
var(b 2) =
Σ( xi − x ) 2
(b2 − β2) (b2 − β2)
t = =
(b2 − β2) ^2 ^
notice the σ var(b 2)
cancellations Σ( xi − x )2
σ2
Σ( x i − x ) 2 (b2 − β2)
t = =
^2
(T−2) σ σ
^2 (b2 − β2)
( T− 2) t =
σ2 Σ( xi − x ) 2 se(b2)
f(t)
(b2 − β2)
t = ~ t (T−2)
se(b2)
(1−α)
α/2 α/2
t has a Student-t Distribution -tc 0 tc t
with T− 2 degrees of freedom.
red area = rejection region for 2-sided test
24
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
5.20 5.21
2. Student-t distribution has fatter tails than the normal. 1. A null hypothesis, H0.
3. Student-t converges to the normal for infinite sample. 2. An alternative hypothesis, H1.
4. Student-t conditional on degrees of freedom (df).
3. A test statistic.
5. Normal is a good approximation of Student-t for the first few
decimal places when df > 30 or so. 4. A rejection region.
25
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
5.26 6.1
Prediction Intervals Chapter 6
A (1−α)x100% prediction interval for yo is:
^y ± t se( f )
o c
The Simple Linear
Regression Model
f = y^o − yo ^ f)
se( f ) = var(
^ 2 1 + 1 + (x o − x )
Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
2
^ f)=σ
var(
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
Τ Σ(x t − x)2
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
t=1
Explained variation:
^y = b + b x
t 1 2 t
Σyt − Tb1 = 0
T T T
= Σ(yt − y)2
product T
t=1 term SST
drops t=1
SST = SSR + SSE out
26
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.6 6.7
Explained Variation in yt Unexplained Variation in yt
^y = b + b x
Fitted ^yt values: t 1 2 t e^t = yt−y
^ =y −b −bx
t t 1 2 t
SSR measures variation of ^yt around y ^
SSE measures variation of yt around yt
T
SSR = Σ(yt − y)
T
= Σ(yt − y^ t)2 = Σ^et2
T
^ 2
t=1
SSE
t=1 t=1
27
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.12 6.13
Correlation Analysis Correlation Analysis
Population:
^ = Σ(xt − x) /(T−1)
T
cov(X,Y) 2
ρ= var(X)
t=1
var(X) var(Y)
^ = Σ(yt − y)2/(T−1)
T
var(Y)
t=1
Sample:
^
cov(X,Y)
=Σ(xt − x)(yt − y)/(T−1)
T
r= ^
cov(X,Y)
^
var(X) ^
var(Y) t=1
t=1
r= R2 is also the correlation
Σ(xt − x) Σ(yt − y)
T T
2 2
between yt and ^ yt
t=1 t=1
measuring “goodness of fit”.
28
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.18 6.19
Regression Computer Output Regression Computer Output
29
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.24 6.25
Effects of Scaling the Data Functional Forms
Changing the scale of x and y
The term linear in a simple
yt = β1 + β2xt + et regression model does not mean
No change in yt/c = (β1/c) + (cβ2/c)xt/c + et/c
the R2 or the a linear relationship between
t-statistics or y*t = β*1 + β2x*t + e*t variables, but a model in which
in regression
results for β2 where y*t = yt/c e*t = e t/c the parameters enter the model
but all other
β* = β /c and
in a linear way.
stats change. 1 1 x*t = xt/c
1. Linear Linear
Look at 2. Reciprocal
each form 3. Log-Log
and its
slope and
4. Log-Linear yt = β1 + β2xt + et
elasticity 5. Linear-Log x
slope: β2 elasticity: β2 yt
6. Log-Inverse t
30
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.30 6.31
Useful Functional Forms Useful Functional Forms
Reciprocal Log-Log
yt = β1 + β2 1x + et ln(yt)= β1 + β2ln(xt) + et
t
slope: elasticity: yt
1
− β2 2
1 slope: β2 x elasticity: β2
− β2 x y t
xt t t
Log-Linear Linear-Log
ln(yt)= β1 + β2xt + et yt= β1 + β2ln(xt) + et
Log-Inverse 1. E (et) = 0
2. var (et) = σ2
ln(yt) = β1 - β2 x1 + et
t 3. cov(ei, ej) = 0
slope: β2
yt 1
elasticity: β2 x 4. et ~ N(0, σ2)
x2t t
31
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.36 6.37
Economic Models Economic Models
32
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
6.42 7.1
Economic Models Chapter 7
5. Phillips Curve
The Multiple
nonlinear in both variables and parameters
* wage rate (wt) and time (t)
Regression Model
wt − wt-1
% ∆wt = wt-1 = γα + γη ut
1 Copyright © 1997 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
unemployment rate, ut or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
33
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
7.6 7.7
Statistical Properties of yt Assumptions
1. yt = β1 + β2xt2 +. . .+ βKxtK + et
1. E (yt) = β1 + β2xt2 +. . .+ βKxtK
2. E (yt) = β1 + β2xt2 +. . .+ βKxtK
2. var(y t) = var(et) = σ2
3. var(yt) = var(et) = σ2
3. cov(yt ,ys) = cov(et , es) = 0 t≠s 4. cov(yt ,ys) = cov(et ,es) = 0 t ≠ s
4. yt ~ N(β1+β2xt2 +. . .+βKxtK, σ2) 5. The values of xtk are not random
6. yt ~ N(β1+β2xt2 +. . .+βKxtK, σ2)
χ
If height is normally distributed and the
normal ranges from minus infinity to plus (Τ − Κ) σ
^2
∼
infinity, pity the man minus three feet tall. σ 2
Τ−Κ
34
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
Gauss-Markov Theorem 7.12 7.13
Variances
Under the assumptions of the yt = β1 + β2xt2 + β3xt3 + et
multiple regression model, the
var(b2) =
σ2 When r23 = 0
Σ(xt2 − x2)
(1− r232 ) 2
ordinary least squares estimators these reduce
to the simple
have the smallest variance of
var(b3) = σ2 regression
all linear and unbiased estimators. (1− r23
2
)Σ(xt3 − x3)2 formulas.
This means that the least squares
estimators are the B est Linear Σ(xt2 − x2)(xt3 − x3)
where r23 =
Unbiased Estimators (BLUE).
Σ(xt2 − x2) Σ(xt3 − x3)
2 2
Σ(xt2 − x2)
2
.
t=1
3. The variable’s values are more spread out:
(xt2 − x2)2 . Σ(xt2 − x2)(xt3 − x3)
where r23 =
4. The correlation is close to zero: r232 0. Σ(xt2 − x2) Σ(xt3 − x3)
2 2
35
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
Normal 7.18 Student-t 7.19
Since bk is a linear bk − βk bk − βk
function of the yt’s: bk ~ N βk, var(bk) t = =
^ k)
var(b se(bk)
b − βk
z = k ~ N(0,1) for k = 1,2,...,K
t has a Student-t distribution with df=(T−K).
var(bk)
bk − t c se(bk) , bk + tc se(bk)
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
Interval endpoints: or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
36
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
One Tail Test 8.4 Two Tail Test 8.5
H0: β3 ≤ 0 b3 H0: β2 = 0 b2
t= ~ t (T−K) t= ~ t (T−K)
H1: β3 > 0 se(b3) H1: β2 ≠ 0 se(b2)
df = T− K df = T− K
= T− 4 = T− 4
(1 − α) α α/2 (1 − α) α/2
0 tc -tc 0 tc
Σ(y^ t − y)2
T
R2 = SSR
SST
= 1 − SSE
SST
R2 = SSR
SST =
t=1
Σ(yt − y)2
T
Adjusted:
0≤ R ≤1 2 t=1
R2 = 1 − SSE/(T−K)
SST/(T−1)
37
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
8.10 8.11
Single Restriction F-Test Multiple Restriction F-Test
yt = β1 + β2Xt2 + β3Xt3 + β4Xt4 + et yt = β1 + β2Xt2 + β3Xt3 + β4Xt4 + et
38
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
8.16 8.17
Collinear Variables Effects of Collinearity
The term “independent variable” means A high degree of collinearity will produce:
an explanatory variable is independent of 1. no least squares output when collinearity is exact.
of the error term, but not necessarily 2. large standard errors and wide confidence intervals.
independent of other explanatory variables. 3. insignificant t-values even with high R2 and a
significant F-value.
Since economists typically have no control
over the implicit “experimental design”, 4. estimates sensitive to deletion or addition of a few
observations or “insignificant” variables.
explanatory variables tend to move
5. good “within-sample”(same proportions) but poor
together which often makes sorting out
“out-of-sample”(different proportions) prediction.
their separate influences rather problematic.
39
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
9.2 9.3
Topics for This Chapter Intercept Dummy Variables
For men: Dt = 1.
yt = β1 + β2Xt + β3Dt + β4DtXt + e t For women: Dt = 0.
“miracle” seed: Dt = 1 regular seed: D t = 0 yt
yt = (β1+ β3) + β2 Xt + et Men
yt yt = (β1 + β3) + (β2 + β4)Xt + et wage
harvest Women
rate
“miracle”
β2 + β 4
weight β2 yt = β1 + β2 Xt + et
of corn yt = β1 + β2Xt + et
β1+ β3 . β2 Testing for H0: β3 = 0
β1 + β3 regular discrimination
β2 β1 . H1: β3 > 0
β1 in starting wage
40
yt = β1 + Copyright
β5 Xt +1996
β6 DLawrence C. Marsh
t Xt + e t
9.8 Copyright 1996
An Ineffective Affirmative Action Plan
Lawrence C. Marsh
9.9
For men Dt = 1. yt = β1 + β2 Xt + β3 Dt + β4 Dt Xt + et
For women Dt = 0.
yt yt women are started yt = (β1 + β3) + (β2 + β4) Xt + et
yt = β1 + (β5 + β6 )Xt + et Men at a higher wage. Men
wage
β5 + β6
wage +β4 Women
Women β2
rate rate
β2 yt = β1 + β2 Xt + et
yt = β1 + β5 Xt + et
β5
Women are given a higher starting wage, β1 ,
Men and women have the same
starting wage, β1 , but their wage rates
β1 while men get the lower starting wage, β1 + β3 ,
β1 increase at different rates (diff.= β6 ). β1 + β3 (β3 < 0 ). But, men get a faster rate of increase
in their wages, β2 + β4 , which is higher than the
β6 > 0 means that men’s wage rates are Note: rate of increase for women, β2 , (since β4 > 0 ).
increasing faster than women's wage rates. (β3 < 0)
0 years of experience 0 Xt
Xt years of experience
t =1 H o: β3 = β4 = 0 vs. H 1: otherwise
and intercept and slope
T y t = wage rate Xt = years of experience
SSE R = ∑ (y t
− b1 − b2 Xt )
2
This model assumes equal wage rate variance.
t =1
41
II. Allowing forCopyright
unequal 1996 Lawrence C. Marsh
variances: 9.14 Copyright 1996 Lawrence C. Marsh
9.15
(running three regressions) Interaction Variables
Forcing men and women to have same β1, β2.
Everyone: y t = β1 + β2 Xt + et SSER
1. Interaction Dummies
Allowing men and women to be different.
Men only: y tm = δ 1 + δ 2 Xtm + etm SSEm
2. Polynomial Terms
Women only: y tw = γ1 + γ2 Xtw + etw SSEw (special case of continuous interaction)
(SSE R − SSEU)/J J = # restrictions
F=
SSEU /(T−K) K=unrestricted coefs. 3. Interaction Among Continuous Variables
J=2 K=4 where SSEU = SSEm + SSEw
Rate income is changing as we age: Sleep and study time do not act independently.
∂yt
= β2 + 2 β3 X t + 3 β4 X2t
∂Xt More study time will be more effective
when combined with more sleep and less
Slope changes as X t changes. effective when combined with less sleep.
42
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
9.20 Exam grade = f(sleep:Zt , study time:Bt) 9.21
continuous interaction
If Zt + Bt = 24 hours, then Bt = (24 − Zt)
Exam grade = f(sleep:Zt , study time:Bt)
yt = β1 + β2 Zt + β3 Bt + β4 Zt Bt + et
yt = β1 + β2 Zt + β3 Bt + β4 Zt Bt + et
yt = β1+ β2 Zt +β3(24 − Zt) +β4 Zt (24 − Zt) + et
Your studying is ∂yt
= β2 + β4 Zt yt = (β1+24 β3) + (β2−β 3+24 β4)Zt − β4Z2t + et
more effective
∂Bt yt = δ1 + δ2 Zt + δ3 Z2t + et
with more sleep.
∂yt Sleep needed to maximize your exam grade:
Your mind sorts = β2 + β4 Bt ∂yt −δ
things out while ∂Zt = δ2 + 2δ3 Zt = 0 Zt = 2
∂Zt 2δ3
you sleep (when you have things to sort out.) where δ 2 > 0 and δ 3 < 0
1 quits job
yi =
1. Linear Probability Model 0 does not quit
43
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
9.26 9.27
Probit Model Probit Model
latent variable, zi : zi = β1 + β2 Xi2 + . . . Since zi = β1 + β2 Xi2 + . . . , we can
substitute in to get:
Normal probability density function:
pi = P[ Z ≤ β1 + β2Xi2 ] = F(β1 + β2Xi2)
1
e −0.5zi
2
f(z i) =
2π yt = 1
Normal cumulative probability function:
zi yt = 0
F(zi) = P[ Z ≤ zi ] = ∫−∞ 1
2π
e−0.5u du
2
1
1 pi = − (β + β X + . . .)
Define pi : pi = 1 +e 1 2 i2
− (β + β X + . . .)
1 +e 1 2 i2
yt = 1
For β2 > 0, pi will approach 1 as Xi2 +∞
consistent and asymptotically efficient . that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
44
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
10.2 10.3
Regression Model
The Nature of Heteroskedasticity
yt = β1 + β2xt + e t
Heteroskedasticity is a systematic pattern in
the errors where the variances of the errors
are not constant. zero mean: E(e t) = 0
Ordinary least squares assumes that all homoskedasticity: var(e t) = σ 2
observations are equally reliable.
nonautocorrelation: cov(et, es) = 0 t≠s
For efficiency (accurate estimation/prediction)
reweight observations to ensure equal error
variance. heteroskedasticity: var(e t) = σt 2
. . n
. io
. pt
.
. . su
m
. . n
. .. . . . . . .
co
.
. . . .. . . . . . rich people
. . .. . . . . .
. . . . . . . .
. . poor people
x1 x2 x3 income xt
income xt
45
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
10.8 10.9
Properties of Least Squares yt = β1 + β2xt + e t
heteroskedasticity: var(e t) = σt 2
1. Least squares still linear and unbiased. incorrect formula for least squares variance:
2. Least squares not efficient. var(b 2) = σ2
3. Usual formulas give incorrect standard Σ (xt − x )2
errors for least squares. correct formula for least squares variance:
4. Confidence intervals and hypothesis tests Σ σt 2(xt − x )2
based on usual standard errors are wrong. var(b 2) =
[Σ ( xt − x )2]2
46
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
yt 1 x e 10.14 10.15
= β1 + β2 t + t Generalized Least Squares
xt xt xt xt
These steps describe weighted least squares:
1. Decide which variable is proportional to the
y*t = β1x*t1 + β2x*t2 + e*t heteroskedasticity (xt in previous example).
σ
^ 2 provides estimator of σ 2 using
2 2 2. Goldfeld-Quandt Test checks for presence
the 20 observations on “sweet” corn. of heteroskedasticity.
47
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
10.20 10.21
Residual Plots Goldfeld-Quandt Test
Plot residuals against one variable at a time
after sorting the data by that variable to try The Goldfeld-Quandt test can be used to detect
to find a heteroskedastic pattern in the data. heteroskedasticity in either the proportional case
. or for comparing two groups in the discrete case.
et .
. . .
. . . . .
. . . . . . . .. . . . .. For proportional heteroskedasticity, it is first necessary
.
0 . . . .. . . . . . ... to determine which variable, such as xt, is proportional
. xt to the error variance. Then sort the data from the
.. .
. largest to smallest values of that variable.
.
H1: α1 ≠ 0, α2 ≠ 0
that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
and/or or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
the usual F test programs or from the use of the information contained herein.
48
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
11.2 11.3
et crosses line not enough (attracting)
The Nature of Autocorrelation Postive . . .. . . . ..
. . . .. .
Auto. 0
.. .. . .
. t
For efficiency (accurate estimation/prediction)
all systematic information needs to be incor-
et crosses line randomly
porated into the regression model.
No . . .. . . . . . . . .
Auto. 0
. . .. . . . . .. .. . . . t
Autocorrelation is a systematic pattern in the .
errors that can be either attracting (positive)
et . . crosses line too much (repelling)
. . . .
or repelling (negative) autocorrelation. Negative . . .
0
Auto.
. . . . . . t
. . .
49
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
11.8 11.9
Generalized Least Squares yt = β1 + β2xt + ρ e t−1 + νt
AR(1) : e t = ρ e t−1 + νt
substitute yt = β1 + β2xt + et
in for e t
et = yt − β1 − β2xt lag the
yt = β1 + β2xt + et errors
et−1 = yt−1 − β1 − β2xt−1 once
yt = β1 + β2xt + ρ e t−1 + νt
(continued) (continued)
yt − ρyt−1 = β1(1−ρ) + β2(xt−ρxt−1) + νt Problems estimating this model with least squares:
1. One observation is used up in creating the
transformed (lagged) variables leaving only
yt* = β1* + β2x*t2 + νt (T−1) observations for estimating the model.
y*t = yt − ρyt−1 x*t2 = (xt−ρxt−1) 2. The value of ρ is not known . We must find
some way to estimate it.
β*1 = β1(1−ρ)
50
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
11.14 11.15
y1 = β1 + β2x1 + e1 y1 = β1 + β2x1 + e1
with error variance: var(e1) = σe2 = σν2 /(1-ρ2).
Multiply through by 1-ρ2 to get:
The other observations all have error variance σ ν2.
1-ρ2 y1 = 1-ρ2 β1 + 1-ρ2 β2x1 + 1-ρ2 e1
Given any constant c : var(ce1) = c2 var(e1).
If c = 1-ρ2 , then var( 1-ρ2 e1) = (1-ρ2) var(e1). The transformed error ν1 = 1-ρ2 e1 has variance σν2 .
= (1-ρ2) σe2
= (1-ρ2) σν2 /(1-ρ2)
This transformed first observation may now be
= σν2 added to the other (T-1) observations to obtain
the fully restored set of T observations.
The transformation ν1 = 1-ρ2 e1 has variance σν2 .
If we had values for the e t’s, we could estimate: Next, estimate the following by least squares:
e t = ρ e t−1 + νt ^e = ρ ^e + ^ν
t t−1 t
First, use least squares to estimate the model: The least squares solution is:
yt = β1 + β2xt + et
Σ e^t ^et-1
T
t=2
The residuals from this estimation are: ^ρ =
Σ ^et-1
T
2
^e = y - b - b x t=2
t t 1 2 t
51
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
11.20 11.21
Prediction with AR(1) Errors
For h periods ahead, the best predictor is:
When errors are autocorrelated, the previous period’s
error may help us predict next period’s error.
^ ^ ^ ~
The best predictor, yT+1 , for next period is: yT+h = β1 + β2xT+h + ^ρh eT
^y ^ ^ ^~
T+1 = β 1 + β 2xT+1 + ρ eT
~
^ ^ Assuming | ^ρ | < 1, the influence of ^ρh eT
where β1 and β2 are generalized least squares diminishes the further we go into the future
~
estimates and e T is given by: (the larger h becomes).
e = y − ^β − ^β x
~
T T 1 2 T
Each firm gets its own coefficients: β1i , β2i and β3i
but those coefficients are constant over time . i = G, W t = 1, . . . , 20
52
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
12.5 homoskedasticity assumption: 12.6
Estimating Separate Equations
σG = σW
2 2
We make the usual error term assumptions:
E(e Gt) = 0 E(e Wt) = 0 Dummy variable model assumes that σ G = σ W :
2 2
53
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
12.11 12.12
Separate vs. Joint Estimation
Seemingly Unrelated Regressions
SUR will give exactly the same results as estimating
each equation separately with OLS if either or both
When the error terms of two or more equations of the following two conditions are true:
are correlated, efficient estimation requires the use
of a Seemingly Unrelated Regressions (SUR) 1. Every equation has exactly the same set of
type estimator to take the correlation into account. explanatory variables with exactly the same
values.
Be sure to use the Seemingly Unrelated Regressions (SUR)
procedure in your regression software program to estimate 2. There is no correlation between the error
any equations that you believe might have correlated errors. terms of any of the equations.
σ^ GW
2
σ
^2 = 1
Σ e^Wt
2
λ = T rGW =
2
σGW
^2 2 rGW W T
rGW = σ^G2 σ^W2
2
σG2 σ^W
^ 2
λ asy.
∼ χ(1)2 λ = T rGW
2
λ asy.
∼ χ2(1)
54
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
12.17 12.18
Test for Equality of Fixed Effects Random Effects Model
Ho : β11 = β12 = β13 = β14
H1 : Ho not true
yit = β1i + β2x2it + β3x3it + eit
The Ho joint null hypothesis may be tested with F-statistic:
(SSER − SSEU) / J J
β1i = β1 + µi
F= ~ F(NT − K)
SSEU / (NT − K)
SSER is the restricted error sum of squares (one intercept) β1 is the population mean intercept.
SSEU is the unrestricted error sum of squares (four intercepts)
N is the number of cross-sectional units (N = 4) µi is an unobservable random error that
K is the number of parameters in the model (K = 6)
J is the number of restrictions being tested (J = N−1 = 3) accounts for the cross-sectional differences.
T is the number of time periods
µi are independent of one another and of eit yit = (β1+µi) + β2x2it + β3x3it + eit
cov(νit,νjs) = 0
John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
i≠j or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
55
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
13.2 13.3
Keynesian Macro Model The Structural Equations
consumption is a function of income:
Assumptions of Simple Keynesian Model
c = β1 + β2 y
1. Consumption, c, is function of income, y.
y=c+i
ct = β1 + β2 yt + et ct = β1 + β2 yt + et
5.
Since yt
The income identity: contains et
3.
yt = ct + it they are
4. yt = ct + it correlated
56
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
13.8 13.9
Deriving the Reduced Form Deriving the Reduced Form
ct = β1 + β2 yt + et (1 − β2)ct = β1 + β2 it + et
yt = ct + it β1 β2
ct = + it + 1 et
(1−β2) (1−β2) (1−β2)
ct = β1 + β2(ct + it) + et
ct = π11 + π21 it + νt
(1 − β2)ct = β1 + β2 it + et
The Reduced Form Equation
1 yt = π12 + π22 it + νt
and νt = (1−β2)
+ et
Since ct and yt are related through the identity: The reduced form parameters are π11 and π21.
yt = ct + it , the error term, νt, of these two
Once the reduced form parameters are estimated,
equations is the same, and it is easy to
the identification problem is to determine if the
show that:
β1 orginal structural parameters can be expressed
π =π =11 12 (1−β2)
uniquely in terms of the reduced form parameters.
π^ 11 ^ π^ 21
β1 = β2 =
^
π22 = (1−π21) = 1
(1−β2) (1+π 21)
^
(1+π^ 21)
57
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
13.14
Identification The Identification Problem 13.15
An equation is under-identified if its structural A system of M equations
(behavorial) parameters cannot be expressed
in terms of the reduced form parameters. containing M endogenous
variables must exclude at least
An equation is exactly identified if its structural
(behavorial) parameters can be uniquely expres- M−1 variables from a given
sed in terms of the reduced form parameters. equation in order for the
An equation is over-identified if there is more parameters of that equation to
than one solution for expressing its structural be identified and to be able to
(behavorial) parameters in terms of the reduced
form parameters. be consistently estimated.
yt1 = π11 + π21 xt1 + π31 xt2 + νt1 yt1 = ^yt1 + ^νt1 and yt2 = ^yt2 + ν^t2
yt2 = π12 + π22 xt1 + π32 xt2 + νt2 Substitue in yt1 = β1 + β2 yt2 + β3 xt1 + et1
for yt1 , yt2 yt2 = α 1 + α 2 yt1 + α 3 xt2 + et2
Use least squares to get fitted values:
^y = ^π + ^π x + ^π x ^ + ^ν ) + β x + e
t1 11 21 t1 31 t2 yt1 = ^yt1 + ^νt1 yt1 = β1 + β2 (y t2 t2 3 t1 t1
^y = π^ + ^π x + ^π x
t2 12 22 t1 32 t2
yt2 = ^yt2 + ^νt2 ^ + ^ν ) + α x + e
yt2 = α 1 + α 2 (y t1 t1 3 t2 t2
58
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
13.20 14.1
2SLS: Stage II (continued) Chapter 14
yt1 = β1 + β2 ^yt2 + β3 xt1 + ut1
Nonlinear
yt2 = α 1 + α 2 ^yt1 + α 3 xt2 + ut2
Least
where ut1 = β2^νt2 + et1 and ut2 = α 2^νt1 + et2
yt = α + et ∂ SSE = − 2 Σ (y − ^α) = 0 ^
et = yt − βx t Σ x ty t − Σ βx t = 0
2
∂α t
59
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
14.6 14.7
Find Minimum of Nonlinear SSE Conclusion
2
SSE SSE = Σ (yt − x β)t
60
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
Gauss-Newton just runs OLS on this 14.12 14.13
transformed truncated Taylor series.
Recall that: y*(o) ≡ y − f(X,b(o)) + f’(X,b(ο)) b(ο)
b(m+1) = b(m) + [ f’(X,b(m))T f’(X,b(m))]-1 f’(X,b(m))T y∗∗(m) b(m+1) = b(m) + [ f’(X,b(m))T f’(X,b(m))]-1 f’(X,b(m))T y∗∗(m)
y t = b1 + b2 X t 2 + b3 X t 3 + εt for t = 1, . . . , n.
εt = ρ ε t - 1 + ut where u t satisfies the conditions
Lag once and multiply by ρ: where εt = ρ εt - 1 + ut
E u t = 0 , E u 2t = su2, E u t u s = 0 for s ≠ t.
ρ y t-1 = ρ b1 + ρ b2 Xt -1, 2 + ρ b3 Xt -1, 3 + ρ εt -1
Therefore, u t is nonautocorrelated and homoskedastic.
Subtract from the original and move ρ y t-1 to right side:
Durbin’s Method is to set aside a copy of the equation,
lag it once, multiply by ρ and subtract the new equation
yt = b1(1-ρ) + b2(Xt 2 - ρXt-1, 2) + b3(Xt 3 − ρXt-1, 3)+ ρy t-1+ ut
from the original equation, then move the ρyt-1 term to
the right side and estimate ρ along with the bs by OLS.
61
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
14.18 14.19
The structural (restricted,behavorial) equation is:
α1 = b1(1-ρ) α2 = b2 α3= - b2ρ α4 = b3 α5= - b3ρ α6= ρ
yt = b1(1-ρ) + b2(Xt 2 - ρXt-1, 2) + b3(Xt 3 - ρXt-1, 3) + ρy t-1+ ut
Given OLS estimates: ^
α1 α^2 ^α3 ^α4 ^
α5 ^α6
Now Durbin separates out the terms as follows:
we can get three separate and distinct estimates for ρ :
yt = b1(1-ρ) + b2Xt 2 - b2ρXt-1 2 + b3Xt 3 - b3ρXt-1 3+ ρy t-1+ ut
^ρ = − ^α 3 − ^α 5 ^ρ = ^α
The corresponding reduced form (unrestricted) equation is: ρ^ =
^α ^α 6
2 4
yt = α1 + α2Xt, 2 + α3Xt-1, 2 + α4Xt, 3 + α5Xt-1, 3+ α6yt-1+ u t These three separate estimates of ρ are in conflict !!!
It is difficult to know which one to use as “the”
legitimate estimate of ρ. Durbin used the last one.
α1 = b1(1-ρ) α2 = b2 α3= - b2ρ α4 = b3 α5= - b3ρ α6= ρ
∂ yt ∂ yt
Consequently, the above structural equation should be
= (1 − ρ) = (X t, 2 − ρ X t-1,2)
estimated using a nonlinear method such as the
Gauss-Newton algorithm for nonlinear least squares.
∂b 1 ∂b2
∂ yt
yt = b1(1-ρ) + b2Xt 2 - b2ρXt -1, 2 + b3Xt 3 - b3ρXt -1, 3+ ρyt-1+ ut ∂b 3
= (X t, 3 − ρ X t-1,3)
∂ yt
= ( - b1 - b2Xt-1,2 - b3Xt-1,3+ y t-1 )
∂ρ
f(Xt,b) = b1(1-ρ) + b2Xt 2 - b2ρXt-1 2 + b3Xt 3 - b3ρXt-1 3+ ρy t-1 John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
62
Copyright 1996 Lawrence C. Marsh15.2 Copyright 1996 Lawrence C. Marsh15.3
y t = α + γ [(n+1)x t + nx t-1 + (n-1)x t-2 + . . . + xt-n] + et Step 5: Run least squares regression on:
y t = α + γ zt + et
63
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh15.9
15.8
Arithmetic Lag Structure Polynomial Lag Structure
proposed by Shirley Almon (1965)
βi
n = the length of the lag
the lag weights fit a polynomial
β0 = (n+1)γ . p = degree of polynomial
y t = α + γ0 z t0 + γ1 z t1 + γ2 z t2 + et βi β
. . β.
β1
2
3
β0
^ .
Step 5: Express βi‘s in terms of ^γ0 , ^γ1 , and ^γ2. .β 4
^ = ^γ
β 0 0
^
β1 = ^γ0 + ^γ + ^γ
1 2
^ = ^γ +
β 2γ^ + 4γ ^ 0 1 2 3 4 i
2 0 1 2
^
β3 = ^γ0 + ^ + 9γ
3γ ^
^
1 2 Figure 15.3
β4 = γ^0 + ^ + 16γ^
4γ 1 2
64
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
15.15
15.14
Geometric Lag Structure Geometric Lag Structure
infinite distributed lag model: infinite unstructured lag:
∞ β0 β
y t = α + Σ βi xt-i + et =
i =0 (15.3.1) β1 = βφ
Substitute βi = β φi β2 = β φ2
geometric lag structure: β3 = β φ3
...
βi = β φi where |φ| < 1 and βφi > 0 . infinite geometric lag:
yt = α + β(xt + φ xt-1 + φ2 xt-2 + φ3 xt-3 + . . .) + et
long-run multiplier : 0 1 2 3 4 i
β
β(1 + φ + φ2 + φ3 + . . . ) = 1− φ
Figure 15.5
y t = α + β(x t + φ xt-1 + φ2 xt-2 + φ3 xt-3 + . . .) + et Lag everything once, multiply by φ and subtract from original:
65
Copyright 1996 Lawrence C. Marsh
15.20 Copyright 1996 Lawrence C. Marsh
15.21
The Koyck Transformation The Koyck Transformation
yt = α(1− φ) + φ yt-1 + βxt + (et − φ et-1)
yt − φ yt-1 = α(1− φ) + βxt + (et − φ et-1)
Defining δ1 = α(1− φ) , δ2 = φ , and δ3 = β ,
Solve for yt by adding φ yt-1 to both sides: use ordinary least squares:
66
Copyright 1996 Lawrence C. Marsh
15.26 Copyright 1996 Lawrence C. Marsh
Adaptive Expectations Adaptive Expectations 15.27
yt = α + β x*t + e t
x*t - x*t-1 = λ (x t-1 - x* t-1)
Lag this model once and multiply by (1− λ):
rearrange to get:
(1− λ)yt-1 = (1− λ)α + (1− λ)β x*t-1 + (1− λ)et-1
x*t = λ xt-1 + (1- λ) x*t-1
67
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
16.1
Partial Adjustment 15.32
Chapter 16
2 ^α = that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the
copyright owner is unlawful. Request for further information should be addressed to the Permissions Department,
(1− β2)
^
(1− β2)
^ John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution
or resale. The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these
programs or from the use of the information contained herein.
4. forecast variable of interest using estimated model. Time Series Analysis is useful for short term forecasting only.
Univariate Time Series Analysis can be used Three types of Univariate Time Series Analysis
to relate the current values of a single economic processes will be discussed in this chapter:
variable to:
1. its past values 1. autoregressive (AR)
2. the values of current and past random errors
2. moving average (MA)
68
Copyright 1996 Lawrence C. Marsh16.6 Copyright 1996 Lawrence C. Marsh16.7
Multivariate Time Series Analysis can be First-Order Autoregressive Processes, AR(1):
used to relate the current value of each of
several economic variables to:
y t = δ + θ1y t-1+ et, t = 1, 2,...,T. (16.1.1)
θi’s are parameters generally between -1 and +1. Consequently, least squares is no longer a
et is an uncorrelated random error with best linear unbiased estimator (BLUE),
mean zero and variance σ e 2 . but it does have some good asymptotic
properties including consistency.
y t = 0.5051 + 1.5537 yt-1 - 0.6515 yt-2 The Partial Autocorrelation Function (PAF)
(0.1267) (0.0707) (0.0708)
The PAF is the sequence of correlations between
(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,
positive given that the effects of earlier lags on yt are
held constant.
negative
69
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
16.13
Partial Autocorrelation Function 16.12
Using AR Model for Forecasting:
Data simulated unemployment rate: yT-1 = 6.63 and yT = 6.20
yt = 0.5 yt-1 + 0.3 yt-2 + et ^ ^ ^
from this model: ^y
T+1 = δ + θ1 yT + θ2 yT-1
^y ^ ^ ^
T+2 = δ + θ1 yT+1 + θ2 yT
µ is the intercept.
αi‘s are unknown parameters. Minimize sum of least squares deviations:
70
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
16.19
16.18
Autocorrelation Function
Choosing the lag length, q, for MA(q): Data simulated yt = et − 0.9 et-1
from this model:
The Autocorrelation Function (AF)
rkk 1 This sample AF suggests a first order
process MA(1) which is correct.
The AF is the sequence of correlations between
(yt and yt-1), (yt and yt-2), (yt and yt-3), and so on,
without holding the effects of earlier lags 2/ T
0 k
on yt constant.
−2/ T
The PAF controlled for the effects of previous lags rkk is the last (k th) coefficient
but the AF does not control for such effects.
−1 in a k th order MA process.
71
Copyright 1996 Lawrence C. Marsh
16.24 Copyright 1996 Lawrence C. Marsh
16.25
Unit Root Tests Unit Root Tests
Use VAR for two or more interrelated time series: 1. extension of AR model.
2. all variables endogenous.
y t = θ0 + θ1y t-1 +...+ θpy t-p + φ1x t-1 +... + φp x t-p + et
3. no structural (behavioral) economic model.
x t = δ 0 + δ 1y t-1 +...+ δ py t-p + α1x t-1 +... + αp x t-p + ut 4. all variables jointly determined (over time).
5. no simultaneous equations (same time).
72
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
16.31
16.30
The random error terms in a VAR model
may be correlated if they are affected by
Least Squares is Consistent
relevant factors that are not in the model
such as government actions or
national/international events, etc. Consequently, regardless of whether
the VAR random error terms are
correlated or not, least squares estimation
Since VAR equations all have exactly the
of each equation separately will provide
same set of explanatory variables, the usual consistent regression coefficient estimates.
seemingly unrelation regression estimation
produces exactly the same estimates as
least squares on each equation separately.
73
Copyright 1996 Lawrence C. Marsh
16.36 Copyright 1996 Lawrence C. Marsh
16.37
Error Correction Model Error Correction Model
Step 1: Step 2:
74
Copyright 1996 Lawrence C. Marsh
17.2 Copyright 1996 Lawrence C. Marsh
17.3
What Book Has Covered Topics for This Chapter
ð Formulation 1. Types of Data by Source
economic ====> econometric.
2. Nonexperimental Data
ð Estimation 3. Text Data vs. Electronic Data
selecting appropriate method.
ð Interpretation 4. Selecting a Topic
how the x ’s impact on the y .
t t 5. Writing an Abstract
ð Inference 6. Research Report Format
testing, intervals, prediction.
75
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh17.9
17.8
Quantitative vs. Qualitative International Data
76
Copyright 1996 Lawrence C. Marsh Copyright 1996 Lawrence C. Marsh
17.15
17.14
Resources for Economists Internet Data Sources
A few of the items on Bill Goffe’s Table of Contents:
Resources for Economists by Bill Goffe
• Shortcut to All Resources.
http://econwpa.wustl.edu/EconFAQ/EconFAQ.html • Macro and Regional Data.
• Other U.S. Data.
• World and Non-U.S. Data.
Bill Goffe provides a vast database of information
• Finance and Financial Markets.
about the economics profession including economic
organizations, working papers and reports, • Data Archives.
and economic data series. • Journal Data and Program Archives.
77
Copyright 1996 Lawrence C. Marsh
17.20 Copyright 1996 Lawrence C. Marsh
17.21
Selecting a Topic Writing an Abstract
General tips for selecting a research topic: Abstract of less than 500 words should include:
ð • “What am I interested in?” (i) concise statement of the problem.
ð • Well-defined, relatively simple topic. (ii) key references to available information.
ð • Ask prof for ideas and references. (iii) description of research design including:
ð • Journal of Economic Literature (ECONLIT) (a) economic model
(b) statistical model
ð • Make sure appropriate data are available.
(c) data sources
ð • Avoid extremely difficult econometrics. (d) estimation, testing and prediction
ð • Plan your work and work your plan. (iv) contribution of the work
78