Professional Documents
Culture Documents
Forecasting
(SW Chapter 12)
12-1
Example #1 of time series data: US rate of inflation
12-2
Example #2: US rate of unemployment
12-3
Why use time series data?
To develop forecasting models
oWhat will the rate of inflation be next year?
To estimate dynamic causal effects
oIf the Fed increases the Federal Funds rate now,
what will be the effect on the rates of inflation and
unemployment in 3 months? in 12 months?
oWhat is the effect over time on cigarette
consumption of a hike in the cigarette tax
Plus, sometimes you dont have any choice
oRates of inflation and unemployment in the US can
be observed only over time.
12-4
Time series data raises new technical issues
Time lags
Correlation over time (serial correlation or
autocorrelation)
Forecasting models that have no causal interpretation
(specialized tools for forecasting):
oautoregressive (AR) models
oautoregressive distributed lag (ADL) models
Conditions under which dynamic effects can be
estimated, and how to estimate them
Calculation of standard errors when the errors are
serially correlated
12-5
Using Regression Models for Forecasting
(SW Section 12.1)
12-8
Example: Quarterly rate of inflation at an annual rate
CPI in the first quarter of 1999 (1999:I) = 164.87
CPI in the second quarter of 1999 (1999:II) = 166.03
Percentage change in CPI, 1999:I to 1999:II
166.03 164.87 1.16
= 100 = 100 = 0.703%
164.87 164.87
Percentage change in CPI, 1999:I to 1999:II, at an
annual rate = 40.703 = 2.81% (percent per year)
Like interest rates, inflation rates are (as a matter of
convention) reported at an annual rate.
Using the logarithmic approximation to percent changes
yields 4100[log(166.03) log(164.87)] = 2.80%
12-9
Example: US CPI inflation its first lag and its change
CPI = Consumer price index (Bureau of Labor Statistics)
12-10
Autocorrelation
Y ,Y )
cov(
j = t t j
Y)
var( t
where
T
1
Y ,Y ) =
cov( t t j
T j 1 t j 1
(Yt Y j 1,T )(Yt j Y1,T j )
12-14
The inflation rate is highly serially correlated (1 = .85)
Last quarters inflation rate contains much information
about this quarters inflation rate
The plot is dominated by multiyear swings
But there are still surprise movements!
12-15
More examples of time series & transformations
12-16
More examples of time series & transformations, ctd.
12-17
Stationarity: a key idea for external validity of time
series regression
Stationarity says that the past is like the present and
the future, at least in a probabilistic sense.
12-18
Autoregressions
(SW Section 12.3)
against Yt1
th
o In a p order autoregression, Yt is regressed
against Yt1,Yt2,,Ytp.
12-19
The First Order Autoregressive (AR(1)) Model
Yt = 0 + 1Yt1 + ut
12-20
Example: AR(1) model of the change in inflation
Estimated using data from 1962:I 1999:IV:
(0.14) (0.106)
12-21
Example: AR(1) model of inflation STATA
First, let STATA know you are using time series data
generate time=q(1959q1)+_n-1; _n is the observation no.
So this command creates a new variable
time that has a special quarterly
date format
12-22
Example: AR(1) model of inflation STATA, ctd.
. gen lcpi = log(cpi); variable cpi is already in memory
12-23
Example: AR(1) model of inflation STATA, ctd
Syntax: L.dinf is the first lag of dinf
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf |
L1 | -.2109525 .1059828 -1.99 0.048 -.4203645 -.0015404
_cons | .0188171 .1350643 0.14 0.889 -.2480572 .2856914
------------------------------------------------------------------------------
if tin(1962q1,1999q4)
STATA time series syntax for using only observations between 1962q1 and
1999q4 (inclusive).
This requires defining the time scale first, as we did above
12-24
Forecasts and forecast errors
A note on terminology:
A predicted value refers to the value of Y predicted
(using a regression) for an observation in the sample
used to estimate the regression this is the usual
definition
A forecast refers to the value of Y forecasted for an
observation not in the sample used to estimate the
regression.
Predicted values are in sample
Forecasts are forecasts of the future which cannot
have been used to estimate the regression.
12-25
Forecasts: notation
Yt|t1 = forecast of Yt based on Yt1,Yt2,, using the
population (true unknown) coefficients
Yt|t 1 = forecast of Yt based on Yt1,Yt2,, using the
estimated coefficients, which were estimated using
data through period t1.
For an AR(1),
Yt|t1 = 0 + 1Yt1
Yt|t 1 = 0 + 1 Yt1, where 0 and 1 were estimated
using data through period t1.
12-26
Forecast errors
12-27
The root mean squared forecast error (RMSFE)
12-28
Example: forecasting inflation using and AR(1)
so
Inf 2000:I |1999:IV = Inf1999:IV + Inf 2000: I |1999: IV = 3.2 0.1 = 3.1
12-29
The pth order autoregressive model (AR(p))
.04Inft4, R 2 = 0.21
(.10)
12-31
Example: AR(4) model of inflation STATA
. reg dinf L(1/4).dinf if tin(1962q1,1999q4), r;
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf |
L1 | -.2078575 .09923 -2.09 0.038 -.4039592 -.0117558
L2 | -.3161319 .0869203 -3.64 0.000 -.4879068 -.144357
L3 | .1939669 .0847119 2.29 0.023 .0265565 .3613774
L4 | -.0356774 .0994384 -0.36 0.720 -.2321909 .1608361
_cons | .0237543 .1239214 0.19 0.848 -.2211434 .268652
------------------------------------------------------------------------------
NOTES
12-32
12-33
Example: AR(4) model of inflation STATA, ctd.
. test L2.dinf L3.dinf L4.dinf; L2.dinf is the second lag of dinf, etc.
( 1) L2.dinf = 0.0
( 2) L3.dinf = 0.0
( 3) L4.dinf = 0.0
F( 3, 147) = 6.43
Prob > F = 0.0004
12-34
Digression: we used Inf, not Inf, in the ARs. Why?
Inft = 0 + 1Inft1 + ut
or
Inft Inft1 = 0 + 1(Inft1 Inft2) + ut
or
Inft = Inft1 + 0 + 1Inft1 1Inft2 + ut
so
Inft = 0 + (1+1)Inft1 1Inft2 + ut
12-35
So why use Inft, not Inft?
AR(1) model of Inf: Inft = 0 + 1Inft1 + ut
AR(2) model of Inf: Inft = 0 + 1Inft + 2Inft1 + vt
12-36
Here, Inft is strongly serially correlated so to keep
ourselves in a framework we understand, the
regressions are specified using Inf
For optional reading, see SW Section 12.6, 14.3, 14.4
12-37
Time Series Regression with Additional Predictors
and the Autoregressive Distributed Lag (ADL) Model
(SW Section 12.4)
12-41
Example: dinf and unem STATA
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf |
L1 | -.3629871 .0926338 -3.92 0.000 -.5460956 -.1798786
L2 | -.3432017 .100821 -3.40 0.001 -.5424937 -.1439096
L3 | .0724654 .0848729 0.85 0.395 -.0953022 .240233
L4 | -.0346026 .0868321 -0.40 0.691 -.2062428 .1370377
unem |
L1 | -2.683394 .4723554 -5.68 0.000 -3.617095 -1.749692
L2 | 3.432282 .889191 3.86 0.000 1.674625 5.189939
L3 | -1.039755 .8901759 -1.17 0.245 -2.799358 .719849
L4 | .0720316 .4420668 0.16 0.871 -.8017984 .9458615
_cons | 1.317834 .4704011 2.80 0.006 .3879961 2.247672
------------------------------------------------------------------------------
12-42
Example: ADL(4,4) model of inflation STATA, ctd.
( 1) L2.dinf = 0.0
( 2) L3.dinf = 0.0
( 3) L4.dinf = 0.0
( 1) L.unem = 0.0
( 2) L2.unem = 0.0
( 3) L3.unem = 0.0
( 4) L4.unem = 0.0
The null hypothesis that the coefficients on the lags of the unemployment
rate are all zero is rejected at the 1% significance level using the F-
statistic
12-43
The test of the joint hypothesis that none of the Xs is a
useful predictor, above and beyond lagged values of Y, is
called a Granger causality test