CH VII Introduction To Time Series

Chapter VII.
Slide 1
VII. Introduction to Time Series AR(1) Model and Forecasting
a. Introduction to Dependent Observations
b. Checking for Independence
c. Autocorrelation
d. The AR(1) Model
e. Random Walks
f. Trend Models
g. An Example of Trend Modeling
h. An Example of a Time Series Regression
Chapter VII. Slide 2
In autoregressive models, we consider observations taken
over time.
To denote this, we will index the observations with the letter
t rather than the letter i.
Our data will be observations on Y
1
, Y
2
, ...Y
t
, ...where t
indexes the day, month, year, or any time interval.
Key new idea:
Exploit the dependence in the series
Time series analysis is about uncovering, modeling, and
exploiting dependence
We will NOT assume that Y
t-1
is independent of Y
t
Example: Is tomorrows temperature independent of todays?
Suppose y
1
...y
T
are the temperatures measured daily for several
years. Which of the following two predictors would work better:
i. the average of the temperatures from the previous year
ii. the temperature on the previous day?
If the readings are iid N(Q,W
2
), what would be your prediction for
Y
T+1
?
This example demonstrates that we should handle dependent
time series quite differently from independent series.
The Lake Michigan Time Series
The mean June level of lake Michigan in number of meters above sea
level (lmich_yr), 1918-2006
Use Minitab Time series Plot Command (under graph menu) to produce
this graph
Storm off Promontory Point
Monthly US Beer Production (millions of barrels)
Strong Seasonality
Index
b
_
p
r
o
d
70 63 56 49 42 35 28 21 14 7 1
20
19
18
17
16
15
14
13
12
Time Series Plot of b_prod
What Does IID Data Look Like?
many (but not too many) crossings of the mean
Index
I
I
D
100 90 80 70 60 50 40 30 20 10 1
3
2
1
0
-1
-2
-3
-4
Time Series Plot of IID
It is not always easy just to look at the data and decide
whether a time series is independent.
So how can we tell?
Plot Y
t
vs. Y
t-1
to check for a relationship
or
Plot Y
t
vs. Y
t-s
for s = 1, 2,
Knowing Y
t
does not help you in predicting Y
t+1
Independence:
How do we do this in Minitab? Use the lag command
MTB > lag c2 c3
MTB > lag c3 c4
C2
3 9 4 6
1 3 9 5
8 1 3 4
5 8 1 3
* 5 8 2
* * 5 1
Y(t-2) Y(t-1) Y(t) t
C4 C3 C1
Y
Y lagged once
Y lagged twice
Now each row has Y at
time t, Y one period
ago, and Y two periods
ago
Now lets return to the lake data
First, lets plot Level
t
vs. Level
t-1
Corr = .794
Each point is a pair of adjacent years.
e.g. (Level
1929
, Level
1930
)
Now, lets plot Level
t
vs. Level
t-2
Corr = .531
c. Autocorrelation
Time series is about dependence. We use correlation as a
measure of dependence.
Although we have only one variable, we can compute the
correlation between Y
t
and Y
t-1
or between Y
t
and Y
t-2
.
The correlations between Ys at different times are called
autocorrelations.
However, we must assume that all the Ys have:
same mean (no upward or downward trends)
same variances
c. Autocorrelation
We will assume what is known as stationarity.
Roughly speaking this means:
The time series varies about a fixed mean and has constant
variance
The dependence between successive observations does
not change over time
Lets define the autocorrelations for a stationary time series.
)
) )
)
)

p = =
-
t t s t t s
s
t
t t s
cov Y,Y cov Y,Y
Var Y
Var Y Var Y
Note that the autocorrelation does not depend on t because we
have assumed stationarity
c. Autocorrelation
We estimate the theoretical quantities by using sample
averages (as always).
The estimated or sample autocorrelations are:
=
=

T
t t s
t s
s
T
2
t
t 1
(Y Y)(Y Y)
=
r
(Y Y
)
c. Autocorrelation
The ACF command in Minitab computes the autocorrelations
There is a strong
dependence
between
observations spaced
close together in
time (e.g only one or
two years apart). As
time passes, the
dependence
diminishes in
strength.
c. Autocorrelation
Lets look at the autocorrelations for the IID series.
In contrast to the ACF
for the level series, the
sample autocorrelations
are much smaller.
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
24 22 20 18 16 14 12 10 8 6 4 2
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for ran
(with 5% significance limits for the autocorrelations)
c. Autocorrelation
How do we know if the sample autocorrelations are good
estimates of the underlying theoretical autocorrelations?
and
How do we know if we have enough sample information to
reach definitive conclusions?
If all the true autocorrelations are 0, then the standard
deviation of the sample autocorrelations is about 1/sqrt(T).
)
T
1
r Err Std
s
=
T = Total Number of observations or time periods
c. Autocorrelation
For the IID series
All of the sample autocorrelations are within 2 standard
deviations of 0 -- no evidence of positive autocorrelation in
the data.
For the level series
T=89 so the standard deviation is again about 0.1. The first
autocorrelation is many standard deviations away from 0,
suggesting strongly that the data are not iid.
c. Autocorrelation
Another Example: Stock Returns
Monthly returns on IBM
Index
I
B
M
-
r
e
t
360 324 288 252 216 180 144 108 72 36 1
0.2
0.1
0.0
-0.1
-0.2
Time Series Plot of IBM-ret
c. Autocorrelation
Lets look at the ACF for the series.
The series is independent
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
60 55 50 45 40 35 30 25 20 15 10 5 1
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for IBM-ret
d. The AR(1) Model
A simple way to model dependence over time is with the
autoregressive model of order 1.
This is a SLR model of Y
t
regressed on lagged Y
t-1
.
= + + s
t 0 1 t 1 t
AR(1) : Y Y
What does the model say for the T+1 st observation?
1 T T 1 0 1 T
Y Y
+ +
+ + =
The AR(1) model expresses what we dont know in terms of
what we do know at time T.
d. The AR(1) Model
How should we predict Y
T+1
?
? A ? A
+ +
= + + s = +
T 1 T 0 1 T T 1 T 0 1 T
E Y | Y Y E | Y Y
How do we use the AR(1) model? We simply regress Y on lagged
Y.
If our model successfully captures the dependence structure in
the data then the residuals should look iid. There should be no
dependence in the residuals!
So to check the AR(1) model, we can check the residuals from the
regression for any left-over dependence.
d. The AR(1) Model
Lets try it out on the lake water level data...
Regression Analysis: level versus level_t-1
The regression equation is
level = 36.8 + 0.792 level_t-1
88 cases used, 1 cases contain missing values
Predictor Coef SE Coef T P
Constant 36.79 11.55 3.18 0.002
level_t-1 0.79161 0.06543 12.10 0.000
S = 0.236208 R-Sq = 63.0% R-Sq(adj) = 62.6%
Analysis of Variance
Source DF SS MS F P
Regression 1 8.1675 8.1675 146.39 0.000
Residual Error 86 4.7983 0.0558
Total 87 12.9657
d. The AR(1) Model
Now lets look at the ACF of the residuals
Not much
autocorrelation
left!
d. The AR(1) Model
Now lets try the beer data
MTB > lag c1 c2
MTB > name c2 bprod-1
Regression Analysis
b_prod = 4.78 + 0.704 bprod-1
71 cases used 1 cases contain missing values
Constant 4.778 1.425 3.35 0.001
bprod-1 0.70429 0.08724 8.07 0.000
s = 1.386 R-sq = 48.6% R-sq(adj) = 47.8%
d. The AR(1) Model
Now lets look at the ACF of the residuals
Theres a lot of auto-correlation left in.
Why at lag 6 and 12?
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
18 16 14 12 10 8 6 4 2
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for RESI1
d. The AR(1) Model
To gain a better feel for this model, lets simulate data series from
the model with various parameter settings
The series fluctuates around a mean level with fairly long runs.
Index
A
R
(
1
)
100 90 80 70 60 50 40 30 20 10 1
0.3
0.2
0.1
0.0
-0.1
-0.2
-0.3
Time Series Plot of AR(1)
0
1
0
.8
=
=
d. The AR(1) Model
Now the ACF
The ACF reveals the strong dependence in the series!
Note the smooth decline from about .8
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
24 22 20 18 16 14 12 10 8 6 4 2
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for AR(1)
d. The AR(1) Model
Now lets look at a series generated with a negative slope value
Because
1
is negative, an above average Y tends to be followed
by a below average Y (and vice versa) - hence the jagged
appearance of the plot.
Index
A
R
(
1
)
-
.
8
100 90 80 70 60 50 40 30 20 10 1
0.5
0.4
0.3
0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.4
Time Series Plot of AR(1)-.8
0
1
0
.8
=
=
d. The AR(1) Model
and the ACF
This choppy behavior is reflected in the ACF
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
24 22 20 18 16 14 12 10 8 6 4 2
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for AR(1)-.8
d. The AR(1) Model
Now lets look at a series generated with a slope value of 1
Wanders around quite a lot!
Index
R
W
1000 900 800 700 600 500 400 300 200 100 1
3
2
1
0
-1
Time Series Plot of RW
0
1
0
1.0
=
=
d. The AR(1) Model
What about the ACF?
The first autocorrelation is close to 1. Does that mean the series
is very predictable? We will return to the case of
1
= 1 shortly
Lag
A
u
t
o
c
o
r
r
e
l
a
t
i
o
n
75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 1
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Autocorrelation Function for RW
d. The AR(1) Model
Some Intuition on Mean Reversion
We have seen that the slope parameter governs the rate at
which the AR(1) model returns or reverts to the mean
level of the series.
Fact for the AR(1) model:
? A
)
= Q =
0
t
1
E Y
1
d. The AR(1) Model
If we subtract Q from both sides of the AR(1) model
equation, we can write the model in terms of deviations from
the mean.
)
t 1 t 1 t
Y Y + =
Thus,
1
governs the rate at which you revert to the mean
level of the series.
On average, Y
t
is closer to the mean than Y
t-1
.
If there is no mean reversion, then we have a random walk.
e. Random Walks
The case of
1
= 1 deserves special attention because of it's
importance in economic data series.
Many economic and business time series display a "random
walk character."
A random walk is an AR(1) model with
1
= 1
Random Walk:
t 1 t o t
Y Y + + =

e. Random Walks
The intercept,
0
, is called the drift parameter for the
random walk. Let's first consider the case of
0
= 0.
This called a random walk with zero drift:
t 1 t t
Y Y + =
A random walk with zero drift meanders around zero with

no particular trend. However, it can take very long
excursion away from zero. These excursions can look
like trends until the series turns back toward 0.
e. Random Walks
The random walk get its name from the idea of a random
walker on the number line. A random walker is someone
who has an equal chance of taking a step forward or a step
backward. The size of the steps are random as well.
To see this, it is very useful to re-express the random walk in
term of increments or steps. Subtract Y
t-1
from both sides,
The increments are an random sample (iid collection of rvs)!
t 1 t t t
Y Y Z = =

e. Random Walks
A random walk with zero drift:
"meanders" around zero with no particular trend.
can take long "excursions" away from zero that look like
trends. Dont get fooled!
but random walk will always return to zero.
Random Walks with Non-zero Drift:
If
0
is positive, we have a random walk with positive drift.
Here the average step size is
0
:
= W
2
t t 1 t 0
Y Y Z ~ iidN( , )
f. Trend Models
Many times we want to allow for shifts in the mean of series
over time.
There are two trend models which can be used:
i. Linear Trend Model
ii. Exponential Trend Model
Linear Trend Model:
50 40 30 20 10
30
20
10
0
Index
C
3
t 1 0 t
t Y + + =
f. Trend Models
Exponential Trend Model:
)
t 1 0 t
t Y log + + =
50 40 30 20 10
5
4
3
Index
C
5
% growth/100
f. Trends and Random Walks
Is the above graph from a trend or a random walk with a
positive drift?
= + + s
t t 1 t
Y .1 Y = + + s
t t
Y t
or
f. Random Walks and Trends
Lets run the regression for the trend fit and look at residual
acf. Looks pretty auto-correlated! Trend Model is not
appropriate.
f. Random Walks and Trends
Difference the data and look at acf. Looks like a RW!
Why?
g. Example of Trend Modeling
In a recent legal case, a downtown hotel claimed that it had suffered a
loss of business due to what was considered an illegal action by others.
In order to support its claim of lost business, the hotel had to predict
what its level of business would have been in the absence of the alleged
illegal action. In order to do this, experts testifying on behalf of the hotel
use data collected before the period in question and fit a relationship
between the hotels occupancy rate and overall occupancy rate in the
city of Chicago. This relationship would then be used to predict
occupancy rate during the period in question. This dataset is
HOTELOCC.MTP.
HX_occ = 16.1 + 0.716 Chi_occ
Constant 16.136 8.519 1.89 0.069
Chi_occ 0.7161 0.1338 5.35 0.000
S = 7.506 R-Sq = 50.6% R-Sq(adj) = 48.8%
Looks good but what about the independence of the residuals?
80 70 60 50 40
80
70
60
50
40
Chi_occ
H
X
_
o
c
c
R-Sq= 0.506
Y= 16.1357+ 0.716132X
Regression Plot
Sequence plot of the residuals
30 20 10
2
1
0
-1
-2
Index
S
R
E
S
1
Whats wrong here?
To take into account the downward trend in the hotels
occupancy rate, we introduce a linear trend term in the
model.
t 2 t 1 0 t
t Occ _ CH Occ _ HX + + + =
Time Trend Term
HX_occ = 26.7 + 0.695 Chi_occ - 0.596 Time
Constant 26.694 6.419 4.16 0.000
Chi_occ 0.69524 0.09585 7.25 0.000
Time -0.5965 0.1134 -5.26 0.000
S = 5.372 R-Sq = 75.6% R-Sq(adj) = 73.8%
We created the Time variable as an index from 1 to 30,
In Minitab, use the Calc Menu - Make Patterned Data -
Simple Set of Numbers item.
30 20 10
2
1
0
-1
-2
Index
S
R
E
S
4
Much better!
h. Example of a Time Series Regression
With time series data, we want to use information available
now (time t) to forecast some variable of interest in the
future. In the AR(1) model, we use the current value of y to
forecast future values of y.
We might wish to use some other time series to forecast the
future value of variable of interest. A nice example of this is
available from the finance literature. A number of authors
have documented the relationship between Price-Dividend
ratios and future stock returns. This involves a time series
regression of the following form:
? A
+
= +
t 1 t 0 1 t
E y | y y
)

= + + s
t 0 1 t 1 t 1 t
R P / D
Here R
t
is the return on some market index portfolio and P/D is the
aggregate price-dividend ratio. Let's run that regression (return_pd.mtp
in the class dataset area). This data set has 1 year, 2 year, 3 year and
5 year returns on the value-weighted stock index and the P/D ratio for
the year prior to the return. For example, the data on 2 year returns
would be organized as follows:
YR P/D R
1925 P/D
1924
R
1925-1926
1926 P/D
1925
R
1926-1927
.
.
.
1996 P/D
1995
R
1996-1997
Let's run a regression to check for predictability.
1yrt = 0.350 - 0.0104 pd1yr
Constant 0.34991 0.09523 3.67 0.001
pd1yr -0.010425 0.003652 -2.85 0.006
S = 0.1591 R-Sq = 14.8% R-Sq(adj) = 13.0%
There is a "significant" relationship here which suggests some
predictability in return. However, before we get carried away, we might
want to check our residual diagnostics. Let's check the residuals for
autocorrelation
ACF of sres
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
+----+----+----+----+----+----+----+----+----+----+
1 -0.127 XXXX
2 -0.253 XXXXXXX
3 0.162 XXXXX
4 0.354 XXXXXXXXXX
5 0.009 X
6 -0.167 XXXXX
7 0.104 XXXX
8 0.025 XX
9 0.124 XXXX
10 -0.101 XXXX
11 -0.061 XXX
12 -0.061 XXX
Residuals look pretty good.
Now let's try to see if we can predict five year returns.
5yrt = 2.06 - 0.0622 pd5yr
Constant 2.0589 0.2025 10.17 0.000
pd5yr -0.062219 0.007991 -7.79 0.000
S = 0.3197 R-Sq = 58.5% R-Sq(adj) = 57.5%
Even stronger relationship (careful though, residuals are a bit
autocorrelated!)
Thus, it appears that there is some predictability in stock returns.
Does this mean you can make money? See John Cochrane, "Where is
the Market Going? Uncertain facts and Novel Theories" pp. 7-9 for a
good discussion of this. (available as the pdf file cochrane fed res art.pdf
on the course web site).
Glossary of Symbols
p
s
- sth order autocorrelation
r
s
- sth order sample autocorrelation
Important Equations
)
) )
)
)

p = =
-
t t s t t s
s
t
t t s
cov Y,Y cov Y,Y
Var Y
Var Y Var Y
=
=

T
t t s
t s
s
T
2
t
t 1
(Y Y)(Y Y)
=
r
(Y Y
)
)
T
1
r Err Std
s
=
Population and
Sample
Autocorrelations
Std error of
sample
autocorrelation
Important Equations
= + + s
t 0 1 t 1 t
AR(1) : Y Y
)
t 1 t 1 t
Y Y + =
t 1 t o t
Y Y + + =
definition of
AR(1) model
Mean Reversion
form of AR(1)
Random Walk

CH VII Introduction To Time Series

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH VII Introduction To Time Series

Uploaded by

Copyright:

Available Formats

Chapter VII.

Chapter VII. Slide 34

A random walk with zero drift meanders around zero with

Chapter VII. Slide 36

You might also like