You are on page 1of 83

Time Series Analysis

Definition
A time series is a sequence of observations
taken sequentially in time
An intrinsic feature of a time series is that,
typically adjacent observations are dependent
The nature of this dependence among
observations of a time series is of
considerable practical interest
Time Series Analysis is concerned with
techniques for the analysis of this dependence

Time Series Forecasting


Examine the past behavior of a time
series in order to infer something about
its future behavior
A sophisticated and widely used
technique to forecast the future demand
Examples

Univariate time series: AR, MA, ARMA,


ARIMA, ARIMA-GARCH
Multivariate: VAR, Cointegration

Univariate Time-series Models


The term refers to a time-series that consists of single
(scalar) observations recorded sequentially over equal
time increments
Univariate time-series analysis incorporates making use of
historical data of the concerned variable to construct a
model that describes the behavior of this variable (timeseries)
This model can, subsequently, be used for forecasting
purpose
Appropriate technique for forecasting high frequency time
series where data on independent variables are either
non-existent or difficult to identify

Famous forecasting quotes


"I have seen the future and it is very much like
the present, only longer." - Kehlog Albran, The Profit
This nugget of pseudo-philosophy is actually a concise
description of statistical forecasting. We search for statistical
properties of a time series that are constant in time - levels, trends,
seasonal patterns, correlations and autocorrelations, etc. We then
predict that those properties will describe the future as well as the
present.
"Prediction

is very difficult, especially if it's


about the future." Nils Bohr, Nobel laureate in Physics
This quote serves as a warning of the importance of validating a
forecasting model out-of-sample. It's often easy to find a model
that fits the past data well--perhaps too well! - but quite another
matter to find a model that correctly identifies those patterns in the
past data that will continue to hold in the future

Time series data


Secular Trend: long run pattern
Cyclical Fluctuation: expansion and
contraction of overall economy (business
cycle)
Seasonality: annual sales patterns tied
to weather, traditions, customs
Irregular or random component

https://www.youtube.com/watch?v=s9Fcg
JK9GNI

Ex-Post vs. Ex-Ante Forecasts


How can we compare the forecast
performance of our model?
There are two ways.
Ex Ante: Forecast into the future, wait for
the future to arrive, and then compare the
actual to the predicted
Ex Post: Fit your model over a shortened
sample
Then forecast over a range of observed data
Then compare actual and predicted .

Ex-Post and Ex-Ante


Estimation & Forecast Periods
Suppose you have data covering the
period 1980.Q1-2001.Q4

Ex-Post Estimation Period

Ex-Post
The
Forecast Future
Period

Ex-Ante
Forecast
Period

Examining the In-Sample Fit


One thing that can be done, once you
have fit your model is to examine the insample fit
That is, over the period of estimation, you
can compare the actual to the fitted data
It can help to identify areas where your
model is consistently under or over
predicting take appropriate measures
Simply estimate equation and look at
residuals

Model Performance
RMSE =(1/n(fi xi)2 - difference
between forecast and actual summed
smaller the better
MAE & MAPE smaller the better

The Theil inequality coefficient always


lies between zero and one, where zero
indicates a perfect fit.
Bias portion - Should be zero
How far is the mean of the forecast from
the mean of the actual series?

Model Performance
Variance portion - Should be zero
How far is variation of forecast from forecast of
actual series variance?

Covariance portion - Should be one


What portion of forecast error is unsystematic
(not predictable)

If your forecast is "good", the bias and


variance proportions should be small so that
most of the bias should be concentrated on
the covariance proportions

Autocorrelation function (ACF)


Autocorrelation function (ACF) of a random
process describes the correlation between the
process at different points in time.
Let Xt be the value of the process at time t
(where t may be an integer for a discretetime process or a real number for a
continuous-time process).
If Xt has mean ; and variance 2 then the
definition of ACF is

ACF & PACF


The partial autocorrelation at lag k is
the regression coefficient on Yt-k when
Yt is regressed on a constant,Yt-1Yt-k
This is a partial correlation since it
measures the correlation of values that
are periods apart after removing the
correlation from the intervening lags
Correlogram: Plot of ACF & PACF
against lags

Stationary Time Series


A stochastic process is said to be stationary if its mean
and variance are constant over time and the value of
covariance between two time periods depends only the
distance or gap or lag between the two time periods and
not the actual time at which the covariance is computed
In time series literature, such stochastic process is
known as weakly stationary or covariance stationary
In most practical situation, this type of stationary often
suffices
A time series is strictly stationary if all the moments of
its probability distribution and not just the first two
(mean & variance) are invariant over time

Stationary Time Series


However, if the stationary process is normal , the weakly
stationary process is also strictly stationary as normal
stachastic process is fully specified by its two moments,
the mean & variance
Let Yt be a stochastic time series with properties:
Mean : E(Yt) =
: var(Yt) = E (Yt )2 = 2
Variance
Covariance :k = E (Yt )(Yt+k ) autocovariance
between Yt and Yt+k, i.e. between two Y values k pariods
apart
If k = 0, we obtain 0, which is simply the variance of Y
If k = 1, 1 is the covariance between two adjacent values
of Y

Stationary Time Series


Now, if we shift the origin from Yt to Yt+m, the
mean, variance and autocovariance of Yt+m
must be same as those of Yt
This, if a time series is stationary, its mean,
variance, autocovariance remains same, no
matter at what point we measure them i.e.
they are time invariant
Such a time series is tend to returns to its
mean, called mean reversion

Non-stationary Series
A non-stationary time series will have a time
varying mean or variance or both
For non-stationary time series, we can study its
behavior only for the time period under
consideration
Each set of time series data will therefore be
for a particular episode
So it is not possible to generalize it to other time
periods
Therefore, for the purpose of forecasting,
non-stationary time series may be of little
practical value

Forecasting
Most statistical forecasting methods
are based on the assumption that the
time series can be rendered
approximately stationary (i.e.,
"stationarized") through the use of
mathematical transformations
A stationarized series is relatively
easy to predict: you simply predict
that its statistical properties will be
the same in the future as they have
been in the past!

Forecasting
The predictions for the stationarized series
can then be "untransformed," by reversing
whatever mathematical transformations
were previously used, to obtain predictions
for the original series
The details are normally taken care of by
software
Thus, finding the sequence of
transformations needed to stationarize a
time series often provides important clues
in the search for an appropriate
forecasting model.

Random or White Noise Process


We call a stochastic process purely
random or white noise process if it has
a zero mean, constant variance and
serially uncorrelated
Error term entered in CLRM is assumed
to be white noise process as u ~ iid (0,
2 )
Random walk model, non-stationary in
nature, observed in asset price, stock
price or exchange rates (discuss later)

Trend: ACF & PACF

The ACF function shows a definite pattern, it


decreases with the lags.
This means there is a trend in the data.
Since the pattern does not repeat , we can conclude
that the data does not show any seasonality.

Seasonality

Trend & Seasonality: ACF & PACF

The ACF plots clearly show a repetition in the pattern indicating


that the data are seasonal, there is periodicity after every 12
observations, ie they show seasonality and trend in the data
The PACF plots also show seasonality, trend

Estimation and Removal of Trend &


Seasonality
Classical Decomposition of a Time Series
Xt = mt + s t + Y t
mt : trend component (deterministic, changes slowly
with t);
st : seasonal component (deterministic, period d);
Yt : noise component (random, stationary).

Aim: Extract components mt and st, and hope that


Yt will be stationary. Then focus on modeling Yt.
We may need to do preliminary transformations if
the noise or amplitude of the seasonal
fluctuations appear to change over time.

Time series data, Xt = mt + st + Yt


ACF, PACF, ADF tests

Non-stationary series
De-trend and/or
De-seasonalize

Stationary Series Yt

Stationary Series, Xt=Yt


Model for Xt=Yt
AR, MA, ARMA

Residual series WN

Model for Yt
AR, MA, ARMA

Residual series WN
Estimate AR, MA, ARMA parameters
Forecast Xt
(In-sample/Out of sample)

Estimate AR, MA, ARMA


parameters
Forecast Xt=Yt
(In-sample/Out of sample)

Backward Shift Operator


This operator B plays an important role in the
mathematics of TSA
BXt=Xt-1 and in general BsXt = Xt-s
A polynomial in the lag operator takes the form
(B)=1+ 1B+ 2B2+.+ qBq, where 1 q are
parameters
The roots of such a polynomial are defined as q
values of B which satisfy the polynomial
equation (B) =0

Backward Shift Operator


If q=1, (B)=1+ B=0 B= - (1/ )
A root is said to lie outside the unit circle
if the modulus is greater than one
The first difference operator is defined
as
= 1- B Xt = Xt Xt-1
2 = (1 B)2 More generally, d=(1- B)d is
a dth order polynomial

Elimination of Trend

Nonseasonal model with trend: Xt = mt + Yt,


E(Yt)=0

Methods:
(a) Moving Average Smoothing
(b) Exponential Smoothing
(c) Spectral Smoothing
(d) Polynomial Fitting
(e) Differencing k times to eliminate trend

Differencing k times to eliminate


trend
Define the backward shift operator B as follows: B Xt =
Xt-1
We can remove trend by differencing, e.g.
(1-B) Xt=Xt - Xt-1, and,
(1-B)2 Xt = (1-2B+B2) Xt = Xt - 2Xt-1 + Xt-2
It can be shown that a polynomial trend of degree k
will be reduced to a constant by differencing k times,
that is, by applying the operator (1-B)k Xt
Given a sequence {xt}, we could therefore proceed by
differencing repeatedly until the resulting series can
plausibly be modeled as a realization of a stationary
process.

Elimination of Seasonality
Seasonal model without trend: Xt = st + Yt,
E(Yt)=0,.
(a)Classical Decomposition
Regress level variable (Y) on dummy variables (with or without
intercept)
Calculate residuals
Add these residuals to mean value of Y
Resulting series is deseasonalized time series
(b)

Differencing at lag d to eliminate period d


Since, (1-Bd)st= st - st-d = 0, differencing at lag d will eliminate
a seasonal component of period d.

Elimination of Trend+Seasonality
Elimination of both trend and seasonal
components in a series, can be
achieved by using trend as well as
seasonal differencing
For example: (1-B)(1-B12)Xt

Time series data, Xt = mt + st + Yt


ACF, PACF, ADF tests

Stationary Series, Xt

Non-stationary series
De-trend and/or
De-seasonalize

Model for Xt=Yt


AR, MA, ARMA

Residual series WN

Stationary Series
Model for stn. series
AR, MA, ARMA

Residual series WN
Estimate AR, MA, ARMA parameters
Forecast Xt after re-transformation
(In-sample/Out of sample)

Estimate AR, MA, ARMA


parameters
Forecast Xt
(In-sample/Out of sample)

Non-Seasonal & Seasonal


AR, MA & ARMA Process

Autoregressive Process
AR(1) model specification is
Yt = m + Yt-1 + ut

{ut} WN(0, 2).

(1 L) Yt = m + ut
Yt = (1+ L+ 2L2 +.)(m + ut)

Since a constant like m has the same


value at all periods, application of lag
operator any number of times simply
reproduces the constant. So

Autoregressive Process
Yt = (1++2+)m+ (ut+ ut-1+2ut-2+)
E(Yt) = (1+ + 2+)m

This expression only exists if the


infinite geometric series has a limit
The necessary & sufficient condition is
[]<1, and then E(Yt) = m/(1- ) =
Y has a constant unconditional mean
independent of time

Autoregressive Process
AR(2) Process:
yt = 1 yt-1 + 2 yt-2 + ut
AR(p) Process:
yt = 1 yt-1 + 2 yt-2 + .+ p yt-p + ut
yt = p yt-p
Defining the AR polynomial
(L) 1L pL p
we can write the AR(p) model concisely as:
(L)yt = ut

Autoregressive Process
It is sometime difficult to distinguish
between AR processes of different orders
solely based on correlograms
A sharper discrimination is possible on the
basis on partial autocorrelation coeff
For an AR(p), PACF vanishes for lags
greater than p. while, ACF of an AR(p)
decays exponentially

Moving Average Process


In a pure MA process, a variable is expressed
solely in terms of the current and pervious white
noise disturbances
MA(1) Process: yt = ut + q1 ut-1
MA(q) Process:
yt = ut + q1 u t-1 + ... + qqu t-q,
{ut} WN(0, 2)
Defining the MA polynomial
q(L) = 1 + q1 L+ ... + qq Lq
we can write the MA(q) model concisely as:
yt = q(L) ut.

Moving Average Process


For parameter identifiability reasons, and
in analogy with the concept of causality for
AR processes, we require that all roots of
(L) be greater than 1 in magnitude
The resulting process is said to be
invertible
The PACF of an MA(q) decays
exponentially
The ACF vanishes for lags beyond q

The single
negative spike at
lag 1 in the ACF
is an MA(1)
signature

ARMA Process
We can put an AR(p) and an MA(q) process
together to form the more general ARMA(p,q)
process:
yt y t-1pyt-p ut + ut-1 qutq
where {ut} WN(0, 2).
By definition, we require that {yt} be stationary.
Using the compact AR & MA polynomial notation,
we can write the ARMA(p,q) as:
(L)yt (L)ut{ut}WN(0, 2)

ARMA Process
For stationarity and invertibility, we require
as before, that all roots of (L) and (L) be
greater than 1 in magnitude
AR & MA are special cases: an
AR(p)=ARMA(p,0), and an
MA(q)=ARMA(0,q)
ACF & PACF both decay exponentially

Sample ACF/PACF
For an AR(p) the ACF decays geometrically, and the
PACF is zero beyond lag p. The sample ACF/PACF
should exhibit similar behavior, and significance at
the 95% level can be assessed via the usual
1.96 n
bounds
For an MA(q) the PACF decays geometrically, and the
ACF is zero beyond lag q. The sample ACF/PACF
should exhibit similar behavior, and significance at
the 95% level can also be assessed via the 1.96/n
bounds
For an ARMA(p,q), the ACF & PACF both decay
exponentially.
Examining the sample ACF/PACF therefore can serve
only as a guide in determining possible maximum
values for p & q to be properly investigated via AICC.

Sample ACF/PACF
The PACF shows a sharper "cutoff" than the
ACF
In particular, the PACF has only two significant
spikes, while the ACF has four
Thus, the series displays an AR(2) signature
If we therefore set the order of the AR term
to 2 i.e., fit an ARIMA(2,1,0) model--we obtain
the following ACF and PACF plots for the
residuals

Order Selection/Model Identification


In real-life data, there is usually no underlying true model.
The question then becomes how to select an appropriate
statistical model for a given data set?
A breakthrough was made in the early 1970s by the
Japanese statistician Akaike.
Using ideas from information theory, he discovered a
way to measure how far a candidate model is from the
true model.
We should therefore minimize the distance from the truth,
and select the ARMA(p,q) model that minimizes Akaikes
Information Criterion (AIC):

AIC 2 log L ,, 2 2 p q

Order Selection/Model Identification

where L ,, 2 denotes the likelihood evaluated at the


MLEs of , , and , respectively. (Nowadays we
actually use a bias-corrected version of AIC called AICC.)
The first term in the AIC expression measures how well
the model fits the data; the lower it is, the better the fit.
The second term penalizes models with more
parameters.
Final model selection can then be based upon goodnessof-fit tests and model parsimony (simplicity).
There are several other information criteria currently in
use, SBC, FPE, SIC, MDL, etc., but AIC and SBC seem
to be the most popular.

Non-stationary Time Series


- Unit Root
- ARIMA

Random Walk Model


Although our interest is on stationary time
series, we often encounters non-stationary
time series
Classic example: RWM (stock price,
exchange rate)
Can be of two types
Random walk without drift: Xt= Xt-1 + ut
Random walk with drift: Xt= + Xt-1 + ut

Random walk without drift


Let X1=Xt-1+u1
X1=X0+u1; X2 = X1+u2X2=X0+u1+u2
E(Xt) = X0 and var(Xt) = t2
Mean value of X is its initial value, which is
constant, but as t increases, its variance
increases indefinitely, thus violating the
stationary condition
RWM is the persistence of random shocks and
impact of particular shock does not die away
RWM said to have infinite memory

Random walk with drift


Xt= +Xt-1 + ut Xt= + ut
Xt drift upward or downward depending
upon being positive or negative
RWM is an example of what is known
as unit root process

Unit Root Process


Say, Xt=Xt-1+ut; -1 < <1
This is an AR(1) process
If =1, then we get RWM without drift, nonstationary process
We call it unit root problem
The term refers to the root of the polynomial in
the lag operator
Thus the terms non-stationarity, random walk
and unit root can be treated as synonymous

Unit root

Difference Stationary (DS) Process


If the trend of a time series is predictable
and not variable, we call it deterministic
trend
If trend is not predictable, we call it
stochastic trend
Say, Xt=b1+b2t+b3Xt-1+ut, ut WN
If b1=b2=0, b3=1 RWM without drift
non-stationary 1st difference
stationary

Trend Stationary Process


If b1=b2 0, b3=0 Xt=b1+b2t+ut
This is called TS process
Though mean is not constant, variance is
Once the values of b1 and b2 is known, the mean can
be forecast perfectly
Thus, if we subtract the mean of Xt from Xt, the
resultant series will be stationary

Dicky-Fuller unit root tests


Simple AR(1) model
xt=xt-1+ut .. (1)
The null hypothesis of unit root,
Ho: =1 with H1: < 1
Subtracting xt-1 from both sides of equ (1), we get
xt xt-1 = xt-1 xt-1 + ut
xt-1 = (-1)xt-1+ ut
xt-1 = xt-1+ ut
Here null hypothesis of unit root
Ho: = 0 and H1: < 0

Detection of Unit Root ADF Tests


ADF test is conducted with the following model:

Where Xt is the underlying variable at time t,


ut is the error term
The lag terms are introduced in order to justify
that errors are uncorrelated with lag terms.
For the above-specified model the hypothesis,
which would be of our interest, is:
H0: = 0

ADF Tests-Eviews
To begin, double click on the series name to open
the series window, and choose View/Unit Root
Test
Specify whether you wish to test for a unit root in
the level, first difference, or second difference of
the series
Choose your exogenous regressors.
You can choose to include a constant, a constant and
linear trend, or neither

EViews automatically select lag length, others


use AIC, SBC and other criteria

Null hypothesis of an unit root


cannot be rejected

Other Unit Root Tests

Phillips-Perron (1998) tests


GLS-detrended Dickey-Fuller tests
(Elliot, Rothenberg, and Stock, 1996)
Kwiatkowski, Phillips, Schmidt, and Shin tests
(KPSS, 1992),
Elliott, Rothenberg, and Stock Point Optimal
tests (ERS, 1996)
Ng and Perron (NP, 2001) unit root tests

Integrated Stochastic Process


RWM is a specific case of more general
class of stochastic process known as
integrated process
Order of integration is the minimum
number of times the series need to be first
differenced to yield a stationary series
RWM is non-stationary but 1st difference is
stationary I(1) series
A stationary series is called I(0) series
1st difference of I(0) series still yields I(0)
series

ARIMA Models
An integrated process Xt is designed
as an ARIMA (p,d,q), if taking
differences of order d, a stationary
process Zt of the type ARMA (p, q) is
obtained.
The ARIMA (p, d, q) model is expressed by the function

Zt = 1 Zt - 1 + 2 Zt - 2 + ..+ p Zt - p + ut - 1 ut 1 - 2 u t 2 - q u t q
Or

(L) (1 L) dX t = (L) ut

Summary of ARMA/ARIMA modeling


procedures
1. Perform preliminary transformations (if
necessary) to stabilize variance over time
2. Detrend and deseasonalize the data (if
necessary) to make the stationarity
assumption look reasonable
Trend and seasonality are also characterized
by ACFs that are slowly decaying and nearly
periodic, respectively
The primary methods for achieving this are
classical decomposition, and differencing

Summary of ARMA/ARIMA modeling


procedures
3. If the data looks nonstationary without a welldefined trend or seasonality, an alternative to
the above option is to difference successively
& use ADF tests
4. Examine sample ACF & PACF to get an idea
of potential p & q values. For an AR(p)/MA(q),
the sample PACF/ACF cuts off after lag p/q
5. Estimate the coefficients for the promising
models

Summary of ARMA/ARIMA modeling


procedures
6

From the fitted ML models above, choose the


one with smallest AICC

Inspection of the standard errors of the


coefficients at the estimation stage, may reveal
that some of them are not significant

If so, subset models can be fitted by constraining


these to be zero at a second iteration of ML estimation

Check the candidate models for goodness-of-fit


by examining their residuals.

This involves inspecting their ACF/PACF for departures


from WN, and by carrying out the formal WN
hypothesis tests

Seasonal part of ARIMA model


The seasonal part of an ARIMA model has the same
structure as the non-seasonal part: it may have an AR
factor, an MA factor, and/or an order of differencing
In the seasonal part of the model, all of these factors
operate across multiples of lag s (the number of periods in
a season)
A seasonal ARIMA model is classified as an ARIMA(P,D,Q)
model, where P=number of seasonal autoregressive
(SAR) terms, D=number of seasonal differences,
Q=number of seasonal moving average (SMA) terms
In identifying a seasonal model, the first step is to
determine whether or not a seasonal difference is needed,
in addition to or perhaps instead of a non-seasonal
difference

Seasonal part of ARIMA model


The seasonal models ARIMA (P, D, Q) which are not
stationary but homogenous of degree D can be
expressed as
Zt = 1 Zt - s + 2 Zt - 2s + ..+ p Zt p s ++ ut - 1ut s - 2 ut 2 s.
p (Ls) (1 Ls) D X t = + Q (Ls) ut
The signature of pure SAR or pure SMA behavior is
similar to the signature of pure AR or pure MA behavior,
except that the pattern appears across multiples of lag s
in the ACF and PACF
For example, a pure SAR(1) process has spikes in the
ACF at lags s, 2s, 3s, etc., while the PACF cuts off after
lag s

Seasonal part of ARIMA model


Conversely, a pure SMA(1) process has
spikes in the PACF at lags s, 2s, 3s, etc.,
while the ACF cuts off after lag s
An SAR signature usually occurs when the
autocorrelation at the seasonal period is
positive, whereas an SMA signature
usually occurs when the seasonal
autocorrelation is negative

General multiplicative seasonal


models
ARIMA (p, d, q) (P, D, Q)s

An integrated process Xt is designed as an ARIMA (p,d,q), if taking


differences of order d, a stationary process Zt of the type ARMA (p, q)
is obtained.
The ARIMA (p, d, q) model is expressed by the function
Zt = 1 Zt - 1 + 2 Zt - 2 + ..+ p Zt - p + ut - 1 ut 1 - 2 u t 2 - - q u t q
Or
(L) (1 L) dX t = (L) ut
The seasonal models ARIMA (P, D, Q) which are not stationary
but homogenous of degree D can be expressed as
Zt = 1 Zt - s + 2 Zt - 2s + ..+ p Zt p s ++ ut - 1ut s - 2 ut 2 s- .
Or
p (Ls) (1 Ls) D X t = + Q (Ls) ut
General multiplicative seasonal models, ARIMA (p, d, q) (P, D,
Q) s
p (Ls) p (L)(1 Ls) D (1 L) d X t = Q (Ls) q (L) ut.

ARIMA Model Building


Identification
This stage basically tries to identify an
appropriate ARIMA model for the underlying
stationary time series on the basis of ACF and
PACF
If the series is nonstationary it is first
transformed to covariance-stationary and then
one can easily identify the possible values of
the
regular
part
of
the
model
i.e.
autoregressive order p and moving average order
q in a univariate ARMA model along with the
seasonal part

ARIMA Model Building


Estimation
Point estimates of the coefficients can be
obtained by the method of maximum likelihood
Associated standard errors are also provided,
suggesting which coefficients could be dropped

Diagnostic checking
One should also examine whether the residues
of the model appear to be white noise process

Forecasting.

MSARIMA (p, d, q)(P, D, Q)


MSARIMA (2,1,1)(0,0,1)
MSARIMA (0, 0,1)(1,1,0)
MSARIMA (1,0,1)(1,0,1)
MSARIMA (2,1,0)(0,1,1)

You might also like