Models Time Series Data With AR, MA, ARMA and ARIMA Models

Models[edit]
Models for time series data can have many forms and represent different stochastic processes.
When modeling variations in the level of a process, three broad classes of practical importance are
the autoregressive (AR) models, the integrated (I) models, and the moving average (MA) models.
These three classes depend linearly on previous data points.[26] Combinations of these ideas
produce autoregressive moving average (ARMA) and autoregressive integrated moving
average (ARIMA) models. The autoregressive fractionally integrated moving average (ARFIMA)
model generalizes the former three. Extensions of these classes to deal with vector-valued data are
available under the heading of multivariate time-series models and sometimes the preceding
acronyms are extended by including an initial "V" for "vector", as in VAR for vector autoregression.
An additional set of extensions of these models is available for use where the observed time-series
is driven by some "forcing" time-series (which may not have a causal effect on the observed series):
the distinction from the multivariate case is that the forcing series may be deterministic or under the
experimenter's control. For these models, the acronyms are extended with a final "X" for
"exogenous".
Non-linear dependence of the level of a series on previous data points is of interest, partly because
of the possibility of producing a chaotic time series. However, more importantly, empirical
investigations can indicate the advantage of using predictions derived from non-linear models, over
those from linear models, as for example in nonlinear autoregressive exogenous models. Further
references on nonlinear time series analysis: (Kantz and Schreiber), [27] and (Abarbanel) [28]
Among other types of non-linear time series models, there are models to represent the changes of
variance over time (heteroskedasticity). These models represent autoregressive conditional
heteroskedasticity (ARCH) and the collection comprises a wide variety of representation (GARCH,
TARCH, EGARCH, FIGARCH, CGARCH, etc.). Here changes in variability are related to, or
predicted by, recent past values of the observed series. This is in contrast to other possible
representations of locally varying variability, where the variability might be modelled as being driven
by a separate time-varying process, as in a doubly stochastic model.
In recent work on model-free analyses, wavelet transform based methods (for example locally
stationary wavelets and wavelet decomposed neural networks) have gained favor. Multiscale (often
referred to as multiresolution) techniques decompose a given time series, attempting to illustrate
time dependence at multiple scales. See also Markov switching multifractal (MSMF) techniques for
modeling volatility evolution.
A Hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is
assumed to be a Markov process with unobserved (hidden) states. An HMM can be considered as
the simplest dynamic Bayesian network. HMM models are widely used in speech recognition, for
translating a time series of spoken words into text.
Notation[edit]
A number of different notations are in use for time-series analysis. A common notation specifying a
time series X that is indexed by the natural numbers is written
X = {X1, X2, ...}.
Another common notation is
Y = {Yt: t T},
where T is the index set.
Conditions[edit]
There are two sets of conditions under which much of the theory is built:
Stationary process
Ergodic process
However, ideas of stationarity must be expanded to consider two important ideas: strict
stationarity and second-order stationarity. Both models and applications can be developed
under each of these conditions, although the models in the latter case might be considered
as only partly specified.
In addition, time-series analysis can be applied where the series are seasonally stationary or
non-stationary. Situations where the amplitudes of frequency components change with time
can be dealt with in time-frequency analysis which makes use of a timefrequency
representation of a time-series or signal.[29]
Economics and finance[edit]

Markov chains are used in finance and economics to model a variety of different phenomena,
including asset prices and market crashes. The first financial model to use a Markov chain was from
Prasad et al. in 1974.[dubious discuss][66] Another was the regime-switching model of James D.
Hamilton (1989), in which a Markov chain is used to model switches between periods high and low
GDP growth (or alternatively, economic expansions and recessions).[67] A more recent example is
the Markov Switching Multifractal model of Laurent E. Calvet and Adlai J. Fisher, which builds upon
the convenience of earlier regime-switching models.[68][69] It uses an arbitrarily large Markov chain to
drive the level of volatility of asset returns.
Dynamic macroeconomics heavily uses Markov chains. An example is using Markov chains to
exogenously model prices of equity (stock) in a general equilibrium setting.[70]
Credit rating agencies produce annual tables of the transition probabilities for bonds of different
credit ratings.[71]
Autocorrelation
Just as correlation measures the extent of a linear relationship between two variables,
autocorrelation measures the linear relationship between lagged values of a time series.
There are several autocorrelation coefficients, depending on the lag length. For
example, r1r1 measures the relationship between ytyt and yt1yt1, r2r2 measures the
relationship between ytyt and yt2yt2, and so on.
Figure 2.9 displays scatterplots of the beer production time series where the horizontal
axis shows lagged values of the time series. Each graph shows ytytplotted
against ytkytk for different values of kk. The autocorrelations are the correlations
associated with these scatterplots.
8.1 Stationarity and differencing
A stationary time series is one whose properties do not depend on the time
at which the series is observed.1 So time series with trends, or with seasonality, are
not stationary the trend and seasonality will affect the value of the time series at
different times. On the other hand, a white noise series is stationary it does not matter
when you observe it, it should look much the same at any period of time.
Some cases can be confusing a time series with cyclic behaviour (but not trend or
seasonality) is stationary. That is because the cycles are not of fixed length, so before we
observe the series we cannot be sure where the peaks and troughs of the cycles will be.
In general, a stationary time series will have no predictable patterns in the long-term.
Time plots will show the series to be roughly horizontal (although some cyclic behaviour
is possible) with constant variance.
Figure
8.1: Which of these series are stationary? (a) Dow Jones index on 292 consecutive days; (b) Daily
change in Dow Jones index on 292 consecutive days; (c) Annual number of strikes in the US; (d)
Monthly sales of new one-family houses sold in the US; (e) Price of a dozen eggs in the US
(constant dollars); (f) Monthly total of pigs slaughtered in Victoria, Australia; (g) Annual total of
lynx trapped in the McKenzie River district of north-west Canada; (h) Monthly Australian beer
production; (i) Monthly Australian electricity production.
Consider the nine series plotted in Figure 8.1. Which of these do you think are
stationary? Obvious seasonality rules out series (d), (h) and (i). Trend rules out series
(a), (c), (e), (f) and (i). Increasing variance also rules out (i). That leaves only (b) and (g)
as stationary series. At first glance, the strong cycles in series (g) might appear to make
it non-stationary. But these cycles are aperiodic they are caused when the lynx
population becomes too large for the available feed, so they stop breeding and the
population falls to very low numbers, then the regeneration of their food sources allows
the population to grow again, and so on. In the long-term, the timing of these cycles is
not predictable. Hence the series is stationary.
Differencing
In Figure 8.1, notice how the Dow Jones index data was non-stationary in panel (a), but
the daily changes were stationary in panel (b). This shows one way to make a time series
stationary compute the differences between consecutive observations. This is known
as differencing.
Transformations such as logarithms can help to stabilize the variance of a time series.
Differencing can help stabilize the mean of a time series by removing changes in the
level of a time series, and so eliminating trend and seasonality.
As well as looking at the time plot of the data, the ACF plot is also useful for identifying
non-stationary time series. For a stationary time series, the ACF will drop to zero
relatively quickly, while the ACF of non-stationary data decreases slowly. Also, for non-
stationary data, the value of r1r1 is often large and positive.
Figure
8.2: The ACF of the Dow-Jones index (left) and of the daily changes in the Dow-Jones index (right).
The ACF of the differenced Dow-Jones index looks just like that from a white noise
series. There is only one autocorrelation lying just outside the 95% limits, and the
Ljung-Box QQ statistic has a p-value of 0.153 (for h=10h=10). This suggests that
the daily change in the Dow-Jones index is essentially a random amount uncorrelated
with previous days.
Random walk model
The differenced series is the change between consecutive observations in the original
series, and can be written as
yt=ytyt1.yt=ytyt1.
The differenced series will have only T1T1 values since it is not possible to calculate
a difference y1y1 for the first observation.
When the differenced series is white noise, the model for the original series can be
written as
ytyt1=etoryt=yt1+et:.ytyt1=etoryt=yt1+et:.
A random walk model is very widely used for non-stationary data, particularly finance
and economic data. Random walks typically have:
long periods of apparent trends up or down

sudden and unpredictable changes in direction.
The forecasts from a random walk model are equal to the last observation, as future
movements are unpredictable, and are equally likely to be up or down. Thus, the
random walk model underpins nave forecasts.
A closely related model allows the differences to have a non-zero mean. Then
ytyt1=c+etoryt=c+yt1+et:.ytyt1=c+etoryt=c+yt1+et:.
The value of cc is the average of the changes between consecutive observations. If cc is

positive, then the average change is an increase in the value of ytyt. Thus ytytwill tend
to drift upwards. But if cc is negative, ytyt will tend to drift downwards.
This is the model behind the drift method discussed in Section 2/3.
Second-order differencing
Occasionally the differenced data will not appear stationary and it may be necessary to
difference the data a second time to obtain a stationary series:
yt=ytyt1=(ytyt1)(yt1yt2)=yt2yt1+yt2.yt=yt
yt1=(ytyt1)(yt1yt2)=yt2yt1+yt2.
In this case, ytyt will have T2T2 values. Then we would model the change in the
changes of the original data. In practice, it is almost never necessary to go beyond
second-order differences.
Seasonal differencing
A seasonal difference is the difference between an observation and the corresponding
observation from the previous year. So
yt=ytytmwhere m= number of seasons.yt=ytytmwhere m= number of

seasons.
These are also called lag-mm differences as we subtract the observation after a lag
of mm periods.
If seasonally differenced data appear to be white noise, then an appropriate model for
the original data is
yt=ytm+et.yt=ytm+et.
Forecasts from this model are equal to the last observation from the relevant season.
That is, this model gives seasonal nave forecasts.
Figure
8.3: Seasonal differences of the A10 (antidiabetic) sales data.
R code
plot(diff(log(a10),12), xlab="Year",
ylab="Annual change in monthly log A10 sales")
Figure 8.3 shows the seasonal differences of the logarithm of the monthly scripts for A10
(antidiabetic) drugs sold in Australia. The transformation and differencing has made the
series look relatively stationary.
To distinguish seasonal differences from ordinary differences, we sometimes refer to
ordinary differences as first differences meaning differences at lag 1.
Sometimes it is necessary to do both a seasonal difference and a first difference to obtain

stationary data, as shown in Figure 8.4. Here, the data are first transformed using
logarithms (second panel). Then seasonal differenced are calculated (third panel). The
data still seem a little non-stationary, and so a further lot of first differences are
computed (bottom panel).
Figure
8.4: Top panel: US net electricity generation (billion kWh). Other panels show the same data after
transforming and differencing.
There is a degree of subjectivity in selecting what differences to apply. The seasonally

differenced data in Figure 8.3 do not show substantially different behaviour from the
seasonally differenced data in Figure 8.4. In the latter case, we may have decided to stop
with the seasonally differenced data, and not done an extra round of differencing. In the
former case, we may have decided the data were not sufficiently stationary and taken an
extra round of differencing. Some formal tests for differencing will be discussed later,
but there are always some choices to be made in the modeling process, and different
analysts may make different choices.
If yt=ytytmyt=ytytm denotes a seasonally differenced series, then the twice-

differenced series is
yt=ytyt1=(ytytm)(yt1ytm1)=ytyt1ytm+ytm1.yt
=ytyt1=(ytytm)(yt1ytm1)=ytyt1ytm+ytm1.
When both seasonal and first differences are applied, it makes no difference which is
done firstthe result will be the same. However, if the data have a strong seasonal
pattern, we recommend that seasonal differencing be done first because sometimes the
resulting series will be stationary and there will be no need for a further first difference.
If first differencing is done first, there will still be seasonality present.
It is important that if differencing is used, the differences are interpretable. First

differences are the change between one observation and the next. Seasonal
differences are the change between one year to the next. Other lags are unlikely to
make much interpretable sense and should be avoided.
Unit root tests

One way to determine more objectively if differencing is required is to use a unit root
test. These are statistical hypothesis tests of stationarity that are designed for
determining whether differencing is required.
A number of unit root tests are available, and they are based on different assumptions
and may lead to conflicting answers. One of the most popular tests is the Augmented
Dickey-Fuller (ADF) test. For this test, the following regression model is estimated:
yt=yt1+1yt1+2yt2++kytk,yt=yt1+1yt1+2yt2++kytk,
where ytyt denotes the first-differenced series, yt=ytyt1yt=ytyt1 and kk is the
number of lags to include in the regression (often set to be about 3). If the original series, ytyt,
needs differencing, then the coefficient ^^ should be approximately zero. If ytyt is already
stationary, then ^<0^<0. The usual hypothesis tests for regression coefficients do not work
when the data are non-stationary, but the test can be carried out using the following R
command.
R code
adf.test(x, alternative = "stationary")
In R, the default value of kk is set to (T1)1/3(T1)1/3 where TT is the length of

the time series and xx means the largest integer not greater than xx.
The null-hypothesis for an ADF test is that the data are non-stationary. So large p-values
are indicative of non-stationarity, and small p-values suggest stationarity. Using the
usual 5% threshold, differencing is required if the p-value is greater than 0.05.
Another popular unit root test is the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.

This reverses the hypotheses, so the null-hypothesis is that the data are stationary. In
this case, small p-values (e.g., less than 0.05) suggest that differencing is required.
R code
kpss.test(x)
A useful R function is ndiffs() which uses these tests to determine the appropriate
number of first differences required for a non-seasonal time series.
More complicated tests are required for seasonal differencing and are beyond the scope
of this book. A useful R function for determining whether seasonal differencing is
required is nsdiffs() which uses seasonal unit root tests to determine the appropriate
number of seasonal differences required.
The following code can be used to find how to make a seasonal series stationary. The
resulting series stored as xstar has been differenced appropriately.
R code
ns <- nsdiffs(x)
if(ns > 0) {
xstar <- diff(x,lag=frequency(x),differences=ns)
} else {
xstar <- x
}
nd <- ndiffs(xstar)
if(nd > 0) {
xstar <- diff(xstar,differences=nd)
}
1. More precisely, if ytyt is a stationary time series, then for all ss, the distribution of (yt,
,yt+s)(yt,,yt+s) does not depend on tt.

Modelling procedure
When fitting an ARIMA model to a set of time series data, the following procedure
provides a useful general approach.
1. Plot the data. Identify any unusual observations.
2. If necessary, transform the data (using a Box-Cox transformation) to stabilize the
variance.
3. If the data are non-stationary: take first differences of the data until the data are
stationary.
4. Examine the ACF/PACF: Is an AR(pp) or MA(qq) model appropriate?
5. Try your chosen model(s), and use the AICc to search for a better model.
6. Check the residuals from your chosen model by plotting the ACF of the residuals,
and doing a portmanteau test of the residuals. If they do not look like white noise, try a
modified model.
7. Once the residuals look like white noise, calculate forecasts.
The automated algorithm only takes care of steps 35. So even if you use it, you will still
need to take care of the other steps yourself.

Models Time Series Data With AR, MA, ARMA and ARIMA Models

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Models Time Series Data With AR, MA, ARMA and ARIMA Models

Uploaded by

Copyright:

Available Formats

Models[edit]

Economics and finance[edit]

long periods of apparent trends up or down

The value of cc is the average of the changes between consecutive observations. If cc is

yt=ytytmwhere m= number of seasons.yt=ytytmwhere m= number of

Sometimes it is necessary to do both a seasonal difference and a first difference to obtain

There is a degree of subjectivity in selecting what differences to apply. The seasonally

If yt=ytytmyt=ytytm denotes a seasonally differenced series, then the twice-

It is important that if differencing is used, the differences are interpretable. First

Unit root tests

where ytyt denotes the first-differenced series, yt=ytyt1yt=ytyt1 and kk is the

In R, the default value of kk is set to (T1)1/3(T1)1/3 where TT is the length of

Another popular unit root test is the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.

,yt+s)(yt,,yt+s) does not depend on tt.

You might also like