Professional Documents
Culture Documents
edu/stat510)
Home > 11.2: Vector Autoregressive models VAR(p) models
11.2: Vector Autoregressive models VAR(p)
models
VAR models (vector autoregressive models) are used for multivariate time series. The structure is that each
variable is a linear function of past lags of itself and past lags of the other variables.
The vector autoregressive model of order 1, denoted as VAR(1), is as follows:
Each variable is a linear function of the lag 1 values for all variables in the set.
In a VAR(2) model, the lag 2 values for all variables are added to the right sides of the equations, In the case
of three xvariables there would be six variables on the right side of each equation, three lag 1 variables and
three lag 2 variables.
In general, for a VAR(p) model, the first p lags of each variable in the system would be used as regression
predictors for each variable.
VAR models are a specific case of more general VARMA models. VARMA models for multivariate time
series include the VAR structure above along with moving average terms for each variable. More generally
yet, these are special cases of ARMAX models that allow for the addition of other predictors that are outside
the multivariate set of principal interest.
Here, as in Section 5.8 of the text, we’ll focus on VAR models.
On page 304, the authors fit the model of the form
x t = Γut + ϕx t−1 + wt
There is a not so subtle difference here from previous lessons in that we now are fitting a model to data that
need not be stationary. In previous versions of the text, the authors separately detrended each series using
a linear regression with t, the index of time, as the predictor variable. The detrended values for each of the
three series are the residuals from this linear regression on t. The detrending is useful conceptually because
it takes away the common steering force that time may have on each series and created stationarity as we
have seen in past lessons. This approach results in similar coefficients, though slightly different as we are
now simultaneously fitting the intercept and trend together in a multivariate OLS model.
The R vars library authored by Bernhard Pfaff has the capability to fit this model with trend. Let’s look at 2
examples: a differencestationary model and a trendstationary model.
DifferenceStationary Model
Example 5.10 from the text is a differencestationary model in that first differences are stationary. Let’s
examine the code and example from the text by fitting the model above:
install.packages("vars") #If not already installed
install.packages("astsa") #If not already installed
library(vars)
library(astsa)
x = cbind(cmort, tempr, part)
plot.ts(x , main = "", xlab = "")
summary(VAR(x, p=1, type="both"))
The first two commands load the necessary commands from the vars library and the necessary data
from our text’s library.
The cbind command creates a vector of response variables (a necessary step for multivariate
responses).
The VAR command does estimation of AR models using ordinary least squares while simultaneously
fitting the trend, intercept, and ARIMA model. The p = 1 argument requests an AR(1) structure and
“both” fits constant and trend. With the vector of responses, it’s actually a VAR(1).
Following is the output from the VAR command for the variable tempr (the text provides the output for cmort):
The coefficients for a variable are listed in the Estimate column. The .l1 attached to each variable name
indicates that they are lag 1 variables.
Using notation T = temperature, t=time (collected weekly), M = mortality rate, and P = pollution, the equation
for temperature is
^
T t = 67.586 − .007t − 0.244M t−1 + 0.487Tt−1 + 0.128Pt−1
The equation for mortality rate is
^
M t = 73.227– 0.014t + 0.465M t−1 − 0.361Tt−1 + 0.099Pt−1
The equation for pollution is
^
P t = 67.464 − .005t − 0.125M t−1 − 0.477Tt−1 + 0.581Pt−1 .
The covariance matrix of the residuals from the VAR(1) for the three variables is printed below the estimation
results. The variances are down the diagonal and could possibly be used to compare this model to higher
order VARs. The determinant of that matrix is used in the calculation of the BIC statistic that can be used to
compare the fit of the model to the fit of other models (see formulas 5.89 and 5.90 of the text).
For further references on this technique see Analysis of integrated and cointegrated time series with R by
Pfaff and also Campbell and Perron [1991].
In Example 5.11 on page 307, the authors give results for a VAR(2) model for the mortality rate data. In R,
you may fit the VAR(2) model with the command
summary(VAR(x, p=2, type="both"))
The output, as displayed by the VAR command is as follows:
Again, the coefficients for a particular variable are listed in the Estimate column. As an example, the
estimated equation for detrended temperature is
^
T t = 49.88 − .005t − 0.109M t−1 + 0.261Tt−1 – 0.051Pt−1 − 0.041M t−2 + 0.356Tt−2 – 0.095Pt−2
We will discuss information criterion statistics to compare VAR models of different orders in the homework.
Residuals are also available for analysis. For example, if we assign the VAR command to an object titled
fitvar2 in our program,
fitvar2 = VAR(x, p=2, type="both")
then we have access to the matrix residuals(fitvar2). This matrix will have three columns, one column of
residuals for each variable.
For example, we might use
acf(residuals(fitvar2)[,1])
to see the ACF of the residuals for mortality rate after fitting the VAR(2) model.
Following is the ACF that resulted from the command just described. It looks good for a residual ACF. (The
big spike at the beginning is the unimportant lag 0 correlation.)
The following two commands will create ACFs for the residuals for the other two variables.
acf(residuals(fitvar2)[,2])
acf(residuals(fitvar2)[,3])
They also resemble white noise.
We may also examine these plots in the crosscorrelation matrix provided by acf(residuals(fitvar2)):
The plots along the diagonal are the individual ACFs for each model’s residuals that we just discussed
above. In addition, we now see the crosscorrelation plots of each set of residuals. Ideally, these would also
resemble white noise, however we do see remaining autocorrelations, especially between temperature and
pollution. As our authors note, this model does not adequately capture the complete association between
these variables in time.
TrendStationary Model
Lets explore an example where the original data are stationary and examine the VAR code by fitting the
model above with both a constant and trend. Using R, we simulated n = 500 sample values using the VAR(2)
model
Using the VAR command explained above:
y1=scan("var2daty1.dat")
y2=scan("var2daty2.dat")
summary(VAR(cbind(y1,y2), p=2, type="both"))
We obtain the following output:
The estimates are very close to the simulated coefficients and the trend is not significant, as expected. For
stationary data, when detrending is unnecessary, you may also use the ar.ols command to fit a VAR model:
ar.ols(cbind(y1, y2), order=2)
In the first matrix given, read across a row to get the coefficients for a variable. The preceding commas
followed by 1 or 2 indicate whether the coefficients are lag 1 or lag 2 variables respectively. The intercepts of
the equations are given under $x.intercept – one intercept per variable.
The matrix under $var.pred gives the variancecovariance matrix of the residuals from the VAR(2) for the
two variables. The variances are down the diagonal and could possibly be used to compare this model to
higher order VARs as noted above.
The standard errors of the AR coefficients are given by the fitvar2$asy.se.coef command. The output
is
As with the coefficients, read across rows. The first row gives the standard errors of the coefficients for the
lag 1 variables that predict y1. The second row gives the standard errors for the coefficients that predict y2.
You may note that the coefficients are close to the VAR command except the intercept. This is because
ar.ols estimates the model for xmean(x). To match the intercept provided by the
summary(VAR(cbind(y1,y2), p=2, type="const")) command, you must calculate the intercept as
follows:
( − ^ ) = + ( – ^ ) + ( – ^ ) + ( − 2, 1– ^ ) + ( – ^ ) +
^ ) = α1 + ϕ11 (yt−1,1 – μ
(yt,1 − μ ^ ) + ϕ12 (yt−1,2 – μ
^ ) + ϕ21 (yt − 2, 1– μ
^ ) + ϕ22 (yt−2,2 – μ
^ ) + wt,1
1 1 2 1 2
^ (1 − ϕ11 − ϕ21 )– μ
yt,1 = α1 + μ ^ (ϕ12 + ϕ22 ) + ϕ11 yt−1,1 + ϕ12 yt−1,2 + ϕ21 yt−2,1 + ϕ22 yt−2,2 + wt,1
1 2
In our example, the intercept for the simulated model for yt,1 equals
0.043637 2.733607*(10.2930+0.4523) – 15.45479*(0.19130.6365) = 9.580768,
and the estimated equation for yt,1
Estimation with Minitab
For Minitab users, here’s the general flow of what to do.
Read the data into columns.
Use Time Series > Lag to create the necessary lagged columns of the stationary values.
Use Stat > ANOVA > General MANOVA.
Enter the list of “present time” variables as the response variables.
Enter the lagged x variables as covariates (and as the model).
Click Results and select “Univariate Analysis” (to see the estimated regression coefficients for each
equation).
If desired, click “Storage” and select Residuals and/or Fits.
Source URL: https://onlinecourses.science.psu.edu/stat510/node/79