You are on page 1of 21

Tutorial 4

Sunil
30 August 2017

Contents
1 A Model of the interest rate spread:An illustration of Box Jenkins Methodology in R 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Model specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Estimation: First model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Diagnostic Checking: AR(7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Second Attempt: AR(6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Third Attempt: AR(2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Fourth Model :AR(1,2,7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 Fifth Model: ARMA(1,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.9 Sixth Attempt: ARMA(2,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.10 Seventh Model :ARMA(2,(1,7)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Assignments 21

1 A Model of the interest rate spread:An illustration of Box Jenk-


ins Methodology in R

1.1 Introduction

This example is a reproduction from Walter Enders (4th Edition). Chapter 2 Section 10 (pp 88). The
estimated values may differ slightly.
Data: Quarterly data from 1960Q1 to 2012Q4 on interest rate spread (Difference between the interest
rate on 5-year U.S. government bonds(R5) and the rate on 3-month treasury bills(TBILL)).
R5 and TBILL are given in the data file named QUARTERLY. (File has a sas7bdat extension. Use
read.sas7bdat() to load the file.)
The graph of spread is given below.

1.2 Model specification

Always examine the time series visually before starting the analysis
library("sas7bdat")
data1 <- read.sas7bdat("quarterly.sas7bdat")

sp <- data1$r5 - data1$Tbill


sp <- ts(sp, start = c(1960, 1), freq = 4)
mu <- mean(sp)
library(zoo)

df <- data.frame(date = as.Date(as.yearqtr(time(sp))), Y = as.matrix(sp))


library(ggplot2)

1
tssp <- ggplot(data = df, mapping = aes(x = date, y = Y)) + geom_line() +
theme_classic() + ggtitle("Figure 1: The interest rate spread") +
xlab("") + ylab("") + geom_hline(yintercept = mu)

Figure 1: The interest rate spread


4
3
2
1
0
1
1960 1980 2000

Following inferences can be drawn from the plot:


The series seems to be persistent
No visible breaks in mean or variance of the series
Looking into the graph the sequence seems to be a covariance stationary process.
We can check the stationarity using unit root test (Yet to discuss).

Next step is to identity tentative model for the process. We can make initial guess by checking ACF and
PACF. Plots of the ACF and PACF are given in figure 2 and 3 (95% confidence interval is by red dashed
lines). 95% confidence intervals are:
2/sqrt(length(sp))

## [1] 0.1373606
-2/sqrt(length(sp))

## [1] -0.1373606
The joint significance of autocorrelation can be examined using Ljung Box Q statistics. For example the
following code computes joint significance of first two autocorrelations.
Box.test(sp, lag = 2, type = "Ljung-Box")

##
## Box-Ljung test
##
## data: sp
## X-squared = 256.67, df = 2, p-value < 2.2e-16
The first 12 Q statistics along with the numerical values of ACF and PACF are given in Table 1.
source("acfpacf.R")
acf.graph1(sp, 12) + ggtitle("Figure 2: ACF of interest rate spread")

2
Figure 2: ACF of interest rate spread
0.9

0.6
ACF

0.3

0.0

1 3 5 7 9 11
lags

pacf.graph1(sp, 12) + ggtitle("Figure 3: PACF of interest rate spread")

Figure 3: PACF of interest rate spread


0.9

0.6
PACF

0.3

0.0

1 3 5 7 9 11
lags

library("knitr")
# ACF
sacf <- acf(sp, plot = F, 12)
# PACF
spacf <- pacf(sp, plot = F, 12)
df1 <- as.matrix(rbind(ACF = sacf$acf[2:13], PACF = spacf$acf))

# Ljung Box Q stat

Mjb <- matrix(nrow = 2, ncol = 12)


for (i in 1:12) {
jb <- Box.test(sp, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))

# Combine ACF and Qstat

dimnames(df1) = list(c("ACF", "PACF"), as.character(1:12))


df1 <- rbind(df1, Mjb)
cap1 <- "\\label{tab:tab2}ACF and PACF of Interest Rate Spread."
kable(df1, digits = 2, caption = cap1, booktabs = TRUE, padding = 2)

3
Table 1: ACF and PACF of Interest Rate Spread.

1 2 3 4 5 6 7 8 9 10 11 12
ACF 0.86 0.68 0.55 0.41 0.28 0.15 0.07 0.04 -0.03 -0.13 -0.20 -0.22
PACF 0.86 -0.21 0.11 -0.18 -0.01 -0.14 0.14 0.01 -0.18 -0.12 -0.04 0.08
Q-stat 157.69 256.67 322.03 358.66 375.68 380.57 381.69 382.04 382.27 385.90 394.72 405.84
pval 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

The ACF and PACF converge to zero quickly enough that we do not have to worry about a time-varying
mean.
Recall that the theoretical ACF of a pure MA(q) process cuts off to zero at lag q. Plot of the ACF
dont exhibit such a behavior hence possibility of pure MA(q) process can be eliminated.
Theoretical ACF of an AR(1)model decays geometrically.Examination of numerical values of ACF rules
out this possibility too.
The estimated values of the PACF are such that 11 = 0.86, 22 = 0.21, 33 = 0.11, and 44 = 0.18.
Although 55 is close to zero, 66 = 0.15 and 77 = 0.14. It is clear from the figure that 11 =
0.86, 22 = 0.21, 44 = 0.18, 66 = 0.15 and 77 = 0.14. are statistically different from zero.In a
pure AR(p) model, the PACF cuts to zero after lag p. Hence, if the series follows a pure AR(p) process,
the value of p could be as high as six or seven.
There appears to be an oscillating pattern in the PACF in that the first seven values alternate in sign.
Oscillating decay of the PACF is characteristic of a positive MA coefficient (Cycles?).
Due to the number of small and marginally significant coefficients, the ACF and PACF of the spread
are probably more ambiguous.
Examining the PACF and ACF we can think of an AR(7) or ARMA(1,1) Process. Let us start with
AR(7).

1.3 Estimation: First model

library(forecast)
library(latex2exp)
mar7 <- Arima(sp, order = c(7, 0, 0))
# specify the order in order arg. First value stands for
# number of AR terms, second the order of integration and
# third for MA terms.

mar7

## Series: sp
## ARIMA(7,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 mean
## 1.1057 -0.4441 0.3891 -0.2899 0.2156 -0.2928 0.1328 1.1970
## s.e. 0.0680 0.1003 0.1038 0.1050 0.1031 0.0994 0.0675 0.1689
##
## sigma^2 estimated as 0.2169: log likelihood=-135.66
## AIC=289.32 AICc=290.21 BIC=319.53

4
# Pvalues of ARIMA
p_mar7 <- (1 - pnorm(abs(mar7$coef)/sqrt(diag(mar7$var.coef)))) *
1.96
### presenting the results with p values
df_mar7 <- as.matrix(rbind(mar7$coef, p_mar7))

dimnames(df_mar7) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "AR3", "AR4", "AR5", "AR6", "AR7", "mu"))

kable(df_mar7, digits = 2, caption = "\\label{tab:tab2}Estimates of AR(7) Model.",


booktabs = TRUE, padding = 2)

Table 2: Estimates of AR(7) Model.

AR1 AR2 AR3 AR4 AR5 AR6 AR7 mu


Coefficients 1.11 -0.44 0.39 -0.29 0.22 -0.29 0.13 1.2
P Value 0.00 0.00 0.00 0.01 0.04 0.00 0.05 0.0

1.4 Diagnostic Checking: AR(7)

car7 <- mar7$coef[1:7]


rar7 <- (polyroot(c(1, -car7)))^{
-1
}
abs(rar7)

## [1] 0.7499416 0.7644520 0.7644520 0.7499416 0.7573816 0.7573816 0.7041410


roots_arma(rar7) + ggtitle("Figure 4:Inverse of the AR roots")

Figure 4:Inverse of the AR roots


1.0

0.5
Imaginary Part

0.0

0.5

1.0
1.0 0.5 0.0 0.5 1.0
Real Part

The inverse of the roots of lag polynomial(eigen values) are given below . The model is stable and stationary
since all the eigen values are inside the unit circle.

5
All the coefficients are statistically significant at 5
R reports the unconditional mean in the estimation (mu). You can easily find out the intercept using
the formula for .
Next, we need to check the properties of residuals using ACF, PACF and Q
Plots of ACF, PACF and estimates of Q stat of residuals are given below.
res_ar7 <- mar7$residuals
acf.graph1(res_ar7, 12) + ggtitle("Figure 5: ACF of residuls from AR(7)")

Figure 5: ACF of residuls from AR(7)

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar7, 12) + ggtitle("Figure 6: ACF of residuls from AR(7)")

Figure 6: ACF of residuls from AR(7)

0.1
PACF

0.0

0.1

1 3 5 7 9 11
lags

# Ljung Box Q stat

Mjb <- matrix(nrow = 2, ncol = 12)


for (i in 1:12) {
jb <- Box.test(res_ar7, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))


cmjbar7 <- "\\label{tab:tab3}Ljung Box Q statistics for residuals of AR(7)"
kable(Mjb, digits = 2, caption = cmjbar7, booktabs = TRUE, padding = 2)

Table 3: Ljung Box Q statistics for residuals of AR(7)

6
Q-stat 0.00 0.18 0.19 0.2 0.74 0.83 1.52 5.40 6.22 6.22 13.02 13.11
pval 0.99 0.92 0.98 1.0 0.98 0.99 0.98 0.71 0.72 0.80 0.29 0.36

Plot of ACF and PACF of residuals indicate presence of autocorrelation at lag 11.To confirm it we can
check Q stat.
The significance levels of the Q-statistics indicate no remaining autocorrelation in the residuals. Hence
there is no need to account for the autocorrelation at lag 11.
Although the AR(7) model has some desirable attributes, one reasonable estimation strategy is to
eliminate the seventh lag and estimate an AR(6) model over the same sample period.

1.5 Second Attempt: AR(6)

Note that the data set begins in 1960Q1, so that with seven lags the estimation of the AR(7) begins in
1961Q4. For comparable results we will estimate AR(6) for the same sample
sp1 <- window(sp, start = c(1960, 2))

mar6 <- Arima(sp1, order = c(6, 0, 0))

# Pvalues of ARIMA
p_mar6 <- (1 - pnorm(abs(mar6$coef)/sqrt(diag(mar6$var.coef)))) *
1.96
# presenting the results with p values
df_mar6 <- as.matrix(rbind(mar6$coef, p_mar6))

dimnames(df_mar6) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "AR3", "AR4", "AR5", "AR6", "mu"))

kable(df_mar6, digits = 2, caption = "\\label{tab:tab2}Estimates of AR(6) Model.",


booktabs = TRUE, padding = 2)

Table 4: Estimates of AR(6) Model.

AR1 AR2 AR3 AR4 AR5 AR6 mu


Coefficients 1.09 -0.42 0.36 -0.24 0.16 -0.14 1.21
P Value 0.00 0.00 0.00 0.02 0.11 0.03 0.00

1.5.1 Diagnostic Checking: AR(6)

car6 <- mar6$coef[1:6]


rar6 <- (polyroot(c(1, -car6)))^{
-1
}
abs(rar6)

## [1] 0.8476173 0.6737253 0.6737253 0.8476173 0.6665575 0.6665575


roots_arma(rar6) + ggtitle("Figure 7:Inverse of the AR roots")

7
Figure 7:Inverse of the AR roots
1.0

0.5
Imaginary Part

0.0

0.5

1.0
1.0 0.5 0.0 0.5 1.0
Real Part

# Ljung Box Q stat


res_ar6 <- mar6$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar6, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))

kable(Mjb, digits = 2, caption = "\\label{tab:tab3}Ljung Box Q statistics for residuals of AR(6)",


booktabs = TRUE, padding = 2)

Table 5: Ljung Box Q statistics for residuals of AR(6)

Q-stat 0.13 0.13 0.13 0.31 0.34 2.08 2.73 10.73 12.03 12.14 16.03 16.39
pval 0.72 0.94 0.99 0.99 1.00 0.91 0.91 0.22 0.21 0.28 0.14 0.17

acf.graph1(res_ar6, 12) + ggtitle("Figure 8: ACF of residuls from AR(6)")

8
Figure 8: ACF of residuls from AR(6)
0.2

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar6, 12) + ggtitle("Figure 9: ACF of residuls from AR(6)")

Figure 9: ACF of residuls from AR(6)


0.2

0.1
PACF

0.0

0.1

1 3 5 7 9 11
lags

Although there is autocorrelations in the residuals at some lags the Q-statistics fail to reject the null of
no autocorrelation at all lags.
Although AR5 appears to be statistically insignificant, it is generally not a good idea to use t-statistics
to eliminate intermediate lags. As such, most researchers would not eliminate the fifth lag and estimate
a model with lags 1 through 4 and lag 6. Recall that the appropriate use of a t-statistic requires that
regressor in question be uncorrelated with the other regressors. (But you can try. If you get a better
model you can use it. )
The overall result is that the diagnostic checks of the AR(6) model suggests that it is adequate. We
can compare these two models using AIC and SBC criteria
AIC for AR(7)
mar7$aic

## [1] 289.3214
AIC for AR(6)
mar6$aic

## [1] 289.5437
SBC for AR(7)
mar7$bic

## [1] 319.5307

9
SBC for AR(6)
mar6$bic

## [1] 316.3586
Thus the AIC selects the AR(7) model, whereas the SBC selects the more parsimonious AR(6) model.

Suppose that you try a very parsimonious model and estimate an AR(2).

1.6 Third Attempt: AR(2)

sp2 <- window(sp, start = c(1961, 2))

mar2 <- Arima(sp1, order = c(2, 0, 0))

# Pvalues of ARIMA
p_mar2 <- (1 - pnorm(abs(mar2$coef)/sqrt(diag(mar2$var.coef)))) *
1.96
### presenting the results with p values
df_mar2 <- as.matrix(rbind(mar2$coef, p_mar2))

dimnames(df_mar2) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "mu"))

kable(df_mar2, digits = 2, caption = "\\label{tab:tab2}Estimates of AR(2) Model.",


booktabs = TRUE, padding = 2)

Table 6: Estimates of AR(2) Model.

AR1 AR2 mu
Coefficients 1.04 -0.22 1.21
P Value 0.00 0.00 0.00

1.6.1 Diagnostic Checking: AR(2)

car2 <- mar2$coef[1:2]


rar2 <- (polyroot(c(1, -car2)))^{
-1
}
abs(rar2)

## [1] 0.7554892 0.2853702


roots_arma(rar2) + ggtitle("Figure 10:Inverse of the AR roots")

10
Figure 10:Inverse of the AR roots
1.0

0.5
Imaginary Part

0.0

0.5

1.0
1.0 0.5 0.0 0.5 1.0
Real Part

The roots are within the unit circle and AR(2) satisfies the stability condition.
# Ljung Box Q stat
res_ar2 <- mar2$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar2, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))

kable(Mjb, digits = 2, caption = "\\label{tab:tab3}Ljung Box Q statistics for residuals of AR(2)",


booktabs = TRUE, padding = 2)

Table 7: Ljung Box Q statistics for residuals of AR(2)

Q-stat 0.19 3.79 9.17 9.22 10.53 12.87 16.86 21.84 22.61 23.68 29.34 29.34
pval 0.67 0.15 0.03 0.06 0.06 0.05 0.02 0.01 0.01 0.01 0.00 0.00

acf.graph1(res_ar2, 12) + ggtitle("Figure 11: ACF of residuls from AR(2)")

11
Figure 11: ACF of residuls from AR(2)

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar2, 12) + ggtitle("Figure 12: ACF of residuls from AR(2)")

Figure 12: ACF of residuls from AR(2)

0.1
PACF

0.0

0.1

0.2
1 3 5 7 9 11
lags
AIC for AR(2)
mar2$aic

## [1] 296.4717
SBC for AR(2)
mar2$bic

## [1] 309.8791
Comparing AR(7) and AR(2) we can see that the AIC selects the AR(7) model, but SBC selects the
AR(2) model.
However, the residual autocorrelations from the AR(2) are problematic.The Q-statistics from the AR(2)
model indicate significant autocorrelation in the residuals at the shorter lags.
As such, AR(2) should be eliminated from further consideration.

1.7 Fourth Model :AR(1,2,7)

If you examined the AR(7) carefully, you might have noticed that AR3 almost offsets AR4 and that AR5
almost offsets A6 (since AR3 + AR4 =0 and AR5 + AR6=0). We can try another model by reestimatin the
model without AR3.AR4,AR5,AR6.

12
mar127 <- Arima(sp, order = c(7, 0, 0), fixed = c(NA, NA, 0,
0, 0, 0, NA, NA))

# use fixed argument to restrict coef values to zero.

# Pvalues of ARIMA
p_mar127 <- (1 - pnorm(abs(mar127$coef)/sqrt(diag(mar127$var.coef)))) *
1.96
### presenting the results with p values
df_mar127 <- as.matrix(rbind(mar127$coef[-(3:6)], p_mar127[-(3:6)]))

dimnames(df_mar127) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "AR7", "mu"))

kable(df_mar127, digits = 2, caption = "\\label{tab:tab2}Estimates of AR(1,2,7) Model.",


booktabs = TRUE, padding = 2)

Table 8: Estimates of AR(1,2,7) Model.

AR1 AR2 AR7 mu


Coefficients 1.03 -0.2 -0.03 1.2
P Value 0.00 0.0 0.41 0.0

The AR7 is now statistically insignificant, it might seem preferable to use the AR(2) instead.
Yet, the AR(2) has been shown to be inadequate relative to the AR(7) and the AR(6) models.
# Ljung Box Q stat
res_ar127 <- mar127$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar127, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))

kable(Mjb, digits = 2, caption = "\\label{tab:tab3}Ljung Box Q statistics for residuals of AR(1,2,7)",


booktabs = TRUE, padding = 2)

Table 9: Ljung Box Q statistics for residuals of AR(1,2,7)

Q-stat 0.19 4.39 8.41 8.42 9.45 12.47 15.35 22.18 23.68 24.27 28.68 28.79
pval 0.66 0.11 0.04 0.08 0.09 0.05 0.03 0.00 0.00 0.01 0.00 0.00

acf.graph1(res_ar127, 12) + ggtitle("Figure 13: ACF of residuls from AR(1,2,7)")

13
Figure 13: ACF of residuls from AR(1,2,7)

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar127, 12) + ggtitle("Figure 14: ACF of residuls from AR(1,2,7)")

Figure 14: ACF of residuls from AR(1,2,7)

0.1
PACF

0.0

0.1

1 3 5 7 9 11
lags
This model also exhibits autocorrelation in residuals. (See the Q stat)
Even though the AR(6) and AR(7) models perform relatively well, they are not necessarily the best
forecasting models.
There are several possible alternatives since the patterns of the ACF and PACF are not immediately
clear.
As mentioned earlier the decaying nature of ACF and PACF can be taken as an indication of ARMA(1,1)
Model.

1.8 Fifth Model: ARMA(1,1)

sp11 <- window(sp, start = c(1961, 3))

mar11 <- Arima(sp11, order = c(1, 0, 1))

# Pvalues of ARIMA
p_mar11 <- (1 - pnorm(abs(mar11$coef)/sqrt(diag(mar11$var.coef)))) *
1.96
### presenting the results with p values
df_mar11 <- as.matrix(rbind(mar11$coef, p_mar11))

dimnames(df_mar11) = list(c("Coefficients", "P Value"), c("AR1",

14
"MA1", "mu"))

kable(df_mar11, digits = 2, caption = "\\label{tab:tab2}Estimates of ARMA(1,1) Model.",


booktabs = TRUE, padding = 2)

Table 10: Estimates of ARMA(1,1) Model.

AR1 MA1 mu
Coefficients 0.76 0.38 1.21
P Value 0.00 0.00 0.00

# Ljung Box Q stat


res_ar11 <- mar11$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar11, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))


cMjb1 <- "\\label{tab:tab3}Ljung Box Q statistics for residuals of ARMA(1,1)"
kable(Mjb, digits = 2, caption = cMjb1, booktabs = TRUE, padding = 2)

Table 11: Ljung Box Q statistics for residuals of ARMA(1,1)

Q-stat 0.29 0.59 6.49 6.57 8.30 9.90 13.49 18.16 18.27 18.67 23.91 23.91
pval 0.59 0.74 0.09 0.16 0.14 0.13 0.06 0.02 0.03 0.04 0.01 0.02

acf.graph1(res_ar11, 12) + ggtitle("Figure 15: ACF of residuls from ARMA(1,1)")

Figure 15: ACF of residuls from ARMA(1,1)

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar11, 12) + ggtitle("Figure 16: ACF of residuls from ARMA(1,1)")

15
Figure 16: ACF of residuls from ARMA(1,1)

0.1
PACF

0.0

0.1

0.2
1 3 5 7 9 11
lags

# AIC
mar11$aic

## [1] 289.5957
# SBC
mar11$bic

## [1] 302.9072
AR and MA coefficient are less than |1| impaling a stationary invertible process.
The SBC from the ARMA(1, 1) is smaller than that of the AR(7) and the AR(6).
Nevertheless,the ARMA(1, 1) specification is inadequate because of remaining serial correlation in the
residuals. The Ljung-Box Q-statistic indicate that the residuals from this model exhibit substantial
serial autocorrelation. As such, we must eliminate the ARMA(1, 1) model from consideration.
Since the ACF decays and the PACF seems to oscillate beginning with lag 2 it seems plausible to estimate an
ARMA(2, 1) model.

1.9 Sixth Attempt: ARMA(2,1)

sp21 <- window(sp, start = c(1961, 2))

mar21 <- Arima(sp21, order = c(2, 0, 1))


# Pvalues of ARIMA
p_mar21 <- (1 - pnorm(abs(mar21$coef)/sqrt(diag(mar21$var.coef)))) *
1.96
### presenting the results with p values
df_mar21 <- as.matrix(rbind(mar21$coef, p_mar21))

dimnames(df_mar21) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "MA1", "mu"))

kable(df_mar21, digits = 2, caption = "\\label{tab:tab2}Estimates of ARMA(2,1) Model.",


booktabs = TRUE, padding = 2)

16
AR1 AR2 MA1 mu

Table 12: Estimates of ARMA(2,1) Model.

AR1 AR2 MA1 mu


Coefficients 0.42 0.32 0.7 1.2
P Value 0.00 0.01 0.0 0.0

car21 <- mar21$coef[1:2]


rar21 <- (polyroot(c(1, -car21)))^{
-1
}
abs(rar21)

## [1] 0.8076514 0.3908103


roots_arma(rar21) + ggtitle("Figure 17:Inverse of the AR roots")

Figure 17:Inverse of the AR roots


1.0

0.5
Imaginary Part

0.0

0.5

1.0
1.0 0.5 0.0 0.5 1.0
Real Part

The roots are within the unit circle and AR(2) satisfies the stability condition. The process is inverible as
MA coefficient is less than one.
# Ljung Box Q stat
res_ar21 <- mar21$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar21, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))


cmjb21 <- "\\label{tab:tab3}Ljung Box Q statistics for residuals of ARMA(2,1)"
kable(Mjb, digits = 2, caption = cmjb21, booktabs = TRUE, padding = 2)

17
Table 13: Ljung Box Q statistics for residuals of ARMA(2,1)

Q-stat 0.00 0.05 1.09 1.22 1.42 2.96 7.36 12.04 12.08 12.28 18.72 18.72
pval 0.99 0.97 0.78 0.87 0.92 0.81 0.39 0.15 0.21 0.27 0.07 0.10

acf.graph1(res_ar21, 12) + ggtitle("Figure 18: ACF of residuls from ARMA(2,1)")

Figure 18: ACF of residuls from ARMA(2,1)

0.1
ACF

0.0

0.1

1 3 5 7 9 11
lags

pacf.graph1(res_ar21, 12) + ggtitle("Figure 19: ACF of residuls from ARMA(2,1)")

Figure 19: ACF of residuls from ARMA(2,1)

0.1
PACF

0.0

0.1

0.2
1 3 5 7 9 11
lags

# AIC
mar21$aic

## [1] 287.1041
# SBC
mar21$bic

## [1] 303.7677
This model is an improvement over the ARMA(1, 1) specification.
All the coefficients are statistical significant.
The AIC selects the ARMA(2, 1) model over that AR(6) and the SBC selects the ARMA(2, 1) over the
AR(6) and the AR(7).

18
The values for Q stat indicate that the autocorrelations of the residuals are not statistically significant
at the 5

1.10 Seventh Model :ARMA(2,(1,7))

In order to account for the serial correlation at lag 7, it might seem plausible to add an MA term to the
model at lag 7
mar27 <- Arima(sp, order = c(2, 0, 7), fixed = c(NA, NA, NA,
0, 0, 0, 0, 0, NA, NA))
# Pvalues of ARIMA
p_mar27 <- (1 - pnorm(abs(mar27$coef)/sqrt(diag(mar27$var.coef)))) *
1.96
### presenting the results with p values
df_mar27 <- as.matrix(rbind(mar27$coef[-(4:8)], p_mar27[-(4:8)]))

dimnames(df_mar27) = list(c("Coefficients", "P Value"), c("AR1",


"AR2", "MA1", "MA7", "mu"))

cmar27 <- "\\label{tab:tab2}Estimates of ARMA(2,(1,7)) Model."


kable(df_mar27, digits = 2, caption = cmar27, booktabs = TRUE,
padding = 2)

Table 14: Estimates of ARMA(2,(1,7)) Model.

AR1 AR2 MA1 MA7 mu


Coefficients 0.34 0.4 0.78 -0.14 1.19
P Value 0.01 0.0 0.00 0.00 0.00

# Ljung Box Q stat


res_ar27 <- mar27$residuals
Mjb <- matrix(nrow = 2, ncol = 12)
for (i in 1:12) {
jb <- Box.test(res_ar27, lag = i, type = "Ljung-Box")
round(Mjb[1, i] <- jb$statistic, 2)
round(Mjb[2, i] <- jb$p.value, 2)
}

dimnames(Mjb) <- list(c("Q-stat", "pval"))


cmjb27 <- "\\label{tab:tab3}Ljung Box Q statistics for residuals of ARMA(2,(1,7))"
kable(Mjb, digits = 2, caption = cmjb27, booktabs = TRUE, padding = 2)

Table 15: Ljung Box Q statistics for residuals of ARMA(2,(1,7))

Q-stat 0.01 0.18 0.58 0.83 0.83 2.06 2.34 2.80 4.71 6.51 11.08 11.24
pval 0.93 0.91 0.90 0.93 0.97 0.91 0.94 0.95 0.86 0.77 0.44 0.51

acf.graph1(res_ar27, 12) + ggtitle("Figure 20: ACF of residuls from ARMA(2,(1,7))")

19
Figure 20: ACF of residuls from ARMA(2,(1,7))
0.15
0.10
0.05
ACF

0.00
0.05
0.10
0.15
1 3 5 7 9 11
lags

pacf.graph1(res_ar27, 12) + ggtitle("Figure 21: ACF of residuls from ARMA(2,(1,7))")

Figure 21: ACF of residuls from ARMA(2,(1,7))

0.1
PACF

0.0

0.1

1 3 5 7 9 11
lags

All of the estimated coefficients are of high quality.


The Q-statistics indicate that the autocorrelations of the residuals are not significant at conventional
level.
Both the AIC and SBC select the ARMA[2,(1,7)] specification over any of the other models.
Although the ARMA[2,(1,7)] model appears to be quite reasonable, other researchers might have
selected a decidedly different model.
Consider some of the alternatives listed below.
1. Parsimony versus Overfitting: Some researcher prefer small models like ARMA(2,1) in this example.
Also try to see the significance of particular coefficient to the problem at hand For instance in this
case, is it really plausible MA7 has a direct effect on the current value of the interest rate spread while
lags 3, 4, 5, and 6 have no direct effects? In other words, do the markets for securities work in such a
way that what happens 7 quarters in the past has a larger effect on todays interest rates than events
occurring in the more recent past? Even though the AIC and SBC select the ARMA[2,(1,7)] model
over the ARMA(2, 1) model, some researchers would prefer the latter. More generally, overfitting refers
to a situation in which an equation is fit to some of the idiosyncrasies of present in a particular sample
that are not actually representative of the data-generating process. In applied work, no data set will
perfectly correspond to every assumption required for the Box-Jenkins methodology. Since it is not
always clear which characteristics of the sample are actually present in the data-generating process, the
attempt to expand a model so as to capture every feature of the data may lead to overfitting.
2. Volatility: Given the volatility of the spread series during the late 1970s and early 1980s, transforming
the spread using some sort of a square root or logarithmic transformation might be appropriate.

20
Moreover, the series has a number of sharp jumps, indicating that the assumption of normality might
be violated. Possibility of heteroschedasticity in Variance
3. Incorporation of trends
Note: Forecasting will be covered in the next tutorial

2 Assignments
Given the coefficients we can computes the theoretical ACF, PACF for the models given above using
ARMAacf(). Plot the theoretical ACF and PACF of all the models and compare it with sample ACF of
spread.
Compute and plot the IRF
Attempt Questions 9, 11, and 13 form Enders (4th edition) Chapter 2, Excercies, pp 114-115.

21

You might also like