Professional Documents
Culture Documents
Statisticians working with time series data uncovered a serious problem with standard
econometric techniques applied to time series. Estimation of parameters of the OLS
model produces statistically significant results between time series that contain a
trend and are otherwise random. This finding let to considerable work on the how to
determine what properties a time series must possess if econometric techniques are to
be used. The basic conclusion was that any times series used in econometric
applications must be stationary.
It is good practice to produce a plot of the time series you are investigating. Does the
series appear to have a stable mean or a trend? Does the variance of the series appear
to be constant over time?
Are the characteristics of a time series – the mean and variance - constant over time?
If the mean and variance are constant over time, then the series is stationary. If the
mean and variance change, then the series is nonstationary. How can you know if a
series is stationary?
εt may have mean of zero and variance of σ2, which implies that the best guess of zt+1
is zt and that the forecast error associated with zt+1 is σ.
εt may have be accompanied by a constant, μ, which means that the best guess for zt+1
is zt+ μ. This is designated a random walk with a drift.
Rather than depending upon zt-1, zt may simply be a function of a deterministic trend
zt is a function of βT and εt. This is designated a trend-stationary process.
zt = λzt-1+ μ + βT + εt
Or
zt - zt-1 = μ + (λ−1)zt-1+ βT + εt
We can use OLS to estimate the parameters of this equation. Notice the short hand
for lags and differences in STATA. d.variable indicates the first difference. l.variable
indicates the first lag.
------------------------------------------------------------------------------
D.approval | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
approval |
L1 | -.1642488 .042101 -3.90 0.000 -.24735 -.0811475
t | -.0133481 .0098681 -1.35 0.178 -.0328263 .0061301
_cons | 10.27947 2.869247 3.58 0.000 4.615999 15.94294
------------------------------------------------------------------------------
If λ−1=0 then the series is not stationary (the series contains what is called a unit
root). In the example above, λ−1<0
If β>0 then the series contains a trend. In the example above β=0
If β=0 and λ−1 is not zero, then the series is stationary. The approval series is
stationary.
We can use the same test to make sure that all of the variables in order model are
stationary.
To automatically run this test in STATA, use the dfuller command. Notice that I use
both the regress and trend options. regress reguests that the table be included, trend
indicates that the trend variable should be included. One important difference
between the standard OLS output and the dfuller command output is the calculation
of the critical value from tables published by MacKinnon. We can conclude that the
coefficient on approval t-1 is not zero, so the series is stationary
. dfuller approval, regress trend
------------------------------------------------------------------------------
D.approval | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
approval |
L1 | -.1642488 .042101 -3.90 0.000 -.24735 -.0811475
_trend | -.0133481 .0098681 -1.35 0.178 -.0328263 .0061301
_cons | 10.26612 2.862827 3.59 0.000 4.615321 15.91692
------------------------------------------------------------------------------
(ii) detrending – remove a linear trend. Identify the linear trend (regression z on t).
Create a predicted value. Subtract the predicted value from the original. Challenge
(and why this is often not done): is it very useful to assume a linear trend?
Model 1 (raw series). The level of presidential approval is a function of the level of
MICS. When consumer sentiment is high, approval is high.
The technical prescription (difference) may or may not coincide with what you think
happens substantively.
Once you are confident the variables are stationary, then you can proceed to OLS.
------------------------------------------------------------------------------
approval | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mics | .5935344 .0696798 8.52 0.000 .456008 .7310608
_cons | 4.298118 6.091941 0.71 0.481 -7.725493 16.32173
3. Diagnosing high order autocorrelation.
What is the correlation between the values of the error term at t and t-1, t and t-2, t and t-
3? STATA permits use to visually inspect the level of correlation across error terms at
each lag – the plot is labeled a correlogram.
corrgram res1
-1 0 1 -1 0 1
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]
-------------------------------------------------------------------------------
1 0.8108 0.8121 117.69 0.0000 |------ |------
2 0.6248 -0.0827 187.97 0.0000 |---- |
3 0.4673 -0.0269 227.52 0.0000 |--- |
4 0.3575 0.0318 250.79 0.0000 |-- |
5 0.2591 -0.0359 263.09 0.0000 |-- |
6 0.1722 -0.0362 268.55 0.0000 |- |
7 0.1130 0.0242 270.92 0.0000 | |
8 0.0508 -0.0721 271.4 0.0000 | |
9 0.0415 0.1188 271.72 0.0000 | |
10 0.0085 -0.1003 271.74 0.0000 | |
11 -0.0580 -0.1248 272.38 0.0000 | |
12 -0.0994 0.0322 274.26 0.0000 | |
13 -0.1583 -0.1327 279.08 0.0000 -| -|
14 -0.1642 0.0812 284.3 0.0000 -| |
15 -0.1130 0.1616 286.78 0.0000 | |-
16 -0.0398 0.0542 287.09 0.0000 | |
17 -0.0165 -0.0719 287.14 0.0000 | |
18 -0.0174 -0.0322 287.2 0.0000 | |
19 0.0167 0.0803 287.26 0.0000 | |
20 0.0369 0.0030 287.53 0.0000 | |
21 0.0484 -0.0360 288.01 0.0000 | |
22 0.0260 -0.0777 288.14 0.0000 | |
23 -0.0131 -0.0453 288.18 0.0000 | |
24 -0.0391 -0.0574 288.49 0.0000 | |
25 -0.0129 0.1483 288.53 0.0000 | |-
26 0.0151 -0.0053 288.57 0.0000 | |
27 0.0276 0.0582 288.74 0.0000 | |
28 0.0413 0.0673 289.1 0.0000 | |
29 0.0588 0.0675 289.83 0.0000 | |
30 0.0783 0.0837 291.15 0.0000 | |
31 0.1298 0.1569 294.79 0.0000 |- |-
32 0.1707 0.0391 301.12 0.0000 |- |
33 0.1709 0.0490 307.52 0.0000 |- |
34 0.1645 0.0425 313.49 0.0000 |- |
35 0.1878 0.0924 321.33 0.0000 |- |
36 0.1687 -0.1532 327.7 0.0000 |- -|
37 0.1425 -0.0394 332.28 0.0000 |- |
38 0.1135 0.0388 335.2 0.0000 | |
39 0.0952 0.0911 337.27 0.0000 | |
40 0.0542 -0.0576 337.95 0.0000 | |
(b) Q test
If the error is strictly a product of random error (et=vt), where vt is mean zero and
variance sigma2, then the autocorrelation function should be composed of ρ=0 for k>0.
This is described as a white noise process. If there is no serial correlation – at lag 1 or
other lags – in our model, then the error term should appear to be white noise. There is a
formal test for this implemented in STATA – the Ljung Box Q. Formally, the Q test
statistics is a function of the square of the correlation coefficients at each lag (for j lags)
and the number of observations in the sample. STATA reports if the test is statistically
significant. If the test is significant, then the residuals are correlated.
wntestq res1
ARIMA regression
------------------------------------------------------------------------------
| OPG
approval | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
approval |
mics | .3900807 .0814409 4.79 0.000 .2304594 .549702
_cons | 22.18424 7.137865 3.11 0.002 8.19428 36.1742
-------------+----------------------------------------------------------------
ARMA |
ar |
L1 | .8286119 .0421585 19.65 0.000 .7459827 .911241
-------------+----------------------------------------------------------------
/sigma | 5.669955 .2467909 22.97 0.000 5.186254 6.153656
------------------------------------------------------------------------------
ARIMA regression
------------------------------------------------------------------------------
| OPG
approval | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
approval |
mics | .5090958 .0745687 6.83 0.000 .3629438 .6552478
_cons | 11.64684 6.569249 1.77 0.076 -1.228655 24.52233
-------------+----------------------------------------------------------------
ARMA |
ma |
L1 | .680474 .0664946 10.23 0.000 .5501471 .810801
-------------+----------------------------------------------------------------
/sigma | 7.124995 .3504756 20.33 0.000 6.438076 7.811915
------------------------------------------------------------------------------
3. MA (1) AR (1 4)
ARIMA regression
Sample: 1 to 176 Number of obs = 176
Wald chi2(5) = 460.81
Log likelihood = -554.6477 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| OPG
approval | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
approval |
mics | .3720738 .0794335 4.68 0.000 .2163869 .5277607
_cons | 23.70178 6.94251 3.41 0.001 10.09471 37.30885
-------------+----------------------------------------------------------------
ARMA |
ar |
L1 | .7803232 .0683468 11.42 0.000 .646366 .9142804
L4 | .0101643 .072298 0.14 0.888 -.1315372 .1518659
ma |
L1 | .134136 .0971602 1.38 0.167 -.0562946 .3245665
L4 | .0211127 .0855824 0.25 0.805 -.1466257 .1888511
-------------+----------------------------------------------------------------
/sigma | 5.635385 .2446229 23.04 0.000 5.155932 6.114837
Next week
ARIMA models
Last week we used a remedy for serial correlation that assumed the special case of "first
order autoregressive" error. This week we both use a more general remedy and tackle
new tests for stationarity and high order autcorrelation. We will again use Green et al
data on presidential approval and macropartisanship. You can take the same approach
with the data as in assignment #2 – model presidential approval as a function of other
variables in the data set.
Your assignment
1. What is the link between presidential approval and the variables you include in the
model? Describe your expectations
2. Estimate the model implied by your expectations with OLS. Report the results
Note: the Dickey Fuller augmented test for a unit root determines if the series has a unit
root. Stationary series do not have a unit root, so you would observe a rejection of the
null hypothesis (the absolute value of the test statistic is high) if the series is stationnary.
(An estimated p-value<0.05 implies the series is stationary)
a. Report and interpret the Ljung-Box Q test statistic for higher order serial
correlation (the default is to test up to lag 40). Note: the statistical test is a
Lagrange Multiplier test. If the value is high, then the null hypothesis - no high
order serial correlation --must be rejected. (An estimated p-value<0.05 imples
serial correlation.)
b. Produce and interpret the correlogram. Are there strong relationships between
residuals at any lags?
5. Specify and estimate an ARIMA model and compare the results to the simple OLS
6. Is there still a problem with serial correlation? Repeat 4(a), and 4(b) above