You are on page 1of 8

1/30

Outline

Basic Econometrics in Transportation What is the nature of autocorrelation?


What are the theoretical and practical consequences of
autocorrelation?
Since the assumption of no autocorrelation relates to the
Autocorrelation unobservable disturbances, how does one know that there is
autocorrelation in any given situation?
How does one remedy the problem of autocorrelation?
Amir Samimi

Civil Engineering Department


Sharif University of Technology

Primary Source: Basic Econometrics (Gujarati)

2/30
3/30

Nature of the Problem Nature of the Problem

Autocorrelation
utoco e at o may
ay be defined
de ed as correlation
co e at o between
betwee Thiss iss in many
a y ways ssimilar
a to tthee problem
p ob e of
o
members of series of observations ordered in time or space. heteroscedasticity.
CLRM assumes that E(uiuj ) = 0 Under both heteroscedasticity and autocorrelation the usual
Consider the regression of output on labor and capital inputs, in a quarterly
time series data. If there is a labor strike affecting output in one quarter, there
OLS estimators, although linear, unbiased, and asymptotically
is no reason to believe that this disruption will be carried over to the next normally distributed, have no longer the minimum variance
quarter. (not efficient) among all linear unbiased estimators.
If output is lower this quarter
quarter, there is no reason to expect it to be lower next
Thus, they may not be BLUE.
quarter.
As a result, the usual, t, F, and 2 may not be valid.
In cross-section studies:
Data are often collected on the basis of a random sample of say households.
There is often no prior reason to believe that the error term pertaining to one
household is correlated with the error term of another household.
4/30
5/30

Patterns of Autocorrelation Possible Reasons

1. Inertia
a. Figure a shows a cyclical A prominent feature of most economic time series.
pattern. There is a momentum built into them, and it continues until something
b. Figure b suggests an upward happens.
linear trend.
2. Specification Bias: Excluded Variables Case
c. Figure c shows a downward
The price of chicken may affect the consumption of beef. Thus, the error
linear trend in the
disturbances, term will reflect a systematic pattern, if this is excluded.
d. Figure d indicates that both 3. Specification Bias: Incorrect Functional Form
linear and quadratic trend If X2i is incorrectly omitted from the model, a systematic fluctuation in the
terms are present in the error term and thus autocorrelation will be detected because of the use of an
disturbances. incorrect functional form.

6/30
7/30

Possible Reasons Possible Reasons

4.. Cobweb Phenomenon


e o e o 7. Data
ata Transformation
a s o at o
Supply reacts to price with a lag of one time period. Autocorrelation may be induced as a result of transforming the model.
Planting of crops are influenced by their price last year. If the error term in Yt = 1 + 2Xt + ut satisfies the standard OLS assumptions,
particularly the assumption of no autocorrelation, it can be shown that the
5. Lags error term in Yt = 2 Xt + vt is autocorrelated.
In a time series regression of consumption expenditure on income, it is not
uncommon to find that the consumption expenditure in the current period 8. Nonstationarity
depends also on the consumption expenditure of the previous period. A time series is stationary if its characteristics (e.g., mean, variance, and
Consumptiont = 1 + 2 incomet + 3 consumptiont-1 + ut (Autoregression) covariance) do not change over time.
6. Manipulation of Data When Y and X are nonstationary, it is quite possible that the error term is also
nonstationary.
Averaging introduces smoothness and dampen the fluctuations in the data.
Quarterly data looks much smoother than the monthly data. Most economic time series exhibit positive autocorrelation
This smoothness may lend to a systematic pattern in the disturbances. because most of them either move upward or downward.
8/30
9/30

Estimation in Presence of Autocorrelation AR(1) Scheme

What happens to the OLS estimators if we introduce auto


auto- ut = ut-1
t 1 + t ((11 < < 1)
correlation in disturbances by assuming E(utut+s) 0? is the coefficient of autocovariance
t is the stochastic disturbance term that satisfies the OLS assumptions:
We revert to the two-variable regression model to explain the
E(t) = 0
basic ideas involved.
var (t) = 2
To make any headway, we must assume the mechanism that cov (t , t +S) = 0 s0
generates ut. G
Given
ve tthee AR(1)
( ) scheme,
sc e e, itt can
ca be shown
s ow that:
t at:
E(ut, ut+S) 0 is too general an assumption to be of any
Since is constant, variance of ut is still homoscedastic.
practical use.
If || < 1, AR(1) process is stationary; that is, the mean,
As a starting point, one can assume a first-order autoregressive variance, and covariance of ut do not change over time.
scheme. A considerable amount of theoretical and empirical
work has been done on the AR(1) scheme.

10/30
11/30

AR(1) Scheme BLUE Estimator in Autocorrelation Case

U
Under
de the
t e AR(1)
( ) scheme,
sc e e, itt can
ca be shown
s ow that
t at For
o a two-variable
two va ab e model
ode and
a d a AR(1)
( ) scheme,
sc e e, we can
ca show
s ow that
t at
following are BLUE:

C and D are a correction factor that may be disregarded in practice.


A term that depends on as well as the sample autocorrelations between the
values taken by the regressor X at various lags.
lags
GLS incorporates additional information (autocorrelation
In general we cannot foretell which one is greater.
parameter ) into the estimating procedure, whereas in OLS such
OLS estimator with adjusted variance are still linear and
side information is not directly taken into consideration.
unbiased, but no longer efficient.
In case of autocorrelation can we find a BLUE estimator?
What if we use OLS procedure despite autocorrelation?
12/30
13/30

Consequences of Using OLS Consequences of Using OLS


As in heteroscedasticity case, OLS estimators are not efficient. If we mistakenly believe that CNLRM assumptions hold true,
OLS confidence intervals derived are likely to be wider than errors will arise for the following reasons:
those based on the GLS procedure:
Since b2 lies in the OLS 1. The residual variance is likely to underestimate the true 2.
confidence interval, we If there is AR(1) autocorrelation, it can be shown that
mistakenly accept the hypothesis
that true 2 is zero with 95% in which
confidence.
r is the sample correlation coefficient between successive values of the X.
X
Use GLS and not OLS to
In most economic time series, and r are both positive and .
establish confidence intervals
and to test hypotheses, even 2. As a result, we are likely to overestimate R2.
though OLS estimators are 3. t and F tests are no longer valid, and if applied, are likely to give seriously
unbiased and consistent.
misleading conclusions about the statistical significance of the estimated
regression coefficients.

14/30
15/30

A Concrete Example Detecting Autocorrelation

G
Graphical
ap ca Method:
et od:
Very often a visual examination of the estimated us gives us some clues about
the likely presence of autocorrelation in the us.
We can simply plot them against time in a time sequence plot.

Standardized residuals are the residuals


divided by standard error of the
regression.
g
Standardized residuals devoid of units of
measurement.
This figure exhibit a pattern for estimated
error terms, suggesting that perhaps ut are
not random.
How reliable are the results if there
is autocorrelation?
16/30
17/30

Detecting Autocorrelation The Runs Test

Graphical Method: Also know as the Geary test, a nonparametric test.


To see this differently, we can plot the residuals at time t against their value at There are 9 negative, followed by 21 positive, followed by 10
time (t 1), a kind of empirical test of the AR(1) scheme.
negative residuals, in our wage-productivity regression:
In our wage-productivity regression, most ()(+++++++++++++++++++++)(
of the residuals are grouped in the second )
and fourth quadrants. Run: An uninterrupted sequence of one symbol.
This suggests
gg a strongg ppositive correlation Length of a run: Number of elements in it.
in the residuals.
Although the graphical method is
Too many runs mean the residuals change sign frequently, thus indicating
suggestive, it is subjective or qualitative
in nature. negative serial correlation.
Too few runs mean the residuals change sign infrequently, thus indicating
positive serial correlation.

18/30
19/30

The Runs Test DurbinWatson d Test

Let
et The
e most
ost ce
celebrated
eb ated test for
o detect
detecting
g se
serial
a co
correlation:
e at o :
N = total number of observations = N1 + N2
N1 = number of + symbols
N2 = number of symbols
R = number of runs Assumptions underlying the d statistic:
H0: successive outcomes are independent 1. The regression model includes the intercept term. If it is not present, obtain
the RSS from the regression including the intercept term.
If N1 and N2 > 10,
10 R ~ N( , ) 2. The
Th explanatory
l variables
i bl are nonstochastic,
h i or fixed
fi d in
i repeatedd sampling.
li
Decision Rule: 3. The disturbances are generated by the first-order autoregressive scheme.
Reject H0 if the estimated R lies outside the limits. 4. The error term is assumed to be normally distributed.
Prob [E(R) 1.96R R E(R) + 1.96R] = 0.95 5. The regression model does not include the lagged value of the dependent
variable as one of the explanatory variables. (no autoregressive model)
6. There are no missing observations in the data.
20/30
21/30

DurbinWatson d Test DurbinWatson d Test

PDF of the d statistic is difficult to derive because, it depends in In our wage


wage-productivity
productivity regression:
a complicated way on the X values. The estimated d value can be shown to be 0.1229.
Durbin and Watson suggested a boundary (Table D.5) for the computed d. From the DurbinWatson tables (N = 40 and k = 1), dL = 1.44 and dU = 1.54 at
If there is no serial correlation, d is expected to be about 2. the 5 percent level.
d < dL Evidence of positive auto-correlation.

Run the OLS regression and


compute d. In many situations, it has been found that the upper limit dU is
For the given sample size and given approximately the true significance limit:
number of explanatory variables,
If d < dU, there is statistically significant positive autocorrelation.
find out the critical dL and dU
values. If (4 d) < dU, there is statistically significant negative autocorrelation.

Now follow the decision rules in


the figure.

22/30
23/30

Remedial Measures Mis-specification

If autoco
autocorrelation
e at o iss found,
ou d, we have
ave four
ou options:
opt o s: For
o tthee wagesproductivity
wages p oduct v ty regression:
eg ess o :
It may be safe to conclude that
1. Try to find out if the autocorrelation is pure autocorrelation and not the result probably the model suffers from pure
of mis-specification of the model. autocorrelation and not necessarily
2. In case of pure autocorrelation, one can use some type of generalized least- from specification bias.
square (GLS) method.
3. In large samples, we can use the NeweyWest method to obtain standard
errors of OLS estimators that are corrected for autocorrelation
autocorrelation.
4. In some situations we can continue to use the OLS method.
24/30
25/30

Correcting for Pure Autocorrelation When Is Known

The remedy depends on the knowledge about the structure of Yt = 1 + 2Xt + ut


autocorrelation. If it holds true at time t, it also holds true at time (t 1)
Yt-1 = 1 + 2Xt-1 + ut-1
As a starter, consider the two-variable regression model, and Yt-1 = 1 + 2Xt-1 + ut-1
assume that the error term follows the AR(1) scheme. Thus,
Yt = 1 + 2Xt + ut ut = utt-11 + t (1 < < 1)
Yt Yt-1 = 1(1 ) + 2(Xt Xt-1) + t
Y*t = *1 + *2X*t + t
Now we consider two cases:
is known,
is not known but has to be estimated. As the error term satisfies the CLRM assumptions, we can apply
OLS to the transformed variables and obtain BLUE estimators.

26/30
27/30

When Is Unknown When Is Unknown

The
e First-Difference
st e e ce Method:
et od: Based
ased oon DurbinWatson
u b Watso d Stat
Statistic:
st c:
Since lies between 0 and 1, one could start from two extreme positions. If we cannot use the first difference transformation because is not sufficiently
= 0: No (first-order) serial correlation, close to 1:
= 1: perfect positive or negative correlation. In reasonably large samples one can obtain rho from and use GLS.
If = 1: Yt Yt-1 = 2(Xt Xt-1) + (ut ut-1)
This may be appropriate if the coefficient of autocorrelation is very high. Estimated from the Residuals:
A rough rule of thumb: Use the first difference form whenever d < R2 If AR(1) scheme is valid, run the following regression:
Use the g statistic to test the hypothesis that = 1. No need to introduce the intercept, as we know the OLS residuals sum to zero.
ut are the OLS residuals from the original regression
et are the OLS residuals from the first-difference regression.
An estimate of is obtained and then used to estimate the GLS.
Use DurbinWatson tables except that now the null hypothesis is that = 1.
g = 0.0012, dL = 1.435 and dU = 1.540: As the observed g lies below the
Thus, all these methods of estimation are known as feasible GLS
lower limit of d, we do not reject the hypothesis that true = 1 (FGLS) or estimated GLS (EGLS) methods.
28/30
29/30

NeweyWest Method Dummy Variables and Autocorrelation

Instead of FGLS methods, OLS with corrected SEs can be used Recall the US savings
savingsincome
income regression model for 1970
19701995:
1995:
in large samples. Yt = 1 + 2Dt + 1Xt + 2(DtXt) + ut
Y = savings; X = income; D = 1 for 19821995; D = 0 for 19701981
This is an extension of Whites correction. Suppose that ut follows a AR(1) scheme:
The corrected SEs are known as HAC (heteroscedasticity- and Use the generalized difference method to estimate the parameters.
autocorrelation-consistent) standard errors. How do we transform the dummy?
We will not present the mathematics behind, for it is involved. One can follow the following procedure:
1. D = 0 ffor 1970
19701981;
1981 D = 1/(1 )) for
f 1982
1982; D = 1 ffor 1983
19831995.
1995
The wagesproductivity regression with HAC standard errors: 2. Transform Xt as (Xt Xt-1).
3. DtXt is zero for 19701981; DtXt = Xt for 1982; and the remaining
observations are set to (DtXt DtXt-1) = (Xt Xt-1).

30/30

Homework 6

Basic
as c Econometrics
co o et cs (Gujarati,
(Guja at , 2003)
003)

1. Chapter 12, Problem 26 [30 points]


2. Chapter 12, Problem 29 [15 points]
3. Chapter 12, Problem 31 [25 points]
4. Chapter 12, Problem 37 [25 points]

Assignment weight factor = 1

You might also like