The asymptotic power envelope is derived for pointoptimal tests of a unit root in the
autoregressive representation of a Gaussian time series under various trend specifications.
We propose a family of tests whose asymptotic power functions are tangent to the power
envelope at one point and are never far below the envelope. When the series has no
detdrministic component, some previously proposed tests are shown to be asymptotically
equivalent to members of this family. When the series has an unknown mean or linear
trend, commonly used tests are found to be dominated by members of the family of
1. lNrRooucrroru
ForrowrNc rHE
sEMINAL woRK
for the tests can differ substantially, no general optimality theory has been
developed. In particular, there are few general results (even .asymptotic) concerning the relative merits of the competing testing principles and of the various
methods for eliminating trend parameters.
Emptoying a model common in the previous literature, we assume that the
data y1,,,,,/r woro generated as
(1)
yt  dt + ul
. ttt:
dllt1+
(t:t,...,7),
Dt
!E
{r
T&
.ff
814
c. ELLIorr, T. J. RoTHENBERG,
AND J. H. srocK
is
,H
AUTOREGRESSIVE
UNIT ROOT
815
ri
the distribution of the data were otherwise known, the Neymanpearson Lemma
gives us the best test against any given point alternative a. The power of this
d,
If there
exist feasible tests with the same asymptotic power as the Neymanpearson pointoptimal tests, the comparison will be appropriate in the nuisance
functions very close to this bound, even when there are additional nuisance
:a.i$
.t6
.t.I
*
,.]
.ri.3i
:]d
:*
In this section we derive an upper bound to the asymptotic power function for
tests of the hypothesis a: L when the data are generated by (1) and the
following condition is satisfied.
:li
,.,,f
it
CoNDIIIoN A: The stationaty sequence (u,) has a stictly positiue spectral density
function; it has a mouing auerage representation u,:Li:o61'th 1 where the 11, arc
independent standard normal random uariables and Li:oil}il <*. The initial uo is
0 and the 6's are known.
The unrealistic assumption of known ao, 5's and error distribution is made so
we may employ the NeymanPearson theory; in Section 3 we show that it may be
dropped without any essential change. Our results, however, are quite sensitive
to the nature of the deterministic components d,. Section 2.1 considers the
simplest case where the d, are known. Section 2.2 examines the case where the
d, are "slowly evolving" and Section 2.3 examines the case where d, is a linear
combination of nonrandom trending regressors. Our purpose here is to derive
the power bound; tests that might be used in practice are discussed later in the
paper. All proofs are given in the Appendix.
2.1.. Known Detetmi.nistic Component
When the d, are known, a, is observable and minus two times the log
likelihood is (except for an additive constant) given by
(2)
L(a) :lau(ar)utl;
t\au (r,r)ui
L(d)  L(r).
When the sample size is large, any reasonable test will have high power unless
is close to one. Thus, in obtaining largesample approximations, it is natural
to employ localtounity asymptotics where the parameter space is a shrinking
neighborhood of unity as the sample size grows. In our case the appropriate rate
to get nondegenerate distributions is T1 so we reparameterize the model
writing c:T(a  1) and take c to be a constant when making limiting arguments. Cf. Chan and Wei (1987), Phillips and Perron (1988). Setting e = T(a  t),
we can then write the likelihood ratio test statistic as
(3)
'Using u different maintained model, Robinson (1994) develops a ,.standard,, asymptotic theory
of efficient tests for a unit root. This requires dropping the familiar autoregression framework and
assuming, for example, a fractionally differenced process for the data.
ENVELopE
Au.
For any given c, rejecting when the linear combination (3) is small yields the
most powerful test against the alternative that c:e .
816
AUTOREGRESSIVE
STOCK
8t7
UNIT ROOT
(4)
(s)
\J/
=wle, Iw,, 
ew"z(1.) <
b(e)]
< b(e)l : a. Because the test indexed by c is optimal against the alternative c:e , the envelope
power function for this family of pointoptimal tests is nk): r(c,c).
where
114"2
Ad, are
CoNorrroN
bounded with
This will automatically be satisfied if. the d, are constant, It will also be
of time. These include low frequenry
sinusoids (e.g., d,:cos(2nkt/I) for finite k); slowly increasing time trends
(e.9., dt:ln(r) or d,:t6 for 6 <t/2); and step functions with finitely many
jumps (e.g., d,: Fo when / < to and d,: F, when / > /o). In the slowly evolving
trend case, the random component of y, dominates the deterministic component
when I is large. It is tempting therefore to ignore the deterministic term when
constructing the test statistic. In the Appendix we show that, if the d, evolve
satisfied by a variety of smooth functions
(6)
tfi: minL(a, B) BB
minL(1, F).
The test statistic is the difference in (weighted) sum of squared residuals from
two constrained GIS regressions, one imposing d:d and the other imposing
a:
L.
(D
v,(t,e)
w,(t)
 ,l^orr,
+ 3(1  t)
[sw"G)
ds],
d,
are
unknown and not slowly evolving is more complicated. Suppose, for example, the
TnEoRpu l: Suppose {y,l is generated by the Gaussian model (L) under Condition A. Consider unitroot tests of size e under locahounity asymptotics where both
c : T(a  1.) and e : T(d  L) are fixed as T tends to infinity.
818
STOCK
AUTOREGRESSIVE
a. When d, is known or satisfies Condition B, the NeymanPearson most powerful test against the alteratiue c : e has asymptotic power function r(c,E\ defined in
(4). An upper bound to the asymptotic power of any unitroot test is giuen by the
power enuelope II(c)= n(c,c).
b. When d,: Fo, the most powerful inuaiant test against the altematiue c : E has
asymptotic power function r(c,e). The asymptotic power enuelope for this family of
pointoptimal inuaiant tests is II(c).
c. When d,: Fo+ BJ, the most powetful inuaiant test against the altematiue
c
rr'(c,e):rufe'[r'nz(t,e) dt+
(8)
where
b'(e)
satisfies
(1
e)v,z(t,z)
<b'Cdf
upper
bound to the arymptotic power of any unitroot test inuaiant to the trend parameters
Bo and B, is giuen by the power enuelope II"(c) = r'(c,c).
PoINToPTTMAL TESTS
Although the pointoptimal test statistics defined in (3) and (6) require E and
uo to be known, it is possible to construct tests having the same largesample
properties even in the absence of this knowledge. Furthermore, the asymptotic
theory is valid under less stringent assumptions than those made in Theorem L.
In this section, we continue to assume that equation (1) describes the data
generating process but we drop Condition A and consider the properties of
some feasible tests under weaker assumptions. For 0 (s ( 1, let [sI] be the
greatest integer less than or equal to sZ and let + denote weak convergence of
the underlying probability measures as 7 tends to infinity.
UNIT ROOT
819
(9)
where
is an estimator for
Zo
and
Thus the power functions r(c,e) ar,d n'(c,c) derived in Theorem L for
pointoptimal tests in the Gaussian model with X known can be attained by the
simple P. family of statistics under the much weaker assumptions of this
section. This is important, because, in practice, I will generally contain uirknown parameters and there is often no compelling reason to believe that the
data are normally distributed. If the errors are nonnormal, tests exploiting the
form of the actual likelihood and possessing power higher than II(c) and [1"(c)
could be constructed. In the absence of such information, quasilikelihood tests
based on leastsquares regressions are likely to be used in practice. The power
bounds derived under normality are still valid when comparing such tests.
Although our analysis is based on relatively weak assumptions, two interesting
models considered elsewhere in the literature are ruled out. A problem closely
related to ours is to test the null hypothesis that {u,} is an integrated process
against the alternative that it is a strictly stationary process. Under that
alternative, ao will have a variance proportional to (1  o2)', a violation of
Condition C. The tests studied in Section 2 are not point optimal under this
specification and the asymptotic power bounds are no longer valid. Our P,
statistics, however, still have simple localtounity limiting representations under
820
c. ELLIorr, T. J. RoTHENBERG,
AND J. H. srocK
AUTOREGRESSIVE
AS
(10)
(1ee3).
second, closely related approach to modeling unit roots is also ruled out
here, One way to avoid making an assumption about the initial error a6 is to
base the entire statistical analysis on the conditional distribution of the data
given the first observation yr. When d, is known, there is no difference
asymptotically between our analysis based on the full likelihood and analysis
based on the conditional likelihood. But when d, is unknown, the pointoptimal
invariant test based on the full likelihood is not asymptotically equivalent to the
pointoptimal test based on the conditional likelihood. Invariance under the
Estimators to2 that are consistent under local alternatives and have nonzero
probability limits under fixed alternatives clearly satisfy this condition. Some
examples of such estimators are given in Section 5.
a
:.i
!t
4. soup
821
*: tlfT(a  1)l for dll,we can find that alternative d(r,2, a) which yields
(approximate) power rl' when using the point optimal test of level e with a
sample of size Z. Then, for e < rr < L, the family of test statistics can be written
the stationary alternative. Suppose, for example, zo is normal with mean zero
and variance (l  oz1t and that the u, are serially uncorrelated with unit
variance. Then Tl/zuvn W!(t) : W"G) * no"' where 4o is a normal variate, independent of W"(.), with mean zero and variance (Zc)l. The P.
statistics can then be written as functionals of the W!(t) process. Further
analysis of the stationary alternative testing problem can be found in Elliott
and
UNIT ROOT
PTQI) =
(We suppress the dependence of P on e.) Although every member of this family
is admissible, past research suggests that values of z near onehalf often yield
tests whose power functions lie close to the power envelope over a considerable
range. Cf. King (1988).
For the remainder of the paper we restrict attention to the three standard
cases discussed in the literature where d, is either zero, a constant, or a linear
trend. To distinguish the cases, we follow Dickey and,Fuller (1979) and use a
superscript trr, when d, is constant and a superscript r when it is a linear trend.
Since commonly used test statistics have distributions not depending on the
parameters determining the d,, we shall also restrict attention to invariant tests.
when there is no deterministic term, our family of P, tests includes as special
cases many tests^ previously proposed. Recall that Pr(rr) has the asymptotic
representation czGr)1W"2 eG)W"z(l) where E(zr) is i monotonically deireasing function taking the value zero when z' is equal to a (the size of the test) and
tending to minus infinity as z,' approaches one" Sargan and Bhargava (1983)
suggest S(0)/S(l)as a test statistic when the u, are white noise; asymptotically it
behaves like lll"2 and corresponds to P1(1). The locally most powerful test
described by Dufour and King (1991) behaves asymptotically like w"2(D ana
corresponds to P7G). The DickeyFuller estimator test (based on their statistic
D) is also a member, since its rejection region is determined, asymptotically, by a
Iinear combination of IW"2 and w"2(1); computations indicate that it has the
same limiting distribution as our Pr(l  a). The DickeyFuller , statistic (denoted by i) is a nonlinear function of lW"2 and W"z(l). Nonetheless, computations indicate that the asymptotic power function of their , test is tangent to the
power envelope when power is aboul onehalf and behaves like the pr(.5) test.
Likewise, the Z. and Z, tests examined in Phillips (1987) and phillips and
Perron (1988) behave like members of the P. family since they are asymptotically equivalent to the fi and i tests, respectively.
Figure L graphs the asymptotic power functions of these tests along with the
power envelope when the tests have size 0.05. These are based on 20,000 Monte
carlo replications where w" was approximated by its discrete realization from a
sample of size 500; simulation standard errors are less than 0.0013. The power
e,nvelope is monotonic and equals onehalf when c :
 7. with the exception of
the locally most powerful test which puts all the weight on w"z(l), a[ the tests
have power functions very close to the power envelope. Indeed, it is hard to
distinguish them without vastly changing the scale of the figure. Although none
of these tests is uniformly most powerful even asymptotically, our numerical
'#
H
822
STOCK
AUTOREGRESSIVE
,$
UNIT ROOT
823
'l
:I
it
/
':3
ililt
l:\f
C.'
:i:I
.i'
.'2'. / .'/
/./
.a
.trb / .//
/'
{$
o.7
o.7
0.6
Solid
o.5
A: Pr(1.0); SarganBhargava
B: P;(.9s); DickeyFuller P
C: Pr(.5); DickeyFuller"
D: Pr(.05); Locally most Powerful
l=
o.5
o.4
Solid
Pf (.5)
B: DFGIS/(.s)
C: SarganBhargava
o.1
o.'l
2.5
12.5 15 17.5 X) 25
4
7.5
7.5
21.5
Frcune lAsymptotic power functions of selected unit root tests: no deterministic component.
well.
D: DickeyFullerpr
E: DickeyFuller?r
o.2
o.2
Things are rather different, however, when d, contains parameters that have
to be estimated. The SarganBhargava (1983) test for the constant mean case,
Bhargava's (1986) extension for the linear trend case, the DickeyFuller estimator tests (based on their statistics pp and D'), the DickeyFuller, tests (based
on their ?r' and ?'), and the PhillipsPerron Z tesls are no longer asymptotically equivalent to members of the P, family since they employ OIS estimates
of the p's instead of constrained localtounity estimates. The power functions
for the PfGr) and P[(r) tests remain very close to the relevant power
envelopes II(c) and II"(c) for a broad range of 7r values. The power functions
for the tests which use OI.$ estimates of. B are well below the power envelopes.
Some results for tests at ttre 5Vo level are presented in Figure 2 for the constant
FrcuRE
12.5 15
17.5
27.s
2Asymptotic power functions of selected unit root tests: constant mean (2,
L).
mean case and in Figure 3 for the linear trend case. The envelope power curve
II'(c) has the same shape as I/(c), but now takes the value onehalf when
c: 13.5. The power loss of the commonly used.tests is particularly dramatic in
the constant mean case. The same pattern is found for tests at the lVo and llVo
significance levels.
A measure of the difference between two tests is Pitman asymptotic relative
efficiency (ARE), defined as the ratio of the values of. c at which the tests
achieve a specified power. Evaluating efficiency at power onehalf and using 57o
level tests, we find in the constant mean case the ARE's of the SarganBhargava,
ip and ?p tests relative to the powerenvelope are, respectively, 1.40,1.53, and
1.91. Since c is proportional to I, this implies that using the DickeyFuller / test
instead of the P,(.5) test is equivalent in large samples to discarding almost half
of the observations. The corresponding ARE's for the linear trend case are L.07,
t.13, and1.,.25.
Since the difficulties with the standard tests are associated with inefficient
estimates of the trend parameters, it is reasonable to expect that modified
824
STOCK
AUTOREGRESSIVE
UNIT ROOT
825
TABLEI
7
/...r''r'
o.9
CnmrcaL VeLues,
.n'
If,vel
2.5%
/o'i'',"
'
o.7
.:/'
A.
t/
50
100
^'"/..,i'i;
1.91.
1.99
/.,., ,'
/.'./ t
,,,, ,,
/''.'t
o.4
f"l,'ri
(,i'.,'
4,
.t
o.3
o.2
f/r/
z'
? ,.
o.1
Constant Mean:
1.87
1.95
200
,'
50
100
zo0
2..5 %
50
100
200
estimates could improve their performance. Because of their relatively good size
properties found in smallsample Monte Carlo studies (e.g., Schwert (1989)),
natural tests to modify are those based on the DickeyFuller / statistics ip and,
?'. choosing a to be that alternative where maximal power is approximately
onehalf, we propose regressing y" on Zu to obtain the estimate p. Then one
can perform the usual augmented DickeyFuller I test (without deterministic
regressors) using the residual series yd =y, B'2, in place of y,. Thus the
modified test statistic (denoted by DFGLS(z') in the tables and figures) is the /
statistic for testing ao:0 in the regression
2.97
3.11
3.17
3.26
3.9r
4.17
4.33
4.48
4.22
4.26
4.05
3.96
5.72
5.64
5.66
5.62
4.94
4.90
4.83
4.78
13.5
3.58
3.46
3.48
statistic when there is no intercept. In the linear trend case, the detrended series
3.46
 3.29
 3.18
 3.15
20,OOO
3.L9
 3.03
2.93
2.89
6.77
6.79
6.86
6.89
13,5
2.89
2.74
2.64
2.s7
30
:I,::tff ;'1*.,';"rTi:;.'"X.'",ii,,:n:*"1,;J;'/#li:Jl::,il#:
Ftcunr 3Asymptotic power functions of selected unit root tests: linear trend (2,:(1,/),).
(11)
2.39
2.47
2.47
2.55
e: 7
e: 
3.77
27.5
with
B: DFGLSr(.5)
C: Bhargava
D: DickeyFullerpr
E: DickeyFullerfr
Pf
2.5
to%
integrals.
yi:l:
Br Br, plays the role of y/. It is shown in the Appendix that
T1/2!["rt* a4(s,c) when d:1+ c/T is used for the estimation of B; the t
power envelope
given in Table
for .01 < e < .10. Some critical values for this choice of e are
r. Note that, although the smallsample values are valid only for
826
AUToREGRESSTvE UNrT
Gaussian white noise {u,}, the largesample critical values do not depend
on
.5
or normality.
5.
r'rurt
lr
63r: t
nt:
SAMPLE PERFoRMANCE
based
MA(l):
AR(l):
III.
GARCH MA(1):
(o:
.8,.5,0, .5,
(d:
.8),
.s,
.s),
.s,0,
.s).
(0:
each of these models the initial condition was ,r0 :0. Although the null
distribution of the test statistics considered here are invariant to the initial
condition, smallsample power typically depends otr uo. This dependence is
In
investigated by considering a variant of the first model where the {rr,} are strictly
stationary under the alternative hypothesis. That is, zo is normal with mean zero
and variance equal to (l+ gz zea)/(L a'),a+ 1. This design violates our
Condition C and is intended to shed light on the importance of that assumption.
The autocorrelation structure of {u,} was assumed to be unknown to the
(13)
wherc
(14)
o:^:
6]
u;
f (,  ,L,r,)'
Ay,:aoy,r*arAy,L+...
+ap
827
Two choices of lag length were employed: the AR(8) estimator used p:8 and
the AR(BIC) estimator used p chosen by the Schwarz(1978) Bayesian information criterion constrained so 3 <p < 8. The SC estimators are given by
A Monte Carlo experiment was conducted to see how well the asymptotic
I.
[.
Roor
Ay,r*aoat*
4r.
K(m/tr)i(m)
 lr
where K(.) is the Parzen kernel, i@): TtL!:1"e,er+n, and e, is the residual
from an OLS regression of y, on (/rr, z,). Two variants were employed: SC(12)
using /, : L2 and SC(auto) using Andrews' (1991) optimal automatic procedure
(his equations (6.2) and (6.4)).
The results are summarized in Table II for a constant mean and in Table III
for a linear trend. Tests were at the 5Vo asymptotic significance level and the
sample size T was 100. For a: l, the tables report the observed rejection rates
from 5000 Monte Carlo replications when critical values were based on the
limiting distributions. For a ( 1, the tables report sizeadjusted power; this is the
rejection rate when critical values are estimated from the a: 1 Monte Carlo
trials.
The results suggest three conclusions. First, the predicted superiority of the
tests using localtounity estimates of the mean and trend parameters is borne
out by the Monte Carlo study. The Pr and modified DickeyFuller tests have
higher sizeadjusted power than the standard DickeyFuller , test for almost all
of the data generating processes and all choices of to2. The improvement is
largest in the constant mean case. A.lthough the observed power curves tend to
be somewhat below the asymptotic power curves, the results are generally
consistent with the predictions of the asymptotic theory. The main exception is
the poor performance of the pointoptimal tests using SC estimates of r,r2 when
the MA parameter 0 is large.
Second, the choice of estimator for az has a large effect on the size of the P,
tests, with the AR estimator exhibiting much smaller distortions than the SC
estimator. This mirrors similar results found for other unitroot statistics; see,
for example, DeJong et al. (1992) and Perron (1996). The AR(S) and AR(BIC)
tests have moderate size distortion except in the MA model with large 0. The
modified DickeyFuller tests have notably smaller size distortions than those
based on Pr. In addition, the tests based on the AR(BIC) estimator have better
sizeadjusted power than those based on the AR(8) estimator, which typically
estimates more nuisance parameters. Other experiments not reported in Tables
II or III indicate that the AR(BIC) tests also dominate the ones based on the
AR(4) estimator for a2. Lag length selection based on sequential likelihood
ratio statistics was also tried; no general improvement over AR(BIC) was found,
although the LR selector appears to improve the sizeadjusted power of the
modified DickeyFuller test relative to BIC in the linear trend case, at least for
of 0.
Third, the powers of the P. and modified DickeyFuller tests deteriorate
substantially when the a, are stationary. Even so, in the linear trend case with
small values
Asymptotic
Power
Slalistic
Pi$)
AR(8)
1.00
.95
.90
.80
.70
.05
.32
.'t6
1.00
1.00
Pi$)
1.00 .os
AR(BIC) .95 .32
.90 .76
.80 1.00
.70 1.00
1.00
.05
.95
.32
.90
Pr(J)
SC(auto)
.80
r.00
.70
1.00
l.oo
.95
.90
.80
..10
A(.5) 1.00
AR(8)
.95
.90
.80
.70
DFGLS
P(.5) 1.00
AR(BIC)
.95
.90
.80
.,to
DFGLS
1.00
.95
.90
.80
.'lo
.os
.32
.76
1.00
1.00
.05
.32
..15
1.00
1.00
0.18
0.31
0.47
0.56
o.t4
0.24
0.50
0.82
0.92
0.o2
0.29
0.64
0.96
1.00
0.04
0.30
0.67
0.97
1.00
0.05
0.21
0.42
0.68
0.80
1.00
0.10
0.26
0.56
0.87
0.96
.05
.12
.31
.85
l 00
0.08
0.11
0.23
0.55
0.76
.05
.32
.'t5
1.00
TEsrs oF THE
CoNsTANT
Mrar (2,:
MAo), A
0.18
AUTOREGRESSIVE
1),
0J
REsuLTs
GARCH MA(l),
 0.5
0.20
o.17
0.29
0.46
0.53
0.21
0.18
0.30
0.48
0.s6
0.11
0.27
0.57
0.89
0.97
0.10
0.27
0.56
0.88
0.96
0.13
0.26
0.s4
0.86
0.95
0.8
0.20
0.18
0.31
0.48
0.65
0.97
1.00
0.04
0.30
0.68
0.98
1.00
0.06
o.23
0.43
0.10
0.28
0.59
0.91
0.98
0.84 0.91
0.98
0.08
0.59
0.92
0.98
0.06
0.10
0.22
0.56
0.7e
0.06
0.10
0.20
0.46
0.67
.99
1.oo
.05
0.i3
.95
.10
0.18
0.3?
.90
.27
0.36
0.69
.80
.81
0.65
0.83
.10
.99
o.82
0.10
o.tl
0.36
0.69
0.86
1.00
.05
.10
.90
.27
.80
.81
.'10
.99
0.00
0.10
0.25
0.65
0.87
0.00
0.11
0.26
0.69
0.91
0.57
.95
1.oo
.05
.10
.90
.27
.80
.81
.'70
.99
0.01
o.12
0.30
o.7'1
0.9'1
0.01
0.11
0.30
0.79
0.97
0.00
0.11
0.27
0.69
0.93
0.26
.95
1.00
.05
,10
0.04
0.08
0.16
0.30
0.41
0.0s
0.09
0.7't
0.31
0.42
0.05
0.09
0.7't
0.33
0.45
0.04 0.09
0.10 0.11
o.20 0.25
0.40 0.s3
0.56 0.68
0.0s
0.09
0.15
0.28
0.37
0.05
.95
0.11
0.11
0.23
0.53
0.75
0.08
0.10
0.23
0.57
0.80
0.07
0.10
0.24
0.61
0.84
0.11 0.58
0.11 0.12
0.28 0.2't
0.72 0.70
0.94 0.91
0.06
0.10
0.22
0.48
0.69
0.07
0.10
0.09
0.16
0.36
0.5'1
0.07
0.08
0.74
0.36
0.s8
0.05
0.08
0.14
0.30
0.48
0.06
0.68
0.64
0.68
0.59
o.97
0.95
0.99
0.98
0.80
1.00
0.74
0.o7
0.34
0.31
0.31
0.73
0.04
0.30
0.68
0.69
0.99
o.97
1.00
0.06
0.87
0.82
,,10
0.51
0.7't
o.gz
0.s1
0.28
0.74
o.li
0.46
0.08
0.28
0.62
0.95
1.00
0.05
0.11
0.2s
0.65
0.89
0.02
o.17
0.36
0.63
0.76
Pi$)
AR(8)
Pi(.s\
0.10
AR(BIC)
0.17
Pi$)
0.07
ss(12)
0.17
0.40
0.'13
0.85
0.70
0.04
0.17
0.39
0.44
0.98
0.98
0.80
1.00
0.'12
1.00
1.00
0.8s
0.91
0.06
o.22
0.43
0.69
o.82
0.06 0.07
0.22 0.23
0.44 0.4'1
0.71 0.78
0.83 0.90
0.06
0.14
o.25
0,40
0.46
0.09
0.26
o.57
0.90
0.98
0.08 0.1 I
0.27 0.26
0.59 0.61
0.92 0.95
0.98 1.00
0.08
o.17
o.37
0.66
o.79
0.80
0.07
0.10
o.23
0.54
o.77
0.06 0.08
0.06
0.06
Pi(s)
0.06
Sc(auto)
0.19
DFGLS'(.5)
0.06
AR(8)
.
0.14
0.25
0.40
DFGLS'(.5)
0.07
AR(BIC)
0.16
0.37
0.68
0.10 0.13
0.11
0.r1
0.23 0.29
0.24
0.24
0.58 0.73
0.57
0.60
0.82 0.93
0.80
0.84
replications,
Power
.95
.05
.10
1.00
.90
.80
.81
.'70
.99
1.00
.05
.95
.10
.90
.27
.80
.81
.70
.99
1.00
.05
.9s
.09
.90
.19
.80
.61
.'70
.94
0.5
0.5
0.16
0.35
0.68
0.84
0.11
0.26
0.61
0.71
0.12
0.32
0.86
0.99
0.09
0.18
0.3s
0.48
GARCH MA(l),
Staiionary MA(l),
0J
0.19
0.14
o.23
o.37
0.46
0.o 05
0.19 0.13
0.15 0.15
0.25 0.25
0.40 0.42
0.48 0.51
0J
0.18
0.13
0.23
0.38
0.46
0"0
0J
0.18
0.13
0.24
0.39
0.48
0.14
0.13
0.11
0.17
0.34
0.66
0.83
0.08
0.16
0.34
0.68
0.84
0.06
0.17
0.37
0.73
0.89
0.10
0.14
0.29
0.61
0.78
0.07
0.14
0.30
0.63
0.81
0.05
0.00
0.11
0.27
0.68
0.89
0.03
0.12
0.29
0.74
0.93
0.77
0.11
0.20
0.31
0.25
0.00
0.09
0.21
0.51
0.73
0.03
0.10
0.25
0.64
0.85
0.77
0.01
0.11
0.30
0;76
o.97
0.05
0.11
0.30
0.81
0.98
0.51
0.11
0.30
0.80
0.97
0.01
0.10
0.23
0.62
0.87
0.04
0.10
0.25
0.70
0.93
0.49
0.05
0.08
0.15
0.31
0.43
0.05
0.09
0.16
0.32
0.45
0.0s
0.09
0.18
0.38
0.53
0.05
0.08
0.13
0.22
0.30
0.05
0.08
0.13
0.23
0.31
0.04
0.08
0.10
0.23
0.56
0.78
0.06
0.10
0.24
0.59
0.82
0.11
0.11
0.26
0.69
0.91
0.08
0.09
0.19
0.46
0.67
0.07
0.09
0.19
0.49
0.71
0.11
0.07
0.08
0.14
o.34
0.55
0.06
0.08
0.14
0.37
0.60
0.09
0.09
0.18
0.48
0.78
0.o7
0.08
0.15
0.36
0.58
0.05
0.08
0.15
0.39
0.64
0.09
0.25
0.42
0.50
0.14
0.32
0.68
0.86
0.09
0.16
0.25
0.20
0.10
0.24
0.63
0.82
0.09
0.15
0.26
0.33
0.47
DFz"
AR(BrC)
iy'ot?J: For each statislic, entries in the first row are the empirical
reiection rate under the null (the size). The remaining eDtries arc
sizeadjusted power under rhe model described in if,"
,,a"y.ptoii"'io*"i,;; is the locallounity
6sympt
power ror each statisric. The entrv helow the name
""jl"i'. " "otun,
of each
secrion 5). For the lol< t
in the final lhree columns, uo *u'" dr"*n r.om lts srarlonaiy ai;i;i;;'tl.
Based on 5000 Monte carlo
j
t
""i*i,''t
"irii.il"liii*re. rh;;;;;";;;;7';;;'i;:
0.05
.81
0.30
0.66
o.tz
0.10
0.16
0.37
0.60
0.76
.2't
.80
0.08
0.24
.90
0.13
0.03
0.31
0.18
0.16
0.26
0.41
0.49
AR(l), d 05 oJ
0.18 0.14 0.13 0.21 0.17
0.16 0.16 0.13 0.15 0.15
0.26 0.2't 0.25 0.2s O.24
0.41 0.44 0.45 0.38 0.38
0.s0 0.53 0.42 0.45 0.46
0.25
0.20
o.29
0.18
MA(l).0:

0.42
0.33
0.99
Asymptotic
Tesl
Statistic
0.8
0.16
0.15
0.26
0.40
0.48
0.0
0.29
0.70
0.27
0 Srationary
0.5  0.5 0.0
05
0.29
829
100
":
ARo), 6
o5
0.02
UNIT ROOT
TABLEIII
sslrcreo
LevuTEsrs,
0.8
STOCK
0.10
0.25
0.63
0.88
0.08
0.15
0.42
0.69
0.09
0.21
0.54
0.76
0.09
0.18
0.52
0.81
830
AUTOREGRESSIVE
0:0, the sizeadjusted powers of the tests using Iocal detrending exceed that of
DF4". In the constant mean case, the sizeadjusted powers of tests using local
demeaning exceed that of DF?,' for close but not distant alternatives. The
gains from employing localtounit estimates of the intercept appear to depend
crucially on the assumption that, under both the null and the alternative
let
O.ifttrcelementsoftheTxluectot'zandofttrcTxTmatr*Aareboundedinabsoluieualue,then,
(AD
Pnoor:
nt
s l  f
 *)z:
T@
/(
hence,
lll<KT.
>  _ V ) zl
lxt (
)f
llOll
O,
A2:
I
(l)
if c : T(a
posiriue and,
 l)
'u',>'il
is
*,
fued as T 
,a'l
St.,
O.
under Condition
rs
P^
,)ll.
'u',u
O.
Since
<f(l) <M;
T
94720, U.S.A.,
l*t yUr) be the autocovariance function and /(I) the spectral density function for the stationary
process {ur} satisfuing Condition A. The rs element of the Z x I covariance matrix .t is y(r s) :
l!, ei<'"ty1^1d,\. We shall approimate 5 t by the f x f matrix g w.irh rs element p(r  s) =
pI
is given by Davies (1973) and DzhaI!,ei('")^l4n2f(A)lr dA. The rc element of D=Irparidze (1985) as
tim rrx'(.5r
T)@
lD'Al<KTt/zllDll, and
D Dia,,l.2Lly?)lk
D lp(ilt..
j=@
rls:l
kt
under ConditionA,
accurate, these tests are essentially optimal among tests based on secondorder
sample moments and should perform considerably better than tests which
employ OLS estimates of the parameters determining d,. Our Monte Carlo
results suggest that the DickeyFuller / test applied to a locally demeaned or
detrended time series, using a datadependent lag length selection procedure,
has the best overall performance in terms of smallsample size and power.
The numerical finding that, as a practical matter, the asymptotic power
functions of the P.(.5) and the modified DickeyFuller , tests effectively lie on
the Gaussian power envelope indicates that, in large samples, there is little
room for improvement under the stochastic specification made here. Of course,
if the errors have a known nonnormal distribution or if the initial error zo is
large compared to ar, better tests could be constructed. Furthermore, the Monte
Carlo evidence suggests that autocorrelation in the u, can have very substantial
effects in small samples. Nevertheless, it appears that, when parameters in the
deterministic component of a series have to be estimated, the proposed tests for
a unit root dominate those currently in common use.
final
TT
tlDll:
LEMMA A1: Let 2 and 9 be TxT Toeplilz matices formed from y(k) and p(k), the Fouier
coeficients of 2rf(l) andl2rf(),))r, tcspectiuely. Let xbe a Tx 7 uectorsuch that lim,Tllxl:
1992;
B:lbijl,let
=LLlf:lbijl,<
squares regressions.
llBll
(A1)
6. coNcrusroNs
The P, and modified DickeyFuller , statistics are easily computed from least
and
Kennedy School of Gouemment, Haruard (Jniuercity, 79 John F. Kennedy
Cambidge, MA 02138, U.S.A.
831
hypotheses, only the early observations are informative about that parameter.
cA
UNIT ROOT
'i
:i
a7(k)
T_K T
=T' L L o,,"o,*0," :
TtlRAl
0 as 7+ o. Define
TK
T2(7 + cTr)k
,:1 s:1
a7(k)("2'l2c)/(2c)z
when
ar(k)
E (t + crr;2{'"(T  k  r)'
c*0
and
a7(k)+l/2when c:0,
832
s.=E=l_r*fl(k)o2
Since
Tztt[A,(>srr)A]:2L
and
AUTOREGRESSIVE
rr=L[*r_r+tp(k)
y(k)lar(&) a7(0)l 
t1
Tl
T2trlA, (V  rrl) A) : 2  p(k)ta7(k)
_ a1(0)l
k:1
0,
where
T2
0.
1A
o)
>A
and B, then
(A4)
(As)
plimT2 d' 1 Z
where
d_,
(0, db
..
.,
dr_
, :
I
plim
and Atl
d,_ r
(db d2
S r 4a
dr,
A, A]
0.
dr_ rl
g,
p(o)r
,o?
,f
T1
o?
[d,d,*od,_1d,_111,1
Tr
ad,\padl
 r)2
 a d'
,
>
tA
ZA,
E
 t < T
e2ktr ( E ) 12 (
E
 e
which.implies that plimT2d't>1u_t:0. For the second part of (A5), note that Au_ul
cTtu, so it suffices toverifoihat uur(Trd,_r2rul:T_rd,_1it_l]b.;;"li,Ttad,ad
+ 0 and lD'Al37r/z"tctllDll imply Tt ad,(2t _ v)u_r 5 0 ,in""
)l
d.
rl2
T
ld,1l,
d, _
rp
u_
rl < T
o)  a2lQr(d)  Qr\)l
t<T
k_@
p&)l L
o,
where,r, is the last column of X'. Under Condition A,Tt/2upz..aW,(s). By the continuous
mapping theorem and the Ito calculus,

/2
xE + hO, ilw,o)
Io'
n tt, e) + eh(s,
wherelr(s,c)isthevectorconsistingofthe4lfunctions
e))w,( s) ds
:I
h?)ldw,
 ew,l,
h,(s,e),H(s,d) isthevectoroffirst
derivatives dh/s,e)/ds, and the time index is dropped in the final term for notational convenience.
Since Iim T tZ'Z is block diagonal, we find that Qy(a) + u? + o2Q(d) vrhere
orrrroo rw))
o@ =Uh.aaw, ew,)f'
Un<etn,<ttf'11
Setting a to one, the same argument shows that QrG)  Qr1) * azlQc)  O(0)1.
The argument now follows the proof of Lemma A2. Because the elements 2,i of Z
polynomials in t, for all k and for all (i, j) pairs except i:i:1,
the terms
arc
Tk
2,,i21
a,il
rrZ'Z)
o.
mryl,)
t<T
o.
ana
+0and o2TtxtEtxaG(e).similarly,Tt(x'>x.2x'x)+0,so?llRxl20,whereR
G10)
k_
QY
and
o T
EIT d'  2
G) 
,,
Tk
+2\ oG)T, 
kl
,1
in (5).
T2d,_t>rd_1<r(>t)T2dt_i_t<r(Dt)max,.rdlyT0.
II Ad,(rt  ild_t:0. But, defining do =b, wi have
lzTt Ad,pd_i:7rld,vd d,_tpd_r Ad,1q.Adl
I
is given
uo
oY
!ZG'Z)'Z't
(,{6)
Z,
and
pnoor: Define the qxq diagonal matrix Nr: diag(j.1/2,7,r1,...,7,n') and the f x4
matru Z:ZuNr. Then, setting 6: AudTtu1:uf (ce)Ttu, we have QrG):
column of
plim T  Ad, E 1 u _,
dr
...,
Qy
2At
uo:(upuzdu1,...tu7au711
(A8)
Leuva A3: If
t
Q7k) : u'oZolZZ"l Z2u,,
LEMMA A4: Suppose the data are generated by (1) under Condition A and d,: B'2,, where
zi=(1,t,t2,...,tqt). Then QrG)  QrT) has a timitingdistibutionwhen c: T(o  1,) ande :T(a
l) are fixed as T tends to infinity. Furthemtore,
(43)
833
Define
(AT
T1
UNIT ROOT
So:7rtz*'rrt .211u !
O,
tr[E(s+sl)]
834
Definin_g
x'b) 3
(Ai1)
AUTOREGRESSIVE
STOCK
the chisquare vaiate Xz(u)=(e,rltu)2/e\Ire1, we find that 1*,2tE)z7xtEtxT tZ'> tZ is block diagonal, we have
0. Since lim
OYG): a2er(d)
 .ru? + oo0).
+ x2(u)
on d,
2Ttu', Au=T11u2
Au'
Aul:ftul.
(82)
e2
T
Lt
, E
, 
ZeT
a2T (d'_ t
2
d_
i2
d'_
 tt'
I
E' Au 
e2
y(0).
W,
/u)
converies in probability
(l)  ll.
u_
r)
2eT
t(
Ad,
>
Recalling that
L :
Ii :
B)
y;l 2,
minrL(u, B)
T
a:
>  Au + Ad, E
u_
u,o
E
uo
 e{
(a).
(83)
(T2u'p
u'
 Et
 1,7
2eT
u' _ t
uzr,grG)  ereD
I4 tt
=e2
Iw,'aw"z(7)
+ g(o)
 ,r([w,r,w,2(t),eG) O(0)),
 eG) + c.
For polynomial trend, L* is a function of c,  and 4; it does not depend on .E at all. when : 1 so
{
z,:1, Q(e):0 and the result is the same as in part (a).
(c) Wtren
(A10)
is given by (1 cs) so [h(e)2:t_ e +
/,.1 Fo+ Pl + ut, the tunction lr(s,c) in
"
e'/3
and lhG)G\zw,):Q,e)w,(l)+ezlswc(s). Afrer considerable algebraic manipulation,
we find the following alternative expressions for (84):
(Bs)
t e : r, I r:
(1,
e)w,2 O)
e2
/3),
azPr  s(d)
: izT
u'
,'ouo
 Qrk),
s(1)(1 + er t)
 ru  t
Or( d).
P,
a'.'le'lw,'
 a(e)]
ew,'<t>+ o(o)
= Lx
o2.
PRooF oF THEoREM 3: It suffices to show that A2Prl 0 *h", ?+ o with lal < 1 and fixed.
Since the initial condition is asymptotically negligible, {u,} behaves like a stationary process. Cf.
Anderson (1971, Section 5.5.2). Thus Tzu'rur3O urA f'u'r30. From (Ag), Tt/2XE:
0, so both
+ ?1c, we find
where Q is defined in (A10). Thus, from (A8) and the argument leading to (B2),
(84)
(86)
Z  Au + ey e)
 Oy G).
, For the polynomial trend of Lemma 44, we have from the continuous mapping theorem,
c2
d _ L + d, _ 
to the statistic on the left of (B2). But, from Lemma ,A3, these terms converge in probability to zero
under Condition B, so the limiting distribution is unchanged.
(b) From standard GLS projection theory and (A7)
min L(o,
(.B7)
>
I : o  d)/(l 
The limiting results (A10) and (B3) follow from the fact rhat Tt/2uL"rtalY,(s) and that
,' r' , L z(0). Since these limits also follow from Condition C, we have
ftrs,
elw,2
where
estimates. Thus the same interpretation holds when detrending is done with OIS.
z(0) +or(1)
835
ROOT
r)1tr,. By Iemma
4(s,e) is the limiting representation of the standardized detrended series 7rl2opolynomial
of
in
the
trend
case
are
asymptotically
estimates
equivalent
to GIS
B
44, OLS
The statistical theory underlying Theorem 1 can be found in Lehmann (1959, Chapter 6). The
limiting representations for the test statistics are derived as follows:
(a) Under Condition A, T1/2up,]
aWG) and hence a2(T2u,_ru_yTruzr) *
UNIT
 0  e + e, fil
tf<r
.)w,(1) *
REFERENCES
ANDERSoN, T. W. (1971): Tlrc Statistical Analysis of Time Series. New York: Wiley.
ANDREws, D. W. K. (1991): "Heteroskedasticity and Autocorrelation Consistent Covariance Matrix
Estimation," Economeltica, 59, 817858.
Baren:ee, A., J. J. DoLADo, J. W. GeIeRAtrH, AND D. F. HENDRv (1993): Coinegratiott, Enor
Corection, and the Econonenic Analysis of Nonstationaty Data. Ofrord: Oxford University Press.
BHencave, A. (1986): "On the Theory of Testing for Unit Roots in Observed Time Series," Reuiew
of Ecortotrtic Studies, 53, 369384.
CnaN, N. H., AND C. Z. WEt (1987): "Asymptotic Inference for Nearly Nonstationary AR(1)
Processes," Annals of Statistics, 15, 10501063.
Devres, R. B. (7973): "Asymptotic Inference in Stationary Gaussian TimeSeries," Aduances in
Applie d Pr oba bility, 5, 469  497.
DEJoNc, D. N., J. C. NnNrrnvrs, N. E. SAVIN, AND C. H. WHTTEMAN (1992): "The Power Problems
of Unit Root Tests in Time Series with Autoregressive Errors," Jotunal of Ecottontehics, 53,
323343.
r,
l rw,<rlf'
Dtcrry, D. A.,
cient in Linear Regressions with Stationary or Nonstationary AR(1) Errors," Jounul of Economettics, 47,115143.
DzHarantozr, K. (1985): Parameter Estimation and Hypothesis Testing in Spectral Analysis of Stationaty Time Series. New York: SpringerVerlag.
836
ELLlorr, G. (1993): "Efficient Tests for a Unit Root when the Initial Observation is Drawn from its
Unconditional Distribution," unpublished manuscript.
Ettcle, R. F. (1984): "Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics," in
Handbook of Econometics, Vol. II, ed. by Z. Griliches and M. Intriligator. New York: North
Holland.
SntxxoNru, P., AND R. Luuxxoxen (1993): "Point Optimal Tests for Testing the Order to
Differencing in ARIMA Models," Econometric Theory, 9, 343362.
Sancm, J. D., AND A. BHARcAvA (1983): "Testing Residuals from Least Squares Regression for
Being Generated by the Gaussian Random Walk," Econometica,5l,7531.74.
ScHwARz, G. (1978): "Estimating the Dimension of a Model," Annals of Statistics,6, 467464.
ScHWERT, G. W. (1989): "Tests for Unit Roots: A Monte Carlo Investigation," Joumal of Business
and Economic Statistics, 7, 147759.
Srocx, J. H. (1994): "Unit Roots and Trend Breaks in Econometrics," in Handbook of Econometics,
Vol.4, ed. by R. F. Engle and D. McFadden. New York: North Holland, pp.27402847.
Zvctu,ruxo, A. (1968): Tigonometic Seies, Vol.7. Cambridge: Cambridge University Press.
CONOME,TRICA
JOURNAL OF THE ECoNoMETRIC SOCIETY
(,1,
CONTENTS
Peren C. B. Psllttps: Econometric Model Determination . . . .
Gneueu ELLtort, Tuotrans J' RorHeNnsnc, AND Jar'aes H. Srocrc: Efficient
Tests for an Autoregressive Unit Root . . . .
Form;
Games
763
813
837
865
891
917
Mutations
943
... ..
Analysis
957
of
981
993
997
News Nores
SuavlssroN
MoNocnapH
998
Srntps
Socrerv
VOL.
64, NO.
4July,
1996