You are on page 1of 56

Heteroskedasticity, Autocorrelation, and Spatial Correlation

Robust Inference in Linear Panel Models with Fixed-Eects


Timothy J. Vogelsang

Departments of Economics, Michigan State University


December 2008, Revised June 2011
Abstract
This paper develops an asymptotic theory for test statistics in linear panel models that are
robust to heteroskedasticity, autocorrelation and/or spatial correlation. Two classes of standard
errors are analyzed. Both are based on nonparametric heteroskedasticity autocorrelation (HAC)
covariance matrix estimators. The rst class is based on averages of HAC estimates across
individuals in the cross-section, i.e. "averages of HACs". This class includes the well known
cluster standard errors analyzed by Arellano (1987) as a special case. The second class is
based on the HAC of cross-section averages and was proposed by Driscoll and Kraay (1998).
The "HAC of averages" standard errors are robust to heteroskedasticity, serial correlation and
spatial correlation but weak dependence in the time dimension is required. The "averages of
HACs" standard errors are robust to heteroskedasticity and serial correlation including the
nonstationary case but they are not valid in the presence of spatial correlation. The main
contribution of the paper is to develop a xed-/ asymptotic theory for statistics based on both
classes of standard errors in models with individual and possibly time xed-eects dummy
variables. The asymptotics is carried out for large time sample sizes for both xed and large
cross-section sample sizes. Extensive simulations show that the xed-/ approximation is usually
much better than the traditional normal or chi-square approximation especially for the Driscoll-
Kraay standard errors. The use of xed-/ critical values will lead to more reliable inference in
practice especially for tests of joint hypotheses.
Keywords: panel data, HAC estimator, kernel, bandwidth, xed-/ asymptotics

I am grateful to Todd Elder, Emma Iglesias, Gary Solon and Je Wooldridge for helpful conversations, and I
thank Silvia Gonalves and Chris Hansen for suggestions and comments on preliminary drafts. I thank seminar
participants at Cornell, U. Michigan, Michigan State, U. Montreal, Purdue for helpful comments. Young Gui Kim
provided excellent research assistance with the simulations and proofread earlier versions of the paper. Portions
of this research were supported by the National Science Foundation through grant SES-0525707. Correspondence:
Department of Economics, Michigan State University, 110 Marshall-Adams Hall, East Lansing, MI 48824-1038.
Phone: 517-353-4582, Fax: 517432-1068, email: tjv@msu.edu
1 Introduction
Since the inuential work of White (1980) on heteroskedasticity robust standard errors 30 years
ago, it has become standard practice in empirical work in economics to use standard errors that
are robust to potentially unknown variance and covariance properties of the errors and data. In
pure cross-section settings it is now so standard to use heteroskedasticity robust standard errors
that authors often do not indicate they have used robust standard errors. In time series regression
the use of heteroskedasticity and serial correlation robust standard errors is routine with authors
usually indicating that they used Newey and West (1987) standard errors. In panel models where
cross-section individuals are followed over time, the so-called panel cluster standard errors (see
Arellano (1987)) are appealing because they are robust to heteroskedasticity in the cross-section and
quite general forms of serial correlation over time including some nonstationary cases. Even though
panel clustered standard errors are covered by graduate level textbooks (see Wooldridge (2002)),
Bertrand, Duo and Mullainathan (2004) found that surprisingly few empirical studies based on
panels with relatively many time series observations used clustered standard errors. The situation
in the empirical nance literature is similar as reported by Petersen (2009).
The validity of panel clustered standard errors requires that individuals in the cross-section be
uncorrelated with each other, i.e. no spatial correlation in the cross-section. Typically spatial corre-
lation is ignored although sometimes the cross-section can be divided into groups or clusters where
it is assumed that individuals within a cluster are correlated but individuals between clusters are
uncorrelated. Then, standard errors can be congured that are robust to the cross-sectional clus-
tering. See Wooldridge (2003) for a useful discussion of cluster methods and additional references.
A recent paper by Bester, Conley and Hansen (2011) provides an interesting analysis of cluster
standard errors in a general setting where the number of clusters is held xed in the asymptotics.
In some cases an empirical researcher may have a distance measure for pairs of individuals in the
cross-section such that the spatial correlation is decreasing in distance. In a pure time series setting
with stationarity, distance in time is the natural distance measure. When a distance measure is
available (or can be estimated) in a spatial setting, robust standard errors can be obtained using the
approaches of Conley (1999), Kelejian and Prucha (2007), Bester, Conley, Hansen and Vogelsang
(2008) or Kim and Sun (2011) which are extensions of nonparametric kernel heteroskedasticity
autocorrelation consistent (HAC) robust standard errors to the spatial context. For the case of
linear panel models with individual and time dummy variables, a recent paper by Kim (2010)
provides results on kernel HAC standard errors. A distance measure in the cross-section is needed
to implement the approach of Kim (2010).
Suppose that a distance measure is either not readily available or is unknown for the cross-
section of the panel. If the time series dimension of the panel is stationary, then it is possible to
1
obtain standard errors that are robust to spatial correlation. These standard errors remain robust
to heteroskedasticity and serial correlation. When information in the time dimension is substantial,
relative to the information in the cross-section, the form of the unknown spatial correlation can be
quite general. Several approaches appropriate for this situation can be found in the literature.
One approach is to group the data, estimate the model within each group and average the
estimators across groups. As shown by Ibragimov and Mller (2010), t-statistics constructed using
these average of estimators will be robust to general forms of correlation within each group. A
special case of this approach is the Fama and MacBeth (1973) estimator of a panel model. The
model is estimated for each time period which generates a time series of cross-section estimates for
each parameter. The cross-section estimates are then averaged across time and robustness to serial
correlation can be obtained using HAC standard errors. The reason this approach is robust to
spatial correlation in the cross-section is because the cross-section estimation collapses the spatial
correlation into a single cross-section variance (the variance of a cross-section average) at each time
period.
A second approach is to estimate the panel regression by pooled OLS and use the robust
standard errors proposed by Driscoll and Kraay (1998). These standard errors are computed by
taking cross-section averages of products of the regressors and residuals and then computing a
HAC estimator with these cross-section averages. Driscoll and Kraay (1998) establish consistency
of these standard errors as the cross-section and time dimension sample sizes increase under mixing
conditions that limit the dependence in the data both in the time and cross-section dimensions.
While the Fama and MacBeth (1973) and Driscoll and Kraay (1998) approaches deliver ro-
bustness to spatial correlation and serial correlation in the panel, each approach has important
limitations in practice. If there are individual xed-eects that are correlated with the regressors,
the Fama-Macbeth estimator is inconsistent. For that reason, one rarely sees the Fama-Macbeth
estimator used in economics applications
1
. There is a simple approach very similar in spirit to
Fama-Macbeth that easily handles the xed-eect. If one averages the model across the cross-
section, a pure time series regression is obtained in the cross-section averages. As long as an
intercept is included in the averaged model, the regression estimators will be exactly invariant
to the xed-eects and OLS will be consistent under usual exogeneity assumptions. The spatial
correlation and heteroskedasticity is collapsed by the cross-section averaging and pure time series
HAC robust standard errors can be used to handle serial correlation in the time dimension. The
drawback to this approach is that information within the cross-section is lost upon averaging and
this leads to less ecient estimates (perhaps substantially so) compared to Pooled OLS.
Pooled OLS can be made exactly invariant to individual xed-eects by the inclusion of individ-
1
The Fama-Macbeth approach was motivated and designed for nance applications. See Petersen (2009) for a
more detailed discussion.
2
ual xed-eects dummy variables leading to the well known xed-eects estimator. Unfortunately,
the mixing conditions used by Driscoll and Kraay (1998) do not hold for the xed-eects esti-
mator although a recent paper by Gonalves (2011) establishes consistency of the Driscoll and
Kraay (1998) standard errors for the xed-eects estimator for general forms of weakly dependent
cross-section correlation. No results on the consistency of Driscoll and Kraay (1998) standard errors
appear available in the literature when time period dummies are included in the model.
This paper provides an in-depth analysis of the Driscoll and Kraay (1998) heteroskedasticity
autocorrelation spatial correlation (HACSC) robust standard errors in linear panel regressions
estimated by xed-eects. Results are also provided in the case where time period dummies are also
included in the model. The analysis is carried out using the xed-/ asymptotic framework developed
by Kiefer and Vogelsang (2005) for HAC based tests. The advantage of the xed-/ approach
over the standard asymptotic approach used by Driscoll and Kraay (1998) is that xed-/ delivers
approximations for test statistics that depend on the choice of kernel and bandwidth required to
implement the HACSC robust standard errors. This is important because both choices can have a
big impact on the sampling behavior of the standard errors. Merely establishing consistency of the
standard errors, which can be dicult theoretically, will not capture any of the inuence of choice
of kernel or bandwidth on the sampling behavior of the standard errors.
The main theoretical result of the paper is to show that test statistics constructed using xed-
eects estimators and the HACSC robust standard errors are valid and have the same xed-/
asymptotic distributions as found by Kiefer and Vogelsang (2005) in pure time series settings.
Therefore xed-/ critical values can be used for t and Wald tests. The xed-/ result requires weak
dependence in the time dimension and holds as the time dimension sample size, T, goes to innity.
The cross-section sample size, :, is can be xed or can go to innity. In the xed-: case, the
form of spatial correlation in the cross-section can take on very general forms and can be unknown.
In the large-: case the spatial correlation in the cross-section is required to be weakly dependent.
Knowledge of the distance measure itself is not required in either case. This paper does not consider
the case where T is xed and : is large, i.e. traditional short panels. With T xed, there is not
sucient information in the time dimension relative to the cross-section dimension for the Driscoll
and Kraay (1998) approach to work. In this case one would need knowledge of the distance measure
in the cross-section to construct valid standard errors. Opposite to the large-T, xed-: case, the
spatial correlation would need to have weak dependence, but the form of serial correlation could
be quite general.
The results depend on exogeneity assumptions in important ways. In the xed-:, large-T case,
only weak exogeneity (in both the cross-section and time dimensions) is required for the xed-eects
estimator. However, if time period dummies are also included, strict exogeneity in the cross-section
dimension is required but weak exogeneity in the time dimension is allowed. In the large-:, large-T
3
case, inclusion of individual xed-eect dummies requires strict exogeneity in the time dimension
although weak exogeneity is still permitted in the cross-section. If time period dummies are also
included, strict exogeneity in the cross-section is also needed.
For completeness the paper also provides a xed-/ analysis for a general class of standard
errors that include the cluster standard errors of Arellano (1987) as a special case. Results are
provided for large-T and xed-: and generalize/extend Theorem 4 of Hansen (2007) to include
general kernels and bandwidths while relaxing the strict exogeneity assumption implicitly used by
Hansen (2007). This class of standard errors will only be valid when there is no spatial correlation
in the cross-section.
Because the ultimate goal of this paper is to convince empirical researchers to use HACSC
robust standard errors along with xed-/ critical values in appropriate panel settings, an important
practical contribution of the paper is to provide a simple numerical method for computing xed-/
critical values and j-values for t-tests and Wald tests (for testing up to 50 restrictions) for the case
of the Bartlett kernel (Newey-West). This numerical method is accurate and removes the need to
simulate the xed-/ critical values on a case-by-case basis.
The remainder of the paper is organized as follows. The next section describes the models
and standard errors. Using a time series perspective, the dierences and similarities between
traditional cluster standard errors and HACSC robust standard errors are highlighted. Section
3 develops xed-/ asymptotic results for tests based on the robust standard errors. The nite
sample performance of the various tests is examined in Section 4 using simulation methods. The
use of xed-/ critical values improves the performance of the tests over the standard normal and
chi-square approximations. The presence of some forms of spatial correlation causes tests based
on the traditional cluster standard errors to severely over-reject under the null hypothesis. The
HACSC robust standard errors exhibit robustness to spatial correlation unless the serial correlation
is strong relative to the time dimension sample size. Power simulations show that when there is
no spatial correlation in the model, the traditional cluster standard errors lead to tests with more
power than those based on HACSC robust standard errors. The power simulations also show that
the bandwidth choice has a noticeable impact on power. Section 5 provides an empirical application
based on divorce rate data analyzed by Wolfers (2006). Several of Wolfers main empirical ndings
are shown to be robust when HACSC robust standard errors are used although it is shown that
inference is very sensitive to the choice of bandwidth when computing the robust standard errors.
Failure to use xed-/ critical values would lead to a misleading sense of precision in the estimates
of the parameters. Conclusions are given in Section 6 and formal proofs are provided in Appendix
A. Appendix B describes the simple numerical method for computing xed-/ critical values and
j-values.
4
2 Linear Panel Models: Estimation and Standard Errors
Consider a standard xed-eects panel model given by
j
it
= r
0
it
, + a
i
+ n
it
, i = 1, 2, ..., : t = 1, 2, ..., T (1)
where j
it
, a
i
and n
it
are scalars and r
it
and , are / 1 vectors. Often time period xed-eects are
included which gives the model
j
it
= r
0
it
, + a
i
+ )
t
+ n
it
. (2)
A more general model might include individual time trends or, if time period xed-eects are
not included, common time trends. The asymptotic results in the paper remain unchanged when
additional time regressors are included however the results do not apply to the estimated coecients
on the time trend variables themselves. The focus is on estimation and inference about ,. Consider
the xed-eects ordinary least squares (OLS) estimator of , given by

, =
_
n

i=1
T

t=1
r
it
r
0
it
_
1
n

i=1
T

t=1
r
it
j
it
(3)
where in model (1)
j
it
= j
it
j
i
, r
it
= r
it
r
i
,
with j
i
= T
1

T
t=1
j
it
and r
i
= T
1

T
t=1
r
it
. In model (2) we have
j
it
= j
it
j
i

1
:
n

j=1
(j
jt
j
j
), r
it
= r
it
r
i

1
:
n

j=1
(r
jt
r
j
).
Plugging in for j
it
using (1) or (2) gives

, , =
_
n

i=1
T

t=1
r
it
r
0
it
_
1
n

i=1
T

t=1
r
it
n
it
.
Let
it
= r
it
n
it
and dene
it
= r
it
n
it
where n
it
are the OLS residuals given by
n
it
= j
it
r
0
it

,.
Dene the partial sums of
it
as

o
i[rT]
=
[rT]

t=1

it
where r (0, 1] and [rT] is the integer part of rT. The sample variance-covariance matrix used to
obtain the traditional cluster standard errors takes the sandwich form and is given by
T
_
n

i=1
T

t=1
r
it
r
0
it
_
1
_
n

i=1
T
1

o
iT

o
0
iT
__
n

i=1
T

t=1
r
it
r
0
it
_
1
,
5
and it is easy to show that the middle term is a special case of a more general class of variance-
covariance matrix estimators.
Let

(i)
j
= T
1
T

t=j+1

it

0
itj
,
and dene

i
=

(i)
0
+
T1

j=1
/
_
,
'
_
_

(i)
j
+

(i)0
j
_
which is the nonparametric kernel HAC estimator for cross-section individual i using the kernel,
/(r), and bandwidth '. An equivalent expression for

i
is given by

i
= T
1
T

t=1
T

s=1
1
ts

it

0
is
, (4)
where
1
ts
= /
_
[t :[
'
_
(5)
Consider the sample variance-covariance matrix

\
ave(n)
= T
_
n

i=1
T

t=1
r
it
r
0
it
_
1
_
n

i=1

i
__
n

i=1
T

t=1
r
it
r
0
it
_
1
where the subscript, ac(:), indicates that the "average" of the : individual-by-individual HAC
estimators is used to construct the middle matrix of the sandwich. For the case where /(r) = 1
for [r[ _ 1 and /(r) = 0 otherwise, i.e. /(r) is the truncated kernel, it is well known that

i
= T
1

o
iT

o
0
iT
when ' = T giving exactly the variance estimator for the traditional clustered
standard errors. Obviously, the traditional cluster standard errors are a special case of "cross-
section averages of HACs" standard errors. Petersen (2009) discusses

\
ave(n)
in a simulation study
of robust standard errors used in nance applications.
The statistical properties of test statistics constructed using the traditional clustered standard
errors are well developed in the econometrics literature beginning with Arellano (1987). A recent
paper by Hansen (2007) provides a thorough analysis that extends the traditional xed-T, large-
: results to include large-T, large-: results and large-T, xed-: results. Hansen (2007) did not
provide results for the case where time period dummies are included as in (2). A key assumption in
showing that the traditional clustered standard errors are valid is an assumption that the regressors
and the error term are independent across i. This assumption rules out the possibility of correlation
across individuals, i.e. spatial correlation in the cross-section is not allowed. When there is spatial
correlation in the cross-section, traditional clustered standard errors are no longer valid.
6
As shown by Driscoll and Kraay (1998), it is possible to obtain standard errors in a panel
model that are robust to general forms of spatial correlation in the cross-section. These standard
errors retain robustness to heteroskedasticity and serial correlation although the serial correlation
needs to be weakly dependent. Stack the / 1 vectors
1t
,
2t
, ...,
nt
into an :/ 1 vector
t
with
transpose
0
t
= [
0
1t
,
0
2t
, ...,
0
nt
] .Dene

t
=
n

i=1

it
.
Note that

t
is : times the cross-section average of
it
. Suppose we compute a HAC estimator
using

t
as follows:

0
+
T1

j=1
/
_
,
'
__

j
+

0
j
_
,

j
= T
1
T

t=j+1

0
tj
. (6)
An equivalent expression for

is given by

= T
1
T

t=1
T

s=1
1
ts

0
s
,
where 1
ts
is dened by (5). When

is used for the middle term of the estimate variance-covariance
matrix, we obtain

\
HACSC
= T
_
n

i=1
T

t=1
r
it
r
0
it
_
1

_
n

i=1
T

t=1
r
it
r
0
it
_
1
.
Note that

\
HACSC
and

\
ave(n)
are very similar except that

\
HACSC
uses the "HAC of the cross-
section averages" whereas

\
ave(n)
uses "cross-section averages of HACs". Note that putting full
weight on all the sample autocovariances is not an option in practice for

because in this case

0
+
T1

j=1
_

j
+

0
j
_
= T
1
o
T

o
0
T
= 0,
using

o
T
=
T

t=1

t
=
T

t=1
n

i=1

it
=
T

t=1
n

i=1
r
it
n
it
= 0. Using full weights on the sample autocovariances
gives an estimator that is identically zero for any data set.
An interesting twist on kernel HAC estimators that are special cases of

\
ave(n)
and

\
HACSC
has
been analyzed recently by Bester et al. (2011) (BCH). BCH consider covariance matrix estimators in
general situations where the data can be divided into clusters that are asymptotically independent.
In the time dimension the data can be clustered in the following general manner. Divide the time
dimension into G contiguous (non-overlapping) exhaustive groups (time clusters) corresponding to
the time periods 1, ..., [`
1
T], [`
1
T] + 1, ..., [`
2
T], ..., [`
G1
T] + 1, ..., T where 0 < `
1
< `
2
<
7
... < `
G1
< 1. Set `
0
= 0 and `
G
= 1. Let 1
ts
= 1 if time periods t and : are in the same
time period cluster and let 1
ts
= 0 otherwise. Plugging into expressions (4) and (6) leads to the
BCH-HAC estimators

BCH
i
= T
1
T

t=1
T

s=1
1
ts

it

0
is
= T
1
G

g=1
_
_
_
_
[gT]

t=[
g1
T]+1

it
_
_
_
_
[gT]

t=[
g1
T]+1

0
it
_
_
_
_
,

BCH
= T
1
T

t=1
T

s=1
1
ts

0
s
= T
1
G

g=1
_
_
_
_
[gT]

t=[
g1
T]+1

t
_
_
_
_
[gT]

t=[
g1
T]+1

0
t
_
_
_
_
.
These estimators can be viewed as a generalizations of the variance covariance matrix estimator
used by Gtze and Knsch (1996) in the block bootstrap.
3 Inference and Asymptotic Approximations
This section denes the test statistics and derives the asymptotic behavior of the tests under null
hypotheses involving linear restrictions on the , vector. Results for large-T, xed-: and large-
T, large-: are treated separately as they require dierent regularity conditions. Throughout, the
symbol = denotes weak convergence of a sequence of stochastic processes to a limiting stochastic
process and
d
denote convergence in distribution.
3.1 The Test Statistics and Denitions
Consider testing linear hypotheses about , of the form
H
0
: 1, = r,
where 1 is a / matrix of known constants with full rank with _ / and r is a 1 vector of
known constants. Dene the Wald statistics as
\a|d
ave(n)
= (1

, r)
0
_
1

\
ave(n)
1
0
_
1
(1

, r),
\a|d
HACSC
= (1

, r)
0
_
1

\
HACSC
1
0
_
1
(1

, r).
In the case where = 1 we can dene the t-statistics
t
ave(n)
=
(1

, r)
_
1

\
ave(n)
1
0
, t
HACSC
=
(1

, r)
_
1

\
HACSC
1
0
.
Asymptotic approximations of the null distributions of the \a|d and t statistics are obtained using
large-T asymptotics. This asymptotic approach is convenient and useful in this context because it
highlights the crucial role played by the assumptions of covariance stationarity and weak dependence
8
in the time dimension in generating asymptotic invariance to general forms of spatial correlation. In
addition, the use of large-T asymptotics allows the standard errors to be approximated within the
xed-/ asymptotic framework of Kiefer and Vogelsang (2005) which captures the choice of kernel
and bandwidth in the asymptotic approximation. With regards to :, results are presented for
xed-: and large-: and the assumptions depend on how : is treated in the asymptotic analysis.
Because the asymptotic distributions of the statistics depend on the form of the kernel used to
compute the HAC estimators, some notation needs to be dened. Let / 0 be an integer. The
following denition denes some random matrices that appear in the asymptotic results.
Denition 1 Let 1
h
(r) denote a generic / 1 vector of stochastic processes. Let the random
matrix, 1(/, 1
h
), be dened as follows for / (0, 1]. Case (i): if /(r) is twice continuously
dierentiable everywhere,
1(/, 1
h
) =
_
1
0
_
1
0
1
/
2
/
00
(
r :
/
)1
h
(r)1
h
(:)
0
drd:
+
1
/
_
1
0
/
0
_
1 r
/
_
_
1
h
(1)1(r)
0
+ 1
h
(r)1
h
(1)
0
_
dr + 1
h
(1)1
h
(1)
0
.
Case (ii): if /(r) is continuous, /(r) = 0 for [r[ _ 1, and /(r) is twice continuously dierentiable
everywhere except for [r[ = 1,
1(/, 1
h
) =
__
jrsj<b
1
/
2
/
00
(
r :
/
)1
h
(r)1
h
(:)
0
drd: +
/
0

(1)
/
_
1b
0
_
1
h
(r + /)1
h
(r)
0
+ 1
h
(r)1
h
(r + /)
0
_
dr
+
1
/
_
1
1b
/
0
_
1 r
/
_
_
1
h
(1)1
h
(r)
0
+ 1
h
(r)1
h
(1)
0
_
dr + 1
h
(1)1
h
(1)
0
,
where /
0

(1) = lim
!0
[(/(1) /(1 c)) ,c], i.e. /
0

(1) is the derivative of /(r) from the left at


r = 1. Case (iii): if /(r) is the Bartlett kernel, /(r) = 1 [r[ for [r[ _ 1 and /(r) = 0 for [r[ _ 1,
1(/, 1
h
) =
2
/
_
1
0
1
h
(r)1
h
(r)
0
dr
1
/
_
1b
0
_
1
h
(r + /)1
h
(r)
0
+ 1
h
(r)1
h
(r + /)
0
_
dr

1
/
_
1
1b
_
1
h
(1)1
h
(r)
0
+ 1
h
(r)1
h
(1)
0
_
dr + 1
h
(1)1
h
(1)
0
.
Denition 2 For the BCH-HAC estimators dene the random matrix, 1(`
0
, `
1
.., `
G
, 1
h
), as
1(`
0
, `
1
.., `
G
, 1
h
) =
G

g=1
(1
h
(`
g
) 1
h
(`
g1
)) (1
h
(`
g
) 1
h
(`
g1
))
0
.
3.2 Large-T, Fixed-: Results
This subsection analyzes the asymptotic properties of the test statistics in the large-T, xed-: case.
All limits in this section are taken as T with : held xed. The following three high level
assumptions are sucient for obtaining results for the xed-eects estimator based on model (1).
9
Assumption 1 : T
1=2
[rT]

t=1
n
it
= O
p
(1).
Assumption 2 : j limT
1
T

t=1
r
it
= j
i
= 1(r
it
) and j limT
1
[rT]

t=1
r
it
r
0
it
= rQ
i
for r (0, 1] where
Q =
n

i=1
Q
i
and Q is nonsingular.
Dene the / 1 vector
ii
t
= (r
it
j
i
)n
it
. Stack the vectors
11
t
,
22
t
, ...,
nn
t
to form the :/ 1
vector of time series
t
with transpose
0
t
=
_

110
t
,
220
t
, ...,
nn0
t

.
Assumption 3 : 1(n
it
[r
it
) = 0 and T
1=2
[rT]

t=1

t
= \(r), where \(r) is an :/ 1 vector of
standard Wiener processes and
0
is the :/ :/ long run variance matrix (2 times the zero
frequency spectral density matrix) of
t
.
To handle the case where time period xed-eects are also included (model (2)), Assumption 3
needs to be strengthened as follows. Dene the / 1 vector
ij
t
= (r
it
j
i
)n
jt
. For a given ,
stack the vectors
1j
t
,
2j
t
, ...,
nj
t
into an :/1 vector
j
t
with transpose
j0
t
=
_

1j0
t
,
2j0
t
, ...,
nj0
t
_
and
then stack the vectors
1
t
,
2
t
, ...,
n
t
into an :
2
/ 1 vector
ex
t
with transpose
ex0
t
=
_

10
t
,
20
t
, ...,
n0
t

where the "cr" superscript denotes an extended vector that includes vectors
ij
t
for i ,= ,.
Assumption 4 : 1(n
it
[r
jt
) = 0 for all i, , = 1, 2, ..., : and T
1=2
[rT]

t=1

ex
t
=
ex
\
ex
(r), where
\
ex
(r) is an :
2
/ 1 vector of standard Wiener processes and
ex

ex0
is the :
2
/ :
2
/ long run
variance matrix (2 times the zero frequency spectral density matrix) of
ex
t
.
Both sets of assumptions hold under covariance stationarity and weak dependence in the time di-
mension. Assumption 1 essentially requires that the regression error satisfy a functional central
limit theorem (FCLT). Assumption 2 requires that the sample mean and sample variance-covariance
matrix of the regressors across time have well dened limits. The form of Q
i
depends on the form
of dummies included in the model. Assumption 3 allows weak exogeneity in the cross-section and
over time and requires that a FCLT holds for
t
. Assumption 4 requires strict exogeneity in the
cross-section but allows weak exogeneity
2
over time and a FLCT is needed
3
for the extended vector
2
Note that the exogeneity assumptions rule out cases where lagged dependent variables (time and/or spatially
lagged) are included in the model and the errors are allowed to have correlation over time and/or in the cross-section.
In this situation one would need valid instruments to obtain consistent estimators of and inference could be carried
out using results in Kelejian and Prucha (2007) provided a distance measure is available for the cross-section and
that instruments and regressors (besides the lagged dependent variables) are nonrandom.
3
A sucient condition for a FCLT to hold for the vt or v
ex
t
vectors is that these vector of time series are covariance
stationary and have one-summable autocovariance functions. This includes the class of covariance stationary vector
autoregressive moving average models.
10

ex
t
. Because Q
i
is not restricted to be the same for all i and because the forms of
0
and
ex

ex0
are not restricted to be block diagonal, the assumptions permit heterogeneity in the conditional
heteroskedasticity and serial correlation while allowing general forms of spatial correlation. Sta-
tionarity is not required in the cross-section. This is dual to the xed-T, large-: situation where
the assumption of random sampling in the cross-section allows general forms of serial correlation
in model, including nonstationarity.
Assumptions 3 and 4 indicate that the form of exogeneity needed depends on whether time
period dummies are included in the model. Without time period dummies, only weak exogeneity
is needed in both the time and cross-section dimensions. When time period dummies are included,
strict exogeneity is needed in the cross-section while weak exogeneity is still permitted in the time
dimension.
Let 1
h
denote a // identity matrix and let i denote an :1 vector of ones. Let c
i
denote an
:1 vector with i
th
element equal to one and zeros otherwise, i.e. c
i
= (0, 0, ...0, 1, 0, ..., 0)
0
. Dene
c
i
= c
i
i(i
0
i)
1
i
0
c
i
= c
i

1
n
i. The following two theorems summarize the theoretical results for
the large-T, xed-: case.
Theorem 1 Suppose the model is estimated with individual xed-eects but no time period dum-
mies are included. Suppose Assumptions 1,2 and 3 hold. For the statistics that use a kernel HAC
estimator assume ' = /T where / (0, 1] is xed. For the statistics based on the BCH standard
errors assume G is xed. Let

1
i
k
(r) denote a / 1 vector of stochastic processes dened as

1
i
k
(r) = 1
i
k
(r) rQ
i
Q
1
1
k
(1),
where 1
i
k
(r) =
i
\(r),
i
is a nonstochastic matrix given by
i
= (c
0
i
1
k
) and 1
k
(1) =
n

i=1
1
i
k
(1) = \(1) where =
n

i=1

i
. Let \

q
(r) denote a 1 vector of independent standard
Wiener processes and dene

\

q
(r) = \

q
(r) r\

q
(1) to be a 1 vector of standard Brownian
bridges. For : xed as T the following hold:
_
T(

, ,)
d
Q
1
1
k
(1),
\a|d
ave(n)
d

_
1Q
1
1
k
(1)
_
0
_
1Q
1
C
k
Q
1
1
0
_
1
1Q
1
1
k
(1),
t
ave(n)
d

1Q
1
1
k
(1)
_
1Q
1
C
k
Q
1
1
0
,
\a|d
HACSC
d
\

q
(1)
0
C
1
q
\

q
(1), t
HACSC
d

1
(1)
_
C
1
,
11
where
C
k
=
_

_
n

i=1

1
i
k
(1)

1
i
k
(1)
0
, for traditional clustering
n

i=1
1(/,

1
i
k
), for kernel HAC
n

i=1
1(`
0
, `
1
.., `
G
,

1
i
k
), for BCH,
and for / = , 1
C
h
=
_
1(/,

h
), for kernel HAC
1(`
0
, `
1
.., `
G
,

h
), for BCH.
Corollary 1 Suppose the BCH standard errors are congured so that the G groups have the same
number of observations, i.e. `
g
=
g
G
for q = 0, 1, 2, ..., G. Then the the conditions for part (iii) of
Proposition 1 of Bester et al. (2011) hold in the time dimension and it follows that
\

q
(1)
0
C
1
q
\

q
(1) ~
G
G
1
q;Gq
,
\

1
(1)
_
C
1
~
_
G
G1
t
G1
,
where 1
q;Gq
is an 1 random variable with degrees of freedom in the numerator and G degrees
of freedom in the denominator and t
G1
is a t random variable with G1 degrees of freedom.
Theorem 1 provides some interesting insights into using robust standard errors in xed-eect panel
models. The results for the ac(:) statistics formally show that result of Theorem 4 of Hansen
(2007) for the traditional cluster standard errors holds under the assumption of weak exogeneity in
the time dimension and generalizes his result to the case where spatial correlation is permitted in
the model. The results for the ac(:) statistics also extend the results of Hansen (2007) to general
classes of kernels and bandwidths. Theorem 1 shows that the ac(:) statistics have limiting null
distributions that depend on nuisance parameters when there is spatial correlation in the model
(i.e. the statistics are not asymptotically pivotal). Therefore, the ac(:) statistics are generally
not valid in the presence of spatial correlation. This is not surprising given that these statistics are
designed for the case where cross-section individuals are uncorrelated with each other.
In contrast, the HCoC statistics have asymptotic distributions that do not depend on nui-
sance parameters, i.e. the HCoC statistics are asymptotically pivotal, in the presence of spatial
correlation. Therefore, the HCoC statistics have broader robustness properties with respect to
correlation in the model. For the kernel HCoC statistics the limiting distributions are identical
to the pure time series results obtained by Kiefer and Vogelsang (2005). This may not be obvious
at rst glance because the form of the random matrix 1
q
(/,

q
) follows from Hashimzade and
Vogelsang (2008) and is more complicated than the form in Kiefer and Vogelsang (2005). However,
because

\

q
(1) = 0, the formula for 1
q
(/,

q
) simplies to the exact same form as in Kiefer and
12
Vogelsang (2005). The limiting distributions are nonstandard, but critical values are easily ob-
tained using simulation methods. A simple and accurate numerical method for computing critical
values and j-values for the Bartlett kernel is provided in Appendix B. Stata code that implements
xed-/ critical values and j-values for the newey, xtscc and test Stata commands is available on
the authors webpage: https://www.msu.edu/~tjv/working.html or by request.
As Corollary 1 shows, if the BCH HCoC statistics are implemented using equal sized groups
in the time dimension, the limiting distributions are scaled 1 and t random variables. Therefore,
critical values and j-values can be computed using standard tables.
Because the ac(:) statistics are designed for situations where there is no spatial correlation in
the model, it is instructive to see how the limiting expressions in Theorem 1 simplify when random
sampling in the cross-section is assumed. In this case would simplify to a block diagonal matrix
with : identical / / matrices

0
along the block diagonal and the Q
i
matrices would be the same
for all i leading to the following Corollary.
Corollary 2 Suppose there is random sampling in the cross-section. The results of Theorem 1
simplify as follows:
\a|d
ave(n)
d
\
q
(1)
0
C
1
q
\
q
(1), t
ave(n)
d
\
1
(1),
_
C
1
,
where \
q
(1) =

n
i=1
\
i
q
(1), \
i
q
(r) are 1 vectors of independent Wiener processes that are
independent of each other and for / = 1,
C
h
=
_

_
n

i=1

\
i
h
(1)

\
i
h
(r)
0
, for traditional clustering
n

i=1
1(/,

\
i
h
), for kernel HAC
n

i=1
1(`
0
, `
1
.., `
G
,

\
i
h
), for BCH,
where

\
i
h
(r) = \
i
h
(r) r:
1
\
h
(1). For the case of the traditional clustered standard errors,
Hansen (2007) has shown that
\
q
(1)
0
C
1
q
\
q
(1) ~
:
:
1
q;nq
,
\
1
(1)
_
C
1
~
_
:
: 1
t
n1
,
Under random sampling in the cross-section, the asymptotic distributions of the ac(:) statistics
are free of nuisance parameters and critical values can be easily tabulated using simulation methods.
The results for the traditional clustered case formally extend Corollary 4.1 of Hansen (2007) to allow
weak exogeneity in the time dimension.
13
When time period dummies are also included in the model, the asymptotic distributions of the
ac(:) statistics change whereas the asymptotic distributions of the HCoC statistics remain the
same. The next theorem and corollary summarize the results for the case where individual xed-
eects and time period dummies are included in the model. Note that Assumption 3 is replaced
with the stronger Assumption 4.
Theorem 2 Suppose the model is estimated with individual xed-eects and time period dummies
are included. Suppose Assumptions 1, 2 and 4 hold. For the statistics that use a kernel HAC
estimator assume ' = /T where / (0, 1] is xed. For the statistics based on the BCH standard
errors assume G is xed. For : xed as T the following hold: (i) the limits of
_
T(

, ,)
and the ac(:) statistics take the same form as in Theorem 1 with the denition of 1
i
k
(r) changed
to be
1
i
k
(r) =
ex
i

ex
\
ex
(r),
where
ex
i
is the nonstochastic matrix given by

ex
i
=
ii

1
:
n

j=1

ij
, where
ij
= c
0
i
c
0
j
1
k
,
and (ii) the limits of the HCoC statistics are the same as given by Theorem 1 and Corollary 1.
Corollary 3 Suppose there is random sampling in the cross-section. The asymptotic distributions
of Theorem 2 simplify as follows:
\a|d
ave(n)
d

q
(1)
0
C
1
q

\

q
(1), t
ave(n)
d

1
(1),
_
C
1
,
where

q
(1) =

n
i=1

\
ii
q
(1),

\
ii
q
(r) =

\
ii
q
(r):
1

n
l=1

\
li
q
(r),

\
ij
q
(r) = \
ij
q
(r):
1

n
l=1
\
il
q
(r),
\
ij
q
(r) are 1 vectors of independent Wiener processes that are independent of each other and
for / = 1,
C
h
=
_

_
n

i=1

\
ii
h
(1)

\
ii
h
(r)
0
, for traditional clustering
n

i=1
1(/,

\
ii
h
), for kernel HAC
n

i=1
1(`
0
, `
1
.., `
G
,

\
ii
h
), for BCH.
where

\
ii
h
(r) =

\
ii
h
(r) r:
1

h
(1). For the case of the traditional clustered standard errors it
follows that

q
(1)
0
C
1
q

\

q
(1) ~
(: 1)
(: 2)
:
(: )
1
q;nq
,

\

1
(1)
_
C
1
~
_
:
: 2
t
n1
.
14
The nding that the HCoC statistics have the same asymptotic distributions when time period
dummies are included in the model is useful given that empirical researchers often include both
individual and time periods dummies. The results for the traditional clustered standard errors
under random sampling are interesting because the only dierence that the inclusion of time period
dummies makes is the (:1),(:2) scaling adjustment to the 1 and t random variables. Obviously,
when : is not small, this adjustment is be small.
3.3 Large-T, Large-: Results
This subsection analyzes the asymptotic properties of the HCoC test statistics in the large-T,
large-: case. All limits in this section are taken as :, T . The following four high level
assumptions are sucient for obtaining results for the xed-eects estimator based on model (1).
Assumption 5 j lim
1
nT
n

i=1
[rT]

t=1
r
it
r
0
it
= rQ, Q
1
exists.
Assumption 6 1(n
it
[r
is
) = 0 for all t, :.
Assumption 7
1
p
nT
n

i=1
[rT]

t=1
(r
it
j
i
)n
it
= \
k
(1, r), where j
i
= 1(r
it
) and \
k
(1, r) is a / 1
vector of standard Brownian sheets.
Assumption 8 The process (r
it
j
i
)n
is
is a mean zero vector of three dimensional stationary
random elds indexed by i, t, :.
Assumptions 5 and 7 rely on the theory of stationary and ergodic random eld theory. A random
eld is simply a random process that is indexed by a vector with a pure time series being a
special case of random eld when the index is a scalar. Assumption 5 is the usual assumption for
the second moments of the regressors and requires the random eld r
it
to be ergodic, see Adler
(1981, Section 6.5). Assumption 7 is a functional central limit theorem for random elds and it
requires covariance stationarity of the random eld, see Deo (1975) and Basu and Dorea (1979).
As an example of a covariance stationary random eld for panel data, suppose the cross-section
comprises the 50 states of the United States. Suppose that over time the data follows a covariance
stationary autoregressive model and that cross-section observations i and , have correlation that
is an exponentially decreasing function of the physical distance between the two states. In this
example, the covariance between pairs of data is stationary because it is only a function of distance
in time and physical geographical distance.
Assumption 6 imposes strict exogeneity in the time dimension but allows weak exogeneity in
the cross-section dimension. Assumption 8 relies on Assumption 6 and stationarity of the random
15
elds r
it
and n
it
and is sucient to make the impact of the xed-eects demeaning asymptotically
negligible. It might be possible to relax Assumption 6 to allow weak exogeneity the time dimension
and to drop Assumption 8 but this would likely change and complicate the asymptotic results. The
upshot of Assumptions 5-8 are that they are stronger than the assumptions required when : is held
xed and they require stationarity and weak dependence in the cross-section.
The next theorem summarizes the results for the xed-eects estimator.
Theorem 3 Suppose the model is estimated with individual xed-eects but no time period dum-
mies are included. Suppose Assumptions 5-8 hold. As :, T
_
:T(

, ,)
d
Q
1
\
k
(1, 1) ~
_
0, Q
1

0
Q
1
_
,
and the asymptotic distributions of \a|d
HACSC
and t
HACSC
are the same as given by Theorem 1.
For the case where time period dummies are also included in the model, Assumptions 6 and 8 need
to be strengthen as follows.
Assumption 9 1(n
it
[r
js
) = 0 for all i, , and t, :.
Assumption 10 The process (r
it
j
i
)n
js
is a mean zero vector of four dimensional stationary
random elds indexed by i, ,, t, :.
We see that strict exogeneity is added to the cross-section dimension. This is sucient to make
the impact of the time periods dummies asymptotically negligible and leads to the theorem:
Theorem 4 Suppose the model is estimated with individual xed-eects and time period dummies
are included. Suppose Assumptions 5, 7, 9 and 10 hold. As :, T
_
:T(

, ,)
d
Q
1
\
k
(1, 1) ~
_
0, Q
1

0
Q
1
_
,
and the asymptotic distributions of \a|d
HACSC
and t
HACSC
are the same as given by Theorem 1.
Theorems 3 and 4 show that the xed-:, large-T results for the HCoC statistics continue to
hold in the large-:, large-T case but restrict the spatial correlation to be stationary and weakly
dependent and require stronger exogeneity assumptions.
4 Finite Sample Properties
In this section the nite sample performance of the robust standard errors is examined using a sim-
ulation study. The asymptotic approximations given by the theorems are compared and contrasted
with the traditional asymptotics. The performance of the various standard errors are compared
16
and contrasted in designs with and without spatial correlation. The impact of the strength of the
serial correlation is also examined. The data generating process used for the simulations is given
by
j
it
= ,
1
r
1it
+ ,
2
r
2it
+ ,
3
r
3it
+ ,
4
r
4it
+ n
it
, (7)
where
n
it
= jn
i;t1
+ `j
t
+ -
it
, n
i0
= 0, -
it
~ (0, 1), j
t
~ (0, 1),
co(-
it
, -
js
) = 0 )or t ,= :, co(j
t
, j
s
) = 0 )or t ,= :, co(j
t
, -
js
) = 0 \t, :, ,.
For | = 1, 2, 3, 4
r
lit
= jr
li;t1
+ `
lt
+ c
lit
, r
li0
= 0, c
lit
~ (0, 1),
lt
~ (0, 1),
co(c
lit
, c
ljs
) = 0 )or t ,= :, co(
lt
,
ls
) = 0 )or t ,= :, co(
lt
, c
ljs
) = 0 \t, :, ,.
The four regressors are uncorrelated with n
it
and each other. The regressors and the error are
modeled as 1(1) processes with the same autoregressive parameter. The innovations to the
regressors and n
it
have two components - a time component that is shared by all individuals and an
idiosyncratic component uncorrelated over time but with potential correlation in the cross-section.
The idiosyncratic errors, -
it
, c
lit
are congured to have spatial correlation as follows. For a given
time, t, : i.i.d. (0, 1) random variables are placed on a rectangular grid (usually a square). At
each grid point, -
it
is constructed as the weighted sum of the normal random variable at that grid
point, the normal random variables that are one step away in either direction on the grid weighted
by 0 and the normal random variables that are two steps away in either direction weighted by
0
2
. Thus, -
it
is a spatial MA(2) process with parameters 0 and 0
2
and the distance measure is
maximum coordinate-wise distance on the grid. The c
lit
shocks are constructed in a similar way.
When 0 = 0 and ` = 0, there is no spatial correlation in the model. When 0 = 0 but ` ,= 0, there
is equi-spatial correlation, j
spat
, between individuals in the cross-section in both the regressors and
the idiosyncratic error. The value of that correlation is `
2
,(1 + `
2
).
Results are given for sample sizes T = 10, 50, 250 and : = 10, 50, 250 or : = 9, 49, 256. The
latter values of : are used when 0 ,= 0 so that the grid in the cross-section can be congured as
a square. The number of replications is 2,000 in all cases and the nominal signicance level is
0.05. Results are reported for the Bartlett kernel. Results using other kernels are similar. Unless
otherwise stated, xed-eects OLS is used to estimate the model.
The rst set of simulation results are for t-statistics for testing the null hypothesis H
0
: ,
1
= 0
against the alternative H
1
: ,
1
,= 0. Because the xed-eects estimate of ,
1
and its standard errors
are exactly invariant to ,
2
, ,
3
, and ,
4
, these parameters are set to zero without loss of generality.
Table 1 reports empirical null rejection probabilities for the t
Bart
ave(n)
and t
clus
ave(n)
statistics. For
t
Bart
ave(n)
a small selection of bandwidths are considered. For each statistic rejections are computed
17
using the traditional (0, 1) critical value and using the xed-/ asymptotic critical value given
by Corollary 2 which are only valid when there is random sampling in the cross-section. Table
2 reports empirical null rejection probabilities for the t
Bart
HACSC
statistic. Rejections are computed
using the traditional (0, 1) critical value and using the xed-/ asymptotic critical value using the
method described in Appendix B. Tables 1 and 2 only consider the case of no spatial correlation
(` = 0, 0 = 0).
An obvious pattern in both tables is that when : is small, all three statistics have rejection
probabilities greater than 0.05 when the (0, 1) critical value is used. This happens even when
the regressors and error are i.i.d. and the problem becomes more pronounced when there is serial
correlation in the model. Some of the over-rejection problem happens because the (0, 1) approx-
imation does not reect the randomness in the standard error. Because the xed-/ approximation
captures some of the randomness in the standard error, the tendency to over-reject is less of a
problem when xed-/ critical values are used.
The patterns of t
clus
ave(n)
are similar to the simulations results reported by Hansen (2007). Except
when both : and T are small, t
clus
ave(n)
has rejections close to 0.05 especially when the critical value
is taken from a
_
n
n1
t
n1
random variable (xed-/ columns), and this is true even when the serial
correlation becomes strong. In contrast t
Bart
ave(n)
tends to over-reject when the serial correlation is
strong even when xed-/ critical values are used. Notice that the tendency to over-reject diminishes
as the bandwidth is increased. This is an expected nding. As the bandwidth is increased, more
weight is placed on higher order lags and

i
becomes closer to T
1

o
iT

o
0
iT
. From the perspective
of size, the traditional cluster standard errors dominate the Bartlett kernel. Unreported results for
other kernels gave similar results.
The performance of t
Bart
HACSC
is markedly dierent than t
ave
as the bandwidth is increased. Notice
in Table 2 that when the (0, 1) critical values are used, the tendency of t
Bart
HACSC
to over-reject
substantially increases as the bandwidth is increased. This pattern is easy to explain. Because

is
an estimator based on a single time series and because

= 0 when full weight is placed on all the
sample autocovariances, it is well known (see Vogelsang (2008)) that as the bandwidth increases,
bias in

initially falls but then increases as the bandwidth increases further. The variance of

is initially increasing in the bandwidth but eventually becomes decreasing in the bandwidth.
Therefore, when a large bandwidth is used,

has substantial downward bias and t
Bart
HACSC
tends
to over-reject. The xed-/ approximation captures much of the bias and variability in

and thus
the over-rejection problem is less severe when xed-/ critical values are used. However, the xed-/
approximation does not capture all of the bias in

. Part of the bias in

depends on the strength
of the serial correlation and this bias grows as the serial correlation becomes stronger
4
. This is
why we see the over-rejection problem becoming worse as j increases. Fortunately this part of the
4
See Vogelsang (2008) for a more detailed discussion on the bias and variance of kernel HAC estimators.
18
bias decreases as the bandwidth is increased and this is why the over-rejection problem lessens as
the bandwidth is increased when xed-/ critical values are used. To summarize, as the bandwidth
increases, the part of the bias of

that depends on the strength of the serial correlation decreases.
But, the increase in the bandwidth increases the other part of the bias of

. This second bias is
captured by the xed-/ approximation. This is why the over-rejection problem of t
Bart
HACSC
is lowest
when the bandwidth is set equal to the sample size (' = T) and xed-/ critical values are used.
The over-rejection problem of t
Bart
HACSC
dissipates as the time dimension sample size increases.
When T is small, larger : helps reduce the over-rejection problem but not substantially. Overall,
it is the size of the time dimension sample size relative to the strength of the serial correlation that
matters for the robustness of t
Bart
HACSC
. The stronger the serial correlation, the larger T needs to be
for t
Bart
HACSC
to have nite sample null rejection probabilities close to the desired nominal level.
Table 3 considers a case where there is spatial correlation in the model with 0 = 0.5. Critical
values for t
clus
ave(n)
and t
Bart
ave(n)
are still based on Corollary 2 to show the extent to which these tests
breakdown when spatial correlation is in the model
5
. Empirical null rejection probabilities are
reported and in all cases xed-/ critical values were used. Not surprisingly, the t
clus
ave(n)
and t
Bart
ave(n)
statistics substantially over-reject across the board. The over-rejection problem becomes worse
as either : or T increase. In contrast, t
Bart
HACSC
performs much better as long as T is not too
small and/or the serial correlation is not too strong. When T = 10, t
Bart
HACSC
tends to over-reject
regardless of the serial correlation and the rejections do not change much as : gets bigger. This
is not surprising because

is being estimated with a univariate time series with 10 observations
and T is too small for the xed-/ approximation to be accurate. When T = 50, the rejections are
close to 0.05 when the serial correlation is weak (j = 0, 0.3) and a large bandwidth is used. When
T = 250 rejections are close to 0.05 when j is as large as 0.6 and when j = 0.9 the over-rejection
problem is not severe. If larger values of T were considered, we would see rejections closer to 0.05
when j = 0.9. Conversely, for a given value of T, the value of j could be pushed closer to 1 resulting
in over-rejections. Again, it is the size of T relative to the strength of the serial correlation that
matters
6
for t
Bart
HACSC
.
Tables 4 and 5 provide some results on the power of the tests. Table 4 considers the case
without spatial correlation whereas Table 5 has spatial correlation. Rejections are computed for
,
1
= 0.1 using two-tailed tests of H
0
: ,
1
= 0. Because of the over-rejection problem under the null,
size-adjusted power was computed. This was accomplished by rst simulating nite sample null
critical values of the statistics for each DGP. Size-adjusted power is useful for making theoretical
5
The results of Theorem 1 could be used to simulate asymptotic critical values for t
clus(n)
and t
ave(n)
given spatial
correlation in the model. But, such an exercise would require knowledge of the form of spatial correlation and would
not be feasible in practice (even if it is feasible in a simulation with known DGP).
6
Simulations reported by Gonalves (2011) suggest that the moving blocks bootstrap can further reduce the
over-rejection problem of tHACSC, relative to using xed-b critical values, when the serial correlation is strong.
19
comparisons of nite sample power. Unfortunately, this size adjustment is not feasible in empirical
applications where the DGP is unknown. Table 4 reveals some interesting patterns. Power of t
ave(n)
and t
Bart
HACSC
tend to decrease as the bandwidth increases. A similar pattern in power was found by
Kiefer and Vogelsang (2005) in a pure time series setting. Power of t
clus
ave(n)
tends to be lower than
t
Bart
ave(n)
. This is not surprising given that t
clus
ave(n)
eectively uses a bigger bandwidth than t
Bart
ave(n)
(it
puts more weight on higher order sample autocovariances). Therefore, when empirical researchers
use t
clus
ave(n)
rather than use the Bartlett kernel, they are giving up power in exchange for greater
robustness of the test to serial correlation (in terms of over-rejections). Power of t
Bart
HACSC
is lower
than t
clus
ave(n)
and t
Bart
ave(n)
. This is to be expected since the variation between the individuals in the
cross-section is averaged out to form

. For all three statistics, power increases as : or T increase
and power decreases as j decreases.
Table 5 only reports power for t
Bart
HACSC
given that t
Bart
ave(n)
and t
clus
ave(n)
are invalid when there is
spatial correlation in the model. Again, we see that power tends to fall as the bandwidth increases.
Power decreases as j increases and power is increasing in : and/or T. One notable feature of Table
5 is that power is lower than in Table 4 across the board. Clearly spatial correlation in the model
negatively impacts power. This is expected because spatial correlation reduces the information in
the model available from the cross-section.
When ` ,= 0 and spatial correlation is generated by a random time eect common to all
individuals in the sample, an obvious solution to the spatial correlation problem is to include time
period xed-eect dummies and then use t
clus
ave(n)
. In fact, this will work quite well because including
time dummies will make all three statistics exactly invariant to the spatial correlation in the model.
However, the time dummies only do the job when the spatial correlation is generated by a common
random time eect and the empirical researcher knows that is the source of the spatial correlation.
If the spatial correlation is driven by another source, time dummies will not make t
clus
ave(n)
or t
Bart
ave(n)
valid. To illustrate this, the cross-section was divided into two groups of equal size. The DGP
within each group is given by (7) with ` =
_
3 and 0 = 0 which generates a correlation of 0.75 for
pairs of individuals within each group. The idiosyncratic errors are uncorrelated between the two
groups but the random time components are congured to have correlation of 0.25 between the two
groups. A simple calculation shows that the correlation between individuals across the two groups
is
0.25`
2
1 + `
2
=
3
16
.
Table 6 reports empirical rejection probabilities for t
Bart
HACSC
and t
clus
ave(n)
for the two group case.
Results for t
Bart
ave(n)
are qualitatively similar to t
clus
ave(n)
and are not reported to save space. Results
are given for both the case where no time dummies are included in the model and the case where
time dummies are included. As the table clearly shows, t
clus
ave(n)
systematically over-rejects whether
or not time dummies are included and the over-rejections become quite large as : increases. On
20
the other hand, the rejections for t
Bart
HACSC
are qualitatively similar to what was seen in Table 3 and
the inclusion of time dummies only has a small eect. As long as T is big enough relative to the
strength of the serial correlation, t
Bart
HACSC
performs well especially when a large bandwidth is used.
The table clearly shows that inclusion of time dummies does not deal with the spatial correlation
problem when the spatial correlation is not generated by a common random time eect.
Tables 7-9 provide results for testing joint hypotheses. Table 7 reports empirical null rejection
probabilities for testing H
0
: ,
1
= 0, ,
2
= 0 for the \
clus
ave(n)
and \
Bart
HACSC
statistics. Rejections are
reported using the traditional
2
2
critical value along with xed-/ critical values for \a|d
Bart
HACSC
and
critical values from
2n
n2
1
2;n2
for \a|d
clus
ave(n)
. Table 8 reports similar results for testing H
0
: ,
1
= 0,
,
2
= 0, ,
3
= 0, ,
4
= 0. There is no spatial correlation in Tables 7 and 8. Table 9 considers the
case of spatial correlation with ` = 0, 0 = 0.5 with results only reported for xed-/ critical values.
Tables 7 and 8 show that the improvement over the standard chi-square approximation provided
by the xed-/ approximation for \a|d
HACSC
and the 1 approximation for \a|d
clus
ave(n)
is much
more striking relative to the t-statistics. When : is small, \a|d
clus
ave(n)
has nontrivial over-rejection
problems when the chi-square critical value is used and the problem becomes more severe when
testing four restrictions (Table 8). In contrast, using critical values from the scaled 1 distribution
performs much better. As : gets large, the dierence between the approximations becomes small.
The over-rejection problem is very severe for \a|d
Bart
HACSC
when the chi-square critical value is
used. In Table 8 notice that the over-rejection problem is present regardless of the sample size or
the strength of the serial correlation. The over-rejection problem becomes quite severe when larger
bandwidths are used. In contrast, rejections using the xed-/ critical value are reasonably close
to 0.05 when T is not small and/or the the serial correlation is not too strong. As was the case
for the t-statistics, \a|d
clus
ave(n)
is very robust to even strong serial correlation as long as no spatial
correlation is in the model.
Table 9 shows that \a|d
clus
ave(n)
substantially over-rejects when there is spatial correlation.
\a|d
Bart
HACSC
performs best when T is not too small and the serial correlation is not too strong.
Larger bandwidths tend to reduce the over-rejection problem relative to small bandwidths.
5 Empirical Application: Divorce Rates
This section re-examines some of the empirical ndings of Wolfers (2006) on the relationship be-
tween unilateral divorce laws and divorce rates. The focus is on column 1 of panel A from Table
4 in Wolfers (2006). A panel of state-level annual data from 1956-1988 was used to estimate the
dynamic change over time of divorce rates as a function of the time since a state adopted no-fault
unilateral divorce laws. The dependent variable is the divorce rate per 1000 persons for a given state
in a given year. The regressors are dummy variables for the number of years the law change had
been in eect for a given year in a given state. Panel A of Table 10 reproduces the OLS estimates
21
(using state population weights) of the model along with the OLS standard errors reported by
Wolfers (2006). The model includes state and time xed-eects. The OLS estimates indicate that
divorce rates rose soon after the unilateral divorce laws were passed but within a decade, divorce
rates had fallen and continued to fall over time. According to OLS standard errors, the estimates
are fairly precise and all coecients are statistically signicant at the 5% level.
It is highly unlikely that OLS standard errors are valid in this data set because this would require
the data to be uncorrelated over time with no correlation across states and no heteroskedasticity in
the cross-section. At the very least, it would be sensible to compute the traditional cluster standard
errors
7
and these are reported in Table 10. The cluster standard errors are larger in magnitude than
the OLS standard errors. The only signicant estimates are for 13-14 years and 15+ years. OLS
looks much less precise according to the cluster standard errors. Of course, the cluster standard
errors are not valid if spatial correlation is in the model.
The last four columns of Panel A report HCoC robust standard errors for a range of band-
width choices using the Bartlett kernel. The standard errors were computed in Stata using the
xtscc procedure developed by Hoechle (2007). For some point estimates, the HCoC robust
standard errors systematically decrease as the bandwidth increases. This pattern is consistent with
weak serial correlation. For other point estimates, the standard errors initially fall as ' increases,
but eventually decrease as ' becomes large. This pattern is consistent with stronger serial corre-
lation. Panel B of Table 10 reports 95% condence intervals using the fully robust standard errors
whereas Panel C reports the lengths of the condence intervals. The calculations are carried out for
both (0, 1) critical values and for xed-/ critical values. The results for the (0, 1) critical values
show that because the (0, 1) approximation does not reect the bandwidth choice, the tightness
of condence intervals depends heavily on the bandwidth. Using a larger bandwidth leads to much
tighter condence intervals. As the simulations in Section 4 illustrated, these tighter condence
intervals are spurious because the (0, 1) approximation leads to tests that over-reject when the
bandwidth increases. In contrast, the condence intervals using xed-/ critical values change much
less as the bandwidth increases although the lengths of the condence intervals tend to be wider
when a larger bandwidth is used. This is not surprising because the simulations showed that when
there is serial correlation in the model that is not weak, tests using larger bandwidths were less size
distorted than tests with small bandwidths where over-rejections were larger. Thus, the shorter
condence intervals when ' is small are in part a reection of tendency to over-reject.
The last two columns of Panel B show that most of Wolfers point estimates remain statistically
signicant when fully robust standard errors are implemented with larger bandwidths and xed-
/ critical values are used. These are the standard errors that exhibited the least over-rejection
7
It is surprising that Wolfers (2006) did not use cluster standard errors given that he cites Bertrand et al. (2004)
in footnote 10.
22
problems in the simulations in the face of strong serial correlation in the model. One can conclude
that the results in column 1 of Panel of Table 4 in Wolfers (2006) remain robust to heteroskedasticity,
spatial correlation and stationary serial correlation in his model. On the other hand, if one were
certain that there was no spatial correlation in the model but one required inference robust to
potentially nonstationary serial correlation in the model, the cluster standard errors would be
appropriate in which case the majority of Wolfers estimates are not statistically signicant. A
recent paper by Lee and Solon 2011 nds that the empirical ndings in Wolfers (2006) are also
sensitive to estimation method and functional form further compounding the uncertainty in the
precision of the parameter estimates.
6 Conclusions
This paper develops an asymptotic theory for test statistics in linear panel models with xed
eects that are robust to heteroskedasticity, autocorrelation and spatial correlation. Two classes of
standard errors were analyzed. Both are based on nonparametric heteroskedasticity autocorrelation
(HAC) covariance matrix estimators. The rst class is based on averages of HAC estimates across
individuals in the cross-section, i.e. "averages of HACs". This class includes the well known cluster
standard errors analyzed by Arellano (1987) as a special case. The second class is based on the
HAC of cross-section averages and was proposed by Driscoll and Kraay (1998). The "HAC of
averages" standard errors are robust to heteroskedasticity, serial correlation and spatial correlation
but covariance stationarity in the time dimension is required. The "averages of HACs" standard
errors are robust to heteroskedasticity and serial correlation including the nonstationary case but
they are not valid in the presence of spatial correlation. There are two main contributions of the
paper. The rst contribution is to establish conditions under which the HCoC standard errors
lead to valid tests when xed-eects dummy variables are included in the model and possibly
time periods dummies are included as well. This is an important contribution from the empirical
perspective given that the assumptions used by Driscoll and Kraay (1998) do not permit xed-eect
or time period dummies in the model. The second contribution is to develop a xed-/ asymptotic
theory for statistics based on both classes of standard errors. It is shown that tests based on
HCoC standard errors have the same xed-/ asymptotic distribution regardless as to whether
individual and/or time period xed-eects dummies are included in the model although exogeneity
assumptions depend on the form of dummies included in the model. This asymptotic results hold
for both xed-:, large-T and large-:, large-T frameworks, although assumptions on the strength of
the spatial correlation and exogeneity depend on whether : is treated as xed or going to innity.
The asymptotic results in this paper extend and generalize recent results by Hansen (2007) for the
traditional Arellano (1987) cluster standard errors.
Extensive simulations show that the xed-/ approximation is usually much better than the
23
traditional normal or chi-square approximation especially for the HCoC standard errors. The
use of xed-/ critical values will lead to more reliable inference in practice especially for tests of
joint hypotheses. The simulations showed that the choice of bandwidth aects both the tendency
to over-reject and power of the tests. Larger bandwidths tend to reduce the over-rejection problem
when there is strong positive autocorrelation in the time dimension. Larger bandwidths tend to
reduce power. As in the pure time series case, there is a trade-o between over-rejections and
power when choosing the bandwidth. To make the implementation of xed-/ asymptotic theory
easy for practitioners, an appendix provides a simple method for computing xed-/ critical values
and j-values for the case of the Bartlett kernel (Newey-West). Stata code that implements xed-/
critical values and j-values for the newey, xtscc and test Stata commands is available on the
authors webpage: https://www.msu.edu/~tjv/working.html or by request.
7 Appendix A: Proofs
Proofs of the Theorems are provided in this appendix. The proofs of Theorems 1 and 2 involve
four basic steps. The rst step is to derive the limit of T
1=2

[rT]
t=1
r
it
n
it
. The second step is to
obtain the asymptotic result for
_
T(

, ,). The third step it to derive the asymptotic limit of


T
1=2

o
i[rT]
= T
1=2

[rT]
t=1

it
. The fourth second step is to write the standard errors as a function
of T
1=2

o
i[rT]
using algebra worked out by Hashimzade and Vogelsang (2008). The nal step is
to use the continuous mapping theorem to obtain the asymptotic distribution of the Wald and t
statistics. The proofs of Theorems 3 and 4 follow similar steps although the details are dierent.
Proof of Theorem 1: Only individual xed-eects are included in the model so that r
it
= r
it
r
i
.
The / 1 vector r
it
n
it
can be written in terms of the :/ 1 vector
t
as follows:
r
it
n
it
= (r
it
r
i
)n
it
= ((r
it
j
i
) (r
i
j
i
))n
it
=
ii
t
(r
i
j
i
)n
it
= (c
0
i
1
k
)
t
(r
i
j
i
)n
it
=
i

t
(r
i
j
i
)n
it
.
Using this formula it directly follows that
T
1=2
[rT]

t=1
r
it
n
it
= T
1=2
[rT]

t=1
(
i

t
(r
i
j
i
)n
it
)
=
i
T
1=2
[rT]

t=1

t
(r
i
j
i
)T
1=2
[rT]

t=1
n
it
=
i
T
1=2
[rT]

t=1

t
+ o
p
(1)
using Assumptions 1 and 2. Using Assumption 3 we have
T
1=2
[rT]

t=1
r
it
n
it
=
i
\(r) = 1
i
k
(r). (8)
24
Using Assumption 2 and (8) it immediately follows that
_
T(

, ,) =
_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
n

i=1
T
1=2
T

t=1
r
it
n
it
d

_
n

i=1
Q
i
_
1
n

i=1
1
i
k
(1) = Q
1
1
k
(1). (9)
Now consider the limit of T
1=2

o
i[rT]
. Simple algebra gives
T
1=2

o
i[rT]
= T
1=2
[rT]

t=1

it
= T
1=2
[rT]

t=1
r
it
n
it
= T
1=2
[rT]

t=1
r
it
_
n
it
n
i
r
0
it
(

, ,)
_
= T
1=2
[rT]

t=1
r
it
n
it
T
1=2
[rT]

t=1
r
it
n
i
T
1
[rT]

t=1
r
it
r
0
it
_
T(

, ,)
= T
1=2
[rT]

t=1
r
it
n
it
+
_
_
T
1
[rT]

t=1
(r
it
r
i
)
_
_
_
T
1=2
T

t=1
n
it
_
T
1
[rT]

t=1
r
it
r
0
it
_
T(

, ,)
= T
1=2
[rT]

t=1
r
it
n
it
T
1
[rT]

t=1
r
it
r
0
it
_
T(

, ,) + o
p
(1)
using Assumptions 1 and 2. Using (8), (9), and Assumption 2 it follows that
T
1=2

o
i[rT]
=1
i
k
(r) rQ
i
Q
1
1
k
(1) =

1
i
k
(r). (10)
Using (9) and (10) the results for the ac(:) statistics are straightforward to establish. A formal
proof is only provided for the case of the ac(:) statistics using the Bartlett kernel. The other cases
are analogous and only dier in the algebraic expressions. The proof proceeds by writing

\
Bart
ave(n)
in
terms of T
1=2

o
i[rT]
. From calculations in Hashimzade and Vogelsang (2008) we have for the case
of the Bartlett kernel

i
= T
1
T

t=1
T

s=1
1
ts

it

0
is
=
2
'
T1

t=1
T
1=2

o
it
T
1=2

o
0
it

1
'
TM1

t=1
_
T
1=2

o
it
T
1=2

o
0
i;t+M
+ T
1=2

o
i;t+M
T
1=2

o
0
it
_

1
'
T1

t=TM
_
T
1=2

o
it
T
1=2

o
0
iT
+ T
1=2

o
iT
T
1=2

o
0
it
_
+ T
1=2

o
iT
T
1=2

o
0
iT
25
=
2
[/T]
T1

t=1
T
1=2

o
it
T
1=2

o
0
it

1
[/T]
T[bT]1

t=1
_
T
1=2

o
it
T
1=2

o
0
i;t+[bT]
+ T
1=2

o
i;t+[bT]
T
1=2

o
0
it
_

1
[/T]
T1

t=T[bT]
_
T
1=2

o
it
T
1=2

o
0
iT
+ T
1=2

o
iT
T
1=2

o
0
it
_
+ T
1=2

o
iT
T
1=2

o
0
iT
d

2
/
_
1
0

1
i
k
(r)

1
i
k
(r)
0
dr
1
/
_
1b
0
_

1
i
k
(r + /)

1
i
k
(r)
0
+

1
i
k

1
i
k
(r + /)
0
_
dr

1
/
_
1
1b
_

1
i
k
(1)

1
i
k
(r)
0
+

1
i
k
(r)

1
i
k
(1)
0
_
dr +

1
i
k
(1)

1
i
k
(1)
0
= 1(/,

1
i
k
), (11)
using (10). Under the null hypothesis that 1, = r, it follows that 1

, r = 1

, 1, = 1(

, ,).
Therefore we have
\a|d
Bart
ave(n)
= (1

, r)
0
_
1

\
Bart
ave(n)
1
0
_
1
(1

, r)
= (1(

, ,))
0
_
_
1T
_
n

i=1
T

t=1
r
it
r
0
it
_
1
_
n

i=1

i
__
n

i=1
T

t=1
r
it
r
0
it
_
1
1
0
_
_
1
1(

, ,)
= (1
_
T(

, ,))
0
_
_
1
_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
_
n

i=1

i
__
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
1
0
_
_
1
1
_
T(

, ,)
d

_
1Q
1
1
k
(1)
_
0
_
1Q
1
_
n

i=1
1(/,

1
i
k
)
_
Q
1
1
0
_
1
1Q
1
1
k
(1),
using (9) and (11). The proof for t
Bart
ave(n)
follows trivially.
The result for \a|d
HACSC
requires the limit of T
1=2
o
[rT]
which using (10) is given by
T
1=2
o
[rT]
= T
1=2
[rT]

t=1

t
= T
1=2
[rT]

t=1
n

i=1

it
=
n

i=1
T
1=2
[rT]

t=1

it
=
n

i=1
T
1=2

o
i[rT]
d

i=1

1
i
k
(r) =
n

i=1
_
1
i
k
(r) rQ
i
Q
1
1
k
(1)
_
=
n

i=1
1
i
k
(r) r1
k
(1)
=
n

i=1

i
\(r) r
n

j=1

j
\(1) = (\(r) r\(1)) =

\(r), (12)
where

\(r) = \(r) r\(1) is an :/ 1 vector of Brownian bridges. Again focusing on the
26
Bartlett kernel, algebra from Hashimzade and Vogelsang (2008) gives

= T
1
T

t=1
T

s=1
1
ts

0
s
=
2
'
T1

t=1
T
1=2
o
t
T
1=2
o
0
t

1
'
TM1

t=1
_
T
1=2
o
t
T
1=2
o
0
t+M
+ T
1=2
o
t+M
T
1=2
o
t
_

1
'
T1

t=TM
_
T
1=2
o
t
T
1=2
o
0
T
+ T
1=2
o
T
T
1=2
o
0
t
_
+ T
1=2
o
T
T
1=2
o
0
T
,
=
2
'
T1

t=1
T
1=2
o
t
T
1=2
o
0
t

1
'
TM1

t=1
_
T
1=2
o
t
T
1=2
o
0
t+M
+ T
1=2
o
t+M
T
1=2
o
t
_
,
using the fact that

o
T
= 0 by the OLS normal equations. Continuing the calculation,

=
2
[/T]
T1

t=1
T
1=2
o
t
T
1=2
o
0
t

1
[/T]
T[bT]1

t=1
_
T
1=2
o
t
T
1=2
o
0
t+[bT]
+ T
1=2
o
t+[bT]
T
1=2
o
t
_
d

_
2
/
_
1
0

\(r)

\(r)
0
dr
1
/
_
1b
0
_

\(r + /)

\(r)
0
+

\(r)

\(r + /)
0
_
dr
_

0
= 1(/,

\)
0

0
,
using (12). It directly follows that
1
_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1

_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
1
0
d
1Q
1
1(/,

\)
0

0
Q
1
1
0
= 1(/, 1Q
1

\) =

q
1(/,

q
)
0
q
,
where

q
is the matrix square root of the matrix 1Q
1

0
Q
1
1
0
and

q
(r) = \

q
(r)r\

q
(1)
is the 1 vector of standard Brownian bridges dened in Theorem 1. Using (9)
1
_
T(

, ,)
d
1Q
1
1
k
(1) = 1Q
1
\(1) =

q
\

q
(1).
Simple algebra gives
\a|d
Bart
HACSC
= (1

, r)
0
_
1

\
Bart
HACSC
1
0
_
1
(1

, r),
= (1(

, ,))
0
_
_
1T
_
n

i=1
T

t=1
r
it
r
0
it
_
1

_
n

i=1
T

t=1
r
it
r
0
it
_
1
1
0
_
_
1
1(

, ,),
= (1
_
T(

, ,))
0
_
_
1
_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1

_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
1
0
_
_
1
1
_
T(

, ,),
27
d

q
\

q
(1)
_
0
_

q
1(/,

q
)
0
q
_
1

q
\

q
(1) = \

q
(1)
0
1(/,

q
)
1
\

q
(1),
completing the proof. The results for other kernels and the 1CH statistics use the same arguments
only diering in the algebraic formulas.
Proof of Corollary 2. Under random sampling in the cross-section, Q
i
= Q
0
for all i and is a
block diagonal matrix with a / / matrix, say
0
, along the block diagonal. In other words, we
have = 1
n

0
. It follows directly that

i
\(r) = (c
0
i
1
k
)(1
n

0
)\(r) =
0
\
i
(r),
where \
i
(r) is a / 1 vector of standard Wiener processes and \
i
(r) is independent of \
j
(r) for
i ,= ,. Applying these simplications to (9) and (10) gives
_
T(

, ,)
d
(:Q
0
)
1

0
n

i=1
\
i
k
(1) = Q
1
0

0
:
1
\
k
(1)
T
1=2

o
i[rT]
=
0
\
i
k
(r) rQ
0
(:Q
0
)
1

0
n

i=1
\
i
k
(1) =
0
_
\
i
k
(r) r:
1
\
k
(1)
_
=
0

\
i
k
(r)
For the case of the Bartlett kernel, the limit of

i
is given by
0
1(/,

\
i
k
)
0
0
and the result for
\a|d
Bart
ave(n)
becomes
\a|d
Bart
ave(n)
d

_
1Q
1
0

0
:
1
\
k
(1)
_
0
_
1:
1
Q
1
0
_

0
n

i=1
1(/,

\
i
k
)
0
0
_
:
1
Q
1
0
1
0
_
1
1Q
1
0

0
:
1
\
k
(1)
=
_
1Q
1
0

0
\
k
(1)
_
0
_
1Q
1
0
_

0
n

i=1
1(/,

\
i
k
)
0
0
_
Q
1
0
1
0
_
1
1Q
1
0

0
\
k
(1).
= \
q
(1)
0
_
n

i=1
1(/,

\
i
q
)
_
1
\
q
(1)
0
using straightforward arguments from Kiefer and Vogelsang (2005) to cancel the nuisance parame-
ters 1Q
1
0

0
. Results for other kernels and t
ave(n)
follow similarly. The result for \a|d
clus
ave(n)
and
t
clus
ave(n)
is shown by Hansen (2007).
Proof of Theorem 2. The key step is to show that the limits of
_
T(

, ,) and T
1=2

o
i[rT]
take the same form as in Theorem 1 with 1
i
k
(r) =
ex
i

ex
\
ex
(r) and to show that the limit
of T
1=2
o
[rT]
is proportional to the Brownian bridge \
ex
(r) r\
ex
(1). Once these results are
obtained, the rest of the proof closely follows the proof of Theorem 1 and details are omitted. With
both individual and time period xed-eect dummies in the model it follows that
r
it
n
jt
= (r
it
r
i
)n
jt
:
1
n

l=1
(r
lt
r
l
)n
jt
28
= ((r
it
j
i
) (r
i
j
i
))n
jt
:
1
n

l=1
((r
lt
j
l
) (r
l
j
l
))n
jt
= (r
it
j
i
)n
jt
:
1
n

l=1
((r
lt
j
l
)n
jt
(r
i
j
i
)n
jt
+ :
1
n

l=1
(r
l
j
l
)n
jt
=
ij
t
:
1
n

l=1

lj
t
(r
i
j
i
)n
jt
+ :
1
n

l=1
(r
l
j
l
)n
jt
= ( c
0
i
c
0
j
1
k
)
ex
t
(r
i
j
i
)n
jt
+ :
1
n

l=1
(r
l
j
l
)n
jt
=
ij

ex
t
(r
i
j
i
)n
jt
+ :
1
n

l=1
(r
l
j
l
)n
jt
.
Using this formula it directly follows that
T
1=2
[rT]

t=1
r
it
n
jt
=
ij
T
1=2
[rT]

t=1

ex
t
(r
i
j
i
)T
1=2
[rT]

t=1
n
jt
+ :
1
n

l=1
(r
l
j
l
)T
1=2
[rT]

t=1
n
jt
=
ij
T
1=2
[rT]

t=1

ex
t
+ o
p
(1)
using Assumptions 1 and 2. Using Assumption 4 we have
T
1=2
[rT]

t=1
r
it
n
jt
=
ij

ex
\
ex
(r). (13)
Using Assumption 2 and (13) it immediately follows that
_
T(

, ,) =
_
n

i=1
T
1
T

t=1
r
it
r
0
it
_
1
n

i=1
T
1=2
T

t=1
r
it
n
it
d
Q
1
n

i=1

ii

ex
\
ex
(1)
= Q
1
n

i=1
_
_

ii

1
:
n

j=1

ij
_
_

ex
\
ex
(1) = Q
1
n

i=1

ex
i

ex
\
ex
(1) = Q
1
n

i=1
1
i
k
(1) = Q
1
1
k
(1).
(14)
From (14) it follows that 1
i
k
(1) =
ex
i

ex
\
ex
(1) as required. Note that this result uses the
relationship
n

i=1

ii
=
n

i=1
_
_

ii

1
:
n

j=1

ij
_
_
which follows from
1
:
n

i=1
n

j=1

ij
=
1
:
n

i=1
n

j=1
( c
0
i
c
0
j
1
k
) =
1
:
n

j=1
__
n

i=1
c
0
i
_
c
0
j
1
k
_
= 0,
29
using
n

i=1
c
0
i
= 0. The result for T
1=2

o
i[rT]
is given next. Direct calculation gives
T
1=2

o
i[rT]
= T
1=2
[rT]

t=1

it
= T
1=2
[rT]

t=1
r
it
n
it
= T
1=2
[rT]

t=1
r
it
_
_
n
it
n
i

1
:
n

j=1
(n
jt
n
j
) r
0
it
(

, ,)
_
_
= T
1=2
[rT]

t=1
r
it
_
_
n
it

1
:
n

j=1
n
jt
_
_

_
_
T
1
[rT]

t=1
r
it
_
_
_
_
T
1=2
n
i

1
:
n

j=1
T
1=2
n
j
_
_
T
1
[rT]

t=1
r
it
r
0
it
_
T(

,,).
From Assumption 1 it follows that
T
1=2
n
i

1
:
n

j=1
T
1=2
n
j
= T
1=2
T

t=1
n
it

1
:
n

j=1
T
1=2
T

t=1
n
jt
= O
p
(1)
and using Assumption 2 it follows that
T
1
[rT]

t=1
r
it
= o
p
(1)
given that 1( r
it
) = 0. Therefore,
T
1=2

o
i[rT]
= T
1=2
[rT]

t=1
r
it
_
_
n
it

1
:
n

j=1
n
jt
_
_
T
1
[rT]

t=1
r
it
r
0
it
_
T(

, ,) + o
p
(1)
= T
1=2
[rT]

t=1
r
it
n
it

1
:
n

j=1
_
_
T
1=2
[rT]

t=1
r
it
n
jt
_
_
T
1
[rT]

t=1
r
it
r
0
it
_
T(

, ,) + o
p
(1)
=
ii

ex
\
ex
(r)
1
:
n

j=1

ij

ex
\
ex
(r) rQ
i
Q
1
1
k
(1)
=
_
_

ii

1
:
n

j=1

ij
_
_

ex
\
ex
(r) rQ
i
Q
1
Q
1
1
k
(1)
=
ex
i

ex
\
ex
(r) rQ
i
Q
1
Q
1
1
k
(1) = 1
i
k
(r) rQ
i
Q
1
1
k
(1), (15)
where convergence follows from Assumption 2, (13) and (14). Using (15) the result for T
1=2
o
[rT]
easily follows.
T
1=2
o
[rT]
=
n

i=1
T
1=2

o
i[rT]
=
n

i=1
_
1
i
k
(r) rQ
i
Q
1
1
k
(1)
_
=
n

i=1
1
i
k
(r) r1
k
(1)
30
=
n

i=1

ex
i

ex
\
ex
(r) r
n

j=1

ex
j

ex
\
ex
(1) =
ex

ex
(\
ex
(r) r\
ex
(1)) =
ex

ex

\
ex
(r)
where
ex
=
n

i=1

ex
i
.
Proof of Corollary 3. Under random sampling in the cross-section, Q
i
= Q
0
for all i and
ex
is
a block diagonal matrix with a / / matrix, say
0
, along the block diagonal. In other words, we
have
ex
= 1
n
2
0
. The form of
ex
i

ex
\
ex
(r) can be written as

ex
i

ex
\
ex
(r) =
_
_

ii

1
:
n

j=1

ij
_
_
(1
n
2
0
)\
ex
(r) =
0

\
ii
k
(r),
where

\
ii
k
(r) =

\
ii
k
(r) :
1

n
l=1

\
li
k
(r),

\
ij
k
(r) = \
ij
k
(r) :
1

n
l=1
\
il
k
(r), \
ij
k
(r) are / 1
vectors of independent Wiener processes that are independent of each other. The asymptotic results
(14) and (15) simplify to
_
T(

, ,)
d
Q
1
0

0
:
1
n

i=1

\
ii
k
(1) = Q
1
0

0
:
1

k
(1), (16)
T
1=2

o
i[rT]
=
0
_

\
ii
k
(r) r:
1

k
(1)
_
. (17)
For the case of the Bartlett kernel, the limit of

i
can be written as
0
1
k
(/,

\
ii
k
)
0
0
where

\
ii
k
(r) =

\
ii
k
(r) r

k
(r) and

\

k
(r) = :
1
n

l=1

\
ll
k
(r). Using this result along with (16) and (17), the
representation for the limit of \a|d
Bart
ave(n)
in the corollary is easily obtained.
Now consider \a|d
clus
ave(n)
. Using (16) and (17), it is easy to show that
\a|d
clus
ave(n)
d

_
n

i=1

\
ii
q
(1)
_
0
_
n

i=1
_

\
ii
q
(1)
__

\
ii
q
(1)
_
0
_
1
_
n

i=1

\
ii
q
(1)
_
,
It is straightforward to show that 1[

\
ii
q
(1)

\
ii
q
(1)
0
] =
_
n1
n
_
2
1
q
and 1[

\
ii
q
(1)

\
jj
q
(1)
0
] =
1
n
2
1
q
for i ,= ,. Given this variance-covariance structure and given that each

\
ii
q
(1) is a vector of
normal random variables, the vectors

\
11
q
(1),

\
22
q
(1), ...,

\
nn
q
(1) can be represented as follows.
Let l
1
, l
2
, ..., l
n
be i.i.d. 1 vectors of mean zero normal random variables each with variance-
covariance matrix
n2
n
1
q
. Let be a 1 vector of mean zero normal random variables with variance-
covariance matrix
1
n
2
1
q
. Then the collection of random vectors l
1
+ , l
2
+ , ..., l
n
+ have the
same joint distribution as

\
11
q
(1),

\
22
q
(1), ...,

\
nn
q
(1). This allows an alternative representation
for the asymptotic distribution of \a|d
clus
ave(n)
given by
_
n

i=1
(l
i
+ )
_
0
_
n

i=1
_
l
i
l
_ _
l
i
l
_
0
_
1
_
n

i=1
(l
i
+ )
_
31
where l = :
1
n

i=1
l
i
. Given well known results from regression analysis,
n

i=1
l
i
is independent of
n

i=1
_
l
i
l
_ _
l
i
l
_
0
and therefore
n

i=1
(l
i
+ ) is independent of
n

i=1
_
l
i
l
_ _
l
i
l
_
0
. Scaling
by
_
n
n2
standardizes the variances of the l
i
random variables and gives
_
_
:
: 2
n

i=1
(l
i
+ )
_
0
_
n

i=1
:
: 2
_
l
i
l
_ _
l
i
l
_
0
_
1
_
_
:
: 2
n

i=1
(l
i
+ )
_
which is equivalent in distribution to the random variable
:(: 1)
: 2
7
0
_
n

i=1
_
7
i
7
_ _
7
i
7
_
0
_
1
7
where 7
1
, 7
2
, ..., 7
n
, 7 are i.i.d. 1 vectors of standard normal random variables. The constant
n(n1)
n2
appears because the variance-covariance matrix of
_
n
n2
n

i=1
(l
i
+ ) is
n(n1)
n2
1
q
. It is well
known from the multivariate statistics literature that
7
0
_
1
: 1
n

i=1
_
7
i
7
_ _
7
i
7
_
0
_
1
7
is Hotellings T-square random variable and is distributed
q(n1)
nq
1
q;nq
. Therefore, it follows that
\a|d
clus
ave(n)
d

:(: 1)
: 2
7
0
_
n

i=1
_
7
i
7
_ _
7
i
7
_
0
_
1
7
=
:
: 2
7
0
_
1
: 1
n

i=1
_
7
i
7
_ _
7
i
7
_
0
_
1
7 ~
:(: 1)
(: 2)(: )
1
q;nq
.
When = 1, the result for the t
clus
ave(n)
statistic is
t
clus
ave(n)
d

_
n(n1)
n2
7

_
n

i=1
_
7
i
7
_
2
=
_
n
n2
7

_
1
n1
n

i=1
_
7
i
7
_
2
~
_
:
: 2
t
n1
.
Proof of Theorem 3: Using algebra from the proof the Theorem 1
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
it

1
_
:T
n

i=1
[rT]

t=1
(r
i
j
i
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
it

1
_
:T
n

i=1
[rT]

t=1
T
1
T

s=1
(r
is
j
i
)n
it
32
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
it
T
1=2
1
_
:T
2
n

i=1
[rT]

t=1
T

s=1
(r
is
j
i
)n
it
.
By Assumption 8
1
_
:T
2
n

i=1
[rT]

t=1
T

s=1
(r
it
j
i
)n
it
= O
p
(1),
and it follows that
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
it
T
1=2
O
p
(1) =\
k
(1, r),
(18)
using Assumption 7. Using Assumption 5 and (18) it immediately follows that
_
:T(

, ,) =
_
1
:T
n

i=1
T

t=1
r
it
r
0
it
_
1
1
_
:T
n

i=1
T

t=1
r
it
n
it
d
Q
1
\
k
(1, 1). (19)
Now consider the limit of
1
p
nT

o
[rT]
. Simple algebra gives
1
_
:T

o
[rT]
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
r
it
(n
it
n
i
r
0
it
(

, ,))
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
_
:T
n

i=1
[rT]

t=1
r
it
n
i

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,)). (20)
Focusing on the middle term of (20):
1
_
:T
n

i=1
[rT]

t=1
r
it
n
i
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
(r
i
j
i
))n
i
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
i

1
_
:T
n

i=1
[rT]

t=1
(r
i
j
i
)n
i
= T
1=2
1
_
:T
2
n

i=1
[rT]

t=1
T

s=1
(r
it
j
i
)n
is

[rT]
_
:T
n

i=1
(r
i
j
i
)n
i
= T
1=2
1
_
:T
2
n

i=1
[rT]

t=1
T

s=1
(r
it
j
i
)n
is
T
1=2
[rT]
T
1
_
:T
2
n

i=1
T

t=1
T

s=1
(r
it
j
i
)n
is
= T
1=2
O
p
(1) T
1=2
O
p
(1) = o
p
(1),
using Assumption 7. Therefore it follows that
1
_
:T

o
[rT]
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,)) + o
p
(1)
=\
k
(1, r) rQQ
1
\
k
(1, 1) = (\
k
(1, r) r\
k
(1, 1)) (21)
33
Because \
k
(1, 1) is a standard Wiener process and \
k
(1, r) r\
k
(1, 1) is a standard Brownian
bridge, the results for the HCoC statistics follow directly from Kiefer and Vogelsang (2005) using
(19) and (21).
Proof of Theorem 4: With individual and time period xed-eects in the model we have
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
:
1
n

l=1
(r
lt
r
l
))n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
(r
lt
r
l
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
(r
lt
j
l
(r
l
j
l
))n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
:
1=2
1
_
:
2
T
n

i=1
[rT]

t=1
n

l=1
(r
lt
j
l
)n
it
:
1=2
1
_
:
2
T
n

i=1
[rT]

t=1
n

l=1
(r
l
j
l
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
:
1=2
1
_
:
2
T
n

i=1
[rT]

t=1
n

l=1
(r
lt
j
l
)n
it
(:T)
1=2
1
_
:
2
T
2
n

i=1
[rT]

t=1
n

l=1
T

s=1
(r
ls
j
l
)n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
:
1=2
O
p
(1) (:T)
1=2
O
p
(1),
using Assumption 10. Noting that Assumption 10 implies Assumption 8, it follows from (18) that
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
it
+ o
p
(1) =\
k
(1, r), (22)
and it directly follows that
_
:T(

, ,) =
_
1
:T
n

i=1
T

t=1
r
it
r
0
it
_
1
1
_
:T
n

i=1
T

t=1
r
it
n
it
d
Q
1
\
k
(1, 1). (23)
Now consider the limit of
1
p
nT

o
[rT]
. Simple algebra gives
1
_
:T

o
[rT]
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it
=
1
_
:T
n

i=1
[rT]

t=1
r
it
(n
it
n
i
:
1
n

l=1
(n
lt
n
l
) r
0
it
(

, ,))
34
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
_
:T
n

i=1
[rT]

t=1
r
it
n
i
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
r
it
(n
lt
n
l
)

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,))
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
_
:T
n

i=1
[rT]

t=1
r
it
n
i
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
r
it
n
lt

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,))
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
_
:T
n

i=1
[rT]

t=1
r
it
n
i

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,)), (24)
using
n

i=1
[rT]

t=1
n

l=1
r
it
n
lt
=
[rT]

t=1
n

i=1
n

l=1
r
it
n
lt
=
[rT]

t=1
__
n

i=1
r
it
__
n

l=1
n
lt
__
= 0,
because

n
i=1
r
it
= 0. Once it is established that the middle term of (24) is o
p
(1), the limit of (24)
is easily obtained using Assumption 5, (22) and (23). Expanding the middle term gives
1
_
:T
n

i=1
[rT]

t=1
r
it
n
i
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
:
1
n

l=1
(r
lt
r
l
))n
i
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
r
i
)n
i
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
(r
lt
r
l
)n
i
=
1
_
:T
n

i=1
[rT]

t=1
(r
it
j
i
)n
i

1
_
:T
n

i=1
[rT]

t=1
(r
i
j
i
)n
i
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
(r
lt
j
l
)n
i
:
1
1
_
:T
n

i=1
[rT]

t=1
n

l=1
(r
l
j
l
)n
i
= T
1=2
1
_
:T
2
n

i=1
[rT]

t=1
T

s=1
(r
it
j
i
)n
is
T
1=2
[rT]
T
1
_
:T
2
n

i=1
T

t=1
T

s=1
(r
it
j
i
)n
is
(:T)
1=2
1
_
:
2
T
2
n

i=1
[rT]

t=1
n

l=1
T

s=1
(r
lt
j
l
)n
is
(:T)
1=2
[rT]
T
1
_
:
2
T
2
n

i=1
n

l=1
T

t=1
T

s=1
(r
lt
j
l
)n
is
= T
1=2
O
p
(1) = o
p
(1),
35
using Assumption 10. Therefore
1
_
:T

o
[rT]
=
1
_
:T
n

i=1
[rT]

t=1
r
it
n
it

1
:T
n

i=1
[rT]

t=1
r
it
r
0
it
_
:T(

, ,)) + o
p
(1)
=\
k
(1, r) rQQ
1
\
k
(1, 1) = (\
k
(1, r) r\
k
(1, 1)),
as required. The results for the HCoC statistics follow directly from Kiefer and Vogelsang (2005)
using (23) and (24).
8 Appendix B: Bartlett Kernel Fixed-b Critical Values for \a|d
Bart
HACSC
and t
Bart
HACSC
While it is a straightforward exercise to simulate the asymptotic critical values for the \a|d
HACSC
and t
HACSC
statistics for a given kernel and value of /, practitioners would prefer a simple direct
method for obtaining critical values and j-values. While some earlier papers on xed-/ asymptotics
such as Kiefer and Vogelsang (2005) and Sun, Phillips and Jin (2008) provide simple methods
of obtaining or approximating the nonstandard critical values for popular kernels and popular
signicance levels for t-statistics and Wald statistics (testing only one restriction), no methods are
available for Wald tests with more than one restriction being tested nor are methods available for
computing j-values in any cases. This section provides a simple method for computing xed-/
critical values and j-values for t
Bart
HACSC
and \a|d
Bart
HACSC
( = 1, ..., 50).
For the t-statistic and the \a|d statistic with = 1, the higher order expansions derived by
Velasco and Robinson (2001) and Sun et al. (2008) suggest that the xed-/ critical values can be
approximated by polynomials that are functions of the (0, 1) and
2
1
critical values respectively
where the coecients depend on the moments of the statistics. While these higher order expansions
provide the order and the coecients of the polynomials, the accuracy of the expansions is only
likely to be accurate for small values of / given that the expansions are calculated around / = 0.
As discussed in Kiefer and Vogelsang (2005), the means and variances of the xed-/ asymptotic
distributions in the Bartlett case are polynomials in / over the full range [0, 1]. Therefore, it is
reasonable to expect that the xed-/ asymptotic critical values can be well approximated by a
polynomial in the standard asymptotic critical value with coecients that depend on /. The order
of the polynomial and the coecients of the polynomial then become a numerical analysis problem
where the goal is to nd polynomials that accurately give xed-/ critical values over the full range
of / [0, 1] and across all percentage points.
To be more concrete, let c denote a right tail percentage point (e.g. c = 0.95 for a 5% right tail
test) and let c

(/) denote the corresponding xed-/ asymptotic critical value for a given statistic
where / = ',T. Let .

denote the right tail critical value for the standard asymptotic distribution
of the statistic ((0, 1) or
2
q
). Consider the polynomial
c

(/) = .

+ (/).

+ 0(/).
2

(25)
36
where
(/) =
1
/ +
2
/
2
+
3
/
3
, 0(/) = 0
1
/ + 0
2
/
2
+ 0
3
/
3
.
Note by construction that c

(0) = .

so that the xed-/ critical values simplify to the standard


critical values as / approaches zero. Because values for c

(/) can be computed accurately using


simulation methods, the unknown coecients in the / polynomials can be estimated as follows. For a
given statistic c

(/) were simulated using 50, 000 replications for the grid / = 0.02, 0.04, ..., 0.98, 1.0.
The Wiener processes in the asymptotic distributions were approximated by normalized partial
sums of i.i.d. (0, 1) random variables using 1, 000 steps. For each value of / the simulated critical
values were sorted. From the 50, 000 sorted critical values, the 4, 995 critical values correspond-
ing to the percentage points 0.0002, 0.0004, ..., 0.9986, 0.9988, 0.9990 were retained except in the
case of the t-statistic where only the 2, 495 critical values corresponding to the percentage points
0.5002, 0.5004, ..., 0.9986, 0.9988, 0.9990 were retained because the asymptotic distribution of the
t-statistic is symmetric around zero. For each percentage point on these grids, .

was computed.
By plugging in the formulas for the / polynomials, it is possible to rewrite (25) as a regression-
type relationship in terms of polynomials of /, .

and their interactions as


c

(/) = .

+
1
/.

+
2
/
2
.

+
3
/
3
.

+ 0
1
/.
2

+ 0
2
/
2
.
2

+ 0
3
/
3
.
2

. (26)
For a given statistic across the grid of /s there are 4, 995 50 observations (2, 495 50 for the
t-statistic) for c

(/), / and .

for which the functional form (26) can be t. It was found through
trial and error that an excellent t could not be obtained across all grid points of / for some statis-
tics. However, dividing the grid points of / into the subgrids (0.02, 0.04, ...0.1), (0.12, 0.14, ..., 0.3),
(0.32, 0.34, ..., 0.98, 1.0) provided excellent ts within each subgrid when OLS was used to t the
model. The OLS estimates of the / polynomials are reported in Table B along with 1
2
values for
each subset of / grids. The lowest 1
2
obtained was 0.9994. The ts were also checked by visual
inspection of plots of the C11 functions based on c

(/) and the OLS tted values, c

(/), for
each value of / on the grid. In all cases, the plots of the true and the tted C11 functions were
essentially identical. Increasing the order of the / or .

polynomials made no noticeable change in


the plotted ts. Using lower order polynomials resulted in poor ts for some values of /.
Most statistical packages report j-values for test statistics. Because the relationship between
c

(/) and .

is quadratic, it is easy to obtain a j-value function by inverting (25). The inverse


functional relationship is given by
. (c(/)) =
(1 + (/)) +
_
(1 + (/))
2
+ 40(/)c(/)
20(/)
, (27)
and it was veried through plots of the C11 functions that the correct root is always given by
(27). Let r denote a value of \a|d
Bart
HACSC
. Let 7 denote the standard asymptotic random variable,
37
i.e.
2
q
. The xed-/ asymptotic j-value is given by
1(\

q
(1)
0
1(/,

q
)
1
\

q
(1) r)
or equivalently using (27)
1(7 .(r)).
Once .(r) is computed, the j-value can be computed easily in most statistical packages. For the
t-statistic the calculation is slightly more complicated. Let 7 now denote the (0, 1) random
variable. For a two-tail test, the j-value is given by
21(7 .([r[)).
For a left-tail test with r < 0, the j-value is given by
1(7 .([r[)),
whereas for r _ 0 the j-value is given by
1 1(7 .(r)).
For a right-tail test with r < 0, the j-value is given by
1 1(7 .([r[)),
whereas for r _ 0 the j-value is given by
1(7 .(r)).
38
Table 1: Empirical Null Rejection Probabilities, 5% level, t
Bart
ave(n)
and t
clus
ave(n)
,
No spatial correlation in cross-section (` = 0, 0 = 0). Two-Tailed Test of H
0
: ,
1
= 0.
(0, 1) Critical Values Fixed-/, t Critical Values
t
Bart
ave(n)
, values of / t
clus
ave(n)
t
Bart
ave(n)
, values of / t
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .084 .085 .090 .091 .096 .110 .083 .079 .075 .071 .068 .063
.3 .112 .117 .120 .121 .122 .112 .110 .110 .106 .097 .091 .071
.6 .178 .164 .162 .156 .154 .128 .174 .158 .146 .129 .115 .081
.9 .277 .241 .225 .206 .196 .150 .275 .226 .209 .181 .162 .106
50 .0 .067 .067 .070 .077 .085 .096 .066 .063 .060 .059 .058 .061
.3 .077 .076 .080 .086 .090 .094 .076 .071 .068 .065 .066 .061
.6 .102 .095 .089 .094 .096 .095 .101 .090 .081 .078 .076 .063
.9 .234 .185 .173 .158 .146 .115 .232 .180 .156 .132 .117 .082
250 .0 .054 .056 .059 .066 .068 .083 .052 .052 .048 .049 .048 .051
.3 .048 .048 .053 .063 .070 .079 .047 .045 .046 .047 .045 .049
.6 .061 .061 .066 .072 .074 .081 .060 .054 .050 .048 .048 .048
.9 .097 .088 .082 .083 .090 .082 .095 .084 .074 .066 .060 .056
50 10 .0 .074 .071 .069 .069 .069 .059 .070 .070 .065 .064 .059 .056
.3 .094 .094 .089 .084 .080 .067 .091 .088 .086 .077 .072 .059
.6 .158 .137 .123 .103 .093 .071 .154 .133 .117 .098 .086 .062
.9 .258 .209 .177 .149 .118 .068 .255 .204 .172 .137 .106 .062
50 .0 .051 .052 .053 .054 .057 .059 .051 .050 .048 .048 .049 .053
.3 .063 .060 .059 .057 .055 .055 .061 .059 .055 .050 .049 .050
.6 .083 .076 .070 .065 .060 .055 .082 .073 .064 .060 .057 .048
.9 .202 .147 .127 .103 .089 .059 .199 .146 .118 .094 .081 .053
250 .0 .052 .051 .053 .053 .053 .057 .049 .050 .050 .050 .050 .048
.3 .052 .051 .053 .055 .056 .057 .051 .049 .049 .049 .051 .051
.6 .057 .055 .053 .055 .057 .057 .055 .053 .050 .050 .051 .051
.9 .095 .077 .074 .067 .066 .056 .090 .075 .066 .062 .058 .052
250 10 .0 .059 .057 .053 .051 .051 .045 .058 .055 .051 .049 .048 .043
.3 .076 .072 .068 .059 .055 .048 .073 .071 .066 .057 .053 .045
.6 .141 .117 .106 .089 .074 .052 .137 .115 .105 .086 .072 .049
.9 .218 .169 .150 .128 .097 .058 .217 .167 .150 .124 .095 .055
50 .0 .052 .051 .049 .051 .049 .048 .049 .049 .049 .048 .048 .046
.3 .066 .065 .063 .063 .060 .058 .066 .064 .062 .060 .059 .055
.6 .090 .079 .072 .067 .066 .059 .089 .077 .071 .066 .066 .055
.9 .202 .148 .125 .097 .081 .044 .200 .146 .124 .095 .079 .041
250 .0 .045 .044 .044 .044 .044 .044 .043 .044 .043 .043 .043 .043
.3 .050 .050 .049 .050 .050 .051 .049 .049 .049 .049 .047 .044
.6 .057 .053 .051 .050 .051 .048 .055 .052 .050 .049 .048 .047
.9 .086 .071 .063 .056 .052 .043 .084 .070 .062 .053 .051 .041
39
Table 2: Empirical Null Rejection Probabilities, 5% level, t
Bart
HACSC
and t
clus
ave(n)
,
No spatial correlation in cross-section (` = 0, 0 = 0). Two-Tailed Test of H
0
: ,
1
= 0.
(0, 1) Critical Values Fixed-/, t Critical Values
t
Bart
HACSC
, values of / t
clus
ave(n)
t
Bart
HACSC
, values of / t
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .134 .167 .223 .311 .390 .110 .091 .086 .087 .082 .083 .063
.3 .167 .206 .259 .356 .414 .112 .120 .112 .106 .102 .102 .071
.6 .238 .270 .339 .420 .496 .128 .194 .167 .163 .149 .149 .081
.9 .363 .368 .436 .501 .563 .150 .311 .256 .234 .219 .218 .106
50 .0 .097 .123 .199 .281 .352 .096 .068 .055 .050 .045 .050 .061
.3 .103 .139 .209 .286 .352 .094 .068 .064 .060 .060 .061 .061
.6 .141 .178 .244 .323 .400 .095 .100 .090 .086 .081 .086 .063
.9 .311 .313 .385 .460 .535 .115 .253 .190 .173 .168 .165 .082
250 .0 .082 .116 .178 .270 .349 .083 .049 .049 .049 .047 .047 .051
.3 .084 .111 .174 .271 .337 .079 .053 .050 .045 .047 .048 .049
.6 .087 .119 .186 .268 .332 .081 .060 .053 .046 .044 .047 .048
.9 .140 .170 .225 .318 .394 .082 .096 .086 .079 .078 .077 .056
50 10 .0 .120 .156 .225 .312 .382 .059 .088 .077 .072 .073 .074 .056
.3 .155 .176 .256 .341 .411 .067 .111 .108 .100 .094 .096 .059
.6 .231 .242 .326 .409 .491 .071 .175 .153 .147 .133 .134 .062
.9 .361 .366 .435 .499 .560 .068 .302 .254 .234 .210 .209 .062
50 .0 .090 .123 .190 .282 .344 .059 .056 .058 .058 .058 .060 .053
.3 .099 .128 .191 .274 .347 .055 .061 .060 .056 .059 .053 .050
.6 .119 .145 .205 .298 .372 .055 .081 .075 .063 .064 .064 .048
.9 .284 .281 .340 .430 .489 .059 .231 .171 .162 .150 .152 .053
250 .0 .082 .110 .168 .259 .335 .057 .047 .051 .055 .050 .051 .048
.3 .081 .113 .172 .263 .331 .057 .056 .049 .050 .050 .051 .051
.6 .093 .126 .190 .284 .352 .057 .057 .057 .062 .059 .057 .051
.9 .142 .173 .240 .339 .401 .056 .104 .085 .080 .079 .080 .052
250 10 .0 .109 .144 .201 .280 .353 .045 .070 .062 .058 .061 .059 .043
.3 .129 .163 .233 .320 .393 .048 .091 .075 .081 .077 .080 .045
.6 .209 .236 .306 .391 .463 .052 .165 .144 .128 .128 .133 .049
.9 .339 .335 .405 .473 .532 .058 .286 .245 .218 .202 .199 .055
50 .0 .088 .121 .176 .265 .340 .048 .055 .052 .053 .051 .054 .046
.3 .103 .138 .202 .272 .337 .058 .064 .060 .061 .058 .060 .055
.6 .140 .169 .238 .326 .382 .059 .097 .084 .082 .079 .081 .055
.9 .306 .303 .372 .446 .510 .044 .239 .178 .171 .173 .171 .041
250 .0 .073 .104 .172 .262 .340 .044 .048 .042 .041 .041 .039 .043
.3 .084 .115 .168 .254 .327 .051 .052 .049 .043 .044 .046 .044
.6 .092 .120 .175 .261 .336 .048 .060 .058 .057 .053 .055 .047
.9 .130 .157 .231 .302 .373 .043 .094 .080 .077 .076 .080 .041
40
Table 3: Empirical Null Rejection Probabilities, 5% level, MA(2) Spatial correlation
in cross-section, ` = 0, 0 = 0.5. Two-Tailed Test of H
0
: ,
1
= 0. Fixed-/, t Critical Values
t
Bart
ave(n)
, values of / t
clus
ave(n)
t
Bart
HACSC
, values of /
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
9 10 .0 .413 .413 .421 .428 .430 .423 .154 .133 .118 .118 .119
.3 .425 .426 .426 .420 .425 .408 .172 .154 .136 .136 .137
.6 .469 .467 .463 .451 .449 .414 .229 .198 .187 .185 .186
.9 .530 .516 .508 .491 .473 .411 .321 .288 .260 .240 .241
50 .0 .382 .389 .401 .422 .433 .456 .070 .065 .059 .064 .065
.3 .393 .401 .408 .416 .426 .441 .081 .084 .081 .074 .079
.6 .434 .430 .434 .443 .446 .437 .120 .102 .099 .100 .095
.9 .535 .501 .485 .466 .460 .423 .272 .227 .209 .197 .200
250 .0 .383 .393 .406 .417 .434 .469 .056 .051 .042 .049 .049
.3 .398 .406 .421 .434 .446 .478 .058 .052 .049 .049 .049
.6 .411 .416 .436 .450 .458 .475 .064 .063 .063 .060 .062
.9 .462 .452 .449 .455 .459 .471 .122 .118 .107 .106 .108
49 10 .0 .526 .526 .523 .525 .525 .523 .110 .097 .093 .092 .091
.3 .537 .537 .533 .529 .527 .517 .133 .120 .108 .112 .114
.6 .584 .569 .561 .547 .538 .513 .196 .164 .162 .158 .162
.9 .637 .602 .585 .566 .548 .498 .297 .246 .230 .215 .212
50 .0 .469 .469 .474 .480 .485 .500 .061 .064 .060 .061 .065
.3 .489 .489 .493 .495 .494 .498 .072 .070 .065 .069 .070
.6 .522 .513 .510 .506 .504 .502 .103 .095 .090 .088 .094
.9 .631 .601 .581 .554 .542 .508 .270 .219 .190 .184 .186
250 .0 .466 .469 .473 .480 .484 .496 .043 .045 .045 .048 .049
.3 .480 .484 .485 .492 .493 .503 .060 .056 .051 .050 .051
.6 .482 .482 .481 .484 .490 .501 .069 .061 .059 .058 .064
.9 .538 .526 .518 .513 .516 .506 .124 .107 .092 .091 .096
256 10 .0 .546 .544 .544 .543 .538 .533 .085 .081 .072 .070 .074
.3 .564 .562 .554 .540 .530 .512 .109 .099 .098 .094 .097
.6 .624 .609 .598 .574 .555 .520 .171 .143 .135 .131 .132
.9 .688 .657 .636 .608 .587 .525 .301 .244 .216 .205 .199
50 .0 .525 .523 .523 .523 .521 .518 .056 .050 .051 .056 .053
.3 .532 .532 .528 .526 .525 .522 .061 .064 .065 .062 .065
.6 .566 .554 .543 .536 .532 .519 .087 .079 .074 .072 .077
.9 .679 .644 .625 .594 .577 .533 .251 .183 .175 .168 .168
250 .0 .505 .506 .509 .508 .510 .506 .047 .046 .049 .043 .047
.3 .500 .501 .502 .503 .504 .503 .050 .048 .050 .049 .048
.6 .510 .506 .505 .505 .505 .507 .057 .053 .050 .048 .051
.9 .559 .542 .533 .524 .518 .506 .093 .081 .076 .073 .073
41
Table 4: Size-adjusted Power, 5% level, Bartlett Kernel
No spatial correlation in cross-section (` = 0, 0 = 0).
Two-Tailed Test of H
0
: ,
1
= 0. Alternative value is ,
1
= 0.1.
t
Bart
ave(n)
, values of / t
clus
ave(n)
t
Bart
HACSC
, values of /
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .153 .141 .150 .149 .148 .131 .117 .130 .123 .106 .115
.3 .122 .125 .136 .130 .127 .116 .103 .099 .110 .099 .100
.6 .110 .111 .109 .106 .103 .114 .104 .102 .102 .098 .099
.9 .089 .087 .086 .086 .087 .084 .091 .098 .077 .079 .086
50 .0 .578 .577 .577 .553 .550 .498 .510 .499 .460 .449 .447
.3 .513 .502 .484 .462 .459 .405 .452 .393 .368 .375 .371
.6 .312 .300 .289 .274 .260 .264 .289 .266 .249 .250 .243
.9 .137 .131 .129 .122 .111 .107 .112 .101 .086 .091 .099
250 .0 .998 .998 .998 .998 .997 .993 .995 .990 .966 .956 .956
.3 .997 .997 .995 .994 .993 .981 .993 .983 .955 .916 .909
.6 .933 .927 .918 .906 .897 .854 .907 .849 .822 .761 .758
.9 .374 .367 .368 .355 .343 .304 .311 .305 .256 .265 .263
50 10 .0 .535 .544 .536 .533 .523 .501 .423 .424 .396 .386 .398
.3 .471 .468 .461 .463 .465 .457 .392 .360 .350 .311 .323
.6 .342 .333 .335 .336 .331 .334 .292 .284 .251 .245 .251
.9 .235 .239 .229 .226 .218 .211 .227 .209 .192 .199 .201
50 .0 1.00 1.00 1.00 1.00 1.00 1.00 .998 .990 .960 .940 .928
.3 .994 .993 .993 .993 .993 .992 .988 .967 .936 .915 .912
.6 .941 .936 .937 .934 .929 .929 .894 .851 .775 .782 .780
.9 .450 .459 .456 .456 .460 .451 .423 .378 .333 .321 .320
250 .0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.6 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 .999 .999 .998
.9 .952 .953 .953 .953 .950 .948 .917 .899 .824 .786 .793
250 10 .0 .998 .998 .998 .998 .998 .998 .991 .977 .949 .907 .912
.3 .996 .996 .995 .995 .995 .996 .967 .955 .922 .882 .876
.6 .959 .960 .959 .958 .959 .958 .890 .872 .777 .777 .777
.9 .831 .830 .282 .828 .829 .833 .743 .710 .687 .630 .617
50 .0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.6 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 .999 .997 .997
.9 .988 .988 .988 .988 .988 .987 .960 .941 .882 .830 .829
250 .0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.3 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.6 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
.9 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 .999 .998
42
Table 5: Size-adjusted Power, 5% level, t
Bart
HACSC
MA(2) Spatial correlation in cross-section (` = 0, 0 = 0.5)
Two-Tail Test of H
0
: ,
1
= 0. Alternative value is ,
1
= 0.1.
t
Bart
HACSC
, values of /
: T j 0.1 0.2 0.4 0.7 1.0
9 10 .0 .076 .070 .062 .058 .059
.3 .071 .066 .059 .067 .069
.6 .070 .059 .067 .074 .073
.9 .062 .067 .059 .059 .060
50 .0 .165 .157 .145 .120 .123
.3 .147 .133 .141 .136 .138
.6 .128 .123 .108 .106 .106
.9 .070 .073 .071 .075 .076
250 .0 .505 .500 .464 .426 .413
.3 .453 .441 .396 .404 .397
.6 .318 .275 .258 .269 .268
.9 .131 .113 .102 .106 .111
49 10 .0 .097 .095 .089 .091 .095
.3 .104 .090 .088 .090 .089
.6 .088 .083 .083 .080 .086
.9 .085 .091 .076 .074 .079
50 .0 .414 .374 .328 .324 .321
.3 .355 .317 .300 .275 .281
.6 .242 .218 .181 .189 .188
.9 .123 .117 .111 .096 .102
250 .0 .971 .953 .915 .880 .873
.3 .932 .898 .866 .832 .826
.6 .721 .680 .639 .591 .597
.9 .250 .235 .229 .212 .202
256 10 .0 .323 .286 .246 .255 .253
.3 .294 .267 .229 .242 .243
.6 .237 .204 .191 .186 .185
.9 .168 .172 .168 .158 .157
50 .0 .937 .906 .862 .800 .814
.3 .875 .838 .779 .729 .744
.6 .717 .688 .600 .587 .596
.9 .275 .246 .216 .206 .226
250 .0 1.00 1.00 1.00 .998 .998
.3 1.00 1.00 .999 .996 .995
.6 .999 .996 .986 .974 .971
.9 .764 .696 .606 .585 .596
43
Table 6: Empirical Null Rejection Probabilities, 5% level, t
Bart
HACSC
and t
clus
ave(n)
,
Two Group Case: Within Group Correlation 0.75, Between Group Correlation 0.25.
Two-Tailed Test of H
0
: ,
1
= 0. Fixed-/, t Critical Values
No Time Dummies With Time Dummies
t
Bart
HACSC
, values of / t
clus
ave(n)
t
Bart
HACSC
, values of / t
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .154 .142 .125 .124 .123 .344 .161 .142 .124 .124 .126 .333
.3 .172 .153 .133 .140 .141 .350 .188 .166 .152 .143 .140 .330
.6 .222 .201 .180 .175 .170 .354 .235 .204 .194 .185 .187 .313
.9 .301 .270 .243 .223 .226 .346 .316 .280 .243 .234 .225 .287
50 .0 .057 .055 .059 .058 .061 .331 .071 .061 .059 .058 .061 .372
.3 .072 .070 .067 .066 .069 .353 .080 .076 .070 .070 .070 .367
.6 .111 .096 .095 .094 .095 .361 .138 .115 .104 .103 .107 .348
.9 .277 .223 .204 .193 .198 .360 .293 .243 .210 .203 .203 .322
250 .0 .051 .051 .050 .046 .048 .358 .053 .049 .049 .052 .052 .373
.3 .055 .057 .052 .049 .050 .349 .052 .050 .050 .050 .052 .372
.6 .063 .066 .061 .058 .061 .346 .062 .055 .059 .058 .061 .352
.9 .128 .114 .091 .096 .095 .358 .130 .112 .102 .100 .106 .357
50 10 .0 .156 .143 .133 .131 .131 .683 .157 .141 .123 .125 .127 .616
.3 .181 .166 .155 .148 .152 .706 .177 .166 .154 .147 .152 .608
.6 .245 .222 .211 .196 .200 .693 .235 .212 .189 .181 .188 .573
.9 .331 .291 .268 .245 .245 .642 .321 .286 .257 .237 .235 .514
50 .0 .067 .060 .059 .060 .060 .664 .061 .059 .060 .061 .064 .663
.3 .074 .069 .068 .070 .071 .669 .076 .067 .071 .071 .073 .647
.6 .108 .099 .088 .086 .089 .666 .107 .098 .088 .088 .085 .638
.9 .281 .220 .200 .196 .201 .674 .272 .222 .194 .178 .179 .566
250 .0 .047 .043 .040 .039 .042 .671 .049 .054 .053 .054 .051 .662
.3 .055 .057 .051 .052 .053 .650 .060 .054 .060 .061 .062 .661
.6 .069 .061 .063 .065 .064 .664 .075 .068 .069 .070 .071 .665
.9 .129 .113 .099 .099 .104 .680 .142 .117 .113 .110 .109 .680
250 10 .0 .135 .120 .114 .103 .106 .846 .162 .140 .130 .135 .135 .826
.3 .147 .133 .131 .122 .121 .851 .182 .156 .149 .141 .148 .806
.6 .207 .185 .174 .161 .164 .841 .237 .210 .195 .191 .191 .768
.9 .295 .257 .235 .227 .224 .835 .314 .276 .238 .228 .228 .730
50 .0 .061 .059 .056 .056 .057 .862 .070 .067 .060 .059 .059 .830
.3 .077 .071 .062 .062 .065 .855 .080 .074 .071 .070 .074 .838
.6 .117 .104 .091 .095 .095 .863 .117 .100 .093 .091 .097 .825
.9 .290 .240 .216 .204 .206 .847 .296 .241 .212 .205 .206 .793
250 .0 .051 .053 .046 .049 .048 .849 .044 .049 .048 .045 .049 .853
.3 .052 .048 .044 .045 .049 .854 .058 .054 .051 .056 .056 .844
.6 .061 .056 .056 .056 .055 .853 .061 .060 .057 .058 .062 .831
.9 .127 .112 .094 .090 .090 .868 .124 .105 .089 .090 .091 .821
44
Table 7: Empirical Null Rejection Probabilities, 5% level, \a|d
Bart
HACSC
(Bartlett Kernel) and \a|d
clus
ave(n)
,
No spatial correlation in cross-section (0 = 0). Testing H
0
: ,
1
= 0, ,
2
= 0.

2
2
Critical Values Fixed-/, 1 Critical Values
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .198 .264 .392 .533 .625 .173 .116 .104 .107 .097 .092 .060
.3 .248 .314 .451 .586 .670 .189 .142 .133 .135 .129 .129 .060
.6 .372 .424 .560 .678 .751 .219 .258 .216 .207 .200 .197 .080
.9 .558 .568 .673 .774 .830 .243 .449 .360 .322 .329 .331 .106
50 .0 .141 .199 .322 .469 .569 .173 .068 .063 .059 .066 .059 .052
.3 .147 .219 .351 .499 .594 .180 .078 .066 .071 .074 .071 .060
.6 .209 .281 .407 .561 .647 .176 .124 .108 .108 .103 .107 .069
.9 .469 .489 .618 .730 .793 .212 .376 .263 .256 .248 .253 .072
250 .0 .115 .181 .314 .477 .580 .159 .050 .044 .046 .051 .052 .052
.3 .122 .188 .323 .464 .572 .160 .057 .056 .059 .057 .056 .052
.6 .142 .203 .341 .485 .582 .151 .074 .069 .064 .063 .065 .050
.9 .227 .291 .421 .563 .655 .172 .146 .120 .114 .108 .110 .070
50 10 .0 .188 .260 .391 .532 .618 .075 .111 .100 .098 .091 .089 .059
.3 .247 .315 .442 .578 .662 .083 .146 .141 .134 .129 .129 .067
.6 .385 .428 .560 .680 .750 .087 .275 .216 .209 .208 .208 .068
.9 .564 .579 .681 .761 .817 .096 .459 .381 .342 .338 .344 .073
50 .0 .120 .204 .331 .474 .568 .072 .053 .054 .053 .056 .056 .052
.3 .145 .212 .343 .508 .601 .070 .073 .062 .059 .070 .067 .056
.6 .195 .259 .403 .556 .651 .076 .114 .102 .096 .092 .095 .059
.9 .463 .472 .597 .712 .777 .086 .343 .260 .240 .236 .236 .066
250 .0 .106 .176 .313 .465 .575 .065 .051 .048 .045 .047 .043 .044
.3 .111 .175 .314 .467 .572 .066 .055 .055 .054 .055 .054 .050
.6 .132 .204 .338 .502 .591 .074 .069 .063 .065 .068 .070 .053
.9 .211 .278 .408 .543 .643 .071 .130 .107 .106 .104 .103 .055
250 10 .0 .183 .238 .632 .507 .598 .043 .102 .094 .085 .088 .087 .040
.3 .217 .285 .409 .552 .643 .046 .129 .113 .101 .105 .103 .043
.6 .333 .385 .526 .646 .729 .049 .234 .191 .184 .172 .172 .048
.9 .544 .562 .658 .754 .815 .052 .452 .360 .319 .319 .319 .049
50 .0 .131 .201 .316 .467 .565 .052 .059 .060 .060 .058 .059 .049
.3 .148 .216 .342 .494 .580 .052 .069 .064 .066 .066 .066 .048
.6 .205 .275 .406 .559 .636 .054 .111 .101 .106 .097 .100 .049
.9 .470 .482 .605 .714 .774 .057 .350 .260 .248 .237 .242 .052
250 .0 .098 .164 .306 .453 .564 .047 .043 .043 .042 .041 .042 .043
.3 .105 .176 .314 .480 .570 .042 .053 .047 .045 .048 .044 .041
.6 .127 .203 .344 .489 .588 .049 .068 .066 .058 .056 .056 .046
.9 .232 .308 .449 .575 .666 .055 .133 .119 .118 .120 .122 .051
45
Table 8: Empirical Null Rejection Probabilities, 5% level, \a|d
Bart
HACSC
(Bartlett Kernel) and \a|d
clus
ave(n)
,
No spatial correlation in cross-section (` = 0, 0 = 0). Testing H
0
: ,
1
= 0, ,
2
= 0, ,
3
= 0, ,
4
= 0.

2
4
Critical Values Fixed-/, 1 Critical Values
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
10 10 .0 .417 .544 .744 .855 .904 .361 .207 .195 .188 .175 .171 .070
.3 .472 .597 .784 .878 .920 .375 .267 .225 .238 .219 .219 .075
.6 .641 .728 .868 .938 .962 .414 .413 .354 .360 .343 .336 .086
.9 .836 .873 .939 .975 .981 .466 .680 .573 .562 .565 .563 .110
50 .0 .223 .399 .641 .806 .875 .349 .070 .067 .067 .070 .070 .053
.3 .253 .425 .654 .814 .886 .351 .085 .088 .087 .086 .081 .059
.6 .374 .529 .732 .856 .916 .358 .158 .140 .148 .141 .138 .058
.9 .745 .796 .911 .961 .978 .422 .525 .433 .436 .435 .425 .092
250 .0 .195 .364 .636 .789 .868 .331 .053 .054 .059 .053 .056 .055
.3 .211 .379 .627 .800 .870 .341 .053 .063 .065 .060 .064 .043
.6 .240 .415 .654 .818 .889 .330 .066 .063 .071 .067 .068 .048
.9 .406 .551 .763 .877 .923 .343 .193 .174 .171 .168 .169 .060
50 10 .0 .392 .536 .736 .853 .908 .096 .187 .181 .176 .171 .162 .051
.3 .463 .603 .771 .887 .930 .101 .249 .225 .228 .207 .205 .060
.6 .646 .738 .882 .943 .966 .105 .412 .347 .344 .344 .337 .065
.9 .848 .876 .939 .971 .984 .119 .678 .563 .552 .573 .559 .069
50 .0 .207 .376 .625 .781 .855 .087 .058 .055 .051 .050 .053 .047
.3 .246 .416 .653 .811 .873 .089 .084 .077 .073 .071 .069 .051
.6 .353 .517 .733 .861 .919 .096 .149 .135 .142 .136 .132 .058
.9 .730 .789 .901 .955 .973 .109 .511 .407 .426 .415 .413 .065
250 .0 .186 .369 .614 .796 .871 .084 .044 .042 .050 .047 .042 .042
.3 .192 .362 .636 .796 .867 .083 .053 .050 .056 .057 .055 .048
.6 .228 .391 .641 .808 .870 .085 .069 .061 .062 .062 .059 .051
.9 .385 .539 .746 .879 .920 .095 .163 .153 .163 .154 .155 .058
250 10 .0 .383 .519 .740 .857 .912 .057 .184 .167 .170 .165 .157 .049
.3 .431 .577 .767 .885 .927 .064 .232 .201 .204 .199 .193 .057
.6 .612 .716 .864 .932 .967 .066 .398 .325 .328 .325 .316 .057
.9 .837 .866 .944 .974 .983 .068 .654 .545 .545 .552 .541 .060
50 .0 .196 .635 .625 .792 .858 .047 .059 .051 .057 .053 .054 .043
.3 .226 .395 .629 .809 .873 .051 .066 .064 .069 .067 .068 .045
.6 .346 .502 .733 .860 .909 .054 .133 .119 .128 .123 .118 .051
.9 .730 .783 .906 .955 .974 .060 .510 .408 .414 .418 .420 .051
250 .0 .191 .366 .631 .796 .871 .049 .046 .051 .047 .046 .043 .044
.3 .201 .372 .630 .799 .860 .054 .055 .051 .053 .051 .049 .046
.6 .229 .393 .643 .804 .877 .063 .069 .060 .062 .063 .059 .057
.9 .399 .562 .761 .871 .928 .064 .172 .145 .148 .146 .147 .058
46
Table 9: Empirical Null Rejection Probabilities, 5% level, \a|d
Bart
HACSC
(Bartlett Kernel) and \a|d
clus
ave(n)
,
MA(2) Spatial correlation in cross-section (` = 0, 0 = 0.5). Fixed-/, 1 Critical Values.
H
0
: ,
1
= 0, ,
2
= 0 H
0
: ,
1
= 0, ,
2
= 0, ,
3
= 0, ,
4
= 0.
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
\a|d
Bart
HACSC
, values of / \a|d
clus
ave(n)
: T j 0.1 0.2 0.4 0.7 1.0 0.1 0.2 0.4 0.7 1.0
9 10 .0 .217 .186 .170 .169 .174 .591 .370 .322 .328 .314 .309 .771
.3 .254 .225 .205 .204 .205 .578 .413 .369 .356 .354 .354 .770
.6 .334 .289 .268 .257 .264 .582 .521 .458 .441 .441 .438 .755
.9 .466 .406 .362 .361 .358 .580 .704 .624 .594 .601 .602 .744
50 .0 .086 .073 .065 .064 .061 .623 .088 .084 .085 .086 .082 .778
.3 .095 .083 .076 .075 .075 .618 .114 .104 .097 .099 .100 .786
.6 .154 .129 .112 .111 .111 .620 .204 .167 .183 .182 .178 .762
.9 .405 .331 .289 .291 .291 .606 .581 .490 .491 .484 .487 .755
250 .0 .058 .050 .057 .055 .055 .654 .051 .052 .052 .049 .048 .795
.3 .065 .063 .062 .060 .061 .665 .059 .053 .057 .053 .052 .777
.6 .077 .074 .070 .071 .072 .661 .077 .073 .076 .071 .069 .786
.9 .168 .146 .145 .138 .139 .644 .234 .206 .210 .206 .203 .775
49 10 .0 .143 .126 .123 .115 .122 .717 .257 .227 .233 .222 .217 .907
.3 .182 .160 .145 .150 .152 .706 .292 .261 .260 .257 .250 .911
.6 .278 .232 .220 .217 .219 .706 .455 .394 .383 .380 .375 .908
.9 .453 .366 .339 .331 .329 .727 .693 .590 .582 .586 .583 .910
50 .0 .061 .062 .063 .057 .059 .699 .070 .069 .060 .064 .060 .886
.3 .082 .081 .077 .076 .074 .705 .087 .077 .082 .076 .077 .891
.6 .135 .112 .107 .102 .106 .708 .163 .144 .143 .141 .135 .900
.9 .365 .280 .258 .257 .258 .712 .551 .446 .445 .445 .444 .896
250 .0 .051 .047 .046 .050 .048 .699 .053 .053 .057 .058 .058 .894
.3 .057 .052 .051 .051 .049 .694 .064 .059 .060 .057 .060 .896
.6 .067 .064 .059 .061 .062 .700 .069 .060 .063 .064 .064 .887
.9 .152 .124 .118 .112 .117 .717 .181 .160 .155 .154 .160 .897
256 10 .0 .110 .098 .095 .089 .087 .711 .192 .170 .168 .154 .154 .897
.3 .146 .136 .121 .115 .118 .723 .234 .213 .207 .198 .196 .900
.6 .253 .210 .196 .192 .187 .723 .400 .320 .324 .317 .306 .900
.9 .453 .353 .320 .311 .311 .739 .658 .557 .538 .546 .540 .912
50 .0 .057 .057 .053 .055 .053 .730 .062 .060 .059 .060 .059 .917
.3 .073 .072 .067 .065 .067 .736 .083 .089 .080 .077 .078 .914
.6 .121 .106 .098 .105 .102 .738 .155 .144 .142 .142 .141 .916
.9 .344 .247 .231 .237 .236 .716 .513 .408 .410 .411 .405 .907
250 .0 .052 .051 .053 .051 .051 .718 .051 .049 .054 .053 .049 .913
.3 .057 .059 .058 .059 .059 .713 .058 .055 .053 .058 .053 .909
.6 .068 .058 .061 .063 .065 .717 .072 .067 .072 .076 .070 .894
.9 .122 .102 .099 .092 .095 .712 .170 .154 .152 .152 .150 .908
47
Table 10: Empirical Application: Divorce Rates, U.S. State Level Data Annual from 1956-1988
Dependent Variable is Divorce Rate per 1,000 Persons per Year
Panel A: Estimates and Standard Errors (includes State and Year xed-eects)
Law Change OLS HCoC Robust se, Bartlett Kernel
in Eect for: Estimate OLS se c|n:(:) se ' = 3 ' = 7 ' = 17 ' = 33
1-2 years .267 .085 .188 .144 .114 .075 .054
3-4 years .210 .085 .159 .090 .079 .046 .035
5-6 years .164 .085 .171 .060 .057 .028 .024
7-8 years .158 .084 .174 .043 .039 .021 .016
9-10 years -.121 .084 .163 .039 .043 .037 .026
11-12 years -.324 .083 .180 .043 .045 .037 .024
13-14 years -.461 .084 .199 .054 .051 .043 .028
15+ years -.507 .080 .233 .042 .047 .037 .029
Panel B: 95% Condence Intervals using HCoC Robust se, Bartlett Kernel
Using (0, 1) Critical Values Using Fixed-/ Critical Values
' = 3 ' = 7 ' = 17 ' = 33 ' = 3 ' = 7 ' = 17 ' = 33
1-2 years -.02, .55 .04, .49 .12, .41 .16, .37 -.05, .59 -.02, .56 .003, .53 .01, .52
3-4 years .03, .39 .06, .36 .12, .30 .14, .28 .01, .41 .01, .41 .05, .37 .04, .38
5-6 years .05, .28 .05, .28 .11, .22 .12, .21 .03, .30 .02, .31 .07, .26 .05, .28
7-8 years .07, .24 .08, .23 .12, .20 .13, .19 .06, .25 .06, .26 .08, .23 .08, .23
9-10 years -.20, -.04 -.20, -.04 -.19, -.05 -.17, -.07 -.21, -.03 -.23, -.01 -.25, .01 -.25, .003
11-12 years -.41, -.24 -.41, -.24 -.40, -.25 -.37, -.28 -.42, -.23 -.44, -.21 -.45, -.19 -.44, -.21
13-14 years -.57, -.36 -.56, -.36 .54, -.38 -.52, -.41 -.58, -.34 -.59, -.33 -.61, -.31 -.59, -.33
15+ years -.59, -.42 -.60, -.41 -.58, -.43 -.56, -.45 -.60, -.41 -.63, -.39 -.63, -.38 -.65, -.37
Panel C: 95% Condence Interval Lengths
Using (0, 1) Critical Values Using Fixed-/ Critical Values
' = 3 ' = 7 ' = 17 ' = 33 ' = 3 ' = 7 ' = 17 ' = 33
1-2 years .564 .447 .294 .212 .644 .582 .527 .515
3-4 years .353 .310 .180 .137 .402 .403 .323 .334
5-6 years .235 .223 .110 .094 .268 .291 .197 .229
7-8 years .169 .152 .082 .063 .192 .199 .148 .153
9-10 years .153 .169 .145 .102 .174 .220 .260 .248
11-12 years .169 .176 .145 .094 .192 .230 .260 .229
13-14 years .212 .200 .169 .110 .241 .260 .302 .267
15+ years .164 .184 .145 .114 .188 .240 .260 .277
48
Table B: Fixed-b Right Tail Critical Value Polynomial Coecients, Bartlett Kernel
Statistic Range of b ~1 ~2 ~3 01 02 03 1
2
|
Bart
HACSC
[0, 0.08] 0.767 -10.864 72.261 0.287 6.576 -42.401 0.9999
(0.08, 0.2] 0.455 -1.617 5.314 0.459 1.289 -3.083 0.9996
(0.2, 0.5] 0.422 -0.650 1.302 0.439 1.071 -1.510 0.9994
(0.5, 1.0] 0.033 0.957 -0.352 0.851 -0.579 0.145 0.9995
VolJ
Bart
HACSC
[0, 0.08] 0.990 5.961 -33.645 0.251 2.912 -14.088 0.9999
j = 1 (0.08, 0.2] 1.213 0.144 4.219 0.281 1.657 -2.985 0.9999
(0.2, 0.5] 1.030 1.975 -0.342 0.331 0.989 -0.884 0.9999
(0.5, 1.0] 1.049 1.891 -0.252 0.539 0.234 -0.206 0.9998
VolJ
Bart
HACSC
[0, 0.08] 1.890 -2.415 12.158 0.254 2.151 -3.355 0.9999
j = 2 (0.08, 0.2] 1.572 2.679 -1.885 0.321 1.011 0.414 0.9999
(0.2, 0.5] 1.558 1.909 2.335 0.328 1.345 -1.442 0.9997
(0.5, 1.0] 1.944 2.686 -0.766 0.571 0.250 -0.224 0.9997
VolJ
Bart
HACSC
[0, 0.08] 2.108 2.866 -14.198 0.292 1.652 0.313 1.0000
j = 3 (0.08, 0.2] 2.138 0.900 5.747 0.291 1.726 -0.452 0.9999
(0.2, 0.5] 1.465 5.299 0.568 0.378 1.580 -1.901 0.9998
(0.5, 1.0] 2.922 3.310 -1.283 0.667 0.130 -0.158 0.9997
VolJ
Bart
HACSC
[0, 0.08] 3.018 -7.653 57.987 0.253 2.905 -4.765 1.0000
j = 4 (0.08, 0.2] 2.826 -2.491 23.472 0.276 2.348 -1.433 0.9999
(0.2, 0.5] 1.174 11.497 -5.167 0.489 1.350 -1.769 0.9996
(0.5, 1.0] 4.578 2.631 -1.051 0.687 0.154 -0.173 0.9997
VolJ
Bart
HACSC
[0, 0.08] 3.343 -10.310 98.419 0.267 3.357 -6.996 0.9999
j = 5 (0.08, 0.2] 3.775 -13.707 73.263 0.191 4.362 -7.628 0.9999
(0.2, 0.5] 0.665 19.301 -14.013 0.620 0.967 -1.384 0.9996
(0.5, 1.0] 5.683 2.986 -1.454 0.762 0.039 -0.099 0.9996
VolJ
Bart
HACSC
[0, 0.08] 3.577 -9.362 99.744 0.276 3.439 -2.745 1.0000
j = 6 (0.08, 0.2] 4.045 -14.568 91.632 0.199 4.941 -9.430 0.9999
(0.2, 0.5] 0.457 26.068 -21.860 0.729 0.615 -1.042 0.9997
(0.5, 1.0] 7.033 2.726 -1.479 0.774 0.057 -0.106 0.9997
VolJ
Bart
HACSC
[0, 0.08] 4.351 -26.213 285.725 0.238 5.438 -19.094 1.0000
j = 7 (0.08, 0.2] 4.772 -18.677 125.633 0.161 5.912 -12.945 0.9998
(0.2, 0.5] 0.980 31.088 -28.381 0.798 0.259 -0.619 0.9995
(0.5, 1.0] 8.323 3.129 -1.837 0.796 -0.013 -0.066 0.9995
VolJ
Bart
HACSC
[0, 0.08] 4.580 -19.928 217.613 0.258 4.933 -7.968 1.0000
j = 8 (0.08, 0.2] 4.198 -6.203 105.729 0.239 5.496 -12.040 0.9997
(0.2, 0.5] 1.472 35.437 -34.325 0.879 -0.062 -0.237 0.9996
(0.5, 1.0] 9.410 3.547 -2.298 0.827 -0.065 -0.026 0.9997
VolJ
Bart
HACSC
[0, 0.08] 4.852 -23.477 300.607 0.260 5.808 -14.526 1.0000
j = 9 (0.08, 0.2] 4.609 -6.845 130.604 0.208 6.618 -16.437 0.9997
(0.2, 0.5] 2.527 37.055 -36.832 0.912 -0.161 -0.144 0.9996
(0.5, 1.0] 10.750 3.338 -2.291 0.824 -0.037 -0.043 0.9997
VolJ
Bart
HACSC
[0, 0.08] 4.644 -6.323 165.841 0.301 4.755 -0.145 1.0000
j = 10 (0.08, 0.2] 3.583 13.222 87.276 0.292 6.097 -15.468 0.9996
(0.2, 0.5] 3.386 39.891 -41.152 0.973 -0.448 0.221 0.9995
(0.5, 1.0] 11.898 3.562 -2.539 0.841 -0.060 -0.029 0.9996
VolJ
Bart
HACSC
[0, 0.08] 5.116 -7.811 191.689 0.288 5.412 -0.512 1.0000
j = 11 (0.08, 0.2] 2.640 35.573 36.296 0.364 5.591 -14.684 0.9995
(0.2, 0.5] 4.847 40.292 -42.470 0.990 -0.548 0.354 0.9996
(0.5, 1.0] 13.210 3.709 -2.755 0.848 -0.077 -0.017 0.9996
VolJ
Bart
HACSC
[0, 0.08] 5.604 -13.995 286.067 0.278 6.224 -4.099 1.0000
j = 12 (0.08, 0.2] 2.268 49.643 11.796 0.397 5.681 -15.847 0.9995
(0.2, 0.5] 6.449 39.614 -42.575 1.001 -0.601 0.443 0.9996
(0.5, 1.0] 14.358 4.055 -3.095 0.867 -0.113 0.006 0.9996
49
Table B (Continued): Fixed-b Right Tail Critical Value Polynomial Coecients, Bartlett Kernel
Statistic Range of b ~1 ~2 ~3 01 02 03 1
2
VolJ
Bart
HACSC
[0, 0.08] 5.921 -16.258 340.065 0.276 6.866 -5.352 1.0000
j = 13 (0.08, 0.2] 1.558 67.687 -27.487 0.453 5.536 -16.284 0.9996
(0.2, 0.5] 8.163 37.200 -40.190 0.997 -0.507 0.330 0.9997
(0.5, 1.0] 15.414 4.290 -3.371 0.889 -0.139 0.024 0.9997
VolJ
Bart
HACSC
[0, 0.08] 6.058 -8.989 343.279 0.290 6.843 -3.598 0.9999
j = 14 (0.08, 0.2] 0.720 93.052 -98.291 0.515 4.920 -14.715 0.9996
(0.2, 0.5] 9.655 36.735 -40.084 1.006 -0.551 0.379 0.9997
(0.5, 1.0] 17.061 3.280 -2.797 0.871 -0.086 -0.013 0.9998
VolJ
Bart
HACSC
[0, 0.08] 6.574 -19.903 531.143 0.276 8.042 -12.853 0.9999
j = 15 (0.08, 0.2] 0.613 107.546 -130.632 0.547 4.836 -15.185 0.9996
(0.2, 0.5] 11.371 35.533 -39.510 0.998 -0.531 0.378 0.9997
(0.5, 1.0] 18.255 3.540 -3.060 0.881 -0.109 0.001 0.9997
VolJ
Bart
HACSC
[0, 0.08] 7.315 -32.404 717.599 0.254 9.283 -21.617 0.9999
j = 16 (0.08, 0.2] 0.006 132.409 -200.503 0.616 4.092 -13.295 0.9996
(0.2, 0.5] 13.228 33.763 -37.825 0.990 -0.505 0.349 0.9997
(0.5, 1.0] 19.822 3.097 -2.866 0.875 -0.099 -0.007 0.9998
VolJ
Bart
HACSC
[0, 0.08] 7.459 -22.763 723.933 0.261 9.477 -22.259 0.9999
j = 17 (0.08, 0.2] -0.406 154.898 -267.882 0.677 3.398 -11.275 0.9997
(0.2, 0.5] 14.566 34.318 -39.291 1.008 -0.610 0.494 0.9998
(0.5, 1.0] 21.090 3.098 -2.947 0.882 -0.110 -0.001 0.9998
VolJ
Bart
HACSC
[0, 0.08] 8.094 -39.623 956.719 0.241 11.078 -34.693 0.9999
j = 18 (0.08, 0.2] -1.143 179.619 -340.499 0.755 2.607 -9.098 0.9997
(0.2, 0.5] 16.370 31.069 -35.566 0.999 -0.506 0.360 0.9998
(0.5, 1.0] 22.339 2.779 -2.865 0.890 -0.106 -0.004 0.9999
VolJ
Bart
HACSC
[0, 0.08] 8.392 -46.713 1169.836 0.236 12.378 -47.795 0.9999
j = 19 (0.08, 0.2] -0.638 188.994 -365.592 0.772 2.598 -9.350 0.9998
(0.2, 0.5] 18.196 28.179 -32.370 0.994 -0.434 0.270 0.9999
(0.5, 1.0] 23.695 2.366 -2.742 0.897 -0.103 -0.004 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.092 -68.480 1527.652 0.210 14.313 -66.525 0.9999
j = 20 (0.08, 0.2] 0.361 192.685 -372.664 0.780 2.650 -9.837 0.9998
(0.2, 0.5] 20.065 25.563 -29.644 0.983 -0.374 0.202 0.9999
(0.5, 1.0] 25.109 1.943 -2.581 0.898 -0.098 -0.007 0.9999
VolJ
Bart
HACSC
[0, 0.08] 8.820 -51.312 1478.310 0.229 14.486 -67.010 0.9999
j = 21 (0.08, 0.2] -0.377 218.356 -455.481 0.866 1.728 -6.973 0.9998
(0.2, 0.5] 21.221 25.227 -29.783 1.003 -0.403 0.246 0.9999
(0.5, 1.0] 26.126 1.791 -2.533 0.916 -0.103 -0.006 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.430 -65.719 1735.357 0.210 16.084 -81.472 0.9999
j = 22 (0.08, 0.2] 0.213 226.969 -482.980 0.896 1.513 -6.395 0.9999
(0.2, 0.5] 22.732 23.358 -27.896 1.007 -0.361 0.192 0.9999
(0.5, 1.0] 27.177 1.883 -2.726 0.932 -0.116 0.003 1.0000
VolJ
Bart
HACSC
[0, 0.08] 9.373 -49.536 1737.667 0.216 16.498 -86.348 0.9999
j = 23 (0.08, 0.2] 0.531 242.940 -536.779 0.933 1.017 -4.847 0.9999
(0.2, 0.5] 23.957 24.488 -30.171 1.017 -0.433 0.302 0.9999
(0.5, 1.0] 28.231 2.435 -3.161 0.945 -0.149 0.023 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.440 -44.204 1840.596 0.216 17.459 -96.017 0.9999
j = 24 (0.08, 0.2] 0.539 262.616 -603.823 0.993 0.249 -2.342 0.9998
(0.2, 0.5] 25.187 24.845 -31.173 1.031 -0.488 0.378 0.9999
(0.5, 1.0] 29.322 2.695 -3.411 0.958 -0.169 0.035 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.393 -32.553 1908.973 0.219 18.111 -103.806 0.9999
j = 25 (0.08, 0.2] 1.880 262.585 -606.285 0.998 0.266 -2.452 0.9998
(0.2, 0.5] 26.608 23.680 -29.962 1.033 -0.473 0.355 0.9999
(0.5, 1.0] 30.627 2.275 -3.227 0.959 -0.161 0.029 0.9999
50
Table B (Continued): Fixed-b Right Tail Critical Value Polynomial Coecients, Bartlett Kernel
Statistic Range of b ~1 ~2 ~3 01 02 03 1
2
VolJ
Bart
HACSC
[0, 0.08] 9.802 -30.854 2070.549 0.207 19.143 -115.400 0.9999
j = 26 (0.08, 0.2] 3.249 266.259 -619.342 1.003 0.126 -2.071 0.9998
(0.2, 0.5] 28.336 22.852 -29.484 1.026 -0.480 0.371 0.9999
(0.5, 1.0] 32.194 1.954 -3.120 0.951 -0.157 0.026 0.9999
VolJ
Bart
HACSC
[0, 0.08] 10.319 -35.179 2305.672 0.186 20.654 -131.977 0.9999
j = 27 (0.08, 0.2] 5.480 257.041 -590.973 0.979 0.443 -3.167 0.9998
(0.2, 0.5] 30.144 21.052 -27.625 1.016 -0.438 0.319 0.9999
(0.5, 1.0] 33.677 1.705 -3.063 0.948 -0.156 0.025 0.9999
VolJ
Bart
HACSC
[0, 0.08] 10.045 -7.317 2240.466 0.198 20.888 -136.397 0.9999
j = 28 (0.08, 0.2] 6.941 258.573 -598.196 0.990 0.299 -2.734 0.9998
(0.2, 0.5] 31.954 18.854 -24.937 1.008 -0.391 0.248 0.9999
(0.5, 1.0] 35.415 0.747 -2.569 0.939 -0.131 0.009 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.987 16.000 2196.474 0.200 21.334 -142.404 0.9999
j = 29 (0.08, 0.2] 8.226 263.045 -616.525 1.002 0.081 -2.008 0.9998
(0.2, 0.5] 33.418 18.883 -25.505 1.010 -0.417 0.286 0.9999
(0.5, 1.0] 36.702 0.985 -2.847 0.941 -0.147 0.020 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.815 51.635 1991.299 0.208 21.292 -142.206 0.9999
j = 30 (0.08, 0.2] 9.029 272.876 -651.327 1.033 -0.353 -0.634 0.9998
(0.2, 0.5] 34.892 18.349 -25.264 1.010 -0.419 0.291 0.9999
(0.5, 1.0] 38.087 0.716 -2.777 0.941 -0.146 0.019 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.977 70.237 1981.498 0.204 21.882 -149.443 0.9999
j = 31 (0.08, 0.2] 11.052 265.552 -628.036 1.019 -0.146 -1.414 0.9998
(0.2, 0.5] 36.587 17.111 -24.181 1.001 -0.393 0.260 0.9999
(0.5, 1.0] 39.469 0.677 -2.842 0.940 -0.150 0.021 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.814 92.206 1935.635 0.209 22.367 -155.642 0.9999
j = 32 (0.08, 0.2] 11.876 274.246 -662.133 1.041 -0.452 -0.349 0.9998
(0.2, 0.5] 37.899 16.345 -23.195 1.005 -0.388 0.242 0.9999
(0.5, 1.0] 40.781 0.333 -2.700 0.940 -0.146 0.017 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.698 118.551 1867.242 0.215 22.753 -161.353 0.9999
j = 33 (0.08, 0.2] 13.474 273.929 -664.908 1.043 -0.492 -0.225 0.9998
(0.2, 0.5] 39.479 15.399 -22.388 1.003 -0.378 0.230 0.9999
(0.5, 1.0] 42.189 0.126 -2.681 0.939 -0.145 0.017 0.9999
VolJ
Bart
HACSC
[0, 0.08] 9.589 150.054 1743.618 0.222 22.996 -165.694 0.9999
j = 34 (0.08, 0.2] 15.304 270.731 -657.865 1.039 -0.447 -0.378 0.9998
(0.2, 0.5] 41.186 13.927 -20.890 0.997 -0.350 0.193 0.9999
(0.5, 1.0] 43.587 0.057 -2.752 0.940 -0.149 0.020 0.9999
VolJ
Bart
HACSC
[0, 0.08] 8.726 205.598 1395.533 0.251 22.479 -162.960 0.9999
j = 35 (0.08, 0.2] 16.928 266.662 -649.353 1.043 -0.433 -0.385 0.9998
(0.2, 0.5] 42.440 13.335 -20.516 1.004 -0.354 0.198 0.9999
(0.5, 1.0] 44.695 -0.008 -2.849 0.948 -0.153 0.023 0.9999
VolJ
Bart
HACSC
[0, 0.08] 8.656 231.938 1320.770 0.256 22.861 -168.835 0.9999
j = 36 (0.08, 0.2] 18.817 261.507 -636.513 1.035 -0.322 -0.727 0.9998
(0.2, 0.5] 44.017 12.081 -19.387 1.001 -0.332 0.173 0.9999
(0.5, 1.0] 46.104 -0.464 -2.644 0.946 -0.146 0.018 0.9999
VolJ
Bart
HACSC
[0, 0.08] 8.076 285.177 973.220 0.275 22.559 -167.764 0.9999
j = 37 (0.08, 0.2] 20.040 265.989 -656.378 1.044 -0.453 -0.255 0.9998
(0.2, 0.5] 45.348 12.205 -20.147 1.005 -0.349 0.200 0.9999
(0.5, 1.0] 47.301 -0.375 -2.802 0.952 -0.155 0.024 0.9999
VolJ
Bart
HACSC
[0, 0.08] 7.747 330.407 691.010 0.289 22.373 -167.543 0.9999
j = 38 (0.08, 0.2] 21.597 265.258 -658.767 1.048 -0.519 0.002 0.9999
(0.2, 0.5] 46.708 12.005 -20.278 1.007 -0.354 0.207 0.9999
(0.5, 1.0] 48.595 -0.480 -2.856 0.955 -0.159 0.027 0.9999
51
Table B (Continued): Fixed-b Right Tail Critical Value Polynomial Coecients, Bartlett Kernel
Statistic Range of b ~1 ~2 ~3 01 02 03 1
2
VolJ
Bart
HACSC
[0, 0.08] 7.483 375.996 412.175 0.301 22.244 -168.025 0.9999
j = 39 (0.08, 0.2] 23.345 263.125 -655.387 1.046 -0.517 -0.006 0.9999
(0.2, 0.5] 48.422 10.326 -18.319 1.000 -0.320 0.158 0.9999
(0.5, 1.0] 50.258 -1.270 -2.473 0.948 -0.145 0.018 0.9999
VolJ
Bart
HACSC
[0, 0.08] 6.951 431.171 59.451 0.318 21.957 -167.476 0.9999
j = 40 (0.08, 0.2] 25.027 262.615 -657.932 1.045 -0.536 0.081 0.9999
(0.2, 0.5] 49.929 10.264 -18.740 1.001 -0.333 0.177 0.9999
(0.5, 1.0] 51.686 -1.353 -2.532 0.949 -0.149 0.020 0.9999
VolJ
Bart
HACSC
[0, 0.08] 6.655 471.380 -167.572 0.331 21.904 -168.726 0.9999
j = 41 (0.08, 0.2] 26.959 255.140 -637.155 1.039 -0.419 -0.295 0.9999
(0.2, 0.5] 51.332 9.436 -17.966 1.001 -0.322 0.161 0.9999
(0.5, 1.0] 53.081 -1.903 -2.285 0.948 -0.143 0.016 0.9999
VolJ
Bart
HACSC
[0, 0.08] 6.323 518.557 -461.231 0.347 21.649 -168.218 0.9999
j = 42 (0.08, 0.2] 28.851 250.031 -624.684 1.033 -0.354 -0.475 0.9999
(0.2, 0.5] 52.764 9.159 -18.135 1.002 -0.328 0.172 0.9999
(0.5, 1.0] 54.327 -1.804 -2.463 0.951 -0.149 0.020 0.9999
VolJ
Bart
HACSC
[0, 0.08] 6.006 575.147 -865.939 0.359 21.333 -166.537 0.9999
j = 43 (0.08, 0.2] 30.164 255.677 -647.231 1.041 -0.520 0.098 0.9999
(0.2, 0.5] 54.380 8.752 -18.016 1.000 -0.331 0.175 0.9999
(0.5, 1.0] 55.888 -2.054 -2.439 0.947 -0.148 0.020 0.9999
VolJ
Bart
HACSC
[0, 0.08] 6.038 610.703 -1065.306 0.368 21.297 -167.786 0.9999
j = 44 (0.08, 0.2] 31.874 254.269 -646.759 1.039 -0.532 0.161 0.9999
(0.2, 0.5] 55.854 8.704 -18.414 1.000 -0.342 0.191 0.9999
(0.5, 1.0] 57.452 -2.588 -2.222 0.943 -0.141 0.016 0.9999
VolJ
Bart
HACSC
[0, 0.08] 5.328 671.516 -1501.684 0.394 20.696 -163.751 0.9999
j = 45 (0.08, 0.2] 33.155 256.236 -658.655 1.049 -0.638 0.568 0.9999
(0.2, 0.5] 57.012 9.007 -18.934 1.008 -0.361 0.210 0.9999
(0.5, 1.0] 58.668 -2.588 -2.364 0.948 -0.146 0.019 0.9999
VolJ
Bart
HACSC
[0, 0.08] 4.736 744.670 -2073.004 0.417 19.852 -157.049 0.9999
j = 46 (0.08, 0.2] 34.587 259.581 -673.590 1.054 -0.756 0.974 0.9998
(0.2, 0.5] 58.494 9.314 -19.931 1.009 -0.380 0.239 0.9999
(0.5, 1.0] 60.087 -2.649 -2.378 0.948 -0.150 0.021 0.9999
VolJ
Bart
HACSC
[0, 0.08] 4.223 805.883 -2536.192 0.439 19.185 -151.964 0.9999
j = 47 (0.08, 0.2] 35.801 263.561 -691.210 1.063 -0.893 1.455 0.9999
(0.2, 0.5] 59.657 10.416 -21.885 1.014 -0.410 0.281 0.9999
(0.5, 1.0] 61.241 -2.372 -2.645 0.952 -0.160 0.027 0.9999
VolJ
Bart
HACSC
[0, 0.08] 4.094 856.973 -2888.564 0.452 18.828 -150.172 0.9999
j = 48 (0.08, 0.2] 37.578 262.998 -695.698 1.061 -0.922 1.629 0.9999
(0.2, 0.5] 61.245 9.784 -21.300 1.012 -0.407 0.273 0.9999
(0.5, 1.0] 62.850 -2.858 -2.439 0.948 -0.154 0.023 0.9999
VolJ
Bart
HACSC
[0, 0.08] 2.772 959.098 -3712.162 0.488 17.442 -138.922 0.9999
j = 49 (0.08, 0.2] 39.144 263.351 -698.393 1.061 -0.964 1.744 0.9999
(0.2, 0.5] 62.926 8.887 -20.647 1.007 -0.399 0.265 0.9999
(0.5, 1.0] 64.400 -3.247 -2.276 0.944 -0.152 0.021 0.9999
VolJ
Bart
HACSC
[0, 0.08] 2.719 994.386 -3914.358 0.501 17.358 -139.690 0.9999
j = 50 (0.08, 0.2] 41.136 255.143 -676.507 1.055 -0.855 1.430 0.9999
(0.2, 0.5] 63.980 10.145 -22.624 1.015 -0.432 0.307 0.9999
(0.5, 1.0] 65.628 -3.303 -2.319 0.948 -0.155 0.023 0.9999
52
References
Adler, R.: (1981), The Geometry of Random Fields, Wiley.
Arellano, M.: (1987), Computing robust standard errors for within-groups estimators, Oxford Bul-
letin of Economics and Statistics 49(4), 431434.
Basu, A. K. and Dorea, C. C. Y.: (1979), On functional central limit theorem for stationary
martingale random elds, Acta Mathematica Academia Scientiarum Hungaricae 33, 307316.
Bertrand, M., Duo, E. and Mullainathan, S.: (2004), How much should we trust dierences-in-
dierences estimates?, Quarterly Journal of Economics 119, 249275.
Bester, A., Conley, T. and Hansen, C.: (2011), Inference with dependent data using cluster covari-
ance estimators, Journal of Econometrics .
Bester, A., Conley, T., Hansen, C. and Vogelsang, T.: (2008), Fixed-/ asymptotics for spatially
dependent robust nonparametric covariance matrix estimators. Working Paper, Department
of Economics, Michigan State University.
Conley, T. G.: (1999), GMM estimation with cross sectional dependence, Journal of Econometrics
92(1), 145.
Deo, C.: (1975), A functional central limit theorem for stationary random elds, The Annals of
Probability 3(4), 708715.
Driscoll, J. and Kraay, A.: (1998), Consistent covariance matrix estimation with spatially dependent
panel data, Review of Economics and Statistics 80(4), 549560.
Fama, E. and MacBeth, J.: (1973), Risk, return, and equilibrium: Empirical tests, Journal of
Political Economy 81(3), 607636.
Gonalves, S.: (2011), The moving blocks bootstrap for panel linear regression models with indi-
vidual xed-eects, Econometric Theory .
Gtze, F. and Knsch, H. R.: (1996), Second-order correctness of the blockwise bootstrap for
stationary observations, Annals of Statistics 24, 19141933.
Hansen, C.: (2007), Asymptotic properties of a robust variance matrix estimator for panel data
when T is large, Journal of Econometrics 141(2), 597620.
Hashimzade, N. and Vogelsang, T. J.: (2008), Fixed-/ asymptotic approximation of the sam-
pling behavior of nonparametric spectral density estimators, Journal of Time Series Analysis
29, 142162.
53
Hoechle, D.: (2007), Robust standard errors for panel regressions with cross-sectional dependence,
Stata Journal 7(3), 281312.
Ibragimov, R. and Mller, U.: (2010), t-statistic based correlation and heterogeneity robust infer-
ence, Journal of Business and Economic Statistics 28(4), 453468.
Kelejian, H. and Prucha, I.: (2007), HAC estimation in a spatial framework, Journal of Economet-
rics 140(1), 131154.
Kiefer, N. M. and Vogelsang, T. J.: (2005), A new asymptotic theory for heteroskedasticity-
autocorrelation robust tests, Econometric Theory 21, 11301164.
Kim, M.: (2010), Heteroskedasticity and spatiotemporal dependence robust inference for linear
panel models with xed eects. working paper, Department of Economics, UCSD.
Kim, M. and Sun, Y.: (2011), Spatial heteroskedasticity and autocorrelation consistent estimation
of covariance matrix, Journal of Econometrics 160, 349371.
Lee, J. and Solon, G.: 2011, The fragility of estimated eects of unilateral divorce laws on divorce
rates. working paper, Department of Economics, Michigan State University.
Newey, W. K. and West, K. D.: (1987), A simple, positive semi-denite, heteroskedasticity and
autocorrelation consistent covariance matrix, Econometrica 55, 703708.
Petersen, M. A.: (2009), Estimating standard errors in nance panel data sets: Comparing ap-
proaches, Review of Financial Studies 22(1), 435.
Sun, Y., Phillips, P. C. B. and Jin, S.: (2008), Optimal bandwidth selection in heteroskedasticity-
autocorrelation robust testing, Econometrica 76, 175194.
Velasco, C. and Robinson, P. M.: (2001), Edgeworth expansions for spectral density estimates and
studentized sample mean, Econometric Theory 17, 497539.
Vogelsang, T. J.: (2008), Spectral analysis, in S. N. Durlauf and L. E. Blume (eds), The New
Palgrave Dictionary of Economics, Palgrave Macmillan.
White, H.: (1980), A heteroskedasticity-consistent covariance matrix estimator and a direct test
for heteroskedasticity, Econometrica 48, 81738.
Wolfers, J.: (2006), Did unilateral divorce laws raise divorce rates? a reconciliation and new results,
The American Economic Review 96(5), 18021820.
54
Wooldridge, J. M.: (2002), Analysis of Cross-sectional and Panel Data, Cambridge, MA: MIT
Press.
Wooldridge, J. M.: (2003), Cluster-sample methods in applied econometrics, American Economic
Review Papers and Proceedings 93(2), 133138.
55

You might also like