You are on page 1of 8

Geophys. J. Int. (2011) ???

, 18

Are declustered earthquake catalogs Poisson?

Brad Luen
Department of Mathematics
Reed College
Portland, OR, USA

Philip B. Stark
Department of Statistics
University of California
Berkeley, CA, 94720-3860 USA

Accepted 2011 ???? ??; Received 2011 January ??; in original form 2011 January 29

SUMMARY
We present strong statistical evidence thatcontrary to previous claimsspatially in-
homogeneous, temporally homogeneous Poisson process (SITHP) models do not fit
declustered Southern California seismicity catalogs. Earlier claims were based on tests
that divide time into intervals, count the events in those intervals, and apply a chi-
square test to the counts, approximating the expected number of intervals with each
event count by estimating the rate of the hypothesized Poisson process from the data.
This test lacks a sound statistical basis and can have low power, for at least two rea-
sons: (1) It ignores earthquake locations. (2) It ignores the order of the time intervals.
For instance, if the test does not reject for a given set of counts, it would not reject if
those counts had occurred in decreasing order with time, which would be implausible
for a SITHP. We evaluate declustered catalogs derived from the 19321971 Southern
California Earthquake Center catalog of 1,556 events with magnitude 3.8 and above
by applying four window declustering methods. A Kolmogorov-Smirnov temporal test
finds P -values of 0.006 to 0.022 for the hypothesis that the declustered catalogs fol-
low a homogeneous temporal Poisson processpoor fit. Since homogeneous Poisson
processes do not fit the marginal temporal distribution, SITHPs cannot not fit the
spatio-temporal distribution. The hypothesis that declustered catalogs follow a SITHP
implies the weaker hypothesis that event times and locations in declustered catalogs
are conditionally exchangeable. For three of the declustered catalogs, a nonparametric
spatio-temporal test finds P -values of 0.005 or below for this weaker hypothesis. For
the fourth declustered catalog, the P -value is 0.069, above 0.05 but still small.
Key words: Earthquake interaction, forecasting, and prediction; Probabilistic fore-
casting; Statistical seismology

1 INTRODUCTION of the chi-square test fail when seismicity really does follow
a SITHP, casting some doubt on the conclusion. Moreover,
Earthquake catalogs are often declustered as a precursor to not rejecting the null hypothesis does not imply that the
modeling the remaining events as a realization of a spatially null hypothesis is true: Failure to reject could be a Type II
inhomogeneous, temporally homogeneous Poisson process error, especially if the test has little power against plausible
(SITHP). Tests of the null hypothesis that declustered cat- alternativeswhich we show to be the case. (And if declus-
alogs are realizations of a SITHP have not rejected the null tering removes enough events, no test can have much power:
hypothesis, leading some studies to conclude that declus- The few remaining events will always pass a test for Poisson
tered catalogs are Poisson. For instance, the title of Gard- behavior.)
ner and Knopoff (1974) is Is the sequence of earthquakes in The Kolmogorov-Smirnov (KS) test, described in sec-
Southern California, with aftershocks removed, Poissonian? tion 3.3, can be also used to test the hypothesis that declus-
The abstract: Yes. tered seismicity follows a SITHP. Unlike the assumptions of
The claim in Gardner and Knopoff (1974) is based on the chi-square test, the assumptions of the KS test are sat-
a chi-square test described in section 3.1. The assumptions isfied when declustered seismicity actually follows a SITHP.
2 B. Luen and P.B. Stark
The KS test does not have good power against every alterna- In a SITHP, event locations are independent of event
tive, but it has better power than the chi-square test against times, so the space-time rate is the product of the marginal
some kinds of clustering. The KS test finds small P -values spatial rate and the uniform temporal rate. The total num-
for the hypothesis that the 19321971 Southern California ber of events, N , is random. We denote the random loca-
Earthquake Center (SCSC) catalog of events with magni- tions and times of the N events by {(Xi , Yi , Ti )}N i=1 . The
tude 3.8 and above, declustered with four methods, follow a times between successive events are (marginally) indepen-
SITHP. dent and identically distributed (iid) exponential random
The chi-square and KS tests use only the times of variables. The number of events in disjoint time intervals
events, not event locations. Section 4 gives an abstract non- are independent Poisson random variables with means pro-
parametric permutation test that uses locations as well as portional to the durations of the intervals; this is the basis of
times to test the hypothesis that event times in declustered the chi-square test described in section 3.1. Conditional on
catalogs are conditionally exchangeable given event loca- N = n, the times {Ti }n i=1 are (marginally) iid uniform ran-
tions, a hypothesis implied by the hypothesis that declus- dom variables; this is the basis of the Kolmogorov-Smirnov
tered catalogs are a realization of a SITHP. The test, based test described in section 3.3.
on ideas in Romano (1988, 1989), finds small P -values for
this weaker hypothesis, again using the 19321971 SCEC
catalog declustered in four ways. We believe this test is new
to seismology.
We study only main-shock window and linked-window 3 TEMPORAL TESTS
declustering methods. There are also stochastic declustering 3.1 The Chi-square test
methods, which use chance to decide whether to remove a
particular event (Zhuang et al. 2002; Vere-Jones 1970); the We believe that the test of the hypothesis that declustered
waveform similarity approach (Barani et al. 2007); and catalogs are realizations of a homogeneous temporal Poisson
others. See Davis and Frohlich (1991) and Zhuang et al. process used by Gardner and Knopoff (1974) and Barani
(2002) for taxonomies. et al. (2007) works as follows:
Main-shock window methods remove the earthquakes
in a space-time window around every main shock, suit- (i) Pick K 1. Partition the study period into K time
ably defined. Main-shock window methods can be thought intervals of length T /K. Count the events in each interval:
of as punching a hole in the catalog after each main Nk #{i : ti ((k 1)T /K, kT /K]}, k {1, . . . , K}.
shock. The hole is the window. Gardner and Knopoffs win- (1)
dows (Knopoff and Gardner 1972; Gardner and Knopoff (ii) Pick C 2, the number of categories. For k
1974) are common in main-shock declustering. They are {1, . . . , K}, interval k is in category c {0, . . . , C 2} if
larger in space and time the larger the shock is. it contains c events; interval k is in category C 1 if it
Linked-window methods calculate a space-time window contains C 1 or more events. Let Oc be the number of
for every event in the catalog, not just main shocks. In intervals observed to be in category c.
linked-window methods, an event is in a cluster if and only (iii) Estimate the theoretical rate of events per interval.
if it falls within the window of at least one other event in We believe Gardner and Knopoff (1974); Barani et al. (2007)
that cluster. Linked-window declustering replaces each clus- used the estimate
ter with a single eventfor instance, the first, the largest, or
an equivalent event. The most widely used linked-window = n/K. (2)
method is Reasenbergs (Reasenberg 1985). Reasenbergs
windows are larger in space but shorter in time the larger (iv) Calculate the expected number of intervals in cate-
the shock is. gory c on the assumption that events follow a Poisson pro-
Below, we examine temporal and spatio-temporal tests cess with rate :
of the hypotheses that the 19321971 SCEC catalog of ( c
events with magnitude 3.8 and above, declustered using four Ke c! , c = 0, 1, . . . , C 2
Ec PC2 (3)
main-shock window and linked-window methods, follows K j=0 Ej , c = C 1.
a temporally homogeneous, spatially heterogenous Poisson
process. The resulting P -values are generally quite small: (v) Calculate the chi-square statistic:
The declustered catalogs are not statistically compatible C1
with SITHP models. X (Oc Ec )2
2 . (4)
c=0
Ec

Take the nominal P -value to be


2 THE POISSON NULL HYPOTHESIS
P Pr{X 2 }, (5)
Suppose that the declustered catalog has n events. The lon-
gitude, latitude, and time of the ith event is (xi , yi , ti ), where X is a random variable with a chi-square distribution
i = 1, . . . , n. Events are not necessarily in chronological or- with d degrees of freedom.
der. We do not consider earthquake depths. We study the
null hypothesis that the points {(xi , yi , ti )}ni=1 are a real- The nominal and true P -values depend on the arbitrary
ization of a SITHP on spatial domain S and time interval choices K and C, on d, and on the method of estimating
(0, T ], where S and T are fixed. .
Are declustered earthquake catalogs Poisson? 3
3.2 Does the chi-square approximation hold? 3.3 The Kolmogorov-Smirnov test
Let Ik = c if the number of events in the kth interval is The Kolmogorov-Smirnov (KS) test compares the empirical
in category c. In the basic chi-square test for goodness of cumulative distribution function Fn (x) of a random variable
C1
fit, the null hypothesis is that (i) {Pr{Ik = c}}c=0 are to a fixed reference distribution function F (x) (Lehmann
K
known and do not depend on k, and (ii) {Ik }k=1 are in- 2005). The KS test rejects when
dependent (Lehmann 2005). The numbers of data in the
C categories, {Oc }C1 Dn sup Fn (x) F (x) C(n, ). (7)

c=0 , then have a multinomial joint dis-
x
tribution. The null distribution of the chi-square statistic
converges to a chi-square distribution with C 1 degrees of According to the Massart-Dvoretzky-Kiefer-Wolfowitz in-
freedom as the number K of data increasesbut the finite- equality (Massart 1990), if we take
sample distribution is only approximately chi-squared. r
Neither (i) nor (ii) is true in testing whether declus- ln
C(n, ) = , (8)
tered catalogs are Poisson. (i) is false because the hypothe- 2n
sis that declustered seismicity is Poisson does not completely the significance level of the resulting test is at most . In
specify the category probabilities {Pr{Ik = c}}C1
c=0 . Instead, seismology, the KS test has been used to assess the unifor-
those probabilities are estimated from an estimate of the mity of declustered earthquake sequences preceding main
marginal temporal rate of the Poisson process. Estimating shocks (Matthews and Reasenberg 1988; Reasenberg and
{Pr{Ik = c}}C1 c=0 from the data changes the distribution of Matthews 1988).
the chi-square statistic. If declustered earthquakes follow a SITHP, then con-
(ii) is false too: Conditional on the estimated tempo- ditional on N = n the times {Ti }n i=1 are iid uniform on
ral rate, the random variables {Ik }K k=1 are not indepen- (0, T ]. Their common cumulative distribution function is
dent because they are related through the total number of F (x) = t/T . Hence, conditional on n,
earthquakesan ingredient in estimating the rate. For in-
n
stance, if n C 1 and Ik = 0, k = 1, . . . , K 1, we 1
X
Dn = sup 1ti t t . (9)

would know that IK = C 1. Hence, the joint distribu- t n
i=1
tion of {Oc }C1c=0 is not multinomial when the Poisson null
hypothesis is true. Unlike the chi-square test, the KS test has no ad hoc choices
When the category probabilities Pr{Ik = c}C1 analogous to K, C, d and . The KS test has asymptotic
c=0 de-
pend on p parameters and those parameters are estimated power 1 against the alternative that the data are iid with
from {Oc } using an efficient maximum likelihood estima- any fixed distribution G 6= F .
tor (MLE), asymptotically the chi-square statistic has a
chi-square distribution with B p 1 degrees of free-
dom (Lehmann 2005). The MLE ML of from {Oc } solves 3.4 Power

P The chi-square test has low power against some kinds of


iML /i!
PC2
c=0 bOc i=C2 clustering that violate the Poisson hypothesis. To see why,
+ P OC1 = K. (6) suppose all the intervals with events occur at the beginning
ML j=C1 jML /j!
of the study period. This would be very unlikely if declus-
tered seismicity followed a homogeneous Poisson process.
The estimate = n/K is not a function of the counts {Oc } However, the chi-square test is insensitive to the order of
alone. Depending on the data, differences between ML and the intervalsit depends only on {Oc }C1 c=1 , the numbers of
can be minor (the estimates are identical if OC1 = 0), but intervals with 0 events, 1 event, etc. Despite the long-term
we know of no guarantee that the asymptotic distribution variation in the rate of seismicity, the test will not reject if Oc
of the chi-square statistic based on is chi-square. is close to Ec for all c. The KS test tends to be more sensitive
To assess the chi-square approximation to the distribu- to long-term rate variations, but less sensitive to departures
tion of the chi-square statistic when the Poisson null hy- from exponential inter-event times. (Neither the chi-square
pothesis is true and is estimated using the naive estimator test nor the KS test uses the spatial locations of the events.
, we simulated 106 independent realizations of a tempo- A process can be non-Poisson in space-time yet have a Pois-
ral rate 1 Poisson process for 500 time intervals of length son marginal temporal distribution; see, e.g., Luen (2010,
one, then counted the number of intervals with 0, 1, 2, or Chapter 3).)
more than 2 events. We estimated the rate to be = n/K To illustrate the complementary limitations of the chi-
and performed a chi-square test with C = 4 categories at square and KS tests, we estimated the power of significance
nominal significance level = 5% for each realization. level-0.05 chi-square and KS tests against the alternatives
Using the 5% critical value of the chi-square distribution that seismicity follows a inhomogeneous Poisson process or a
with 2 degrees of freedom led to rejecting the null hypoth- gamma renewal process, using 10,000 forty-year simulations
esis 5.01% of the time. Using the 5% critical value of the of each process. (We chose 40 years to match the lengths
chi-square distribution with 3 degrees of freedom led to re- of the actual catalogs used below.) The chi-square test used
jecting the null hypothesis only 2.02% of the time. Similar ten-day intervals and four categories. Table 1 shows the re-
tests with different rates and different numbers of categories sults.
suggest that setting the degrees of freedom to C 2 gives a In the inhomogeneous Poisson process, the expected
better approximation to the true significance level when rate is 0.25 events per ten days for the first twenty years
is estimated by = n/K. and 0.5 events per ten days for the next twenty yearsa
4 B. Luen and P.B. Stark

Process KS Chi-square
ilar results for 19321971 Southern California seismicity for
their windows. Method 2 is a main-shock window method.
Inhomogeneous Poisson 1 0.1658 Gardner and Knopoff (1974) performed a variety of
Gamma renewal 0.0009 1 chi-square tests on a number of catalogs declustered using
Method 1a. Among other things, they report results for a
Table 1. Estimated power of significance level 0.05 tests that catalog of earthquakes with magnitude at least 3.8 occurring
two temporal point processes are homogeneous Poisson processes, in the Southern California Local Area from 19321971.
based on 10,000 40-year simulations of each process. In the In- That raw catalog had 1,751 events; the declustered catalog
homogeneous Poisson process, events occur at expected rate had 503 events. They divided the forty-year period into ten-
0.25 per ten days for twenty years, then at expected rate 0.5 per day intervals, found Oc and estimated Ec for some range of
ten days for another twenty years. In the Gamma renewal c, and compared the chi-square statistic to a chi-square dis-
process, the times between events are independent and follow tribution with 2 degrees of freedom. They did not state C,
a gamma distribution with shape 2 and rate 1. The KS test is how they estimated , nor whether they used d = C 1 or
described in section 3.3. The chi-square test, which uses ten-day
d = C 2 in their tests. They found a P -value of 0.0599 and
intervals, C = 4 categories, and d = C 2 = 2 degrees of free-
hence did not reject the hypothesis that declustered catalogs
dom, is described in section 3.1. For the inhomogeneous Poisson
process, the KS test rejects in all simulations and the chi-square are Poisson.
test usually does not reject. For the gamma renewal process, the We do not have the catalog used by Gardner and
chi-square test rejects in all simulations and the KS test rarely Knopoff (1974), so we used the SCEC catalog? for the same
rejects. time period, covering approximately the same region. The
SCEC catalog contained 1,556 events with magnitude at
least 3.8 between 1932 and 1971. We declustered the cat-
clear departure from temporally homogeneous Poisson be- alog using Methods 1a, 1b, and 2 with the Gardner and
havior. The KS test rejected in all 10,000 simulations. But Knopoff (1974) windows and using Method 3. (We used
the chi-square test rejected in only 1658 of the 10,000 sim- Stefan Wiemers ZMAP package for MATLAB to apply
ulations: It has little power because it ignores the order of Reasenbergs method.) The declustered catalogs contained
the intervals. 437, 424, 544, and 985 events, respectively. Figure 1 maps
In the gamma renewal process, inter-event times are in- the events in the original catalog and the events that remain
dependent with a Gamma(2, 1) distribution (the time unit after declustering using methods 1b, 2, and 3. The maps for
is ten days). The expected rate is constant, but the distri- methods 1a and 1b are visually indistinguishable.
bution of inter-event times is not exponential, as it would We tested the SITHP hypothesis using the KS test and
be in a homogeneous Poisson process. The chi-square test the chi-square test with C = 4 and d = C 2 = 2. We
rejected in all 10,000 simulations: More intervals have one rejected the null hypothesis if the KS test had a P -value
event and fewer have two or more events than would be ex- less than 0.025 or the chi-square test had a nominal P -value
pected if the process were Poisson. The KS test rejected in less than 0.025. If the null hypothesis is true, the chance of a
only 9 of the 10,000 simulations. Type I error is no greater than 0.05, by Bonferronis inequal-
The KS test is sensitive to rate variations across the ity (ignoring the difference between the nominal and true P -
study period, but not to local departures from exponential value for the chi-square test). We also compared chi-square
inter-event times. The chi-square test is sensitive to depar- tests using = n/k to tests that estimate by solving (6);
tures from exponential inter-event times in short intervals, differences were negligible. Table 2 shows the results. None of
but not to long-term rate variations. Neither test uses spa- the declustered catalogs appears to be Poisson, contradict-
tial information, which we address in section 4. ing Gardner and Knopoff (1974). For all four declustering
methods, the KS test rejects the Poisson hypothesis at level
0.025. The chi-square test rejects at nominal level 0.025 for
3.5 Tests on declustered catalogs
methods 2 and 3.
We consider the following four window-based declustering
algorithms:
Method 1: Variant (a) Remove every event that is 4 SPATIO-TEMPORAL TESTS
in the window of some other event (Gardner and Knopoff
1974). Variant (b) Divide the catalog into clusters as follows: 4.1 A weaker null hypothesis: conditionally
An event is in a given cluster if and only if it is in the window exchangeable times
of at least one other event in the cluster. In every cluster,
The KS test rejects the hypothesis that the marginal distri-
remove all events except the largest (Gardner and Knopoff
bution of times in the declustered SCEC catalogs is Poisson.
1974).
The marginal distribution of times for a SITHP is Poisson.
Method 2: Consider the events in chronological order.
Hence the test in the previous section also rejects the hy-
If the ith event is in the window of a preceding larger shock
pothesis that declustered 19321971 SCEC catalogs are a
that has not already been deleted, delete it. If a larger shock
realizations of a SITHP.
is in the window of the ith event, delete the ith event. Oth-
erwise, retain the ith event (Knopoff and Gardner 1972).
Method 3: Reasenbergs method (Reasenberg 1985). ? http://www.data.scec.org/catalog_search/data_mag_loc.
Methods 1a, 1b, and 3 are linked-window methods. Gardner php
and Knopoff (1974) found that methods 1a and 1b gave sim- http://www.earthquake.ethz.ch/software/zmap
Are declustered earthquake catalogs Poisson? 5

(a) (b)

(c) (d)

Figure 1. (a): 19321971 SCEC catalog of 1,556 events of magnitude 3.8 or greater in Southern California. (b): The 424 events that
remain after declustering using method 1b. (c): The 544 events that remain after declustering using method 2. (d): The 985 events that
remain after declustering using method 3.

Moreover, SITHPs can have events arbitrarily close to- ally exchangeable given event locations. Since event times in
gether. But catalogs declustered with window methods have SITHPs are conditionally iid uniform given event locations,
a minimum spacing between events: If a catalog contains two SITHPs have conditionally exchangeable times given event
events very close in space and time, the later event will fall locations.
within the window of the former, and one or both of them
Under the hypothesis of conditionally exchangeable
will be deleted. Hence, catalogs declustered using window
times, given the set of locations {(xi , yi )} and (separately)
methods cannot be realizations of SITHPs.
the set of times {ti }, all one-to-one assignments of times
To try to salvage part of the SITHP hypothesis, we test
to locations have the same chance. If events close in space
a much weaker condition implied by SITHP: the hypothesis
tend to be close in timethe kind of clustering real seis-
that times are conditionally exchangeable given event loca-
micity exhibitstimes are not conditionally exchangeable.
tions. Let be the set of all n! permutations of {1, . . . , n}.
If events close in space tend to be distant in timewhich
We say a process has conditionally exchangeable times if,
can result from deleting events in windowstimes are not
conditional on the locations,
conditionally exchangeable.
d
{T1 , . . . , Tn } = {T(1) , . . . , T(n) } (10) We test the hypothesis that times are conditionally
exchangeable by adapting abstract methodology developed
d
for all permutations . (The notation = means has by Romano (1988, 1989). Let Pn be the empirical distribu-
the same probability distribution as.) If event times are tion of the times and locations of the n observed events. Let
conditionally iid given event locations, they are condition- Pn be the distribution that has equal mass at every ele-
6 B. Luen and P.B. Stark

Chi-square Method n P -value CI Reject?


Method n KS Plug-in MLE Reject?
1a 437 0.005 [0.003, 0.007] Yes
1a 437 0.012 0.087 0.087 Yes 1b 424 0.000 [0.000, 0.001] Yes
1b 424 0.006 0.297 0.295 Yes 2 544 0.069 [0.063, 0.076] No
2 544 0.022 < 0.001 < 0.001 Yes 3 985 0.000 [0.000, 0.001] Yes
3 985 0.003 0.000 0.000 Yes

Table 3. P -values for tests of the null hypotheses that the 1932
Table 2. P -values for tests of the null hypotheses that the 1932 1971 SCEC catalog of 1,556 events with magnitude 3.8 or larger,
1971 SCEC catalog of 1,556 events with magnitude 3.8 or larger, declustered using Methods 1a, 1b, 2, and 3 of section 3.5, have
declustered using Methods 1a, 1b, 2, and 3 of section 3.5, has a conditionally exchangeable times given the locations of the events.
homogeneous Poisson distribution in time. The number of events The number of events that remain after declustering is n. The
that remain after declustering is n. KS is an upper bound on the P -values are estimated from 10, 000 random permutations. CI
P -value for the Kolmogorov-Smirnov test. Chi-square plug-in is a 99% confidence interval for the P -value that would result
is the nominal P -value for a chi-square test that estimates by from examining all permutations instead of a random sample of
n/K. Chi-Square MLE is the nominal P -value for a chi-square permutations.
test that estimates by maximum likelihood from the category
counts using equation (6). Reject is yes if the P -value for the
KS test or the nominal P -value for the chi-square test is less than
0.025.

ment of the orbit of the data under the permutation group


no ad hoc tuning constants, is guaranteed to be conserva-
(permuting times relative to the locations).
tive, and has more power against many alternatives. Both
A set V R3 is a lower-left quadrant if, for some
the chi-square test and the Kolmogorov-Smirnov test con-
(x0 , y0 , t0 ), it is of the form:
dition on the total number of events. The KS test shows
V = {(x, y, t) R3 : x x0 and y y0 and t t0 }. (11) that that declustering the 19321971 SCEC catalog of events
of magnitude 3.8 and above using main-shock window or
Let V be the set of all lower-left quadrants. The test statistic linked-window methods does not leave a set of events that
is the supremum (over all lower-left quadrants V V) of looks like a realization of a homogeneous Poisson process:
the difference between the probability Pn assigns to V and P -values range from 0.006 to 0.022 for the four declustering
the probability that Pn assigns to V : methods we consider.
(Pn ) sup |Pn (V ) ( Pn )(V )|. (12) Neither the chi-square test nor the KS test takes spatial
V V locations into account. In a spatially inhomogeneous, tem-
The P -value is the fraction of elements of the orbit (permu- porally homogeneous Poisson process (SITHP), two events
tations) for which the test statistic is at least as large as it is may occur arbitrarily close to one another with strictly posi-
for the actual data. Since the total number of permutations tive probability. Catalogs declustered using window methods
is large, in practice, the P -value is estimated from a ran- never have events very close in space and time, so they can-
dom sample of permutations. (Alternatively, the test can be not be realizations of SITHP. But declustered catalogs may
defined in terms of a deterministic or random subset of the still have properties in common with SITHP. For instance,
permutations; the resulting test will still have level but its the times might be conditionally exchangeable given the lo-
power might suffer.) cations: Knowing the location of an event might give no
We implemented a test for conditionally exchangeable information about the time of the event, given (separately)
times in R (http://cran.r-project.org/). Code is avail- the times of all the events and the locations of all the events.
able online at http://statistics.berkeley.edu/~stark/ We tested the hypothesis that times of events in declus-
Code/Quake/permutest.r. Appendix A describes the algo- tered catalogs are conditionally exchangeable given the lo-
rithm. Table 3 shows the P -values that result from apply- cations using a permutation test. For the 19321971 SCEC
ing the test to the 19321971 SCEC catalog of events with catalog of events with magnitude 3.8 or greater declustered
magnitude 3.8 or above, declustered using the four meth- using Gardner-Knopoff windows applied as a linked-window
ods described in section 3.5. The evidence is quite strong method or Reasenbergs method, one-tailed P -values were
that times are not conditionally exchangeable for methods estimated to be 0.005 or below. The one-tailed P -value us-
1a, 1b, and 3. The evidence that times are not conditionally ing Gardner-Knopoff windows applied as a main-shock win-
exchangeable for method 2 is somewhat weaker. dow method was estimated to be 0.069: larger than for the
other methodsand above 0.05but still small. Declus-
tered catalogs do not appear to be conditionally exchange-
able, much less consistent with SITHP. The ordering of
5 DISCUSSION
the P -values suggests that Reasenbergs method might re-
The chi-square test others have used to assess whether move too few events and linked-window declustering using
declustered catalogs fit SITHP models relies on ad hoc Gardner-Knopoff windows might remove too many events.
tuning constants, lacks theoretical justification, can have a Of course, by removing almost all the events, the P -value
significance level larger than its nominal significance level, can be made arbitrarily close to 1.
and has low power against many plausible alternatives. A Ok, so why do you decluster the catalog? asks the
Kolmogorov-Smirnov (KS) test for the same hypothesis has online FAQ for the Earthquake Probability Mapping Appli-
Are declustered earthquake catalogs Poisson? 7
cation of the USGS. The answers: to get the best possible finds the distances for quadrants with corners (xj , yi , tk ) for
estimate for the rate of mainshocks, and the methodol- every value of k successively; that is, it finds
ogy [of the Earthquake Probability Mapping Application]  
requires a catalog of independent events (Poisson model), = max max P (V (j, i, k)) ( P )(V (j, i, k)) ,

k j,i
and declustering helps to achieve independence. The evi-
dence presented here suggests that declustered catalogs do where V (j, i, k) is the lower-left quadrant with corner
not consist of independent events that follow a Poisson (xj , yi , tk ).
model. (iv) Set the iteration counter h to 0.
To estimate the rate of main shocks presumes an un- (v) Increment h. Create a random permutation of
ambiguous definition of main shock. Often, main shocks {1, . . . , n}. Apply this permutation to the locations, leav-
are taken to be the events that remain after a catalog is ing times fixed. (One could permute the times instead of
declusteredessentially a circular definition. And different the locations, but that would require the re-sorting the cat-
methods will produce different declustered catalogs. Rather alog into temporal order after each permutation.) The spa-
than try to identify main shocks, it might be better to model tial measure has not changed, but its indexing has; apply the
all large events. As the USGS FAQ also notes, large fore- permutation to both the rows and the columns of xy.upper.
shocks and aftershocks can do just as much damage as main (vi) As in step iii, find the absolute differences between
shocks. Removing them from the catalog will neither avert the empirical measure and the empirical null measure for
nor repair the damage. the n3 lower-left quadrants. Let h be the maximum value
of all these distances.
(vii) Determine whether to stop or to return to step v.
ACKNOWLEDGMENTS (We might simply stop when h = 10, 000, or we might apply
Walds sequential probability ratio test (Wald 1945) to de-
We are grateful to Steven N. Evans for instructive and help- termine whether, on the basis of the random permutations
ful conversations. taken so far, it is possible to conclude whether P .) If
the algorithm stops, estimate the P -value as
1
APPENDIX A: ALGORITHM TO TEST THE P = #{h : h },
H
HYPOTHESIS THAT TIMES ARE
where H is the total number of iterations.
CONDITIONALLY EXCHANGEABLE
R code that implements the algorithm to test whether
times are conditionally exchangeable is available REFERENCES
at http://statistics.berkeley/edu/~stark/Code/
Quake/permutest.r. The algorithm has the following steps: S. Barani, G. Ferretti, M. Massa, and D. Spallarossa. The wave-
form similarity approach to identify dependent events in instru-
(i) Sort the catalog of longitudes, latitudes, and times mental seismic catalogues. Geophys. J. Intl., 168(1):100108,
in time order. Label the sorted points {xi , yi , ti } for i 2007.
{1, . . . , n}. Find the longitude and latitude ranks of every S.D. Davis and C. Frohlich. Single-link cluster analysis of Earth-
event. quake aftershocks: Decay laws and regional variations. J. Geo-
phys. Res., 96:63356350, 1991.
(ii) Find the empirical spatial measure for all lower-left
J.K. Gardner and L. Knopoff. Is the sequence of earthquakes
quadrants in R2 with corners in Southern California, with aftershocks removed, Poissonian?
(yi , xj ), 1 i, j n. (A.1) Bull. Seis. Soc. Am., 64(15):13631367, 1974.
L. Knopoff and J.K. Gardner. Higher seismic activity during lo-
In the online R code, this spatial distribution is stored in cal night on the raw worldwide earthquake catalogue. Geophys.
the matrix xy.upper. The entry indexed by (i, j) is J. Intl., 28(3):311313, 1972.
n
E.L. Lehmann. Testing Statistical Hypotheses. Springer, New
1X York, 3rd edition, 2005.
1(yi y, xj x). (A.2)
n i=1 B. Luen. Earthquake prediction: Simple methods for complex
phenomena. PhD thesis, University of California, Berkeley,
This is the number of events in the catalog with latitude 2010.
less than the latitude of the ith event in the catalog and P. Massart. The tight constant in the Dvoretsky-Kiefer-
with longitude less than the longitude of the jth event in Wolfowitz inequality. Ann. Prob., 18:12691283, 1990.
the catalog. M.V. Matthews and P.A. Reasenberg. Statistical methods for in-
(iii) Find the absolute differences between the empirical vestigating quiescence and other temporal seismicity patterns.
Pure Appl. Geoph., 126(2-4):357372, 1988.
measure and the empirical null measure for the n3 lower-left
P.A. Reasenberg. Second-order moment of central California
quadrants with corners
seismicity, 1969-1982. J. Geophys. Res., 90(B7):54795495,
(xj , yi , tk ), i, j, k {1, . . . , n}. 1985.
P.A. Reasenberg and M.V. Matthews. Precursory seismic quies-
Find the maximum value of all these differences; this is the cence: A preliminary assessment of the hypothesis. Pure Appl.
test statistic . To reduce storage requirements, the code Geoph., 126(2-4):373406, 1988.
J.P. Romano. A bootstrap revival of some nonparametric dis-
tance tests. J. Am. Stat. Assoc., 83:698708, 1988.
http://earthquake.usgs.gov/learn/faq/?faqID=280, last ac- J.P. Romano. Bootstrap and randomization tests of some non-
cessed 29 January 2011. parametric hypotheses. Ann. Stat., 17:141159, 1989.
8 B. Luen and P.B. Stark
D. Vere-Jones. Stochastic models for earthquake occurrence. J.
Roy. Stat. Soc., Ser. B, 32:162, 1970.
A. Wald. Sequential tests of statistical hypotheses. Ann. Math.
Stat., 16:117186, 1945.
J. Zhuang, Y. Ogata, and D. Vere-Jones. Stochastic declustering
of space-time earthquake occurrences. J. Am. Stat. Assoc., 97
(458):369380, 2002.