Econometrics

NONLINEAR MODEL SPECIFICATION/DIAGNOSTICS: INSIGHTS FROM A BATTERY OF NONLINEARITY TESTS
Richard A. Ashley Department of Economics Virginia Tech
Douglas M. Patterson Department of Finance Virginia Tech
February 12, 2001
Abstract A single statistical test for nonlinearity can indicate whether or not the generating mechanism of a time series is or is not linear. However, if the null hypothesis of linearity is rejected, the test result conveys little information as to what kind of nonlinear model is appropriate. Here we show that a battery of different nonlinearity tests, in contrast, can yield valuable model identification information. In particular, applying such a battery of tests to data on U.S. real GNP, we are able to reject the hypothesis that this time series is generated by some sort of two-state regime switching process, even though these models fit the data fairly well in a least squares sense.
We are grateful to Ivan Pastine for helpful comments and to M.J. Hinich and B. LeBaron for sharing their computer codes with us. The authors may be contacted at ashleyr@vt.edu and amex@vt.edu, respectively. This paper is available at http://ashleymac.econ.vt.edu/working_papers/E9905.pdf as Economics Department Working Paper E99-05 .
1. Introduction
The proposition that efficient parameter estimation and valid statistical inference hinge crucially on appropriate model specification is hardly controversial. Further, ample theoretical and empirical evidence indicates that nonlinear generating mechanisms are important in a number of macroeconomic and financial processes. For example, many theoretical macroeconomic models are highly nonlinear, from Hicks (1950) elaboration of the Samuelson multiplier-accelerator theory, to Grandmonts (1985) overlapping generations model, to labor hoarding models such as Hall (1990), and to recent models, such as Palm and Pfann (1997), which are based on an explicit treatment of asymmetric adjustment costs. The nonlinearity in these models is intrinsic to the macroeconomic hypotheses embodied therein and essential to the derivation of observed macroeconomic properties, such as asymmetric business cycles and chaotic dynamics 2. Moreover, numerous empirical studies have found statistical evidence for significant nonlinearity in the generating mechanisms of important macroeconomic and/or financial time series. Examples include Engle (1982), Tong (1983), Hinich and Patterson (1985), Tsay (1986), Ashley and Patterson (1989), Hamilton (1989), Brock, Hsieh and LeBaron (1991), Potter (1995), and Altug, Ashley and Patterson (1999) among many others. Some of these studies have simply detected nonlinearity Potter (1995) e.g., Ashley and Patterson (1989) others e.g., Hamilton (1989) and
assume at the outset that the nonlinearity takes a particular form.
The field of nonlinear time series analysis lacks a comprehensive model identification
2
See Barnett and Hinich (1992) and Barnett, et al. (1995). 1
algorithm analogous to that proposed by Box and Jenkins (1976) for linear processes. Here we provide evidence that, while a single nonlinearity test can only detect (or fail to detect) nonlinearity, the application of a battery of nonlinearity tests can provide valuable nonlinear identification information on a given time series. The plan of the paper is as follows. In Section 2 we briefly review each of the six nonlinearity tests considered here: McLeod and Li (1983), Engle LM (1982), Brock, Dechert and Scheinkman (1996), Tsay (1986), Hinich and Patterson bicovariance (1995), and Hinich bispectrum (1982).
In Section 3 we provide evidence for the differential power of these tests to detect nonlinearity of the various forms in the literature. In the main we do not find specific tests that are diagnostic for specific forms of nonlinearity. We do find, however, that the pattern of statistical results across the various tests provides a new stylized fact about the time series which is potentially informative with respect to what kind of nonlinear process generated the series. In particular, this observed pattern of nonlinearity test results allows us to construct a statistical test of the proposition that a specific nonlinear model e.g., a threshold autoregressive model is capturing the nonlinear serial dependence in the data, as distinct from merely fitting the 2 ARCH/GARCH, threshold autoregressive, markov switching, etc. proposed
data well in a least squares sense. Of course, rejection of this null hypothesis does not specify what nonlinear model will provide an adequate representation of the actual generating mechanism for the time series but surely if the current framework is not adequate, it is better to know it. Finally, in Section 4 we demonstrate the utility of our approach using data on U.S. real GNP and several models for these data in the empirical literature a threshold autoregressive model estimated by Potter (1995) and several markov switching models estimated by Lam (1997). These models are all sufficiently elaborate as to fit the sample data fairly well, but we find that they are all significantly inconsistent with the pattern of nonlinearity test results we observe using the actual data.
2. Nonlinearity Tests
In this section each of the statistical tests considered below is briefly described. These include a test for ARCH effects due to McLeod and Li (1983), the Engle (1982) LM test for ARCH effects, the BDS test proposed by Brock, Dechert, and Scheinkman (1996), the Tsay (1986) test for quadratic serial dependence, the bicovariance test due to Hinich (1996) and Hinich and Patterson (1995), and the Hinich bispectral test proposed in Hinich (1982) and studied in Ashley, Patterson, and Hinich (1986) and in Ashley and Patterson (1989). These tests are implemented in a Nonlinearity Toolkit, a Windows-based computer program which is available from the authors at http://ashleymac.econ.vt.edu/working_papers/toolzipd.exe. A detailed description of the Toolkit can be found in Patterson and Ashley (2000). 3 Except for the Hinich bispectral test, these tests all share the same premise: once any linear serial dependence is removed from the data via a prewhitening model, any remaining serial dependence must be due to a nonlinear generating mechanism. Thus, each of these procedures is actually a test of serial independence applied to the (by construction) serially uncorrelated fitting errors of an AR(p) model for the sample data. 4 This fitting error series, standardized to zero mean and unit variance, is denoted by {x t} below.
It is obviously infeasible to include every known test in the Toolkit. Omitted tests include: Ramsey (1969), Ashley and Patterson (1986), Mizrach (1991), Nychka, et al. (1992), Kaplan (1993), and Dalle Molle and Hinich (1995). The most well known test not considered here is the neural net test of White (1989). This test is omitted because it has so many free parameters (connection strengths) that White recommends choosing them at random. Since the test is then substantially different each time it is run, the best one can do is to run it several times and use the Bonferroni inequality to obtain an upper bound on the p-value for the test. In our implementation p is chosen to minimize the Schwartz (SC) criterion. In contrast to alternative choices (e.g., AIC or FPE) the Schwartz criterion is known to be consistent for AR(p) order determination under the null hypothesis of a linear generating mechanism; see Judge, et al. (1985, p. 246).
4
In contrast, the Hinich bispectral test directly tests for a nonlinear generating mechanism, so prewhitening is not necessary in using this test. Moreover, the Hinich bispectral test is invariant to linear filtering of the data, so if prewhitening is done anyway the adequacy of the
prewhitening model is irrelevant to the validity of this test. 5 Below each test is considered in two forms: using the asymptotic distribution of the relevant test statistic and using the bootstrap. As de Lima (1997) points out, in using these asymptotic distributions, each of these tests differs in the moment restrictions it is implicitly imposing that is, in its requirement for the existence of moments up to and including some specific order. Below we state for each test the value of this order necessary for asymptotic inferential analysis, taking into account the fact that each test (except, as noted above, the Hinich bispectral test) operates on fitting errors from an estimated AR(p) model.
Mcleod-Li Test This test for ARCH effects was proposed by McLeod and Li (1983) based on a suggestion in Granger and Andersen (1978). It looks at the autocorrelation function of the squares of the prewhitened data and tests whether corr(x 2 , x2 ) is non-zero for some j. The autocorrelation at t t-j lag j for the squared residuals {x 2} is estimated by: t
See Theorems 1 and 2 below and Ashley, et al. (1986).
2 2 j xt & F xt& j & F

N 2 2
r (j) '
j '1
2 j xt & F
N 2 t '1
where
' j
N t'1
xt
Under the null hypothesis that x t is an i.i.d process (and assuming that E(x 8) exists) McLeod and t Li (1983) show that, for fixed L:
Nr ' r (1) , ... , (L) r
is asymptotically a multivariate unit normal. Consequently, for L sufficiently large, the usual BoxLjung statistic Q ' N (N % 2) j
L j '1
r 2(j) N&j
is asymptotically P2(L) under the null hypothesis of a linear generating mechanism for the data. Typically L is taken to be around 20; below results are quoted for L = 24, although the Toolkit computes the test for a variety of L values.
Engle LM Test This test was proposed by Engle (1982) to detect ARCH disturbances; as Bollerslev (1986) suggests, it should also have power against GARCH alternatives. As with most LaGrange Multiplier tests, the test statistic itself is based on the R 2 of an auxiliary regression, in this case:
xt
' "o % j "i xt &i % <t

M 2 i' 1
where it is assumed that E(x 8) exists. Under the null hypothesis of a linear generating mechanism t for xt, NR2 for this regression is asymptotically distributed P2(M). Below results are quoted for M = 5, although the Toolkit computes the test for a variety of M values.
BDS Test The BDS test is a nonparametric test for serial independence based on the correlation integral of the scalar series, {x t}. For embedding dimension m, let {Om} denote the sequence of t m-histories generated by {x t}: Ot
m
'
xt , ... , xt% m& 1 )
where the prime denotes transpose. Then the correlation integral C m,T (,) for a realization of N is given by:
Cm,N (,) ' j I, Ot , Os

m t <s
2 Nm (Nm & 1)
where Nm = N - (m - 1) and I,(Om , Om) is an indicator function which equals one if the sup norm s t 2Om -Om2 < , and equals 0 otherwise. Basically, C m,N (,) counts up the number of m-histories s t that lie within a hypercube of size , of each other. Brock, Dechert, and Scheinkman (1996) exploit the asymptotic normality of C m,N (,) under the null hypothesis that {x t} is an i.i.d. process to obtain a test statistic which asymptotically converges to a unit normal. This convergence requires extremely large samples for values of the embedding dimension (m) much larger than 2, so attention here is restricted to the cases m = 2, 3, and 4. 6 Where (as here) the data has been normalized to unit variance, the test is ordinarily computed for , =.5, 1, and 2; results are quoted below for , = 1. According to de Lima (1997) the BDS test has no moment restrictions. Apparently, this follows from the fact that the test maps the sup norm 2Om -Om2 onto [0, 1] in 1. However, de t s Lima (1997) also points out that the existence of the second moment is probably required when the test is applied (as it must be) to the residuals from a linear regression. Therefore, we take it that this test requires the existence of moments at least up to order two.
Tsay Test The Tsay (1986) test is a generalization of the Keenan (1985) test; it explicitly looks for quadratic serial dependence in the data.
6
The actual size of the BDS test for modest sample sizes is analyzed below.
Let the K = k(k-1)/2 column vectors V 1 ... VK contain all of the possible crossproducts of the form xt-i xt-j, where i 0 [1, k] and j 0 [i, k]. Thus, vt,1 = x2 ; vt,2 = xt-1 xt-2; vt,3 = xt-1 xt-3; vt,k+1 = t-1
^ xt-2 xt-3; vt,k+2 = xt-2 xt-4, ... and vt,K = x2 . And let vt,i denote the projection of v t,i on the subspace t-k
orthogonal to xt-1 , ... , xt-k
i.e., the residuals from a regression of v t,i on xt-1 , ... , xt-k.
The parameters (1 ... (K are then estimated by applying OLS to the regression equation
xt ' (o % j (i vt, i % 0t.

K i' 1
^ Note that the jth regressor in this equation is vt,j
the period t fitting error from a regression of v t,j
on xt-1 , ... , xt-k. So long as p exceeds K, this projection is unnecessary for the dependent variable (xt) since it is prewhitened using an AR(p) model. 7 The Tsay test statistic is then just the usual F statistic for testing the null hypothesis that (1 ... (K are all zero, where it is assumed that E(x 8) t exists.
Hinich Bicovariance Test This test assumes that {x t} is a realization from a third-order stationary stochastic process and tests for serial independence using the sample bicovariances of the data. The (r,s) sample bicovariance is defined as:
The value of K is user-selectable in the Toolkit; this parameter is set to five in the simulations reported
below.
C3 (r , s ) ' (N & s )&1 j xt xt% r xt% s

N& s t' 1
0 # r # s.
The sample bicovariances are thus a generalization of a skewness parameter. The C 3(r,s) are all zero for zero mean, serially i.i.d. data. One would expect non-zero values for the C 3(r,s) from data in which xt depends on lagged crossproducts, such as x t-ixt-j and higher order terms. Let G(r,s) = (N - s) C3(r,s) and define X3 as
X3 ' j j G( r , s) 2.
R s&1 s ' 2 r' 1
Under the null hypothesis that {x t} is a serially i.i.d. process, Hinich and Patterson (1995) show that X3 is asymptotically distributed P2(R [R -1]/2) for R < N ; based on their simulations, they recommend using R = N.4. Under the assumption that E(x 12) exists, the X3 statistic detects nont zero third order correlations. It can be considered a generalization of the Box-Pierce portmanteau statistic.
Hinich Bispectral Test This nonparametric test examines the third order moments (bicovariances) of the data in the frequency domain to obtain a direct test for a nonlinear generating mechanism, irrespective of any linear serial dependence which might be present. Consequently, when this test rejects, one need not worry about the possibility that the linear prewhitening model has failed to remove all linear serial dependence in the data. 10
Suppose that {yt}, the series of interest, is a third-order stationary time series with, for expositional convenience, E[y t] = 0 and assuming that E(y 6) exists. The series {y t} might be t serially correlated, in which case it is distinct from the prewhitened fitting error series denoted {xt} above. Letting c yyy(r,s) denote the third order cumulant function for {y t},
cyyy ( r, s) ' E yt yt% r yt% s ,
the bispectrum of {y t} at frequency pair (f 1, f2) is its (double) Fourier transform:
By (f1 , f2 ) '
r' &4
j
4
s'& 4
j cyyy (r, s) exp[& i2 B(f1 r % f2 s )].

4
By (f1, f2) is a spatially periodic function of (f 1, f2), whose principal domain is the triangular set S = {0 < f1 < , f2 < f1, 2f1 +f2 < 1}; see Brillinger and Rosenblatt (1967) for a rigorous treatment of the bispectrum. The time series {yt} is linear if and only if it can be expressed as yt ' j a(n) u(t&n) ,
4 n '0
where ut ~ iid(0, F2) and the sequence of weights {a(n)} is fixed. The bispectrum of {y(t)} has a particularly simple form when the generating mechanism for {y t} is linear:
11
Theorem 1: If {yt} is linear and the weights {a(n)} are absolutely summable, then
1. By(f1, f2) = 3A(f1)A(f2)A*(f1 + f2) where A(f) / j a(s) exp &i2B fs ,

4 s' 0
A*(f) is its complex conjugate, and 3 is E{u(t)3}.
and 2. the square of the skewness function, Q2 (f1, f2),
Q (1 , 2 ) /
| By ( 1 , 2 ) |2 S y ( 1) Sy (2) Sy (1 % 2)
'
3 F6
is a constant for all frequency pairs (f 1, f2) in S, where Sy(f) is the spectrum of {y t} at frequency f.
Proof of Theorem 1: This was first proven in Brillinger (1965); an elementary proof is given in Appendix 1.
Under the null hypothesis of a linear generating mechanism, Theorem 1 implies that the sample estimates of Q2 (f1, f2) for different frequency pairs will differ from one another no more than one would expect due to sampling error. In particular, Hinich (1982) shows that consistent 12
sample estimates of 2Q2 (f1,f2) are asymptotically distributed as a noncentral chi-squared variate, P2(2, 8), with constant non-centrality parameter (8) under the null hypothesis of linearity; whereas, if the null hypothesis of linearity is false, then 8 is dependent on f1 and f2. The Hinich test then uses an expression for the asymptotic distribution of the interdecile range of observations from a specified distribution given by David (1970) to test whether the dispersion in the estimates of 2Q2 (f1,f2) exceeds that which one would expect under the null hypothesis. 8 The bispectrum {By (f1, f2)} is consistently estimated using an average of appropriate triple products of the Fourier representation of the observed time series. This average is taken over a square containing M adjacent frequency pairs. As with smoothing of a periodogram so as to obtain a consistent estimate of the spectrum, large M reduces the variance of the estimator at the cost of introducing some small sample bias. 9 The Hinich bispectral test has the nice property that it is unaffected by the application of a linear filter to yt. This follows from the fact that the squared skewness function, Q2 (f1,f2), is invariant to linear filtering:
Theorem 2: If {yt}satisfies the assumptions of Theorem 1 and z t = W(B)yt, where W(B) is a fixed, linear, causal filter with absolutely summable weights, then Q2(f1, f2) is identical to z Q2(f1, f2). y
The interdecile (rather than the interquartile) range is used here because it yields test results which are more robust to non-gaussianity in the data. See also Subba Rao and Gabr (1980) for an earlier approach. See Hinich (1982) and Ashley, Patterson, and Hinich (1986) for details. Based on simulation results in the latter paper, M is set to the integer closest to N .6 in the calculations reported below.
9
13
Proof of Theorem 2: This is proven in Ashley, Patterson, and Hinich (1986, p. 174), but it also follows directly from Theorem 1 since Az(f) = W(exp{i2Bf})A(f) so that the same factor of W(exp{i2Bf1})W(exp{i2Bf2})W(exp{-i2B[f1+f2]}) appears in both the numerator and denominator of Q2(f1, f2) and hence cancels out. z
Consequently
in contrast to all the other tests
the Hinich bispectral test is not confounded by
linear serial dependence remaining in the data due to imperfect prewhitening.
Sizes of the Tests Like most econometric procedures, the tests described above are only asymptotically justified. Particular concern has been expressed about the validity of the BDS test for reasonable sample sizes and addressed, to some degree, in Brock, et al. (1991). More recently, de Lima (1997) has considered the behavior of a number of nonlinearity tests where the moment restriction assumptions underlying the asymptotic distributions of these tests are not satisfied, finding particular problems in situations involving leptokurtic (heavy-tailed) data. Because we share these concerns, we routinely bootstrap the significance levels of all the tests used here, as well as computing significance levels based on asymptotic theory. This is very straightforward. After pre-whitening, so that the data is serially i.i.d. under the null hypothesis of a linear generating mechanism, we draw 1000 N-samples at random from the empirical distribution of the observed N-sample of data. The bootstrap significance level for a given test is then just the fraction of these 1000 new N-samples for which the test statistic exceeds that 14
observed in the sample data. It is simple enough to confirm that 1000 bootstrap replications is sufficient by merely observing that the results are invariant to increasing this number; it is distinctly less clear that N itself is sufficiently large: after all, the pre-whitening procedure and bootstrap itself are themselves only asymptotically justified. Consequently, we examined the actual size of each test (using both asymptotic theory and the bootstrap) for samples of N = 200 serially i.i.d. variates generated from each of four distributions: gaussian, exponential, Students t with 5 degrees of freedom, and the symmetric stable Paretian distribution with index " = 1.93. The exponential distribution is quite asymmetric. Both of the latter two distributions are heavy-tailed to the point where the variance does not
exist for the symmetric stable Paretian distribution with this index value. 10 We also examined the actual sizes of the tests for linearly dependent data, where the observations are generated by an AR(2) process driven by innovations generated from each of these distributions. 11 The results of these calculations are given in Tables 1a and 1b below. We observe that the concerns about the small-sample validity of the tests in particular, the BDS test are justified, at
least at this sample length. In contrast, the bootstrap results appear to be satisfactory for all of the tests except the BDS test with m = 3 and m = 4. 12 Consequently, BDS test results are only
Symmetric stable Paretian variates have finite variance only for " $ 2.00. The de Lima (1997) article arbitrarily considers " = 1.50; the value " = 1.93 used here is Famas (1965) estimate for U.S. stock price data. These variates were obtained using the exact algorithm given by Kanter and Steiger (1974).
11
10
The AR(2) process used was y t = .28yt-1 + .08yt-2 + ,t or (1 - .456B)(1 + .176B)y t = ,t.
The fact that several of the bootstrap size estimates lie outside the 95% confidence interval around .05 is inconsequential in view of the number of estimates made. Since the BDS test is correctly sized at all three embedding dimensions when p is constrained to equal zero {Patterson and Ashley (2000, Tables 4-1, 4-2)} the problem at m = 3 and m = 4 is evidently due to a high sensitivity to minor amounts of linear dependence created on those occasions where the prewhitening procedure mis-identifies the order of the AR(p) prewhitening model as exceeding zero.
12
15
quoted for m = 2 in the remainder of the paper. We conclude that it is reasonable to proceed using the bootstrapped tests for samples of roughly this length or larger without further concern about moment restrictions or the form of the datas distribution.
16
Table 1a Empirical Size of 5% Tests 13 Serially i.i.d. Data 200 Observations
McLeod-Li L = 24 Bootstrap Gaussian Students t(5) Exponential Paretian " =1.93 Asymptotic Theory Gaussian Students t(5) Exponential Paretian " =1.93 .052 .054 .051 .051 .045 .045 .037 .048
Engle LM p=5 m=2
BDS m=3 m=4
Tsay k=5
Bicov. R =8
Bispectral M = 24
.056 .044 .070 .032
.057 .056 .057 .062
.075 .057 .061 .062
.088 .076 .067 .065
.053 .062 .058 .062
.052 .061 .061 .049
.048 .051 .053 .053
.042 .047 .086 .026
.088 .070 .072 .081
.108 .076 .077 .084
.126 .099 .078 .105
.044 .063 .068 .060
.052 .081 .090 .052
.030 .058 .083 .072
Results significantly different from .05 are shown in bold. All figures quoted are based on 1000 generated samples. The 5% critical region for each test was obtained using 1000 bootstrap replications. Under the null hypothesis that the actual size is .05, an (asymptotic) 95% confidence interval for these estimates is (.036, .064). The parameters L, p, m, k, R , and M are defined earlier in this Section, where each test is discussed.
13
17
Table 1b Empirical Size of 5% Tests 14 Linearly Dependent {AR(2)} Data 200 Observations McLeod-Li L = 24 Bootstrap Gaussian Students t(5)15 Exponential Paretian " =1.93 Asymptotic Theory Gaussian Students t(5) Exponential Paretian " =1.93 .048 .054 .042 .045 .047 .047 .052 .035 .066 .103 .070 .076 .074 .130 .076 .091 .074 .161 .086 .101 .044 .046 .054 .047 .051 .054 .081 .058 .116 .128 .073 .065 .043 .045 .026 .037 .060 .056 .045 .041 .044 .071 .062 .055 .046 .100 .064 .068 .052 .121 .071 .072 .057 .055 .047 .045 .047 .056 .059 .049 .046 .047 .047 .059 Engle LM p=5 m=2 BDS m=3 m=4 Tsay k=5 Bicov. R =8 Bispectral M = 24
Results significantly different from .05 are shown in bold. All figures quoted are based on 1000 generated samples. The 5% critical region for each test was obtained using 1000 bootstrap replications. Under the null hypothesis that the actual size is .05, an (asymptotic) 95% confidence interval for these estimates is (.036, .064). The parameters L, p, m, k, R , and M are defined earlier in this Section, where each test is discussed. In the simulations driven by students t innovations, the SC criterion tends to mis-identify the prewhitening model as AR(0) rather than AR(2); where p was constrained to exceed zero, the BDS test results became more or less correctly sized and the other test results were not materially affected.
15
14
18
3. The Differential Power of the Nonlinearity Tests Across Alternatives: Implications for Model Identification
In this Section we discuss our estimates of the power of each test against a number of alternative data generating processes commonly considered in the literature. The processes considered here are listed in Table 2; the estimated power of each test against each of these alternatives is given in Table 3. 16 For each alternative process, 250 samples of length N = 200 are generated; the estimated power of each test is simply the fraction of these samples for which the test rejects the null hypothesis of a linear generating mechanism at the 5% level. Our goal is to answer the following questions: 1. Do one or more of the tests have high power against all of the alternative processes? 2. Is the pattern of power estimates across the alternative processes similar for all of the tests? 3. For a given generating process, are the results of all of the tests highly correlated across the simulations? If one of the tests dominates all the rest in terms of power, then this test is the one to use as a nonlinearity screening test to, for example, routinely check the fitting errors of a proposed model for a time series. On the other hand, since such a test has relatively high power against all of the alternatives, it conveys very little information as to which kind of nonlinear model is appropriate.
Other studies examining the ability of various tests to detect nonlinearity include Lee, et al. (1993), Barnett, et al. (1997), and Lemos and Stokes (1998). The space of all possible nonlinear generating processes is extremely broad no representation is made here that we have in any sense completely spanned it. On the other hand, the set of processes given in Table 2 does include a number of different nonlinear models which have received empirical and/or theoretical attention in the literature and also includes several models which correspond to the low order terms in a Volterra expansion of a rather general class of nonlinear models.
16
19
Paradoxically, such identifying information is only obtainable from tests whose performance is uneven across the alternatives. In this context, a test conveys identifying information to the extent that it is either particularly powerful or particularly unpowerful against a limited subset of the alternatives. Finally, we sought to examine the correlations between the results of the tests for a given alternative. Here, again paradoxically, what is useful is a lack of consistency. For example, we generated 250 samples of 200 observations from the threshold autoregression (TAR) model given in Table 2. If test #1 rejects the null hypothesis of linearity at the 5% level for, say, 200 of these
samples, then it has higher power than test #2 which only rejects the null for 150 samples. But were most of these 150 samples among the 200 samples for which test #1 rejected, or not? If both tests reject over basically the same samples, then test #2 is simply an inferior alternative to test #1. In contrast, if this rejection overlap is small, then test #2 is sensitive to a different aspect of the data set than test #1 and provides separately useful information as to whether or not to reject the null hypothesis them into a portmanteau test. Our estimated power results are given in Table 3. No single test dominates all the others across all seven alternative generating processes. However, the BDS test clearly stands out in terms of overall power against a variety of alternatives. The BDS test has distinctly the highest power for the data generated from ARCH, markov switching, quadratic (martingale), and cubic processes; it is a close second for the GARCH and quadratic (non-martingale) processes. The Tsay test stands out for the data generated from a threshold autoregression (TAR) process, but even there the BDS test still has reasonable power. We conclude that the BDS test is the best test of this group for use as a nonlinearity screening test. 20 in this case it is worthwhile to do both tests and perhaps combine
Table 2. Data Generating Processes Considered a
Conditional Heteroskedasticity Models: ARCH(4) xt = (ht).5 ,t ht = .000019 + .846{x 2 + .3x2 + .2x2 + .1x2 } t-1 t-2 t-3 t-4 GARCH(1,1) Switching Models: Threshold Autoregression (TAR) xt = -.5 xt-1 + ,t xt = .4 xt-1 + ,t if xt-1 < 1 otherwise if in state 1 if in state 2 xt = (ht).5 ,t ht = .011 + .12 (xt-1)2 + .85 ht-1
xt = -.5 xt-1 + ,t Two State Markov xt = .4 xt-1 + ,t
(Remain in state with probability .90) Quadratic models: martingale non-martingale Cubic models: martingale non-martingale xt = ,t + .8 ,t [,t-1,t-2 +.8,t-2,t-3 + ... + .817,t-18,t-19] xt = ,t + .8 ,t-1 [,t-2,t-3 +.8,t-3,t-4 + ... + .817,t-19,t-20] xt = ,t + .8 ,t [,t-1 +.8,t-2 + .82,t-3 + ... + .817,t-18] xt = ,t + .8 ,t-1 [,t-2 +.8,t-3 +.82,t-4 + ... +.817,t-19]
,t is NIID(0,1) in all models. Students t, exponential, and symmetric Paretian variates are considered in Patterson and Ashley (2000), yielding results similar to those reported below.
21
Table 3 Power Estimates of 5% Tests 17 Gaussian Innovations 200 Observations
McLeod-Li L = 24 ARCH GARCH Switching Models TAR Markov Quadratic martingale non-martingale Cubic martingale non-martingale .78 .60 .45 .49 .12 .17 .62 .71
Engle LM p=5 .80 .72
BDS m=2 1.00 .65
Tsay k=5 .32 .38
Bicov. R =8 .46 .68
Bispectral M = 24 .10 .09
.13 .32
.62 .56
.78 .11
.10 .13
.12 .06
.63 .67
.93 .91
.54 .57
.69 .68
.13 .14
.88 .76
.98 .86
.84 .57
.94 .71
.16 .10
All figures quoted are based on 250 generated samples. The 5% critical region for each test was obtained using 1000 bootstrap replications. The parameters L, p, m, R , k, and M are defined in Section 2, where each test is discussed. BDS test results were calculated for , equal to .5, 1, and 2 standard deviations; for brevity (and without much loss of information) results are quoted only for , = 1; results at m = 3 and m = 4 are omitted due to the problems with the size of the test at these embedding dimensions reported in Section 2 above. The generating models (GARCH, TAR, etc.) are given in Table 2.
17
22
On the other hand, this same consistently high power across the alternatives also implies that the BDS test conveys very little information as to what kind of nonlinear process generated the data. Here it is inconsistent power against the alternatives that is useful. In this context the high power of the Tsay test relative to the BDS test for the TAR model considered here suggests that Tsay test may be useful as a marker for TAR models. 18 Next we turn to the third question raised at the beginning of this Section. Generally speaking, there appear to be few complementarities among the tests when a particular tests
power exceeds that of another against a given alternative, we typically find that the less powerful test is rejecting the null hypothesis over basically the same sample replications in which the more powerful test rejects the null also. Consequently, we conclude that construction of portmanteau tests is probably not worth pursuing.
Obviously, one would need to see this pattern hold up for a variety of TAR specifications. This result is confirmed using simulated data from the Potter (1995) TAR model for U.S. real GNP, however: see Section 4 below.
18
23
4. An Application to U.S. Real GNP
The battery of tests analyzed above was applied to the logarithmic growth rate of U.S. real GNP over a sample of 163 quarters running from 1953I to 1993III; this sample period was used for consistency with Altug, et al. (1999). These data are plotted in Figure 1 below; they appear to be reasonably stationary over this time period. The test results are given in Table 4; the significance level for each test was obtained using 1000 bootstrap replications.
Table 4 Significance Levels for Nonlinearity Tests on U.S. Real GNP McLeod-Li .218 Engle LM .525 BDS .356 Tsay .025 Bicovariance .017 Bispectral .331
As expected from results in Ashley and Patterson (1989) on the U.S. Index of Industrial Production and results in Altug, et al. (1999) and in Potter (1995) on real GNP itself, the null hypothesis of a linear generating mechanism for this time series can be rejected at the 5% level. In fact, the results in Table 4 indicate that this null hypothesis can be rejected at the 2.5% level using the Tsay and Hinich bicovariance tests. The pattern of test results for this time series is quite interesting. For one thing, since neither the McLeod-Li nor the Engle LM test rejects the null, the nonlinear generating mechanism for real GNP is apparently not inducing significant conditional heteroskedasticity in the series. Moreover, since the power of the BDS test is noticeably higher than that of the Tsay and Hinich bicovariance tests in most of the simulations reported in Section 3, it is noteworthy
24
Figure 1 Growth Rate in Real GNP 1953I - 1993III 0.04
0.03
0.02
0.01
-0.01
-0.02
-0.03
25
that these two tests reject the null hypothesis for these data and the BDS test does not. More specifically, nowadays, real GNP is commonly modeled as some sort of two-state regime switching process. This specification is consistent with the fact that the Tsay test, which has high relative power against the simple TAR alternative considered in Section 3 above, rejects the null hypothesis of a linear generating mechanism at the 2% level for these data. But the pattern of the real GNP test results in Table 4 is quite different from what one might expect based on the regime switching simulation results reported in Section 3. For example, using data generated from the simple TAR model considered there, the Hinich bicovariance test has quite low power and the BDS test has relatively high power. Similarly, for data generated from the simple Markov regime switching model considered in Section 3, both the Tsay and the Hinich bicovariance tests have low power relative to the BDS test. In contrast, using the actual data on real GNP, we see a fairly strong rejection from the Tsay test and the Hinich bicovariance test, but the BDS test cannot reject the null hypothesis. This pattern of results suggests that the true generating mechanism for real GNP is either a regime switching model which is quite unlike the two switching models considered in Section 3 or that the true generating mechanism for real GNP is not well approximated by a regime switching model at all. To examine this hypothesis, we estimated the power of all six tests using simulated data generated from each of three estimated switching models for real GNP in the literature. In essence, we take the pattern of nonlinearity test results given in Table 4 to be a new stylized fact concerning real GNP and ask whether or not each (or any) of these putative models for real GNP can reproduce it. The first of these estimated switching models is a TAR model for U.S. real GNP identified and estimated by Potter (1995) based on an identification procedure suggested by Tsay (1991). 26
Potters preferred model is:
yt' &.808 % .516 yt &1 & .946 yt& 2 % .352 yt &5 % ,t (.423) (.185) (.353) (.216) ' .517 % .299 yt &1 (.161) (.080) % .189 yt &2 & 1.143 yt& 5 % 0t (.107) (.069)
for yt &2 # 0
for yt& 2 > 0
where the figures in parentheses are estimated standard errors and the sample period is 1948:3 to 1990:4.19 One can wonder whether the terms at lag five are really necessary, but Potter notes that Tiao and Tsay (1991) obtain qualitatively similar behavior from a model omitting them, so they are probably not causing any major problems. The second model is a two-state Markov chain model first proposed by Hamilton (1989) and re-estimated by Lam (1997) over the sample period 1952:2 to 1996:4: 20
yt '
.852 (State I) or &1.500 (State II) (.093) (.453) % .388 yt& 1 % .097 yt &2 & .106 yt& 3 & .127 yt &4 % ,t (.084) (.102) (.099) (.083)
Potters sample period includes a number of large variations at the beginning of the sample. In common with most authors, we exclude these unusual observations by starting our sample period somewhat later. Inclusion of these noisier observations in the sample washes out some of the significance of the nonlinearity test results, but the Tsay test still rejects at the 5% level and the BDS test still rejects at the 5% level at embedding dimensions of three and four and fails to reject at an embedding dimension of two, even at the 10% level. Lams results differ somewhat from Hamiltons, partly because he uses more data (Hamiltons sample period ran from 1952:2 to 1984:4) but primarily because of the BEAs shift from fixed-weighted to chain-weighted real GNP. Since the shift is probably an improvement, we use Lams estimates rather than Hamiltons.
20
19
27
where the system remains in State I with probability .966 and remains in State II with probability .208. The third model is a two-state Markov chain model proposed by Lam (1997) which generalizes this framework to allow both the mean growth rate and the transition probabilities to depend on how long the system has been in its current state (D t):
yt '
1.6250 & .0775 (Dt & 1) % .0013 (Dt & 1)2 (.1198) (.0241) (.0006) &.2755 & 1.2302 (Dt & 1) % .6267 (Dt & 1)2 (.1574) (.2237) (.0884)
(State I)
(State II)
% .385 yt &1 % .401 yt& 2 & .292 yt &3 & .193 yt& 4 % ,t (.099) (.134) (.099) (.114)
where the probability of remaining in State I is the logistic of .9867 % .0970 (Dt & 1) (.3861) (.0342) and the probability of remaining in State II is the logistic of 2.0873 & (.7660) 1.6992 (Dt & 1). (.5861)
We generated 250 samples of length 163 observations from each of these three models and computed the empirical power of each nonlinearity test using data generated from each model; this sample length was chosen to match the number of observations in the actual sample. The resulting power estimates are given in Table 5. 28
Table 5. Empirical Power of 5% Tests Data Generated From Estimated Models for U.S. Real GNP McLeod-Li Engle LM BDS Tsay Bicovariance Bispectral
Using simulated data from Potter (1995) TAR model for U.S. real GNP: .91 .93 .83 .95 .96 .02
Using simulated data from Lam (1997) re-estimation of Hamilton (constant transition probabilities) Markov switching model for U.S. real GNP: .05 .07 .11 .07 .07 .07
Using simulated data from Lam (1997) estimated Markov switching model for U.S. real GNP with duration dependent mean and transition probabilities: .09 .11 .11 .13 .14 .06
Except for the Hinich bispectral test, all of the tests appear to have high power to detect the nonlinearity in the data generated from Potters estimated TAR model. With power this high, one would expect the McLeod-Li and Engle LM tests to reject the null hypothesis of linearity in the actual data if it were generated by a model similar to Potters TAR, but reference to Table 4 shows that they do not reject. In contrast, none of the tests seems particularly effective at detecting the nonlinearity in the data generated using either of the two markov switching models estimated by Lam (1997). With power this low, one would not expect the BDS, Tsay, and Hinich bicovariance tests to reject linearity in the actual data if they were generated by a Markov chain model such as these, but they do reject. Thus, the pattern of which tests reject linearity using the actual real GNP data conflicts with the pattern of power results obtained using data generated from each of the three estimated models. In principle, these discrepancies could be due to ordinary sampling variation. To assess 29
the statistical significance of the observed (uneven) pattern of rejection significance levels across these three tests, we applied the BDS, Tsay, and Hinich bicovariance tests to 1000 data sets (each of length 163) generated from each of the three estimated models. We then computed the fraction of the 1000 data sets generated from each estimated model which yielded test results at least as extreme as those observed with the actual data, where at least as extreme means that the BDS test significance level exceeds .356, the Tsay significance level is less than .025, and the Hinich bicovariance test significance level is less than .017. For each of the three generating models, this fraction is given (as a percentage) in Table 6.
Table 6. Percentage of 1000 Simulated Data Sets Yielding BDS, Tsay, and Hinich BicovarianceTest Results More Extreme Than Those Observed With the Actual Real GNP Data
Threshold Autoregressive Model Potter (1995) 2.9 % constant transition probabilities .1 %
Markov Switching Model duration dependent transition probabilities Lam (1997) .5%
Evidently, data generated from these estimated models produces a pattern of test results this uneven only very rarely. We conclude that the null hypothesis that the growth rate of real GNP is generated by a threshold autoregression of the form estimated by Potter (1995) can be rejected at the 3% level of significance and that the null hypothesis that the growth rate of real
30
GNP is generated by a markov switching model of the forms estimated by Lam (1997) can be rejected at the .5% level. 21
Further restricting attention to outcomes for which the McLeod-Li and Engle LM tests fail to reject at even the 21.8% and 52.5% levels would only strengthen this result. Our approach is similar in spirit to that of Brock, Lakonishok, and LeBaron (1992) in which a putative underlying null hypothesis model (random walk, AR(1), GARCH-M, etc.) is fit to daily stock price data (the Dow Jones Industrial Average) and used to bootstrap artificial stock price data. The results of several simple (chartist) trading rules applied to the actual stock price data are compared to those obtained applying the same trading rules to the artificial data generated from each model, allowing Brock, et al. to infer whether each null model could have generated the actual data.
21
31
5. Summary and Conclusions
The size and power of the McLeod-Li, Engle LM, BDS, Tsay, and Hinich bicovariance and bispectral tests are examined above over a variety of data generating mechanisms. We conclude that: (1) At N = 200, bootstrapping is necessary (but sufficient) in order for the tests to be properly sized. (2) Of the tests considered, the BDS test has relatively high power against all of the alternatives, making it a reasonable choice as a nonlinearity screening test for routine use. (3) The test results appear to be quite highly correlated with one another: based on these results we see little potential benefit in attempting to combine them into a portmanteau test. (4) Excluding the BDS test, the remaining tests are quite inconsistent in their power across the various alternatives considered. Some of the tests (e.g., McLeod-Li) are simply erratic. The Tsay test, however, appears to have relatively high power against TAR alternatives compared to the BDS test. Finally, applying the battery of tests to actual data on real U.S. GNP, we find persuasive evidence that the generating mechanism for this time series is nonlinear. However, the pattern of the test results is quite unlike anything we observe in our simulations. In particular, on the actual data, we find that none of the other tests reject the null hypothesis of a linear generating mechanism for real GNP more strongly than the Hinich bicovariance test; this result suggests that 32
the Hinich bicovariance test may be substantially more useful in practice than our results based on data simulated from the generating mechanisms in Table 2 might indicate. Moreover, with the actual real GNP data we find that the Tsay and Hinich bicovariance tests reject the null hypothesis but the BDS test does not. This result is strikingly at odds with our power estimates based on simulated data, in which the power of the BDS tests is generally substantially higher than that of the Tsay and Hinich bicovariance tests. Taking this observed pattern of nonlinearity test results as a new stylized fact about U.S. real GNP, we examine whether estimated TAR and Markov switching models for U.S. real GNP in the literature are consistent with it. We find that data generated from these estimated TAR and Markov switching models yield patterns of nonlinearity test results which are significantly different from the pattern observed in applying the tests to the sample data itself. Thus, we are able to reject the widely accepted hypothesis that real GNP is generated by some sort of two-state regime switching model, even though several such models fit the data quite well in a least squares sense. While our test does not directly suggest an alternative specification for these data, our finding that these existing specifications inadequately capture the underlying nonlinear dynamics in real GNP is an important result. What kind of nonlinear process does generate real GNP then? An ARCH or GARCH model seems quite unlikely in view of the failures of the McLeod-Li and Engle LM tests to reject linearity. A quadratic or a cubic model seems possible, or perhaps some form of nonlinear model yet to be proposed.
33
References
Altug, S., Ashley, R., and Patterson, D. M. (1999) "Are Technology Shocks Nonlinear?" Macroeconomic Dynamics 3(4), 506-533. Ashley, R. and Patterson, D. M. (1989). Linear Versus Nonlinear Macroeconomies International Economic Review 30, 685-704. Ashley, R., Patterson, D. M. and Hinich, M. (1986). A Diagnostic Test for Nonlinear Serial Dependence in Time Series Fitting Errors Journal of Time Series Analysis 7, 165-78. Barnett, W. A. and M.J. Hinich (1992) Empirical Chaotic Dynamics in Economics, Annals of Operations Research 37, 1-15. Barnett, W. A., A.R. Gallant, M.J. Hinich, J.A. Jungeilges, D.T. Kaplan, and M.J. Jensen (1995) Robustness of Nonlinearity and Chaos Tests to Measurement Error, Inference Method, and Sample Size, Journal of Economic Behavior and Organization 27, 301-320. Barnett, W. A., A.R. Gallant, M.J. Hinich, J.A. Jungeilges, D.T. Kaplan, and M.J. Jensen (1997) A Single-Blind Controlled Competition Among Tests for Nonlinearity and Chaos Journal of Econometrics 82, 157-92. Bollerslev, Tim (1986) Generalized Autoregressive Conditional Heteroskedasticity Journal of Econometrics 31, 307-27. Box, G. E. P. and Jenkins, G. M. (1976) Time Series Analysis Holden-Day: San Francisco. Brillinger, D. and M. Rosenblatt (1967) Asymptotic Theory of kth Order Spectra in Spectral Analysis of Time Series, (B. Harris, ed.) Wiley: New York, pp. 153-88.
34
Brock, W. A., Hsieh, D. A., and LeBaron, B.D. (1991) A Test of Nonlinear Dynamics, Chaos, and Instability: Theory and Evidence MIT Press: Cambridge. Brock, W. A., Dechert W., and Scheinkman J. (1996) A Test for Independence Based on the Correlation Dimension Econometric Reviews 15, 197-235. Brock, W., J. Lakonishok, and B. LeBaron (1992) Simple Technical Trading Rules and the Stochastic Properties of Stock Returns Journal of Finance XLVII(5), 1731-1764. Dalle Molle, J. W. and Hinich M.J. (1995) Trispectral Analysis of Stationary Random Time Series. Journal of the Acoustical Society of America 97, 2963-2978. David, H. A. (1970) Order Statistics Wiley: New York. Engle, Robert F. (1982) Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation Econometrica 50, 987-1007. Fama, E. F. (1965) The Behavior of Stock Market Prices Journal of Business 38, 34-105. Grandmont, J. M. (1985) On Endogenous Competitive Business Cycles Econometrica 53, 995-1045. Granger, C. W. J. and Andersen, A. A. (1978) An Introduction to Bilinear Time Series Models Vandenhoeck and Ruprecht: Gottingen. Hall, R. (1990) Invariance Properties of Solows Productivity Residual in P. Diamond (ed.) Growth/Productivity/Employment MIT Press: Cambridge. Hamilton, James (1989) A New Approach to the Economic Analysis of Non-Stationary Time Series and the Business Cycle Econometrica 57, 357-84. Hicks, J. R. (1950) A Contribution to the Theory of the Trade Cycle Oxford University Press: Oxford. 35
Hinich, M. (1982) Testing for Gaussianity and Linearity of a Stationary Time Series Journal of Time Series Analysis 3, 169-76. Hinich, M. (1996) Testing for Dependence in the Input to a Linear Time Series Model Journal of Nonparametric Statistics 6, 205-221. Hinich, M. and Patterson D. M. (1985) Evidence of Nonlinearity in Daily Stock Returns Journal of Business and Economic Statistics 3, 69-77. Hinich, M. and Patterson D. M. (1995) Detecting Epochs of Transient Dependence in White Noise, unpublished manuscript, University of Texas at Austin. Judge, G., W., Griffiths, C., Hill, H. L, Ltkepohl, Lee, T. C. (1985) The Theory and Practice of Econometrics John Wiley and Sons: New York. Kanter, M. and Steiger W. L. (1974) Regression and Autoregression with Infinite Variance Advances in Applied Probability 6, 768-83. Kaplan, D. T. (1993) Exceptional Events as evidence for Determinism. Physica D 73, 38-48. Keenan, D.M. (1985) A Tukey Nonadditivity-type Test for Time Series Nonlinearity. Biometrika 72, 39-44. Lam, P. (1997) A Markov Switching Model of GNP Growth With Duration Dependence (unpublished manuscript). Lee, T., H. White, C.W.J. Granger (1993) Testing for Neglected Nonlinearity in Time Series Models. Journal of Econometrics 56, 269-90. Lemos, M. and H. H. Stokes (1998) A Single-Blind Controlled Competition Among Tests for Nonlinearity and Chaos; Further Results unpublished manuscript, University of Illinois at Chicago.
36
de Lima, P. J. F. (1997) On the Robustness of Nonlinearity Tests to Moment Condition Failure Journal of Econometrics 76, 251-80. McLeod, A. I. and Li, W. K. (1983) Diagnostic Checking ARMA Time Series Models Using Squared-Residual Autocorrelations Journal of Time Series Analysis 4, 269-73. Mizrach, B. (1991) A Simple Nonparametric Test for Independence. unpublished manuscript. Nychka, D., Ellner, S., Gallant, A.R., and McCaffrey, D. (1992) Finding Chaos in Noisy Systems. Journal of the Royal Statistical Society B 54, 399-426. Palm, F. C. and Pfann, G. A. (1997) Sources of Asymmetry in Production Factor Dynamics Journal of Econometrics 82, 361-92. Patterson, D. M. and Ashley, R. (2000). A Nonlinear Time Series Workshop. Kluwer:Norwell . Potter, S. (1995) A Nonlinear Approach to U.S. GNP Journal of Applied Econometrics 10, 109-125. Ramsey, J.B. (1969) Tests for Specification Errors in Classical Linear Least Squares Regression Analysis. Journal of the Royal Statistical Society B 31, 350-371. Subba Rao, T. and Gabr, M. (1980) A Test for Linearity of Stationary Time Series Analysis Journal of Time Series Analysis 1, 145-58. Tong, H. (1983) Threshold Models in Non-linear Time Series Analysis, Springer-Verlag: New York. Tiao, G. and R. S.Tsay (1991) Some Advances in Nonlinear and Adaptive Modeling in Time Series Analysis University of Chicago Graduate School of Business Statistics Research Centre Report #118. Tsay, R. S. (1986) Nonlinearity Tests for Time Series Biometrika 73, 461-6. Tsay, R. S. (1991) Detecting and Modeling Nonlinearity in Univariate Time Series Statistica Sinica 1, 431-452. 37
White, H. (1989) Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models. Journal of the American Statistical Association 84, 1003-1013.
38
Appendix 1 Proof of Theorem 1
Lemma 1: If {yt} is linear, then
By (f1 , f2 ) ' 3 j
4
n' &4 m'&4
j exp &i2B( f1 m % f2 n) j a(j) a(j&n%m) a(j&n)

4 4 j' n
proof:
By ( f1 , f2 ) '
n'&4 m'&4
j
4
j E yt%n yt%m yt exp & i2B (f1 m % f2 n)

4
'
n' &4 m'&4
j
4
j
4
E j a(j) u(t%n&j j a(k)ut%m&k j a(l) ut&l

4 4 4 j' 0 k' 0 l' 0
exp & i2B (f1 m % f2 n)
39
'
n' &4 m' &4
j
4
j exp &i2B( f1 m % f2 n)
4
j'0 k'0 l'0
j j j a(j) a(k) a(l) E ut%n&j ut%m&k u(t&l)

4 4 4
This expectation equals 3 if n-j = m-k = l; otherwise, it is zero. So l = j-n and k = j-n+m are the only values of l and k which lead to nonzero terms in the sums over l and k. Consequently, the expression for B y (f1, f2) becomes:
By ( f1 , f2 ) ' 3 j
4
n'&4 m' &4
j exp & i2B (f1 m % f2 n) j a(j) a(j&n%m) a(j&n)

4 4 j' 0
but a(j-n) is zero for j0 [0, n-1], so the sum over j can start at j = n. This completes the proof of Lemma 1.
Lemma 2: Assuming that the weights {a(n)} are absolutely summable, so that the sums defining A(f) converge,
A(f1) A(f2 ) A ((f1%f2 ) ' j

4 n'&4 m'&4
j exp & i2B (f1 m % f2 n) j a(j) a(j&n%m) a(j&n)

4 4 j 'n
40
A(f1) A(f2 ) A ((f1%f2 ) ' j a(s) exp &i2B f1 s j a(J)exp & i2B f2 J j a(w)exp i2B (f1%f2) w
4 4 4 s'0 J '0 w' 0
A(f1)A(f2 ) A ((f1%f2 ) ' j j j exp &i2B f1 [s&w] %f2 [J&w]

4 4 4 s' 0 J ' 0 w'0
a(s) a(J)a(w)
Letting m = s-w, m 0 [-4, 4] and letting n = J-w, n 0 [-4, 4],
A(f1) A(f2 ) A (f1%f2 )' j

4 (
m'&4 n' &4 w' 0
j j exp & i2B f1 m% f2 n

4 4
a(m%w)a(n%w) a(w)
and letting j = n+w this becomes
A(f1)A(f2 ) A ((f1%f2 )' j

4
m'&4 n' &4 j 'n
j j exp &i2B f1 m%f2 n

4 4
a(m%j&n) a(j)a(j&n)
proving Lemma 2. Combining Lemma 1 and Lemma 2 proves part 1 of Theorem 1. Part 2 follows directly from the observation that S y(f) ' F2 |A(f) |2.
41

Econometrics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics

Uploaded by

Copyright:

Available Formats

NONLINEAR MODEL SPECIFICATION/DIAGNOSTICS: INSIGHTS FROM A BATTERY OF NONLINEARITY TESTS

Richard A. Ashley Department of Economics Virginia Tech

Douglas M. Patterson Department of Finance Virginia Tech

February 12, 2001

assume at the outset that the nonlinearity takes a particular form.

See Barnett and Hinich (1992) and Barnett, et al. (1995). 1

See Theorems 1 and 2 below and Ashley, et al. (1986).

2 2 j xt & F xt& j & F

' "o % j "i xt &i % <t

xt , ... , xt% m& 1 )

Cm,N (,) ' j I, Ot , Os

orthogonal to xt-1 , ... , xt-k

i.e., the residuals from a regression of v t,i on xt-1 , ... , xt-k.

xt ' (o % j (i vt, i % 0t.

^ Note that the jth regressor in this equation is vt,j

the period t fitting error from a regression of v t,j

C3 (r , s ) ' (N & s )&1 j xt xt% r xt% s

cyyy ( r, s) ' E yt yt% r yt% s ,

the bispectrum of {y t} at frequency pair (f 1, f2) is its (double) Fourier transform:

j cyyy (r, s) exp[& i2 B(f1 r % f2 s )].

1. By(f1, f2) = 3A(f1)A(f2)A*(f1 + f2) where A(f) / j a(s) exp &i2B fs ,

A*(f) is its complex conjugate, and 3 is E{u(t)3}.

and 2. the square of the skewness function, Q2 (f1, f2),

in contrast to all the other tests

the Hinich bispectral test is not confounded by

linear serial dependence remaining in the data due to imperfect prewhitening.

Table 1a Empirical Size of 5% Tests 13 Serially i.i.d. Data 200 Observations

Engle LM p=5 m=2

BDS m=3 m=4

.056 .044 .070 .032

.057 .056 .057 .062

.075 .057 .061 .062

.088 .076 .067 .065

.053 .062 .058 .062

.052 .061 .061 .049

.048 .051 .053 .053

.042 .047 .086 .026

.088 .070 .072 .081

.108 .076 .077 .084

.126 .099 .078 .105

.044 .063 .068 .060

.052 .081 .090 .052

.030 .058 .083 .072

Table 2. Data Generating Processes Considered a

xt = -.5 xt-1 + ,t Two State Markov xt = .4 xt-1 + ,t

Table 3 Power Estimates of 5% Tests 17 Gaussian Innovations 200 Observations

Engle LM p=5 .80 .72

BDS m=2 1.00 .65

Tsay k=5 .32 .38

Bicov. R =8 .46 .68

Bispectral M = 24 .10 .09

4. An Application to U.S. Real GNP

Figure 1 Growth Rate in Real GNP 1953I - 1993III 0.04

Potters preferred model is:

for yt& 2 > 0

Threshold Autoregressive Model Potter (1995) 2.9 % constant transition probabilities .1 %

5. Summary and Conclusions

Appendix 1 Proof of Theorem 1

Lemma 1: If {yt} is linear, then

n' &4 m'&4

j exp &i2B( f1 m % f2 n) j a(j) a(j&n%m) a(j&n)

j E yt%n yt%m yt exp & i2B (f1 m % f2 n)

n' &4 m'&4

E j a(j) u(t%n&j j a(k)ut%m&k j a(l) ut&l