You are on page 1of 14

THERE IS NO CHAOS IN STOCK MARKETS

JAMAL MUNSHI
ABSTRACT: The elegant simplicity of the Efficient Market Hypothesis (EMH) is its greatest weakness because human nature
demands complicated answers to important questions and Chaos Theory readily fills that demand for complexity claiming that it
reveals the hidden structure in stock prices. In this paper we take a close look at the Rescaled Range Analysis tool of chaos
theorists and show that their findings are undermined by weaknesses in their methods1.

1. INTRODUCTION

Stock market data have thwarted decades of effort by mathematicians and statisticians to discover their
hidden pattern. Simple time series analyses such as AR, MA, ARMA, and ARIMA2 were eventually
replaced with more sophisticated instruments of torture such as spectral analysis but the data refused to
confess. The failure to discover the structure in price movements convinced many researchers that the
movements were random. The random walk hypothesis (RWH) of Osborne and others (Osborne, 1959)
(Bachelier, 1900) was developed into the efficient market hypothesis (EMH) by Eugene Fama (Fama,
1965) (Fama, Efficient Capital Markets: A Review of Theory and Empirical Work, 1970), which serves
as a foundational principle of finance.
The `weak form of the EMH implies that movements in stock returns are random events independent of
historical values. The rationale is that prices contain all publicly available information and if patterns did
exist, arbitrageurs would take advantage of them and thereby quickly eliminate them. Both the RWH and
the EMH came under immediate attack from technical analysts and this attack continues to this day partly
because the statistics used in tests of the EMH are controversial. The null hypothesis states that the market
is efficient. The test then consists of presenting convincing evidence that it is not. The tests usually fail.
Many argue that the failure of these tests represent a Type II error, that is, a failure to detect a real effect
because of low power of the statistical test employed. It is therefore logical to conjecture that the reason
for the failure of statistics to reject the EMH is not the strength of the theory but the weakness of the
statistics. If that is the case, perhaps a different and more powerful mathematical device that allowed for
more complexity might be successful in discovering the hidden structure of stock prices.
In the early seventies, it appeared that Catastrophe Theory was just such a device (Zeeman, 1974)
(Zeeman, Catatstrophe Theory, 1976). It had a seductive ability to mimic long bull market periods
followed by catastrophic crashes. But it proved to be a mathematical artifact. Its properties could not be
generalized. It yielded no secret structure or patterns in stock prices. The results of other non-EMH
models such as the Rational Bubble theory (Diba, 1988) and the Fads Theory (Camerer, 1989) are equally
unimpressive for the same reasons.
1

Date June, 2014


Key words and phrases: rescaled range analysis, fractal, chaos theory, statistics, Monte Carlo, stock returns
Author affiliation: Professor Emeritus, Sonoma State University, Rohnert Park, CA, 94928, munshi@sonoma.edu
2
Auto Regressive, Moving Average, Auto Regressive Moving Average, Auto Regressive Integrated Moving Average

Electronic copy available at: http://ssrn.com/abstract=2448648

2. THEORY

Many economists feel that the mathematics of time series implied by Chaos Theory (Mandelbrot B. ,
1963) is a promising alternative. If time series data have a memory of the past and behave accordingly
even to a small extent then much of what appears to be random behavior may turn out to be part of the
deterministic response of the system. Certain non-linear dynamical system of equations can generate time
series data that appear remarkably similar to the behavior of stock market prices.
Research in this area of finance is motivated by the idea that by using new mathematical techniques
hidden structures can be discovered in what appears to be a random time series. One technique, attributed
to Lorenz (Lorenz, 1963), uses a plot of the data in phase space to detect patterns called strange attractors
(Bradley, 2010) (Peters E. , 1991). Another method proposed by Takens (Takens, 1981) uses an algorithm
to determine the `correlation dimension' of the data (Wikipedia, 2014) (Schouten, 1994). A low
correlation dimension indicates a deterministic system. A high correlation dimension is indicative of
randomness.
The correlation dimension technique has yielded mixed results with stock data. Halbert White and others
working with daily returns of IBM concluded that the correlation dimension was sufficiently high to
regard the time series as white noise (White, 1988) although Scheinkman (Scheinkman, 1989) et al claim
to have found that a significant deterministic component in weekly returns.
2.1
Rescaled range analysis. A third technique for discovering structure in time series data has been
described by Mandelbrot (Mandelbrot, 1982), Hurst (Hurst, 1951), Feder (Feder, 1988), and most recently
by Peters (Peters, 1991) (Peters E. , 1991) (Peters E. , 1994). Called `rescaled range analysis', or R/S, it is
a test for randomness of a series not unlike the runs test. The test rests on the relationship that in a truly
random series, a selection of sub-samples of size taken sequentially without replacement from a large
sample of size N should produce a random sampling distribution with a standard deviation given by

Equation 1

x = (/0.5)*(N-)/(N-1)

Here x is the standard deviation of the distribution of sample means obtained by drawing samples of size
sequentially and without replacement from a large sample of size N, and is the standard deviation of
the large sample, i.e., = x when =N. However, when the time series has runs, it can be shown that the
exponent of in the term 0.5, will differ from 0.5. The paper by Peters describes the following
relationship.

Equation 2

R/S = H

Electronic copy available at: http://ssrn.com/abstract=2448648

where R is the range of the sequential running totals of the deviations of the sub-sample values from the
sub-sample mean3, S is the standard deviation of the sub- sample, and is the size of the sub-sample . The
`H' term is called the Hurst constant or the Hurst exponent. It serves as a measure of the fractal and nonrandom nature of the time series. If the series is random H will have a value of 0.5 as shown in Equation
1, but if it has runs the H exponent will be different from 0.5.
If there is a tendency for positive runs, that is, increases are more likely to be followed by increases and
decreases are more likely to be followed by decreases, then H will be greater than 0.5 but less than 1.0.
Values of H between 0 and 0.5 are indicative of negative runs, that is increases are more likely to be
followed by decreases and vice versa. Hurst and Mandelbrot have found that many natural phenomena
previously thought to be random have high H-values that are indicative of a serious departure from
independence and randomness (Hurst, 1951).
Once `H' is determined for a time series, the autocorrelation in the time series is computed as follows:
Equation 3

CN = 2(2H-1) -1

CN is the correlation coefficient and its magnitude may be interpreted as the degree to which the elements
of the time series are dependent on historical values. The interpretation of this coefficient used by Peters
to challenge the EMH is that it represents the percentage of the variation in the time series that can be
explained by historical data. The weak form of the EMH implies that this correlation is zero; i.e., that the
observations are independent of each other. Therefore, evidence of such a correlation can be interpreted to
mean that the weak form of the EMH does not hold.
Peters (Peters E. , 1991) (Peters E. , 1994) studied monthly returns of the S&P500 index, 30-year
government T-bond, and the excess of stocks returns over the bond returns and computed the R/S values
for all three time series data for eleven sequential sub-sample sizes. He found very high values of H and
CN and therefore rejected the EMH null hypothesis. His papers and books on the subject have generated a
great deal of interest in R/S research with many papers reporting a serious departure from randomness in
stock returns previously assumed to be random (Pallikari, 1999) (Bohdalova, 2010) (Jasic, 1998)
(McKenzie, 1999) (Mahalingam, 2012). We now examine this methodology in some detail.

The sum of these values is of course zero by definition but the intermediate values in the running sub-totals will
have a range that is related to the tendency for the data to have positive or negative runs.

3. METHODOLOGY

An appropriate time series that is sufficiently long to facilitate sequential sub-sampling without
replacement is selected for R/S analysis. It will be sampled in cycles. In each cycle, of many subsampling cycles, sub-samples are taken sequentially and without replacement. The sub-sample sizes are
left to the discretion of the researcher. In the example data shown in Table 1, for example, we find that the
researcher (Peters E. , 1991) has taken eleven cycles of sub-samples from a large sample of N=463
monthly returns. In the first sub-sampling cycle he selected one sample of =N=463. In the second cycle
he selected two samples of =230 returns sequentially and without replacement. Similarly in the third
cycle he selected three subsamples of =150 sequentially and without replacement; and so on gradually
reducing sample size until he selected 77 sub-samples of =6.

463
230
150
116
75
52
36
25
18
13
6

Stocks
31.877
22.081
16.795
12.247
12.182
10.121
7.689
6.296
4.454
3.58
2.168

Bonds
45.05
21.587
15.72
12.805
10.248
9.29
7.711
5.449
4.193
4.471
2.11

Premium
27.977
18.806
15.161
11.275
11.626
8.79
7.014
4.958
4.444
3.549
2.209

Table 1 and R/S data from Peters

For each of the many sub-samples he generated, he computed the sample mean, the deviation of each
observation in that sub-sample from the sub-sample mean, the running sum of these deviations, the range4
of the running sum of the deviations, and the standard deviation of the sub-sample. He then divided the
range by the standard deviation of the sub-sample to obtain values shown in Table 1.
For =463 there was of course just one sample and the value of Range over Standard deviation (R/S) for
stocks for that single sample was found to be 31.877 as shown in Table 1. For =230, there were two subsamples and therefore two values of R/S. We don't know what they were but we know from Table 1 that
their average for stocks is 22.081. Similarly, there were three values of R/S for the three sub-samples of
size =150 but this information is not reported. Instead we are told that the average of the three R/S
values for stocks is 16.795. Thus for all sub-sample sizes shown in Table 1, the only R/S values reported
and the only R/S values subjected to further analysis are the averages of the R/S values of all the subsamples of a given size. The additional information contained in the original data, particularly with

Range = maximum minus minimum. The sum of the all deviations is of course zero but the running partial sums
may have a large range that deviates from that of a random series if there is tendency for a positive or negative
persistence in the data.

reference to variance, is lost and gone forever. As we shall see later, this information loss is a serious
weakness in R/S research methodology.
Once all the values of and the average R/S for sub-samples of each size are tabulated as shown in Table
1, the researcher is ready to estimate the value of H by utilizing Equation 2 which states the relationship
between and R/S. To do that, the researcher first renders Equation 2 into linear form by taking the
natural logarithm of both sides to yield
ln(R/S) = H*ln()

Equation 4

Equation 4 implies that there is a linear relationship between ln(R/S) and ln() and that the slope of this
line is H. To estimate the value of H the researcher takes the natural logarithms of the R/S values and for
the corresponding values of and then carries out OLS5 linear regression between the logarithms to
estimate the slope using the regression model
Equation 5

y=b0+b1x

If x is set to ln() and y is set to ln(R/S), then, it is claimed, the regression coefficient b1 can be
interpreted as our best unbiased estimate of the value of H. The regression results and the values of b1 are
shown in Table 2. For example, the value of b1 for stocks is b1=0.611 and the researcher uses that
information to conclude that his best unbiased estimate for H is H=0.611. He may decide that this value is
rather high, much higher than H=0.5 for a random series, and conclude that the series is not random but
contains a positive persistence. He can now use Equation 3 to put a value on the amount of persistence as
CN=0.166, that is 16.6% of the value of the returns can be explained by the effect of past returns.
PETERS RESULTS
Regression results
Model estimates

parameter
b0
b1
H
CN

stocks
-0.103
0.611
0.611
0.166

bonds premium
-0.151
-0.185
0.641
0.658
0.641
0.658
0.216
0.245

Table 2 The findings as presented in the paper by Peters

There are two serious problems with this regression procedure one of which we will address in this paper.
First, we should take note that the regression algorithm minimizes the sum of squared errors in the natural
logarithm of R/S. We have no assurance that the value of H at which the sum of squared errors in ln(R/S)
is minimized is the same as the value of H at which the sum of squared errors in R/S is also minimized.
We will examine this issue in a future paper. For now we turn to the more serious issue of b0, a value that
is computed and presented but then forgotten and never interpreted. Some R/S researchers (Pallikari,
1999) (Jasic, 1998) have attempted to acknowledge the existence of b0 by changing Equation 2 to the
form
5

Ordinary Least Squares

Equation 6

R/S = CH

Equation 7

C=exp(b0).

For example, in Table 2, the C and H values for stocks is C=e-0.103=0.902 and H=0.611. So we get a sense
that the value of b0 now has a place in R/S research but we still have no attempt by researchers to interpret
C or to examine the relationship between C and H.
In fact, there is an inverse relationship between C and H that works like a see-saw. Higher values of C are
associated with lower values of H and lower values of C are associated with higher values of H. This
relationship presents the second methodological problem for R/S research because it implies that the
values of H and C must be evaluated together as a pair and not in isolation. Alternately, one could fix the
regression intercept at zero where C=1 and compare H values directly at the cost of increasing the error
sum of squares in the regression. We investigate these possibilities in the next section.
Yet a third issue in chaos research that must be investigated is that the way they are structured often leads
to a high probability spurious findings. First, the level of hypothesis tests is usually set to =0.05. This
means that five percent of the time researchers will find an effect in random data. Recent studies of the
irreproducibility of results in the social sciences has led many to insist that this error level should be set to
=0.001 (Johnson, 2013). In many R/S papers, the false positive error rate is further exacerbated by
multiple comparisons that are made without a Bonferroni adjustment of the level (Mundfrom, 2006).
For example, if five comparisons are made at =0.05, the possibility of finding an effect in random data
rises to 1.05^5-1 or 27% - an unacceptable rate of the production of spurious and irreproducible results.

4. DATA ANALYSIS

For a demonstration of R/S analysis and the methodological issues we raised, we have selected four time
series of daily returns and four series of weekly returns each with N=2486. The stocks and stock indexes
selected for study are listed in Table 3. The number of cycles and the sub-sample sizes to use are
arbitrarily assigned as follows. The sub-samples are taken in six cycles. In the first cycle we take one
sample of =2486; in the second cycle, two samples of =1243 each; in the third cycle four samples of
=621 each; in the fourth cycle, eight samples of =310 each; in the fifth cycle, sixteen samples of =155
each; and in the sixth cycle we take thirty two samples of =77 each. In each cycle the samples are taken
sequentially and without replacement. We then compute the value of R/S for each sample of each series.
These values are shown in Table 5. Average R/S values as they would appear in conventional research are
shown for reference in Table 4. All data and Microsoft Excel computational files are available in the
online data archive of this paper (Munshi, 2014)6.

The use of the R/S analysis spreadsheet is explained in the Appendix

Symbol

Name

From

To

Returns

IXIC

NASDAQ Composite Index

2004

2014

daily

2486

DJI

Dow Jones Industrial Average

2004

2014

daily

2486

BAC

Bank of America

2004

2014

daily

2486

CL

Colgate Palmolive

2004

2014

daily

2486

SPX

S&P500 Index

1966

2014

weekly

2486

CAT

Caterpillar

1966

2014

weekly

2486

BA

Boeing

1966

2014

weekly

2486

1966

2014

weekly

2486

IBM
IBM
Table 3 Returns series selected for R/S analysis

IXIC

DJI

BAC

CL

SPX

CAT

BA

IBM

2486

67.103

71.259

72.107

41.150

65.899

42.414

62.702

61.596

1243
621

40.069
26.854

39.317
23.887

55.765
29.925

31.469
21.898

52.971
34.672

30.363
25.526

48.096
35.427

44.789
36.034

310
155

20.426
14.387

17.773
13.072

18.719
14.453

15.808
12.458

23.039
15.099

19.659
13.723

24.614
15.373

25.140
14.998

77
9.487
9.231
9.370
8.977
Table 4 The average value of R/S for each value of

9.796

9.204

9.938

10.614

By comparing the values of R/S in Table 4 with those in Table 5 for the same value of we can see that
there is a great deal of variance among the actual observed R/S values and that this variance simply
vanishes from the data when the average value is substituted for them. Of course, the reduction of
variance thus achieved increases the precision of the regression with higher values for R-squared and
correspondingly lower values for the variance of the residuals and the standard error of the regression
coefficient b1. As a result, it appears that we have estimated H with a great deal of precision but this
precision is illusory and fictional because it does not actually exist in the data.
Yet another problem created by throwing away the actual data and substituting one average value of R/S
to represent many observed values of R/S is that the R/S values at lower values of are not weighted
sufficiently in the least squares procedure and therefore the regression is unduly influenced by the large
samples. For example the squared residuals of R/S values for =155 carries a weight=16 if the actual data
are used in the regression instead of weight=1 when one average value is substituted for sixteen
observations. The variance and weighting problems together act to corrupt the regression procedure.
We therefore choose to carry out our regression using the data in Table 5 instead of the average values in
Table 4. With Equation 6 as our empirical model, we take the logarithm of both sides to obtain regression
model as ln(R/S)=C+H*ln(). Linear regression is carried out between y=ln(R/S) and x=ln() and the
regression coefficients b0 and b1 are used as our unbiased estimate of the parameters in Equation 6 as
H=b1 and C=exp(b0). These results are summarized in Table 6.

n
2486
1243
1243
621
621
621
621
310
310
310
310
310
310
310
310
155
155
155
155
155
155
155
155
155
155
155
155
155
155
155
155
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77

IXIC
67.103
51.309
28.829
29.682
31.785
20.237
25.713
21.686
25.180
21.297
21.462
19.619
17.099
22.256
14.810
15.702
20.579
15.521
15.114
13.939
15.930
14.180
11.935
13.455
16.258
12.544
10.086
16.071
17.333
9.892
11.649
8.392
9.855
12.330
12.180
7.868
12.964
9.299
7.807
10.151
9.101
9.589
9.083
10.850
10.262
9.654
9.553
9.967
7.454
10.740
7.553
8.465
9.621
9.003
8.555
6.842
10.029
12.157
8.790
9.993
9.360
6.806
9.310

DJI
71.259
52.210
26.424
21.223
30.288
23.359
20.679
16.900
19.334
17.508
18.272
18.104
19.098
18.486
14.482
13.432
15.052
9.978
11.873
16.481
12.707
11.235
13.367
12.881
13.275
12.515
10.684
14.108
14.939
13.125
13.497
9.484
9.161
11.570
9.469
8.495
10.496
7.426
8.526
9.622
7.086
9.441
8.250
10.348
8.191
9.064
9.052
11.382
6.692
11.077
9.874
8.939
11.993
9.181
7.333
8.966
10.065
5.701
11.964
6.883
8.788
11.101
9.787

BAC
72.107
54.747
56.783
26.771
35.840
27.944
29.143
16.022
15.103
16.620
24.181
19.288
16.541
22.588
19.414
10.524
15.727
15.379
14.014
10.149
10.778
15.423
15.740
14.667
18.580
17.654
9.266
21.575
11.894
13.289
16.583
8.906
7.118
11.016
8.440
12.870
10.287
8.161
11.315
6.730
6.298
10.574
8.774
10.946
7.890
7.986
9.836
9.209
9.604
11.248
6.334
9.684
8.712
8.425
8.455
8.569
10.610
11.106
8.901
8.849
14.334
8.534
10.127

Table 5 R/S values for all sub-samples taken

CL
41.150
32.024
30.915
22.764
25.693
21.509
17.628
19.321
12.126
17.864
17.273
18.850
12.516
14.318
14.195
15.382
12.499
9.674
12.337
9.048
14.061
14.286
13.984
14.152
12.453
10.921
10.798
9.445
12.730
11.921
15.644
11.395
7.119
9.100
9.271
7.219
7.777
8.562
12.522
10.405
8.707
6.744
7.840
7.576
11.614
7.782
8.747
8.527
10.500
8.565
11.670
8.718
8.649
9.078
8.891
8.604
8.742
7.067
7.535
8.204
8.490
10.629
11.013

SPX
65.899
43.696
62.245
34.683
30.024
37.810
36.170
29.664
26.607
20.039
20.073
18.165
23.090
20.320
26.351
14.605
17.725
17.187
17.924
14.473
18.773
9.640
16.650
14.948
18.088
11.005
12.600
14.079
12.496
20.200
11.188
13.576
11.781
14.479
11.659
7.190
13.160
8.972
8.748
7.598
12.652
11.315
10.865
9.079
6.673
12.759
10.282
11.010
7.189
8.270
10.510
8.634
9.350
5.357
9.187
12.017
9.000
6.761
8.546
12.201
9.188
7.687
7.781

CAT
42.414
24.955
35.772
20.575
22.941
24.660
33.931
19.673
19.483
15.674
18.064
18.820
17.210
19.411
28.940
9.877
12.534
14.446
17.099
14.299
14.238
11.549
11.034
12.421
11.937
11.785
15.035
15.665
13.889
21.845
11.925
8.834
7.296
10.615
11.430
8.695
9.894
9.907
11.213
7.069
9.828
8.451
7.108
8.228
10.719
7.743
8.121
9.737
8.491
9.268
8.201
10.098
6.315
7.916
8.003
13.898
9.711
9.779
8.878
13.071
7.354
10.189
8.480

BA
62.702
60.936
35.255
50.970
33.671
23.726
33.343
31.186
23.536
25.636
26.447
23.861
14.981
26.802
24.465
20.571
14.393
15.825
16.399
15.193
18.211
14.829
16.629
12.580
11.986
13.044
13.145
17.146
13.500
16.938
15.576
14.273
9.586
11.301
10.346
8.927
10.802
10.170
13.913
8.428
10.719
14.611
8.418
7.268
9.732
13.020
7.375
8.988
11.645
8.933
7.447
6.343
6.836
12.648
10.457
9.902
7.341
9.961
11.736
9.974
7.478
8.959
10.470

IBM
61.596
36.848
52.730
37.105
37.393
46.462
23.175
30.035
25.724
27.780
21.216
28.885
25.668
17.154
24.660
17.603
17.697
13.806
21.653
15.273
16.065
14.978
16.202
15.857
12.682
13.058
14.135
14.022
11.632
16.054
9.257
10.773
10.067
14.476
13.900
8.236
9.957
11.608
12.331
9.733
15.051
10.573
11.742
12.453
11.451
12.755
13.922
12.289
12.574
8.091
9.600
10.708
8.260
8.333
8.487
10.859
8.653
8.834
8.119
11.741
9.111
7.085
7.884

IXIC

DJI

BAC

CL

SPX

CAT

BA

IBM

C=exp(b0)

0.9550

1.0012

0.7339

1.3406

0.6999

1.1850

0.7721

0.9346

H=b1

0.5282

0.5065

0.5810

0.4352

0.6030

0.4741

0.5872

0.5548

0.8627

0.8639

0.8594

0.8382

0.8472

0.8211

0.8478

0.8395

H
0.0270
0.0257
0.0301
0.0245
0.0328
0.0283
Table 6 Results of linear regression on the natural logarithms of the data in Table 5

0.0319

0.0311

In Table 6, the R2 value expresses the percentage of the total sum of squared deviations from the mean
that is explained by the regression. These values are quite a bit lower than the R2 values7 one usually
encounters in R/S research but they are probably more realistic. Similarly the standard error in the
estimation of H listed as H in Table 6 are higher but more reliable for the same reasons.
If we scan Table 6 for H values that appear to be very different from H=0.5, CL and SPX stand out with
CL showing a rather low value of H and SPX a somewhat high value. This is of course the line of
reasoning conventionally taken. However, if we also look at the values of C we find something very
interesting. Colgate Palmolive (CL), with a value of H unusually lower than H=0.5 shows a value of C
that is unusually higher than C=1. In the same way, the S&P500 Index (SPX) with a value of H unusually
higher than the neutral value of H=0.5 shows a value of C that is correspondingly lower than the neutral
value of C=1. The DJIA Index (DJI) appears to be in the middle of these extremes with an H value very
close to H=0.5 and a C value very close to C=1.
In fact, if we compare all the H and C values we can discern the see-saw effect mentioned earlier. Values
of H greater than 0.5 correspond with values of C less than 1.0 and values of H less than 0.5 correspond
with values of C greater than 1.0. Therefore, to interpret the regression results in terms of persistence in
the data and chaos in stock prices, we must first understand the relationship between C and H in a random
series that has no persistence, no memory, and no chaos. To do that we took forty random samples of size
n=2486 from a Gaussian distribution with =0 and =1 and computed the R/S values using the same subsampling strategy that we used for the stock returns. Linear regression between ln(R/S) and ln() was
carried out for each of the forty samples. The regression results are shown in Figure 1 and Table 7.
0
-0.4

ln(H)

-0.6

-0.2

-0.2

-0.4

0.2

0.4

0.6

y = -0.4199x - 0.6572
R = 0.9717

-0.6
-0.8
-1
ln(C)

Figure 1 Observed relationship between C and H in random numbers

well above 0.95

Sample#

Averages8

All data
C

R-sqrd

R-sqrd

14

0.629

0.617

0.816

0.631

0.622

0.989

15

0.683

0.606

0.847

0.754

0.592

0.997

32

0.684

0.601

0.853

0.588

0.630

0.961

0.688

0.607

0.786

0.816

0.581

0.998

17

0.758

0.588

0.845

0.732

0.599

0.997

35

0.797

0.572

0.827

0.811

0.572

0.998

37

0.799

0.571

0.781

0.716

0.595

0.987

0.804

0.565

0.857

0.880

0.551

0.985

16

0.805

0.579

0.804

0.751

0.597

0.997

40

0.811

0.564

0.773

0.562

0.637

0.979

19

0.814

0.566

0.818

1.205

0.499

0.980

20

0.816

0.560

0.792

0.830

0.562

0.995

28

0.843

0.556

0.833

0.612

0.618

0.961

36

0.847

0.561

0.810

0.844

0.564

0.980

0.881

0.550

0.782

0.847

0.561

0.980

0.884

0.553

0.814

1.235

0.496

0.994

21

0.884

0.532

0.783

0.654

0.591

0.976

0.885

0.555

0.734

1.097

0.522

0.983

10

0.899

0.549

0.784

1.357

0.479

0.980

22

0.924

0.528

0.816

0.840

0.548

0.990

12

0.924

0.531

0.814

0.705

0.585

0.981

38

0.938

0.536

0.848

0.901

0.546

0.997

30

0.939

0.528

0.785

0.910

0.539

0.999

34

0.950

0.526

0.859

0.675

0.590

0.959

27

0.960

0.531

0.738

1.089

0.515

0.995

29

0.980

0.519

0.807

1.192

0.487

0.994

23

0.989

0.520

0.790

1.094

0.505

0.992

18

1.008

0.516

0.765

0.758

0.571

0.959

33

1.030

0.521

0.817

1.364

0.472

0.990

11

1.032

0.524

0.851

1.312

0.485

0.989

25

1.058

0.502

0.855

0.879

0.538

0.973

31

1.072

0.502

0.800

1.019

0.515

0.998

1.073

0.496

0.757

1.508

0.438

0.975

24

1.083

0.504

0.775

1.069

0.511

0.982

1.098

0.506

0.773

1.644

0.437

0.973

39

1.117

0.509

0.800

1.475

0.461

0.982

1.149

0.480

0.761

1.525

0.432

0.981

13

1.303

0.470

0.775

1.881

0.406

0.983

1.462

0.442

0.798

2.120

0.376

0.975

26

1.482

0.424

0.682

1.952

0.377

0.993

mean

0.944

0.537

0.800

1.046

0.530

0.984

stdev

0.188

0.042

0.038

0.394

0.068

0.011

max

1.482

0.617

0.859

2.120

0.637

0.999

min

0.629

0.424

0.682

0.562

0.376

0.959

range

0.853

0.193

0.177

1.559

0.260

0.040

Table 7 The values of C and H for random numbers


8

The results for the regression of average values are shown only for comparison.

The inverse relationship between C and H in random numbers is clear to see in the sorted data in Table 7
and also graphically in Figure 1. The relationship between H and C for random numbers in our subsample structure may thus be estimated numerically with linear regression in a purely empirical way as
shown in Figure 1.
Equation 8

ln(H) = -0.6572 - 0.4199*ln(C)

It is noted that Equation 8 does not yield a value of H=0.5 when C=1 as we would expect according to
Equation 2. The actual value we compute from Equation 8 is that H(C=1)=e-0.6572 = 0.5183. We know that
Equation 8 is not accurate but it is a better alternative than ignoring the value of C. We can now use
Equation 8 for hypothesis tests to determine if the observed values of H for the stock returns shown in
Table 6 are different from the values of H we would observe in a random series for the same value of C.
The hypothesis tests are shown in Table 8. The row marked H(C) in Table 8 refers to the value of H we
would expect in a random series for the value of C in each column.
IXIC

DJI

BAC

CL

SPX

CAT

BA

IBM

0.9550

1.0012

0.7339

1.3406

0.6999

1.1850

0.7721

0.9346

0.5282

0.5065

0.5810

0.4352

0.6030

0.4741

0.5872

0.5548

H( C )

0.5284

0.5180

0.5902

0.4583

0.6021

0.4826

0.5778

0.5332

Difference

0.0002

0.0116

0.0092

0.0231

0.0009

0.0085

0.0094

0.0216

0.008221

0.4500939

0.3054518

0.9421487

0.0287287

0.3006475

0.2957537

0.6942445

p-value
0.9947665 0.7307531
Table 8 Hypothesis test for H

0.8112735

0.5189576

0.9817158

0.8140747

0.8169358

0.6136656

t-value

It is easy to see in Table 8 that the values of H observed for the stock series are very similar to the values
we would expect to see in a random series and the t-test9 confirms our intuition. These results indicate
that the R/S procedure carried out in the manner shown here does not show any evidence of persistence in
the eight time series data studied that includes both daily and weekly returns.
Yet another way to deal with C is to simply get rid of it and test Equation 2 directly by setting C=1. In the
regression model for y=ln(R/S) against x=ln() we force the y-intercept to be zero. That will yield the
regression equation y=b1x. The b0 term is gone leaving us only with H=b1. The results are shown in Table
9. None of the H-values shows a large departure from H=0.5 in conformity with our findings in Table 8.
IXIC

DJI

BAC

CL

SPX

CAT

BA

IBM

0.5192

0.5067

0.5205

0.4925

0.5332

0.5073

0.5366

0.5416

R
0.8624
0.8639
Table 9 Regression with C=1

0.8498

0.8332

0.8355

0.8170

0.8413

0.8390

H
2

Using the standard error of estimate for H in Table 6

5. CONCLUSIONS
It is proposed that the findings of persistence and chaotic behavior in stock returns by the application of
Rescaled Range Analysis are not real but artifacts of the methodology employed. The regression
procedure normally employed yields values for H and C and the interpretation of H in isolation without
consideration of the corresponding value of C can lead to spurious results. We also note that the practice
of averaging sub-sample R/S values and then using that average as if it were an observation can also
introduce serious errors in the estimation of H and the standard error of H. Yet another weakness of
conventional R/S research is the use of high values of in hypothesis tests and the absence of Bonferroni
corrections for multiple comparisons. Under these circumstances we see no evidence of non-randomness
in R/S research. We conclude that stock returns are a random walk and that the weak form of the Efficient
Market Hypothesis has not been proven wrong by the R/S methodology.

6. REFERENCES
Bachelier, L. (1900). Thorie de la Spculation. Annales Scientifique de l'cole Normale .
Bohdalova, M. a. (2010). Markets, Information and their Fractal Analysis. Retrieved 2014, from g-casa.com:
http://www.g-casa.com/conferences/budapest/papers/Bohdalova.pdf
Bradley, L. (2010). Strange attractors. Retrieved 2014, from Space telescope science institute:
http://www.stsci.edu/~lbradley/seminar/attractors.html
Camerer, C. (1989). Bubbles and Fads in Asset Prices. Journal of Economic Surveys , 3-41.
Chen, N. R. (1983). Chen, Naifu, Economic forces and the stock market: testing the APT and alternate asset pricing
theories. Working pape .
Chen, N. (1983). Some empirical tests of the theory of arbitrage pricing. Journal of Finance , Dec p414.
Diba, B. (1988). The Theory of Rational Bubbles in Stock Prices. The Economic Journal , vol 98 No. 392 p746-754.
Dybvig, P. a. (1985). Yes, the APT is Testable. Journal of Finance .
Fama, E. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. Journal of Finance , 25 (2): 383
417.
Fama, E. (1965). The Behavior of Stock Market Prices. Journal of Business , 38: 34105.
Feder, J. (1988). Fractals. NY: Plenum Press.
Hurst, H. (1951). Long term storage capacity of reservoirs. Transactions of the American Society of Civil Engineers ,
Vol. 116, p770.
Jasic, T. (1998). Testing of nonlinearity and determinstic chaos in monthly Japanese stock market returns. Zagreb
International Review of Economics and Business , Vol. 1 No. 1 pp. 61-82.

Johnson, V. E. (2013, November). Revised Standards for Statistical Evidence. Retrieved December 2013, from
Proceedings of the National Academy of Sciences: http://www.pnas.org/content/110/48/19313.full
Kryzanowski, L. S. (1994). Kryzanowski, Lawrence, Simon LalSome tests of APT mispricing using mimicking
portfolios. Financial Review , v29: 2, p153.
Lorenz, E. (1963). Deterministic nonperiodic flow. Lorenz, Journal of the Atmospheric Sciences , 20 (2): 130141.
Mahalingam, G. (2012). Persistence and long range dependence in Indian stock market returns. Retrieved 2014,
from IJMBS: http://www.ijmbs.com/24/gayathri.pdf
Mandelbrot, B. (1982). The Fractal Geometry of Nature. NY: Freeman.
Mandelbrot, B. (1963). The variation of certain speculative prices. Journal of Business , 36 (4): 394419.
McKenzie, M. (1999). Non-periodic Australian stock market cycles. Retrieved 2014, from RMIT University:
http://mams.rmit.edu.au/ztghsoxhhjw1.pdf
Mundfrom, D. (2006). Bonferroni adjustments in tests for regression coefficients. Retrieved 2014, from University
of Northern Colorado: http://mlrv.ua.edu/2006/Mundfrom-etal-MLRV-3.pdf
Munshi, J. (2014). RS data archive. Retrieved 2014, from Dropbox:
https://www.dropbox.com/sh/bu1mdjtg9mvlmfa/AACLwFys7FMblJzJPGandpjfa
Osborne, M. (1959). Brownian motion in the stock market. Operations Research vol 7 , 145-173.
Pallikari, F. a. (1999). A rescaled range analysis of random events. Journal of Scientific Exploration , Vol 13. No. 1
pp. 25-40.
Peters, E. (1991). A Chaotic Attractor for the S&P500. Financial Analysts Journal , Vol.47 No. 2 p55.
Peters, E. (1991). Chaos and order in the capital markets : a new view of cycles, prices, and market volatility. NY:
John Wiley and Sons.
Peters, E. (1994). Fractal market analysis. NY: John Wiliey and Sons.
Roll, R. (1977). A critique of the asset pricing theory's tests. Journal of Financial Economics , March, p129.
Roll, R. a. (1980). An empirical investigation of the arbitrage pricing theory. Journal of Finance , Dec p1073.
Roll, R. a. (1980). Roll,An empirical investigation of the arbitrage pricing theory. Journal of Finance , p1073.
Ross, S. (1976). The arbitrage theory of capital pricing. Journal of Economic Theory , v13, p341, 1976.
Scheinkman, J. (1989). Nonlinear dynamics and stock returns. Journal of Business , Vol. 62 p. 311.
Schouten, J. (1994). Estimation of the dimension of a noisy attractor. Retrieved 2014, from Google Scholar:
http://scholar.google.co.th/scholar_url?hl=en&q=http://repository.tudelft.nl/assets/uuid:bd5339a0-24e5-4362b378adc7c60cac99/aps_schouten_1994.pdf&sa=X&scisig=AAGBfm36mO9vIeHfx7XnQ_zLbcHlaxZD6Q&oi=scholarr&ei=
mkRzU92RKYSskAXfz4D4BA&ved=0CCoQgAMoADAA

Shanken, J. (1982). The arbitrage pricing theory: is it testable? Retrieved 2014, from The University of Utah:
http://home.business.utah.edu/finmll/fin787/papers/shanken1982.pdf
Shanken, J. (1982). The Arbitrage Pricing Theory: Is it Testable? Journal of Finance , 1129-1140.
Sharpe, W. (1962). A simplified model for porftolio returns. Management Science , p277.
Sharpe, W. (1964). Capital asset prices: a theory of market equilibrium under conditions of risk. Journal of Finance ,
v19, p425.
Takens, F. (1981). Detecting strange attractors in turbulence. In D. A.-S. Young, Dynamical Systems and Turbulence
(pp. 366-381). Springer-Verlag.
University of South Carolina. (2014). Multicollinearity and variance inflation factors. Retrieved 2014, from
University of South Carolina: http://www.stat.sc.edu/~hansont/stat704/vif_704.pdf
Virginia Tech. (2014). Methods for multiple linear regression analysis. Retrieved 2014, from vt.edu:
http://scholar.lib.vt.edu/theses/available/etd-219182249741411/unrestricted/Apxd.pdf
White, H. (1988). Economic prediction using neural networks: the case of IBM daily stock returns. White, H.,
"Economic prediction using neuraNeural Networks IEEE International Conference , White, H., "Economic prediction
using neural networks: the case of IBM daily stock returpp.451,458 vol.2.
Wikipedia. (2014). Autoregressive Model. Retrieved 2014, from Wikipedia:
http://en.wikipedia.org/wiki/Autoregressive_model
Wikipedia. (2014). Correlation dimension. Retrieved 2014, from Wikipedia:
http://en.wikipedia.org/wiki/Correlation_dimension
Wikipedia. (2014). Roll's Critique. Retrieved 2014, from wikipedia: http://en.wikipedia.org/wiki/Roll's_critique
Zeeman, E. (1976). Catatstrophe Theory. Scientific American , 65-83.
Zeeman, E. (1974). On the unstable behavior of stock exchanges. Journal of Mathematical Economics , 39-49.
APPENDIX

The Microsoft Excel file that computes R/S values is called "rescaled range analysis worksheet". Use paste values to
put a column of returns data of sample size 2486 starting on cell A6 of the worksheet called "RS Computation".
Then go to the worksheet called "Regression". All your R/S values are there along with the regression results.

You might also like