You are on page 1of 7

Predicting Crashes in a Model of Evolving Networks

ANDREAS KRAUSE
University of Bath, School of Management, Bath BA2 7AY, United Kingdom

Received April 3, 2003; revised January 5, 2004; accepted January 5, 2004

We consider an evolving network of interacting species which exhibits self-organization. The system is characterized by repeated crashes in which a large number of species are extinct and subsequent recoveries. We investigate the macroscopic properties of this system before such crashes, concentrating on the variance of the relative population sizes of species and its evolution over time. A simple score function is constructed to determine the probability of a crash within a certain time interval to be used as a predictor for crashes. 2004 Wiley Periodicals, Inc. Complexity 9: 24 30, 2004 Key Words: crash; prediction; self-organization; evolving networks

any time series are characterized by rare but repeated large and sudden changes (crashes) with subsequent slow reversals (recoveries). Examples of such behavior include earthquakes, sand piles, the extinction of species in biological evolution, and chemical processes as well as social systems, where stock markets have been of particular interest. Empirical as well as theoretical contributions in most cases focus on the distribution of the size of such crashes and the waiting time between them. However, for practical applications it would be of great importance to nd properties of these systems that allow to predict the occurrence of crashes. Recently it has been proposed that earthquakes are preceded by log-periodic oscillations [1, 2]. A similar result has been reported for

Correspondence to: Andreas Krause, E-mail: mnsak@ bath.ac.uk

crashes in stock markets [3], and the literature mentioned therein, but the empirical evidence for these precursors in stock market crashes is thus far not conclusive. Furthermore, the origin of these oscillations remains undetected such that these precursors up to now lack a sound theoretical foundation for the mechanism causing the crash. In this article we consider a model of evolving networks of interacting species as has been found to be useful for a wide range of applications, e.g., in social systems [4 6], evolutionary models [7, 8], or chemical processes [9]. Investigating the network structure of such a model [10], nds that large crashes are the consequence of a structural change in the network itself, called a core-shift. A further result is that the network structure also affects the robustness of the network to minor exogenous changes causing crashes. In deriving these results it was necessary to know the complete structure of the network at any point of time; this we will call the microscopic properties of a network.

24

C O M P L E X I T Y

2004 Wiley Periodicals, Inc., Vol. 9, No. 4

When investigating real networks such precise knowledge is usually not obtainable. We therefore focus on aggregate outcomes of the network structure, i.e., its macroscopic properties, which are much more easily observed. After briey outlining the model, we establish that there exists a close relationship between microscopic and macroscopic properties of the network. We then use these macroscopic properties to determine the probability of a crash within a given time interval and then, nally, develop a simple score function to predict the probability of a crash within a certain time period.

1. THE MODEL
We describe a network of species by a directed graph with n nodes representing the species, which is characterized by its ij ) 0. The edges of the graph adjacency matrix Ai (at represent any links between species. If there is a link originating from node j and directly terminating at node i we set ij 1 and aij at t 0 otherwise. In a chemical process we can interpret this situation as j being a catalyst for the production of i or for biological species that j increases the chances of i surviving. We will exclude self-catalytic processes by ii 0. setting at A fast dynamical variable is given by the relative popui i , where 0 xi lation of the species, xt t 1 and the xt sum up to unity. This variable evolves according to the following differential equation until it reaches its xed point:

The number of populated species, i.e., those with positive relative populations, show repeated sudden crashes followed by subsequent slow recoveries. The properties of these crashes and recoveries are very well documented in a large number of the before-mentioned articles and are the topic of investigation here. In this article we use a simulation with n 100 species and q 0.0025, which has become a standard parameterization of this model in the literature. A crash is here dened as a situation where in a single time period 60 or more species are depopulated. This choice of 60 depopulated species is arbitrary and aims only to capture the need for a substantial fraction of species to become depopulated in order to be classied as a crash. Simulations with other thresholds gave qualitatively similar results to those presented below. In all we simulate 4,000,000 time periods, which in our realization include 1450 crashes, or 0.03625% of all observations.

2. MACROSCOPIC VS. MICROSCOPIC PROPERTIES


In the simple model outlined above, the only observable variable that does not directly relate to the network structure and can be used to predict crashes is the distribution of the species relative population. Obviously, the normalization to unity prevents the mean to be informative, so that an obvious choice would be to use the variance of the relative population, which we dene as
2 i t Var xt .

i x t

i1

ij j i at xt xt

k1 j1

n n

kj j at x t.

We only consider these xed points in the further analysis. Rather than investigating the dynamics of the relative population, we are here concerned with the dynamics of the graph itself, serving as the slow dynamic variable. After the xed point is reached, the node with the least populated species is extinct and replaced with a new species ik ki and at for all i k to unity with probaby reassigning at bility q and zero with probability 1 q and giving a small random population 0 xk t 1 to the new species; the relative populations are reweighed to ensure they sum up to unity. It has to be noted that the xed point of the differential equation does not depend on the initial conditions but only on the adjacency matrix At. After this change of the graph (graph update), the relative population evolves again according to the above equation. We dene time as the number of graph updates. The initial graph is chosen such ij that for all i j aij t 1 with probability q and at 0 with probability 1 q. This model has been introduced in Jain and Krishna [17], based on the well-documented Bak-Sneppen model of evolution [11], and its properties have extensively been investigated in the literature [1216], besides others.

Apparently this can be observed without reference to the details of the network structure; hence, it is a macroscopic variable. The network structure is represented by its adjacency matrix, as mentioned above. The dynamic properties of the network depend at least in part on its largest eigenvalue, 1 t, 1 0, there are only as shown in Jain and Krishna [17]. If t a few crashes and any changes in the number of populated species are random. For 1 t 1 the number of populated species tends either to grow, with the possibility of a few smaller setbacks in this process, or all species are populated. An eigenvalue 0 1 t 1 cannot be observed in this model. A crash is often, but not always, associated with a signicant change in the network structure, a core-shift, which manifests itself in a signicant change of the largest eigenvalue of the adjacency matrix. In order to determine the eigenvalues of the adjacency matrix, we need to know the entire network structure; hence, it is a microscopic variable. There is a strong relationship between the largest eigenvalue of the adjacency matrix and the variance of the relative population. Conducting a regression of this variance, 2 t , and the eigenvalue we obtain with Dt being a dummy 1 variable that equals 1 for t 0 and zero otherwise:

2004 Wiley Periodicals, Inc.

C O M P L E X I T Y

25

FIGURE 1

Behavior of the median variance of the relative population and its growth rate (as dened in the text) before a crash.

2 t 2.643 10 3 7.357 10 3 D t 1 1.795 10 3 1 D t t ,

variance of the relative population as a good approximation to reect the network structure.

R 2 0.940.
Other specications are consistent with this result and show a similar goodness of t, R2. From the above result we can deduct that the variance of the relative population depends on the largest eigenvalue of the adjacency matrix. Hence macroscopic and microscopic properties are very close substitutes, and we can use the

3. VARIANCE AND THE OCCURRENCE OF CRASHES


When investigating the behavior of the variance before a crash, we see from Figure 1 that it decreases until approximately 150 time periods before a crash and then increases at an accelerating rate until the crash. Consequently we observe that the growth rate of the variance, relative to 100 time periods earlier in order to eliminate any short term uctuations and dened as

26

C O M P L E X I T Y

2004 Wiley Periodicals, Inc.

FIGURE 2

Probability of a crash as a function of the variance of the relative population (horizontal axis) and its growth rate (vertical axis). Note: The plots use a linear interpolation between data points that are located at the crossing points of the grid.

2 2 t t 100 t , 2 t 100

becomes positive shortly before the crash and then increases ever faster. These observations indicate that we can use the variance and its growth rate as a predictor for

2004 Wiley Periodicals, Inc.

C O M P L E X I T Y

27

crashes. However, it has to be noticed that the condence intervals are very wide. On the basis of our simulations, we investigated the fraction of observations representing crashes occurring in the time periods [t T; t T 100], where t denotes the current time period, given that 2 t Ik and its growth rate t J1. The intervals are determined as follows: the variance is divided into 18 equally wide intervals Ik between 0 and 2.5 104 and a nal interval for all values exceeding this upper limit. The growth rate is divided into 17 equally wide intervals Jl between 1 and 3.5 as well as one interval for all values below 1 and one for all those 3.5. We choose to divide the variance and its growth rate into 19 discrete sections as a compromise between a sufcient sample size in each section and the accuracy of the values. We determine for each k, l the fraction of crashes for this parameter constellation over all 4,000,000 time periods and denote this by pkl,T. This fraction, given the size of the simulation, we use as an estimator for the probability with which a crash occurs in this time period, conditional upon the variance and its growth rate being in the appropriate intervals. The ex ante probability of a crash in any of the 100 time periods is 0.036. From Figure 2 we see that the predictability of crashes with our variables is increased significantly. It is obvious that a large fraction of crashes are associated with a high growth rate and until a forecasting period of about 150 time periods the variance reduces for the maximum probability, as mentioned before, and after this increases again. From inspection of these plots is becomes apparent that the dependence of pkl,T on the variance and its growth rate is very complex. It is beyond the scope of this article to give a complete characterization of this structure; however, it should be noted that in many constellations the number of observations is relatively small. As Figure 3 shows, most observations are concentrated in the lower left corner of the plot. Only future research can determine whether the small number of observations has caused this result or there are indeed more complex dependencies to be found. It is also apparent that combining the growth rate and the variance increases the precision of the forecasts as can be seen from comparing with Figure 4 which only uses a single of these variables to predict crashes. The analysis shows that for certain variances and growth rates the probability of a crash is increased. But we thus far only investigated a single time period in our analysis. Additional gains in predicting a crash should arise from observing the development of these variables over time, especially the forecast errors should decrease. We will therefore in the next section propose a score function to predict the occurrence of crashes.

FIGURE 3

Relative distribution of observations for the variance and its growth rate.

4. THE SCORE FUNCTION


We here propose to use a weighted average of the pkl,T in distances of 50 time periods, where T 500, 450, , 50. Let 2 2 T t50 Ik and Tt50 Je, then we know from the previous section that the probability of a crash in [t 50; t 150] is pkl,T. Let us now for notational simplicity dene T .This allows to predict whether a crash this probability by t happens in the time periods [t 50; t 150], i.e., in an interval of 100 time periods starting in 50 time periods. Let T/50, we then can dene a score function as

St

10

50 t .

Five different weights are investigated: 1. 2. 3. 4. 5. Equal weights: 0.1, 10 Linear weights: / 1, 10 2 Quadratic weights: 2/ 1 , Logarithmic weights: ln( 1)/ln(11!), 10 1 . Decreasing weights: 1/ 1

On the basis of these score functions, we determine the fraction of crashes in the time period [t 50; t 150], which is shown in Figure 5(a). Obviously there is a positive relationship between the score and the fraction of crashes. However, for St 0.1 the results for the various score functions diverge and the relationship becomes less obvious. This observation can be attributed to the small number of events with a high score (see Figure 6), giving rise to wide error bands.

28

C O M P L E X I T Y

2004 Wiley Periodicals, Inc.

FIGURE 4

Probability of a crash depending on a single explanatory variable for different times before a crash.

We also estimated the probability of observing a crash in the time period [t 50; t 150] using a logit transformation:

improves substantially the ex-ante prediction of a crash, which is 0.036.

Prob tCrash

e01St , 1 e01St

5. CONCLUSIONS
We have shown rst that the variance of the relative population and the eigenvalue of the adjacency matrix are close substitutes; hence, macroscopic and microscopic properties of the network are interchangeable. We then continued to investigate the relationship between the variance and the probability of a crash occurring in a given time interval and developed a score function to predict these crashes. It is observed that a short period before a crash the variance is minimal; hence, the homogeneity of species with respect to their relative population is high. This nding is

where the sign of 1 should be positive. The results of these estimations are shown in Table 1 and visualized in Figure 5(b). It is apparent that the most recent observations should receive higher weights as this increases the goodness of t, but the improvement from equal to quadratic weights does not make the choice of equal weights unjustiable as an approximation, especially in light of the few observation with a high score. Using such score functions as developed above, we get a relatively reliable predictor for crashes in our model, which

FIGURE 5

The score function and the probability of a crash.

2004 Wiley Periodicals, Inc.

C O M P L E X I T Y

29

FIGURE 6

TABLE 1
OLS Parameter Estimates of the Logit Model Determining the Probability of a Crash Equal Linear 5.392 40.04 0.149 Quadratic 5.223 35.90 0.151 Log 5.502 43.00 0.146 Decreasing 5.525 48.46 0.109

0 1 R2

5.631 46.76 0.141

Distribution of the score for equal weights.

REFERENCES
consistent with the observation that in nancial markets before large changes in the market most investors agree on the optimal investment strategy, i.e., they show a great degree of homogeneity in their opinions and are thus easily disturbed by minor events causing a crash. Initially only a few dissidents warn about the overvaluation of stocks and a possible crash, but they are mostly ignored. Their presence, however, increases the heterogeneity until the change is triggered by a minor change in the opinion of a single or few investors. How these results t into empirical data in the stock market, besides anecdotal evidence, and other disciplines, e.g., the extinction of species in the evolutionary process or chemical processes, has to be shown in future research. It would be of further interest to investigate the behavior of higher moments of the distribution of the relative population, e.g., skewness or kurtosis, to develop more precise predictors of crashes. A note of caution should be made at this place regarding the application to social sciences. Knowledge of a crash or any other uncertain future event and the subsequent actions based on this knowledge can easily change the outcome. Hence with knowledge of a coming crash we may not observe a crash at all; it may be postponed or occur earlier. This result arises from the fact that people will rationally exploit their knowledge and so change the course of events. The prediction of crashes in social systems is thus much more complex than the simple model provided in this article and has to take into account these additional aspects.
1. Huang, Y.; Saleur, H.; Sornette, D. Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake. J Geophys Res 2000, 105(B12), 2811128123. 2. Sornette, D.; Sammis, C. Complex critical exponents from renormalization group theory of earthquakes: Implications for earthquake predictions. J Phys 1995, I, 607 619. 3. Zhou, W.-X.; Sornette, D. Non-Parametric Analyses of Log-Periodic Precursors to Financial Crashes. cond-mat/0205531, 2002. 4. Krause, A. Herding Behavior of Financial Analysts: A Model of SelfOrganized Criticality. The Complex Dynamics of Economic Interaction- Essays in Economics and Econophysics; Gallegati, M.; Kirman, A.P.; Marsili, M., Eds.; Springer-Verlag: Hamburg, 2004. 5. Wellman, B.; Berkowitz, S.D., Eds. Social Structures: A Network Approach. Cambridge University Press: Cambridge, UK, 1988. 6. Ormerod, P; Johns, H. System Fitness and the Extinction Patterns of Firms under Pure Economic Competition. cond-mat/0110052, 2001. 7. Newman, M.E.J.; Palmer, R.G. Models of Extinction: A Review. adaporg/9908002, 1999. 8. Drossel, B. Biological evolution and statistical physics. Adv Phys 2001, 50, 209 295, 2001. 9. Lodish, H.; Baltimore, D.; Berk, A.; Zipursky, S.L.; Matsudaira, P.; Darnell, J.E. Molecular Cell Biology; Scientic American Books: New York, 1995. 10. Jain, S.; Krishna, S. Crashes, recoveries, and core-shifts in a model of evolving networks. Phys Rev E 2001, 65(2), 026103. 11. Bak, P.; Sneppen, K. Punctuated equilibrium and criticality in a simple model of evolution. Phys Rev Lett 1993, 71(24), 4083 4086. 12. Kulkarni, R.V.; Almaas, E.; Stroud, D. Exact results and scaling properties of small-world networks. Phys Rev E 2000, 61(4), 4268 4271. 13. Moreno, Y.; Vazquez, A. The Bak-Sneppen model on scale-free networks. Europhys Lett 2002, 57(5), 765771. 14. Jain, S.; Krishna, S. Large extinctions in an evolutionary model: The role of innovation and keystone species. Proc Natl Acad Sci USA 2002, 99(4), 20552060. 15. Jain, S.; Krishna, S. A model of the emergence of cooperation, interdependence, and structure in evolving networks. Proc Natl Acad Sci USA 2001, 98(2), 543547. 16. Jain, S.; Krishna, S. Emergence and growth of complex networks in adaptive systems. Comput Phys Commun 1999, 121-122, 116 121. 17. Jain, S.; Krishna, S. Autocatalytic sets and the growth of complexity in an evolutionary model. Phys Rev Lett 1998, 81(25), 5684 5687.

ACKNOWLEDGMENTS
The author acknowledges the benecial remarks from the anonymous reviewers. All remaining errors are the sole responsibility of the author.

30

C O M P L E X I T Y

2004 Wiley Periodicals, Inc.

You might also like