Professional Documents
Culture Documents
Introduction
Computer modelling will even provide the tools with which to perform in silico clinical trials, based on whole organ body models that test for everything. (Pharma 2005, PriceWaterhouseCoopers report, 1998)
. Acknowledgements
When Pharma 2005 was published 3 years ago, that prediction must have seemed simply a dream. Yet today there is a Grand Challenge, the Physiome Project (see www.physiome.org), launched by the International Union of Physiological Sciences (IUPS) (see www.IUPS.org) to facilitate the international academic funding and collaborations required. And there are biotechnology companies already developing the technologies required to achieve it (see the websites of Entelos (www.entelos.com), Pharsight (www.pharsight.com), Physiome Sciences (www.physiome.com), Simulations Plus (www.simulationsplus.com)). What has changed? Here I will try to explain why biological computation is becoming a hot topic. Three major national funding agencies, NIH (National Institutes of Health), MRC (Medical Research de la Council) and INSERM (Institut National de la Sante dicale), held strategy meetings on it during Recherche Me 2001, while the rst International Conference on Computational Biology also took place in 2001 (Carson et al., 2001). I will discuss what the problems are, and why computational biology will have a profound and pervasive impact on biology in the twenty-rst century. Computers are used today in all aspects of biology. It would be impossible to review the whole of this range, which involves analysis of experimental data, development and management of databases, including of course gene and protein databases, imaging, statistical analysis _ the list is almost endless. There is hardly anything in biological science today that has not been computerized. I will not deal with any of these subjects. This article focuses rather on computer modelling of biological processes, not analysis, storage and systematization of data. At the end of this article I will, though, discuss how systematization of data and modelling of processes must interact. Even within this topic though, I will be fairly restrictive. I will not deal with modelling of biological molecules, protein structure, etc. These are important applications of
computational modelling, but they can be viewed as a branch of theoretical chemistry. Biology begins when biological function emerges, and this is above the level of individual molecules. It lies at the level at which proteins interact to produce phenomena like the hearts pacemaker activity, the secretion of insulin (Chay, 1997), the functioning of the immune system etc. In each of these functions large numbers of genes interact in complex ways. Biological computation seeks to understand and predict this complexity by modelling it (Platt, 1964; Bray, 1995; Fell, 1996; Bassingthwaighte, 1997; Kohl et al., 2000; Hunter et al., 2001; Kitano, 2002). The molecular biologist Sydney Brenner gave the fundamental reason for biological modelling when he wrote Genes can only specify the properties of the proteins they code for, and any integrative properties of the system must be computed by their interactions (Brenner, 1998). Brenner meant not only that biological systems themselves compute these interactions but also that in order to understand them we need to compute them, and he concluded this provides a framework for analysis by simulation.
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
about possible results. Most importantly, computer models enable the researcher to fail under controlled conditions. Trying ideas out in a virtual environment where the possible impact of dierent conditions can be tested systematically allows the researcher to anticipate problems and to select the best overall design principles in advance of costly real life studies. Biological function emerges at many dierent levels. Integrative interaction between many proteins may be the basis of a metabolic pathway (Schilling and Palsson, 1998; Schilling et al., 1999; Edwards and Palsson 1999, 2001), a signalling cascade (see www.afcs.org and Shimizu et al., 2000), control of the cell boundaries (membrane transporters of many dierent kinds), cellular functions like secretion and electrical activity, multicellular functions as in kidney tubules or cardiac conducting pathways, through the various levels to whole organs and systems. Ultimately, some even conceive the possibility of constructing a virtual human. A partial virtual heart already exists (Nielsen et al., 1991; Winslow et al., 1991, 1993; Hunter, 1995; Costa et al., 1996; Hunter et al., 1997; Holden and Panlov, 1997; Chen et al., 1998; Smith et al., 2000, 2001). Other organs and systems are beginning to be represented in the same way (Howatson et al., 2000). Connecting them together in appropriate anatomical and biochemical frameworks is indeed not inconceivable. A partial virtual human (partial because probably with not much of a brain! at least for the foreseeable future) may seem a long way o, but such a project, once it has gathered momentum, would be using vast arrays of biological information, much of which already exists. The greatest challenge will lie in connecting it all together. The scale and necessity of this task can be judged by the fact that the Human Genome Project has so far located around 40 000 genes (but see Hollon, 2001 for the debate on this gure), yet there may be as many as 250 000 dierent proteins. Another striking statistic is that the mouse genome has the same number of genes, with 96.5% similarity to the human genome! The task of working out proteinprotein interactions and the role of their environment is, therefore, of crucial importance. However much experimental data we may accumulate on these interactions, computation must be an essential part of the task of unravelling such complexity.
knew the positions and velocities of all the components of a system we must in principle be able to foresee all its subsequent states (though Heisenbergs uncertainty principle reminds us that there are strict limits to this, at least on a small scale). An extreme reductionist view is that we should begin with the properties of the individual molecules in a biological system and then exhaustively compute their interactions (see Bock and Goode, 1998 for a lively debate on reductionism in biology). There is, of course, a computer that does this. It is the body itself! This is the rst sense of Sydney Brenners use of the term biological computation. But there are several problems with trying to copy nature so completely in our own simulations, to create a complete computational clone. The rst is that, even if we were able to succeed, the result would not be a model in the strict sense of the term. Models are always, necessarily, partial representations of reality. Their aim is understanding and explanation in a simplied representation, and this must consist partly in determining what features of a system are necessary and sucient to understand it. Thus, we could try to understand pacemaker function in the heart by computing the interactions of the few thousand protein types that must be involved in making any cardiac cell. In fact, however, this is not necessary. We can understand most of what we wish to know about pacemaker activity by computing the interactions of only around 1020 protein types (Garny et al., 2002). That is the power of a model. It identies what is essential. A complete representation a clone of reality would not do that. It would leave us just as wise, or as ignorant, as before. The second problem is that of computability. Such a complete representation of a biological system, from the bottom upwards, would require more computing power than we are likely to achieve in the foreseeable future. It is estimated that to compute the folding of a single protein from the rst principles of theoretical chemistry will require a years work on Blue Gene, the one-million processor supercomputer being built by IBM (see www.research.ibm.com/bluegene/). Yet, there may be more than 100 000 proteins in the body. Even before we start computing proteinprotein interactions, we would already have many years to wait! If we are really talking about simulation, not re-creation, then it is probable that we will not have enough components to build the monstrous computer that would be required to simulate in a complete bottom-up way. Third, and most fundamentally, bottom-up is not the only rule that nature itself follows. At all levels there exist feedback loops from higher function to lower level activity. Gene expression is not a static property. It is dynamically related to the environment in which genes operate. Genes are as much prisoners of the successful physiological systems that carry them (Noble, 2002), as selsh determinants of what that system will do.
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
Pure bottom-up therefore is neither feasible, nor would it necessarily be explanatory. What about the opposite of bottom-up: top-down? The idea here is that, since any model must extract from reality the essential features necessary for an explanation, we might as well start with the function we want to explain and burrow down to the underlying mechanisms. This, at least, ensures that we will focus on the biological phenomena we are trying to explain. But it also runs the risk of being either empty (reformulating interactions at a higher level does not guarantee that we will hit upon the correct lower-level mechanisms) or seriously incomplete. Genes (and proteins), after all, do not come with their own functional labels. We have to work them out, and in most cases the functionality will be both multiple and extensively crosslinked. The discovery that there may be many fewer genes than proteins reinforces this point. Top-down also carries a further risk, which is that we may be misled by the prejudices that are inevitably built in to a preconceived idea of functionality. A top-down approach must operate within the boundaries predetermined by the system we think is complete. Recent biological research has thrown up countless examples of underlying processes that we did not know were there, and certainly could not have predicted from a purely functional-level description. Compare any textbook of cell biology with its counterpart of 20 years ago to see the point. Could a combination of top-down and bottom-up approaches work a kind of outside-in? This is the way we often think about the relationships between molecular biology and systems biology. How else can we conceivably span the whole range from genes to functions? So, there cannot be much doubt that this dialectical way of thinking will inform our eorts in computational biology. Even the most ardent reductionist is not blind to functional descriptions of biological processes even if he or she might dream of replacing them, while no systems biologist today can be fully satised with explanations that do not connect to the underlying mechanisms and even to the genome level (e.g. Clancy and Rudy, 1999). Could we therefore imagine a kind of double-tunnelling, a bit like the construction of Eurotunnel linking Britain and France, where the teams started from both sides and met in the middle? The problem with this approach is that, unlike tunnelling, there is not a single predetermined meeting point, or even just one way of describing the meeting points. Because genes have multiple functions, and functions depend on many genes, and because there are feedbacks between the levels (gene expression depends itself on functionality), there will be many ways of dividing the system up at the intermediate levels. Modellers will choose between these according to their own aims. Someone interested in reconstructing electrical activity will describe proteins primarily in terms of their ionic transport properties, whereas someone interested in uxes and energy expenditure may even regard the electrical properties as a side
eect. There will also be dierent ways of conating individual protein functions at the higher levels. Electrophysiologists are already familiar with this problem. Reduced systems of equations (van der Pol and van der Mark, 1928; Nagumo and Sato, 1972) can only be said to represent protein activity by combining the eects of several proteins together in a single computed parameter. There are many ways in which this can be done. There will therefore be a multiplicity of parallel models at the intermediate levels, all of which will have their strengths and weaknesses. This raises the important question of mapping. One way to analyse this heterogeneity of models will be to map models onto each other, something that is rarely done, but which can in fact be highly instructive. One of the major controversies in cardiac modelling was resolved by this approach (DiFrancesco and Noble, 1982). We need to think about the ontology of models. These issues were recently debated in a Novartis Foundation symposium on biological complexity (Goode, 2001). The outcome was the proposal that if the various outside-in options (including bottom-up and top-down) have so many diculties, there was only one remaining alternative: modelling had to be middle-out, meaning that computational models inevitably focused on a functional level between genes and function at which they were most detailed. This might be a biochemical pathway, a cell signalling mechanism, the whole cell, a tissue, an organ, a system etc. The level of focus can be characterized as that at which the relevant information on which the modelling is based is most detailed. This inevitably leads to the concept of a hierarchy of models, and one of the major challenges will be how to link them together.
Role of Models
Analogy with the role of modelling in the physical sciences is helpful. In particular it is important to recall that, here also, there has been a long period of iterative interaction between computation and observation. No-one today builds a new aircraft, for example, without rst modelling designs and functions on computer systems. Once, this was done laboriously with wind tunnels. Models of the universe have gone through many stages of development, as astronomy has provided more and more data. In turn, models have suggested new aims for observation. Some of these developments, particularly the iterative nature of the interaction between theory and experiment (see e.g. Noble and Rudy 2001), have obvious parallels in the biological sciences. What makes the situation look so dierent is that the sheer complexity of biological systems, and the recent discovery of much of the relevant data during the molecular biological revolution, means that we are at an earlier stage of development in biology. To some extent,
3
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
the dierences will become less obvious with time as the eld develops. But there will also be some respects in which biological computation must be fundamentally dierent. Ideally, biological models need to span at least three levels.
that call themselves theoretical physics and theoretical chemistry on which computational work in the physical sciences is based. Is there, then, a theoretical biology which can act as the base for computational biology? Note, rst, that there is an important dierence to be drawn between computational biology and mathematical biology. The latter is a branch of mathematics applied to biology. Its role is to discover explanations, not just numerical solutions. There is a good tradition in many aspects of mathematical biology (Murray, 1983; Keener and Sneyd, 1998), some of which does form a base for computational biology. In the theory of excitable cells there is a substantial literature (Jack et al., 1975; Hunter et al., 1975; DiFrancesco and Noble, 1980; Hinch, 2002) on analytical solutions to problems in excitation theory. Unfortunately, the sheer power of modern computing has sometimes overshadowed the more purely mathematical approach. So, some areas of biological computation are already well based in mathematical biology. Does the existence of mathematical biology necessarily entail that something called theoretical biology should also exist? At this stage, I believe the answer to that question must be no (see also Bray, 2001). If there is to be a theoretical biology, it will have to emerge from the integration of many pieces of the reconstruction of living systems. This question touches on the very nature of biology as a science. Of course, there are theories in biology many of them! There is also a central theory that of evolution. So why do I conclude that theoretical biology does not yet exist? The rst reason is that much of the increasingly spectacular modelling of biological processes does not use anything called theoretical biology; it uses the basic theories of chemistry and physics. Reconstructions of the electrical activity of nerves, muscles and organs like the heart, for example, use the well-known laws of electricity and conservation of mass and charge, while a reconstruction of its blood ow uses the equations of uid dynamics. There are biological data in such models but there is no sense in which the equations used can be said to come from a theoretical biology. The second reason is that we still have not resolved some fundamental questions within the central theory of biology, that of evolution. Has that process been essentially one of chance, dependent on contingent events such as weather change, meteorite impacts, etc., with no overall trend? Or are there features that would inevitably emerge in any such evolutionary process? Part of our trouble is that we have only one experiment to judge by, and even if we discover life on Mars or elsewhere, it may well not be a fully independent evolutionary event. Yet this question (of contingency or necessity) must be fundamental to any emergence of a subject that could call itself theoretical biology. If we are dealing mostly with the consequences of contingent events, then a level of massive and exhaustive description will always and necessarily
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
remain at the heart of biological computation. Whereas, if there were to be some general principles (what some would call the Logic of Life (Boyd and Noble, 1993)), then these, in their quantitative formulations, would eventually become the basis of a fully theoretical biology. One of the big future challenges for computational biology will be to try to identify these rst principles. We are in a similar situation to that of the theoretical cosmologists. So far as we know, they also are dealing with only one actual event, the evolution of the universe we know. But computation gives us the possibility of exploring many other dierent possibilities. I believe that will one day be true of biological computation. Can we foresee the ways in which computational biology might eventually identify the principles of a theoretical biology? This is where I can see a role for what I called earlier the ontology of models. I am not a good enough mathematician to see clearly what might be done here, but I have the intuition that by mapping alternative models, and exploring the ontologies of models at dierent levels we may begin to reveal at least one thread of the theoretical rules for biological processes.
not fail if we could with reasonable certainty identify these small groups of individuals (and they will be dierent for dierent drugs, of course) and allow the great majority who could benet to do so.
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
2000). Here I will simply note that biological computation is not just of theoretical interest. The reason that biotechnology companies exist that are built on in silico simulation technology (see the introductory remarks to this article) is that the potential for application to the processes of drug discovery and device development is immense. Virtually all pharmaceutical companies now have or are developing in-house eorts in this direction, sometimes in collaboration with biotechnology companies and with university teams. Leading medical device companies are also involved in modelling.
Conclusion
Biological computation is at an early stage of development, one at which it relies heavily on well-established principles from the physical sciences, combined with huge and rapidly growing databases of biological information. I predict that the eld will grow rapidly as the techniques became easy to use and as widely available as are powerful computers today. Within probably only 10 years or so, biology will become suciently quantitative that it will be as unthinkable not to formulate hypotheses in a mathematical form, as it is now to be purely qualitative in our descriptions of biological data. The sheer complexity of the functional level of biology will force us to think in terms of models that address this complexity in a way that the unaided brain cannot possibly do. Conceivably, but not necessarily, out of this will eventually emerge the principles of a theoretical biology. At present it is impossible to tell whether such a discipline exists and how it will take shape. Either way, though, the impact of biological computation will be extensive. Integrative biology will increasingly depend on it, and if a theoretical biology does eventually emerge, biology will never be the same again. Some have surmised that with the sequencing of the genome we have seen the last major revolution in biology. I believe the contrary: that immense achievement is only the beginning of an even bigger revolution: not just to describe, but to understand.
Acknowledgements
I would like to acknowledge the support of the British Heart Foundation, MRC and The Wellcome Trust.
References
Bassingthwaighte JB (1997) Design and strategy for the Cardionome Project. Advances in Experimental Medicine and Biology 430: 325339. Bock GR and Goode JA (eds) (1998) The Limits of Reductionism in Modern Biology. Novartis Foundation Symposium, no. 213. Chichester: John Wiley.
Boyd CAR and Noble D (eds) (1993) The Logic of Life. Oxford: Oxford University Press. Bray D (1995) Protein molecules as computational elements in living cells. Nature 376: 307312. Bray D (2001) Reasoning for results. Nature 412: 863. Brenner S (1998) Biological computation. In: Bock GR and Goode JA (eds) The Limits of Reductionism in Biology. Novartis Foundation Symposium, no. 213, pp. 106116. Chichester: John Wiley. Carson JH, Cowan A and Loew LM (2001) Computational cell biologists snowed in at Cranwell. Trends in Cell Biology 101 (in press). Chay TR (1997) Eects of extracellular calcium on electrical bursting and intracellular and luminal calcium oscillations in insulin secreting pancreatic beta cells. Biophysical Journal 73: 16731688. Chen FT, Vaughan-Jones RD, Clarke K and Noble D (1998) Modelling myocardial ischaemia and reperfusion. Progress in Biophysics and Molecular Biology 69: 515537. Clancy CE and Rudy Y (1999) Linking a genetic defect to its cellular phenotype in a cardiac arrhythmia. Nature 400: 566569. Costa KD, Hunter PJ, Wayne JS et al. (1996) A three-dimensional nite element method for large elastic deformations of ventricular myocardium. 2. Prolate spheroidal coordinates. Journal of Biomechanical Engineering 118: 464472. DiFrancesco D and Noble D (1980) The time course of potassium current following potassium accumulation in frog atrium: analytical solutions using a linear approximation. Journal of Physiology 306: 151173. DiFrancesco D and Noble D (1982) Implications of the re-interpretation of iK2 for the modelling of the electrical activity of pacemaker tissues in the heart. In: Bouman LN and Jongsma HJ (eds) Cardiac Rate and Rhythm, pp. 93128. The Hague, Boston, London: Martinus Nijho. Edwards JS and Palsson BO (1999) Systems properties of the Haemophilus inuenzae Rd metabolic genotype. Journal of Biological Chemistry 274: 1741017416. Edwards JS and Palsson BO (2001) In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nature Biotechnology 19(2): 125130. Fell DA (1996) Understanding the Control of Metabolism. Portland Press: London. Garny A, Noble PJ, Kohl P and Noble D (2002) Comparative study of rabbit sino-atrial node cell models. Chaos, Solitons & Fractals (in press). Goode J (ed.) (2001) Complexity in Biological Information Processing. Novartis Foundation Symposium, no. 239. Chichester: John Wiley. Hinch R (2002) An analytical study of the physiology and pathology of the propagation of cardiac action potentials. Progress in Biophysics and Molecular Biology (in press). Hodgkin AL and Huxley AF (1952) A quantitative description of membrane current and its application to conduction and excitation in nerve. Journal of Physiology 117: 500544. Holden AV and Panlov AV (1997) Modelling propagation in excitable media. In: Panlov AV and Holden AV (eds) Computational Biology of the Heart, pp. 6599. Chichester: John Wiley. Hollon T (2001) Consolidation of transcript and protein databases suggests humans may have more than 70,000 genes. The Scientist, October 15. Howatson M, Pullan AJ and Hunter PJ (2000) Generation of an anatomically based three-dimensional model of the conducting airways. Annals of Biomedical Engineering 28(7): 793802. Hunter PJ (1995)Myocardial constitutive laws for continuum mechanics models of the heart. In: Sideman S and Beyar R (eds) Molecular and subcellular cardiology: Eects of structure and function. New York: Plenum Press.
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net
Biological Computation
Hunter PJ, McNaughton PA and Noble D (1975) Analytical models of propagation in excitable cells. Progress in Biophysics 30: 99144. Hunter PJ, Nash MP and Sands GB (1997) Computational electromechanics of the heart. In: Panlov A and Holden A (eds) Computational Biology of the Heart, pp. 345407. Chichester: John Wiley. Hunter PJ, Kohl P and Noble D (2001) Integrative models of the heart: achievements and limitations. Philosophical Transactions of the Royal Society London A 359: 10491054. Jack JJB, Noble D and Tsien RW (1975) Electric Current Flow in Excitable Cells. Oxford: Oxford University Press. Keener J and Sneyd J (1998) Mathematical Physiology. New York: Springer. Kitano H (2002) Systems biology: towards systems-level understanding of biological systems. In: Kitano H (ed.) Foundations of Systems Biology. Cambridge MA, USA: MIT Press. Kohl P, Noble D, Winslow RL and Hunter P (2000) Computational modelling of biological systems: tools and visions. Philosophical Transactions of the Royal Society London A 358: 579610. Murray J (1983) Mathematical Biology. New York: Springer-Verlag. Nagumo J and Sato S (1972) On a response characteristic of a mathematical neuron model. Kybernetik 10: 155164. Nielsen PMF, LeGrice IJ, Smaill BH and Hunter PJ (1991) A mathematical model of the geometry and brous structure of the heart. American Journal of Physiology 29(4): H1365H1378. Noble D (1962) A modication of the HodgkinHuxley equations applicable to Purkinje bre action and pacemaker potentials. Journal of Physiology 160: 317352. Noble D (2002) Is the genome the Book of Life? Physiology News 46: 1820. Noble D and Colatsky TJ (2000) A return to rational drug discovery: computer-based models of cells, organs and systems in drug target identication. Emerging Therapeutic Targets 4: 3949. Noble D and Rudy Y (2001) Models of cardiac ventricular action potentials: iterative interaction between experiment and simulation. Philosophical Transactions of the Royal Society A 359: 11271142. Noble D, Levin J and Scott W (1999) Biological simulations in drug discovery. Drug Discovery Today 4: 1016. Platt JR (1964) Strong inference. Science 146: 347353. PriceWaterhouseCoopers (1998) Pharma 2005. An Industrial Revolution in R&D, p. 20. Schilling CH and Palsson BO (1998) The underlying pathway structure of biochemical reaction networks. Proceedings of the National Academy of Sciences of the USA 95: 41934198.
Schilling CH, Edwards JS and Palsson BO (1999) Toward metabolic phenomics: analysis of genomic data using ux balances. Biotechnology Progress 15: 288295. ` re N, Levin MD et al. (2000) Molecular model of a Shimizu TS, Le Nove lattice of signalling proteins involved in bacterial chemotaxis. Nature Cell Biology 2: 792796. Smith NP, Pullan AJ and Hunter PJ (2000) Generation of an anatomically based geometric coronary model. Annals of Biomedical Engineering 28(1): 1425. Smith NP, Pullan AJ and Hunter PJ (2001) An anatomically based model of coronary blood ow and myocardial mechanics. SIAM J Applied Mathematics (in press). Sykes B (2001) The Seven Daughters of Eve. London: Bantam Press. Tomita M, Shimizu K, Matsuzaki Y et al. (1999) E-Cell: software environment for whole cell simulation. Bioinformatics 15(1): 7284. van der Pol B and van der Mark J (1928) The heartbeat considered as a relaxation oscillation, and an electrical model of the heart. The London, Edinburgh and Dublin Philosophical Magazine and Journal of Science VI (7th Series No. XXXVIII November (Supplement)): 763 775. Winslow RL, Kimball AL, Noble D and Denyer JC (1991) Computational models of the mammalian cardiac sinus node implemented on a Connection Machine CM2. Medical and Biological Engineering and Computing 29(2): 832. Winslow RL, Varghese A, Noble D, Adlakha C and Hoythya A (1993) Generation and propagation of triggered activity induced by spatially localised Na-K pump inhibition in atrial network models. Proceedings of the Royal Society 254: 5561.
Further Reading
Bock G and Goode J (Eds) (2002) In Silico Simulation of Biological Processes. Novartis Foundation Symposium 247. London: John Wiley. Csete ME and Doyle JC (2002) Reverse engineering of biological complexity. Science 295: 16641669. Noble D (2002) The rise of computational biology. Nature Reviews Molecular Cell Biology (in press). Noble D (2002) Modelling the heart: from genes to cells to the whole organ. Science 295: 16781682.
ENCYCLOPEDIA OF LIFE SCIENCES / & 2002 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net