Professional Documents
Culture Documents
It is often supposed that the spectacular successes of our modern mathematical sciences support a lofty vision of a world completely ordered by one
elegant theory. In this book Nancy Cartwright argues to the contrary. When
we draw our image of the world from the way modern science works - as
empiricism teaches us we should - we end up with a world where some
features are precisely ordered, others are given to rough regularity and still
others behave in their own diverse ways. This patchwork of laws makes
sense when we realise that laws are very special productions of nature,
requiring very special arrangements for their generation. Combining
previously published and newly written essays on physics and economics,
The Dappled World carries important philosophical consequences and offers
serious lessons for both the natural and the social sciences.
Nancy Cartwright is Professor of Philosophy at the London School of Economics and Political Science and at the University of California, San Diego,
a Fellow of the British Academy, and a Mac Arthur Fellow. She is the author
of How the Laws of Physics Lie (1983), Nature's Capacities and their
Measurement (1989), and Otto Neurath: Philosophy Between Science and
Politics, co-authored with Jordi Cat, Lola Reck and Thomas Uebel (1995).
(CAMBRIDGE
UNIVERSITY PRESS
Contents
Acknowledgements
Introduction
page ix
1
21
23
35
49
75
77
Causal laws
5 Causal diversity; causal stability
104
137
Probabilistic laws
1 Probability machines: chance set-ups and economic
models
152
vii
viii
Contents
177
179
211
Bibliography
234
Index
242
Acknowledgements
This book is squarely in the tradition of the Stanford School and is deeply
influenced by the philosophers of science I worked with there. It began with
the pragmatism of Patrick Suppes and the kinds of views he articulated in
his Probabilistic Metaphysics.1 Then there was Ian Hacking, John Dupre,
Peter Galison and, for one year, Margaret Morrison.
The second major influence is the Modelling and Measurement in Physics
and Economics research group at the London School of Economics. The
Modelling portion of the project was directed by Mary Morgan and Margaret
Morrison; Measurement, by Mary Morgan and Hasok Chang. I have been
helped by the ideas and the studies of all of the research assistants in the
project, whose own work on the topics has been a model I would wish to
follow for philosophical originality and getting the details right: Towfic
Shomar, Mauricio Suarez, Marco Del Seta, Cynthia Ma, George Zouros,
Antigone Nounou, Francesco Guala, Sang Wook Yi, Julian Reiss and Makiko
Ito. I have worked out almost all of the ideas in these chapters in detailed
conversations and debates with Jordi Cat.
Julian Reiss and Sang Wook Yi have read and criticised the entire book,
which has improved as a result. The production of the typescript was impressively carried out by Dorota Rejman, with the help of Julian Reiss and Sang
Wook Yi. The original drawings are by Rachel Hacking; the machines by
Towfic Shomar.
I am very grateful to the MacArthur Foundation and the LSE Centre for
Philosophy of Natural and Social Science for financial support throughout,
and to the Latsis Foundation for a grant that allowed me to complete the
book.
Much of the material of this book has been drawn from articles published
elsewhere. Much has not been published before. The exact origin of each
chapter is described in the acknowledgements section at the end of it.
Suppes 1984.
ix
Introduction
Introduction
arguments. I think they show just the opposite. They show a world whose
laws are plotted and pieced.
Consider physics first. I look particularly at quantum physics, because it is
what many suppose - in some one or another of its various guises - to be
the governor of all of matter.21 also look to some extent at classical physics,
both classical mechanics and classical electromagnetic theory. For these have
an even more firmly established claim to rule, though the bounds of their
empire have contracted significantly since the pretensions of the seventeenth
century Mechanical Philosophy or the hopes for an electromagnetic take-over
at the end of the nineteenth century. And I look at the relations between
them. Or rather, I look at a small handful of cases out of a vast array, and
perhaps these are not even typical, for the relations among these theories are
various and complicated and do not seem to fit any simple formulae.
The conventional story of scientific progress tells us that quantum physics
has replaced classical physics. We have discovered that classical physics is
false and quantum physics is, if not true, a far better approximation to the
truth. But we all know that quantum physics has in no way replaced classical
physics. We use both; which of the two we choose from one occasion to
another depends on the kinds of problems we are trying to solve and the
kinds of techniques we are master of. 'Ah\ we are told, 'that is only in
practice. In principle everything we do in classical physics could be done,
and done more accurately, in quantum physics.' But I am an empiricist. I
know no guide to principle except successful practice. And my studies of the
most successful applications of quantum theory teach me that quantum physics works in only very specific kinds of situations that fit the very restricted
set of models it can provide; and it has never performed at all well where
classical physics works best.
This is how I have come to believe in the patchwork of law. Physics in its
various branches works in pockets, primarily inside walls: the walls of a
laboratory or the casing of a common battery or deep in a large thermos,
walls within which the conditions can be arranged just so, to fit the wellconfirmed and well-established models of the theory, that is, the models that
have proved to be dependable and can be relied on to stay that way. Very
2
Looked at from a different point of view it is superstring theory that makes the loudest claims
right now to be a theory of everything. But superstring theory is not (yet?) a theory of the
physical world, it is a speculation; and even its strongest advocates do not credit it with its
own empirical content. Mathematics, they say in its defence, is the new laboratory site for
physics. (See Galison forthcoming for a discussion of this.) Its claims to account for anything
at all in the empirical world thus depend on the more pedestrian theories that it purports to
be able to subsume; that is, a kind of 'trickle down' theory of universal governance must be
assumed here. So, setting aside its own enormous internal problems, the empire of superstring
theory hangs or falls with those of all the more long standing theories of physics that do
provide detailed accounts of what happens in the world.
Introduction
Introduction
Introduction
physics, from whom I differ so radically despite our shared interest in physics? These latter, I would say, are primarily interested in the world that science represents. They are interested, for instance, in the geometry of space
and time. Their interest in science generally comes from their belief that
understanding our most advanced scientific representations of the world is
their best route to understanding that world itself. John Dupre too is interested
in the world, but in the material, cultural and politico-economic world of
day-to-day and historical life.5 He is interested in science as it affects that
life, in all the ways that it affects that life. Hence he is particularly interested
in the politics of science, not primarily the little politics of laboratory life
that shapes the internal details of the science but its big politics that builds
bombs and human genomes. That kind of interest is different again from
most historians and sociologists of science whose immediate object is science
itself, but - unlike the philosophers who use our best science as a window to
the world - science as it is practised, as a historical process.
My work falls somewhere in the midst of these three points of departure.
My ultimate concern in studying science is with the day-to-day world where
SQUIDs can be used to detect stroke victims and where life expectancy is
calculated to vary by thirty-five years from one country to another. But the
focus of my work is far narrower than that of Dupre: I look at the claims of
science, at the possible effects of science as a body of knowledge, in order
to see what we can achieve with this knowledge. This puts me much closer
to the 'internalist' philosophers in the detail of treatment that I aim for in
discussing the image science gives us of the world, but from a different
motive. Mine is the motive of the social engineer. Ian Hacking distinguishes
two significant aims for science: representing and intervening.6 Most of my
colleagues in philosophy are interested in representing, and not just those
specialists whose concerns are to get straight the details of mathematical
physics. Consider Bas van Fraassen, who begins with more traditional philosophical worries. Van Fraassen tells us that the foremost question in philosophy of science today is: how can the world be the way science says it is
or represents it to be?7 I am interested in intervening. So I begin from a
different question: how can the world be changed by science to make it the
way it should be?
The hero behind this book is Otto Neurath, social engineer of the shortlived Bavarian Republic and founding member of the Vienna Circle.8 Neurath
is well known among philosophers for his boat metaphor attacking the
C/. Dupre 1993.
Hacking 1983.
Van Fraassen 1991.
See Cartwright et al. 1996.
Introduction
Introduction
j
economics
h\$loru
oQW
Introduction
by the concepts of the theory? The interpretative models of the theory provide
the answer. And what kinds of interpretative models do we have? In
answering this, I urge, we must adopt the scientific attitude: we must look to
see what kinds of models our theories have and how they function, particularly how they function when our theories are most successful and we have
most reason to believe in them. In this book I look at a number of cases
which are exemplary of what I see when I study this question. It is primarily
on the basis of studies like these that I conclude that even our best theories
are severely limited in their scope. For, to all appearances, not many of the
situations that occur naturally in our world fall under the concepts of these
theories. That is why physics, though a powerful tool for predicting and
changing the world, is a tool of limited utility.
This kind of consideration is characteristic of how I arrive at my image
of the dappled world. I take seriously the realists' insistence that where
we can use our science to make very precise predictions or to engineer
very unnatural outcomes, there must be 'something right' about the claims
and practices we employ. I will not in this book go into what the
'something right' could be, about which there is a vast philosophic literature. Rather I want to consider what image of the material world is most
consistent with our experiences of it, including our impressive successes
at understanding, predicting and manipulating it - but not excluding the
limitations within which we find ourselves confined and the repeated
failures to get it right that constitute far and away the bulk of normal
scientific activity. The logic of the realists' claims is two-edged: if it is
the impressive empirical successes of our premier scientific theories that
are supposed to argue for their 'truth' (whatever is the favoured interpretation of this claim), then it is the theories as used to generate these
empirical successes that we are justified in endorsing.
How do we use theory to understand and manipulate real concrete things
to model particular physical or socio-economic systems? How can we use
the knowledge we have encoded in our theories to build a laser or to plan an
economy? The core idea of all standard answers is the deductive-nomological
account. This is an account that serves the belief in the one great scientific
system, a system of a small set of well co-ordinated first principles, admitting
a simple and elegant formulation, from which everything that occurs, or
everything of a certain type or in a certain category that occurs, can be
derived. But treatments of real systems are not deductive; nor are they
approximately deductive, nor deductive with correction, nor plausibly
approaching closer and closer to deductivity as our theories progress. And
this is true even if we tailor our systems as much as possible to fit our
theories, which is what we do when we want to get the best predictions
possible. That is, it is not true even in the laboratory, as we learn from Peter
10
Introduction
Introduction
11
Religion. The project is natural religion: to establish the properties that God
is supposed to have - omniscience, omnipotence, and benevolence - from
the phenomena of the natural world. The stumbling block is evil. Demea and
Cleanthes try to explain it away, with well-known arguments. Demea, for
example, supposes 'the present evil of phenomena, therefore, are rectified in
other regions, and at some future period of existence'.12 Philo replies:
I will allow that pain or misery in man is compatible with infinite power and goodness
in the Deity . . . what are you advanced by all these concessions? A mere possible
compatibility is not sufficient. You must prove these pure, unmixed and uncontrollable attributes from the present mixed and confused phenomena, and from these
alone.13
Philo expands his argument:
[I]f a very limited intelligence whom we shall suppose utterly unacquainted with the
universe were assured that it were the production of a very good, wise, and powerful
being, however finite, he would, from his conjecture, form beforehand a very different
notion of it from what we find it to be by experience; . . . supposing now that this
person were brought into the world, still assured that it was the workmanship of such
a divine and benevolent being, he might, perhaps, be surprised at the disappointment,
but would never retract his former belief if founded on any very solid argument . . .
But suppose, which is the real case with regard to man, that this creature is not
antecedently convinced of a supreme intelligence, benevolent and powerful, but is left
to gather such a belief from the appearance of things; this entirely alters the case, nor
will he ever find any reason for such a conclusion.14
12
Introduction
In the same sense in which we judge other general claims true or false, whatever sense that is.
Introduction
13
Hausman 1997.
14
Introduction
/ A
Figure 0.4 Source: Rachel Hacking.
15
Introduction
OOP
rooo
/
16
Introduction
larly data that might be supplied from psychology experiments. Nor do they
use the results of 'high quality tests' in their own field to improve their
theories. Here is Hausman's account of why:
[M]any economists believe that equilibrium theory provides, at a certain level of
resolution, a complete theory of the whole economic domain. Equilibrium theorists
rarely state this thesis explicitly, for on the face of it, such a claim is highly implausible. But equilibrium economists show their commitment to this implausible view
when they reject as ad hoc any behavioural generalizations about individual agents
that are not part of equilibrium theory and are not further specifications of the causal
factors with which equilibrium theory is concerned. For example, as part of his
explanation for the existence of depressions, Keynes offered the generalization that
the marginal propensity to consume out of additional income is less than one. This
psychological generalization is consistent with equilibrium theory, but independent: it
does not follow from equilibrium theory. Instead, it purports to identify an additional
important causal factor. For precisely this reason, regardless of its truth or falsity,
many equilibrium theorists have rejected it as ad hoc . . . If those theorists believed
that economists need to uncover additional causal factors, then they would not reject
attempts to do so.17
Economics is not special in this respect. The pernicious effects of the belief
in the single universal rule of law and the single scientific system are spread
across the sciences. Indeed, a good many physicists right now are in revolt.
Superstring theory is the new candidate for a theory of everything.18 P. W.
Anderson is one of its outspoken opponents. The theory consumes resources
and efforts that could go into the hundreds of other enterprises in physics
that ask different kinds of questions and solve different kinds of problems.
The idea that solutions can trickle down from superstring theory is not even
superficially plausible. Anderson urges,
The ability to reduce everything to simple fundamental laws does not imply the ability
to start from those laws and reconstruct the universe. In fact, the more the elementary
particle physicists tell us about the nature of the fundamental laws, the less relevance
they seem to have to the very real problems of the rest of science, much less to those
of society . . . [T]he behavior of large and complex aggregates of elementary particles,
it turns out, is not to be understood in terms of a simple extrapolation of the properties
of a few particles. Instead, at each level of complexity entirely new properties appear,
and the understanding of the new behaviors requires research which I think is as
fundamental in its nature as any other.19
The damage from the move for unity is not done when there is good empirical
evidence for a specific kind of unification. Nor are attempts at take-overs out
17
18
19
Ibid., p. 406.
For a full discussion of unificationalist programmes in theoretical physics and what they
amount to I recommend Margaret Morrison's study Unifying Theories, Morrison forthcoming.
Anderson 1972, p. 393; for a discussion of Anderson's recent attacks, see Cat 1998.
Introduction
17
of place when we have good reason to think the methods or ideas we have
developed to solve one kind of problem can be employed to solve a different
kind, when we can argue the case and make a reasonable bet, where the costs
have been thought through and our assessments of the chances for success
can warrant this way of proceeding over others. If it is not even objectionable
to invest in a hunt for unity out of sheer passion for order and rationality,
nor to try our favoured theory just because we have no better ideas or because
that is what we are good at. It is not objectionable so long as we are clear
about what we are doing and what it is going to cost us, and we are able to
pay the price.
But we are often not so clear. The yearning for 'the system' is a powerful
one; the faith that our world must be rational, well ordered through and
through, plays a role where only evidence should matter. Our decisions are
affected. After the evidence is in, theories that purport to be fundamental to be able in principle to explain everything of a certain kind - often gain
additional credibility just for that reason itself. They get an extra dollop of
support beyond anything they have earned by their empirical success or the
empirically warranted promise of their research programme for solving the
problem at hand. I have mentioned already the take-over attempts of strings
and symmetries and dualities, of fundamental particles in physics, of quantum
mechanics over classical, and equilibrium theory, rational expectations and
game theory in political economy. In medicine one of the primary take-over
theories is genetics. Let us consider it.
Genetics has proved very useful across a variety of problems. Genetic
information about the heritability of the production of Down's syndrome
babies has led to much better targeting of amniocentesis among young
mothers. Our knowledge that glycogen storage diseases are due to single
point mutations allows us to administer the somewhat dangerous tests for
these diseases only to babies with a family history. Now that we understand
that the serious mental handicaps of phenylketonuria (PKU) are due to a
single point mutation that leads too much accumulation of one amino acid
and not enough of another with a resulting failure of neurological development, we can adjust the diet of the children affected till the relevant period
of development is over. Some even hope that we will learn how to use viruses
as vectors to resupply the genetic material.
For many diseases, however, other approaches may be just as, or even
more, fruitful. For example, in breast cancer there is very substantial evidence
that endogenous oestrogen levels are the major determining factor in the
occurrence of the disease in the vast majority of cases. It is well known that
endogenous oestrogen levels are affected by lifestyle, but little emphasis is
put on this aspect of prevention: on finding out how diet, physical activity
(exercise and work) and other modifiable factors may lower endogenous
18
Introduction
Introduction
19
Parti
For realism
A number of years ago I wrote How The Laws of Physics Lie. That book was
generally perceived to be an attack on realism. Nowadays I think that I was
deluded about the enemy: it is not realism but fundamentalism that we need
to combat.
My advocacy of realism - local realism about a variety of different kinds
of knowledge in a variety of different domains across a range of highly
differentiated situations - is Kantian in structure. Kant frequently used a
puzzling argument form to establish quite abstruse philosophical positions
(O): We have X - perceptual knowledge, freedom of the will, whatever. But
without O (the transcendental unity of the apperception, or the kingdom of
ends) X would be impossible, or inconceivable. Hence <f>. The objectivity
of local knowledge is my O; X is the possibility of planning, prediction,
manipulation, control and policy setting. Unless our claims about the
expected consequences of our actions are reliable, our plans are for nought.
Hence knowledge is possible.
What might be found puzzling about the Kantian argument form are the Xs
from which it starts. These are generally facts which appear in the clean and
orderly world of pure reason as refugees with neither proper papers nor proper
introductions, of suspect worth and suspicious origin. The facts which I take to
ground objectivity are similarly alien in the clear, well-lighted streets of reason,
where properties have exact boundaries, rules are unambiguous, and behaviour
is precisely ordained. I know that I can get an oak tree from an acorn, but not
from a pine cone; that nurturing will make my child more secure; that feeding
the hungry and housing the homeless will make for less misery; and that giving
more smear tests will lessen the incidence of cervical cancer. Getting closer to
physics, which is ultimately my topic here, I also know that I can drop a pound
coin from the upstairs window into the hands of my daughter below, but probably not a paper tissue; that I can head north by following my compass needle
(so long as I am on foot and not in my car), that. . .
I know these facts even though they are vague and imprecise, and I have
23
24
no reason to assume that that can be improved on. Nor, in many cases, am I
sure of the strength or frequency of the link between cause and effect, nor of
the range of its reliability. And I certainly do not know in any of the cases
which plans or policies would constitute an optimal strategy. But I want to
insist that these items are items of knowledge. They are, of course, like all
genuine items of knowledge (as opposed to fictional items like sense data or
the synthetic a priori) defeasible and open to revision in the light of further
evidence and argument. But if I do not know these things, what do I know
and how can I come to know anything?
Besides this odd assortment of inexact facts, we also have a great deal of
very precise and exact knowledge, chiefly supplied by the natural sciences. I
am not thinking here of abstract laws, which as an empiricist I take to be of
considerable remove from the world they are supposed to apply to, but rather
of the precise behaviour of specific kinds of concrete systems, knowledge of,
say, what happens when neutral ^-mesons decay, which allows us to establish
CP violation, or of the behaviour of SQUIDs (Superconducting Quantum
Interference Devices) in a shielded fluctuating magnetic field, which allows
us to detect the victims of strokes. This knowledge is generally regimented
within a highly articulated, highly abstract theoretical scheme.
One cannot do positive science without the use of induction, and where
those concrete phenomena can be legitimately derived from the abstract
schemes, they serve as a kind of inductive base for these schemes. How The
Laws of Physics Lie challenged the soundness of these derivations and hence
of the empirical support for the abstract laws. I still maintain that these
derivations are shaky, but that is not the point I want to make in this chapter.
So let us for the sake of argument assume the contrary: the derivations are
deductively correct and they use only true premises. Then, granting the validity of the appropriate inductions,1 we have reason to be realists about the
laws in question. But that does not give us reason to be fundamentalists. To
grant that a law is true - even a law of 'basic' physics or a law about the
so-called 'fundamental particles' - is far from admitting that it is universal that it holds everywhere and governs in all domains.
2
Against fundamentalism
Return to my rough division of the concrete facts we know into two categories: (1) those that are legitimately regimented into theoretical schemes, these
generally, though not always, being about behaviour in highly structured,
manufactured environments like a spark chamber; (2) those that are not.
1
These will depend on the circumstances and our general understanding of the similarities and
structures that obtain in those circumstances.
25
There is a tendency to think that all facts must belong to one grand scheme,
and moreover that this is a scheme in which the facts in the first category
have a special and privileged status. They are exemplary of the way nature
is supposed to work. The others must be made to conform to them. This is
the kind of fundamentalist doctrine that I think we must resist. Biologists are
clearly already doing so on behalf of their own special items of knowledge.
Reductionism has long been out of fashion in biology and now emergentism
is again a real possibility. But the long-debated relations between biology
and physics are not good paradigms for the kind of anti-fundamentalism I
urge. Biologists used to talk about how new laws emerge with the appearance
of 'life'; nowadays they talk, not about life, but about levels of complexity
and organisation. Still in both cases the relation in question is that between
larger, richly endowed, complex systems, on the one hand, and fundamental
laws of physics on the other: it is the possibility of 'downwards' reduction
that is at stake.
I want to go beyond this. Not only do I want to challenge the possibility
of downwards reduction but also the possibility of 'cross-wise reduction'. Do
the laws of physics that are true of systems (literally true, we may imagine
for the sake of argument) in the highly contrived environments of a laboratory
or inside the housing of a modern technological device, do these laws carry
across to systems, even systems of very much the same kind, in different and
less regulated settings? Can our refugee facts always, with sufficient effort
and attention, be remoulded into proper members of the physics community,
behaving tidily in accord with the fundamental code? Or must and should they be admitted into the body of knowledge on their own merit?
In moving from the physics experiment to the facts of more everyday
experience, we are not only changing from controlled to uncontrolled environments, but often from micro to macro as well. In order to keep separate
the issues which arise from these two different shifts, I am going to choose
for illustration a case from classical mechanics, and will try to keep the scale
constant. Classical electricity and magnetism would serve as well. Moreover,
in order to make my claims as clear as possible, I shall consider the simplest
and most well-known example, that of Newton's second law and its application to falling bodies, F = ma. Most of us, brought up within the fundamentalist canon, read this with a universal quantifier in front: for any body in any
situation, the acceleration it undergoes will be equal to the force exerted on
it in that situation divided by its inertial mass. I want instead to read it, as
indeed I believe we should read all nomologicals, as a ceteris paribus law.
In later chapters I shall have a lot to say about the form of this ceteris paribus
condition. Indeed, I shall argue that laws like this are very special indeed and
they obtain only in special circumstances: they obtain just when a nomological machine is at work. But for the moment, to get started, let us concentrate
26
on the more usual observation that, for the most part, the relations laid out
in our laws hold only if nothing untoward intervenes to prevent them. In our
example, then, we may write: for any body in any situation, if nothing interferes, its acceleration will equal the force exerted on it divided by its mass.
But what can interfere with a force in the production of motion other than
another force? Surely there is no problem. The acceleration will always be
equal to the total force divided by the mass. That is just what I question.
Think again about how we construct a theoretical treatment of a real situation. Before we can apply the abstract concepts of basic theory - assign a
quantum field, a tensor, a Hamiltonian, or in the case of our discussion, write
down a force function - we must first produce a model of the situation in
terms the theory can handle. From that point the theory itself provides 'language-entry rules' for introducing the terms of its own abstract vocabulary,
and thereby for bringing its laws into play. How The Laws of Physics Lie
illustrated this for the case of the Hamiltonian - which is roughly the quantum
analogue of the classical force function. Part of learning quantum mechanics
is learning how to write the Hamiltonian for canonical models, for example,
for systems in free motion, for a square well potential, for a linear harmonic
oscillator, and so forth. Ronald Giere has made the same point for classical
mechanics.2 In the next chapter I will explain that there is a very special kind
of abstraction that is at stake here. For now let us consider what follows from
the fact that concepts like force or the quantum Hamiltonian attach to the
world through a set of specific models. (I develop a quantum example in
detail in chapter 9.)
The basic strategy for treating a real situation is to piece together a model
from these fixed components. Then we determine the prescribed composite
Hamiltonian from the Hamiltonians for the parts. Questions of realism arise
when the model is compared with the situation it is supposed to represent.
How the Laws of Physics Lie argued that even in the best cases, the fit
between the two is not very good. I concentrated there on the best cases
because I was trying to answer the question 'Do the explanatory successes
of modern theories argue for their truth?' Here I want to focus on the multitude of 'bad' cases, where the models, if available at all, provide a very poor
image of the situation. These are not cases that disconfirm the theory. You
cannot show that the predictions of a theory for a given situation are false
until you have managed to describe the situation in the language of the theory.
When the models are too bad a fit, the theory is not disconfirmed; it is just
inapplicable.
Now consider a falling object. Not Galileo's from the leaning tower, nor
the pound coin I earlier described dropping from the upstairs window, but
2
Giere 1988.
27
rather something more vulnerable to non-gravitational influence. Otto Neurath has a nice example. My doctrine about the case is much like his.
In some cases a physicist is a worse prophet than a [behaviourist psychologist], as
when he is supposed to specify where in Saint Stephen's Square a thousand dollar
bill swept away by the wind will land, whereas a [behaviourist] can specify the result
of a conditioning experiment rather accurately.3
Mechanics provides no model for this situation. We have only a partial
model, which describes the thousand dollar bill as an unsupported object in
the vicinity of the earth, and thereby introduces the force exerted on it due
to gravity. Is that the total force? The fundamentalist will say no: there is in
principle (in God's completed theory?) a model in mechanics for the action
of the wind, albeit probably a very complicated one that we may never succeed in constructing. This belief is essential for the fundamentalist. If there
is no model for the thousand dollar bill in mechanics, then what happens to
the note is not determined by its laws. Some falling objects, indeed a very
great number, will be outside the domain of mechanics, or only partially
affected by it. But what justifies this fundamentalist belief? The successes of
mechanics in situations that it can model accurately do not support it, no
matter how precise or surprising they are. They show only that the theory
is true in its domain, not that its domain is universal. The alternative to
fundamentalism that I want to propose supposes just that: mechanics is true,
literally true we may grant, for all those motions all of whose causes can be
adequately represented by the familiar models which get assigned force functions in mechanics. For these motions, mechanics is a powerful and precise
tool for prediction. But for other motions, it is a tool of limited serviceability.
Let us set our problem of the thousand dollar bill in Saint Stephen's Square
to an expert in fluid dynamics. The expert should immediately complain that
the problem is ill defined. What exactly is the bill like: is it folded or flat?
straight down the middle? or . ..? is it crisp or crumpled? how long versus
wide? and so forth and so forth and so forth. I do not doubt that when the
right questions can be asked and the right answers can be supplied, fluid
dynamics can provide a practicable model. But I do doubt that for every real
case, or even for the majority, fluid dynamics has enough of the 'right questions'. It does not have enough of the right concepts to allow it to model the
full set of causes, or even all the dominant ones. I am equally sceptical that
the models which work will do so by legitimately bringing Newton's laws
(or Lagrange's for that matter) into play.4
How then do airplanes stay aloft? Two observations are important. First,
3
28
we do not need to maintain that no laws obtain where mechanics runs out.
Fluid dynamics may have loose overlaps and intertwinings with mechanics.
But it is in no way a subdiscipline of basic classical mechanics; it is a discipline on its own. Its laws can direct the thousand dollar bill as well as can
those of Newton or Lagrange. Second, the thousand dollar bill comes as it
comes, and we have to hunt a model for it. Just the reverse is true of the
plane. We build it to fit the models we know work. Indeed, that is how we
manage to get so much into the domain of the laws we know.
Many will continue to feel that the wind and other exogenous factors must
produce a force. The wind after all is composed of millions of little particles
which must exert all the usual forces on the bill, both across distances and
via collisions. That view begs the question. When we have a good-fitting
molecular model for the wind, and we have in our theory (either by composition from old principles or by the admission of new principles) systematic
rules that assign force functions to the models, and the force functions
assigned predict exactly the right motions, then we will have good scientific
reason to maintain that the wind operates via a force. Otherwise the assumption is another expression of fundamentalist faith.
3
If the laws of mechanics are not universal, but nevertheless true, there are at
least two options for them. They could be pure ceteris paribus laws: laws
that hold only in circumscribed conditions or so long as no factors relevant
to the effect besides those specified occur. And that's it. Nothing follows
about what happens in different settings or in cases where other causes occur
that cannot be brought under the concepts of the theory in question. Presumably this option is too weak for our example of Newtonian mechanics. When
a force is exerted on an object, the force will be relevant to the motion of
the object even if other causes for its motion not renderable as forces are at
work as well, and the exact relevance of the force will be given by the
formula F = ma: the (total) force will contribute a component to the acceleration determined by this formula.
For cases like this, the older language of natures is appropriate. It is in the
nature of a force to produce an acceleration of the requisite size. That means
that ceteris paribus, it will produce that acceleration. But even when other
causes are at work, it will 'try' to do so. The idea is familiar in the case of
forces. What happens when several forces all try at once to produce their
own characteristic acceleration? What results is an acceleration equal to the
vector sum of the functions that represent each force separately, divided by
the inertial mass of the accelerating body. In general what counts as 'trying'
will differ from one kind of cause to another. To ascribe a behaviour to the
29
Wholism
We look at little bits of nature, and we look under a very limited range of
circumstances. This is especially true of the exact sciences. We can get very
precise outcomes, but to do so we need very tight control over our inputs.
Most often we do not control them directly, one by one, but rather we use
some general but effective form of shielding. I know one experiment that
aims for direct control - the Stanford Gravity Probe. Still, in the end, they
will roll the space ship to average out causes they have not been able to
command. Sometimes we take physics outside the laboratory. Then shielding
becomes even more important. SQUIDs can make very fine measurements of
magnetic fluctuations that help in the detection of stroke victims. But for
administering the tests, the hospital must have a Hertz box - a small totally
metal room to block out magnetism from the environment. Or, for a more
homely example, we all know that batteries are not likely to work if their
protective casing has been pierced.
We tend to think that shielding does not matter to the laws we use. The
same laws apply both inside and outside the shields; the difference is that
inside the shield we know how to calculate what the laws will produce, but
outside, it is too complicated. Wholists are wary of these claims. If the events
we study are locked together and changes depend on the total structure rather
than the arrangement of the pieces, we are likely to be very mistaken by
looking at small chunks of special cases.
Goodman's new riddle of induction provides a familiar illustration.6 Sup5
6
I have written more about the two levels of generalisation, laws and ascriptions of natures, in
Cartwright 1989.
Goodman 1983, ch. 3.
30
pose that in truth all emeralds are grue (where grue = green and examined
before the year 2000 or blue and either unexamined or examined after the
year 2000). We, however, operate with 'All emeralds are green/ Our law is
not an approximation before 2000 of the true law, though we can see from
the definitions of the concepts involved why it will be every bit as successful
as the truth, and also why it will be hard, right now, to vary the circumstances
in just the right way to reverse our erroneous induction.
A second famous example of erroneous induction arising from too narrow
a domain of experience is Bertrand Russell's ignorant chicken. The chicken
waits eagerly for the arrival of the farmer, who comes to feed her first thing
every morning - until the unexpected day that he comes to chop off her head.
The chicken has too limited a view of the total situation in which she is
embedded. The entire structure is different from what she suspects, and
designed to serve very different purposes from her own. Yet, for most of
her life, the induction that she naturally makes provides a precisely accurate
prediction.
For a more scientific example consider the revolution in communications
technology due to fibre optics. Low-loss optical fibres can carry information
at rates of many gigabits per second over spans of tens of kilometres. But the
development of fibre bundles which lose only a few decibels per kilometre is
not all there is to the story. Pulse broadening effects intrinsic to the fibres
can be truly devastating. If the pulses broaden as they travel down the fibre,
they will eventually smear into each other and destroy the information. That
means that the pulses cannot be sent too close together, and the transmission
rate may drop to tens or at most hundreds of megabits per second.
We know that is not what happens. The technology has been successful.
That is because the right kind of optical fibre in the right circumstance can
transmit solitons - solitary waves that keep their shape across vast distances.
I will explain why. The light intensity of the incoming pulse causes a shift
in the index of refraction of the optical fibre, producing a slight non-linearity
in the index. The non-linearity leads to what is called a 'chirp' in the pulse.
Frequencies in the leading half of the pulse are lowered while those in the
trailing half are raised. The effects of the chirp combine with those of dispersion to produce the soliton. Thus stable pulse shapes are not at all a general
phenomenon of low-loss optical fibres. They are instead a consequence of
two different, oppositely directed processes that cancel each other out. The
pulse widening due to the dispersion is cancelled by the pulse narrowing due
to the non-linearity in the index of refraction. We can indeed produce perfectly stable pulses. But to do so we must use fibres of just the right design,
and matched precisely with the power and input frequency of the laser which
generates the input pulses. By chance that was not hard to do. When the ideas
were first tested in 1980 the glass fibres and lasers readily available were
31
easily suited to each other. Given that very special match, fibre optics was
off to an impressive start.
Solitons are indeed a stable phenomenon. They are a feature of nature, but
of nature under very special circumstances. Clearly it would be a mistake to
suppose that they were a general characteristic of low-loss optical fibres. The
question is, how many of the scientific phenomena we prize are like solitons,
local to the environments we encounter, or - more importantly - to the environments we construct. If nature is more wholistic than we are accustomed to
think, the fundamentalists' hopes to export the laws of the laboratory to the
far reaches of the world will be dashed.
It is clear that I am not very sanguine about the fundamentalist faith. But
that is not really out of the kind of wholist intuitions I have been sketching.
After all, the story I just told accounts for the powerful successes of the
'false' local theory - the theory that solitons are characteristic of low-loss
fibres - by embedding it in a far more general theory about the interaction
of light and matter. Metaphysically, the fundamentalist is borne out. It may
be the case that the successful theories we have are limited in their domain,
but their successes are to be explained by reference to a truly universal
authority. I do not see why we need to explain their successes. I am prepared
to believe in more general theories when we have direct empirical evidence
for them but not merely because they are the 'best explanation' for something
which may well have no explanation. 'The theory is successful in its domain':
the need for explanation is the same whether the domain is small, or large,
or very small, or very large. Theories are successful where they are successful, and that's that. If we insist on turning this into a metaphysical doctrine,
I suppose it will look like metaphysical pluralism, to which I now turn.
5
32
principal focus. Now I want to draw the divides sharply. Some features of
systems typically studied by physics may get into situations where their
behaviour is not governed by the laws of physics at all. But that does not
mean that they have no guide for their behaviour or only low-level phenomenological laws. They may fall under a quite different organised set of
highly abstract principles. (But then again they may not.)
There are two immediate difficulties that metaphysical pluralism encounters. The first is one we create ourselves, by imagining that it must be joined
with views that are vestiges of metaphysical monism. The second is, I
believe, a genuine problem that nature must solve (or must have solved).
First, we are inclined to ask, 'How can there be motions not governed by
Newton's laws?' The answer: there are causes of motion not included in
Newton's theory. Many find this impossible because, although they have
forsaken reductionism, they cling to a near-cousin: supervenience. Suppose
we give a complete 'physics' description of the falling object and its surrounds. Mustn't that fix all the other features of the situation? Why? This is
certainly not true at the level of discussion at which we stand now: the wind
is cold and gusty; the bill is green and white and crumpled. These properties
are independent of the mass of the bill, the mass of the earth, the distance
between them.
I suppose, though, I have the supervenience story wrong. It is the microscopic properties of physics that matter; the rest of reality supervenes on
them. Why should I believe that? Supervenience is touted as step forward
over reductionism. Crudely, I take it, the advantage is supposed to be that
we can substitute a weaker kind of reduction, 'token-token reduction', for the
more traditional 'type-type reductions' which were proving hard to carry out.
But the traditional view had arguments in its favour. Science does sketch a
variety of fairly systematic connections between micro-structures and macroproperties. Often the sketch is rough, sometimes it is precise, usually its
reliability is confined to very special circumstances. Nevertheless there are
striking cases. But these cases support type-type reductionism; they are irrelevant for supervenience. Type-type reductionism has well-known problems:
the connections we discover often turn out to look more like causal connections than like reductions; they are limited in their domain; they are rough
rather than exact; and often we cannot even find good starting proposals
where we had hoped to produce nice reductions. These problems suggest
modifying the doctrine in a number of specific ways, or perhaps giving it up
altogether. But they certainly do not leave us with token-token reductionism
as a fallback position. After all, on the story I have just told, it was the
appearance of some degree of systematic connection that argued in the first
place for the claim that micro-structures fixed macro-properties. But it is just
this systematicity that is missing in token-token reductionism.
The view that there are macro-properties that do not supervene on micro-
33
features studied by physics is sometimes labelled emergentism. The suggestion is that, where there is no supervenience, macro-properties must miraculously come out of nowhere. But why? There is nothing of the newly landed
about these properties. They have been here in the world all along, standing
right beside the properties of microphysics. Perhaps we are misled by the
feeling that the set of properties studied by physics is complete. Indeed, I
think that there is a real sense in which this claim is true, but that sense does
not support the charge of emergentism. Consider how the domain of properties for physics gets set. Here is a caricature: we begin with an interest in
some specific phenomena, say motions - deflections, trajectories, orbits. Then
we look for the smallest set of properties that is, ceteris paribus closed (or,
closed enough) under prediction. That is, we expand our domain until we get
a set of factors that are ceteris paribus sufficient to fix our starting factors.
(That is, they are sufficient so long as nothing outside the set occurs that
strongly affects the targeted outcome.) To succeed does not show that we
have got all the properties there are. This is a fact we need to keep in mind
quite independently of the chief claim of this chapter, that the predictive
closure itself only obtains in highly restricted circumstances. The immediate
point is that predictive closure among a set of properties does not imply
descriptive completeness.
The second problem that metaphysical pluralism faces is that of consistency. We do not want colour patches to appear in regions from which the
laws of physics have carried away all matter and energy. Here are two stories
I used to tell when I taught about the Mechanical Philosophy of the seventeenth century. Both are about how to write the Book of Nature to guarantee
consistency. In the first story, God is very interested in physics. He carefully
writes out all of its laws and lays down the initial positions and velocities of
all the atoms in the universe. He then leaves to Saint Peter the tedious but
intellectually trivial job of calculating all future happenings, including what,
if any, macroscopic properties and macroscopic laws will emerge. That is the
story of reductionism. Metaphysical pluralism supposes that God is instead
very concerned about laws, and so he writes down each and every regularity
that his universe will display. In this case Saint Peter is left with the gargantuan task of arranging the initial properties in the universe in some way that
will allow all God's laws to be true together. The advantage to reductionism
is that it makes Saint Peter's job easier. God may nevertheless have chosen
to be a metaphysical pluralist.
6
I have argued that the laws of our contemporary science are, to the extent
that they are true at all, at best true ceteris paribus. In the nicest cases we
may treat them as claims about natures. But we have no grounds in our
34
experience for taking our laws - even our most fundamental laws of physics as universal. Indeed, I should say 'especially our most fundamental laws of
physics', if these are meant to be the laws of fundamental particles. For we
have virtually no inductive reason for counting these laws as true of fundamental particles outside the laboratory setting - if they exist there at all. Ian
Hacking is famous for the remark, 'So far as I'm concerned, if you can spray
them then they are real.'7 I have always agreed with that. But I would be
more cautious: 'When you can spray them, they are real.'
The claim that theoretical entities are created by the peculiar conditions
and conventions of the laboratory is familiar from the social constructionists.
The stable low-loss pulses I described earlier provide an example of how that
can happen. Here I want to add a caution, not just about the existence of the
theoretical entities outside the laboratory, but about their behaviour. Hacking's point is not only that when we can use theoretical entities in just the
way we want to produce precise and subtle effects, they must exist; but also
that it must be the case that we understand their behaviour very well if we
are able to get them to do what we want. That argues, I believe, for the
truth of some very concrete, context-constrained claims, the claims we use
to describe their behaviour and control them. But in all these cases of precise
control, we build our circumstances to fit our models. I repeat: that does not
show that it must be possible to build a model to fit every circumstance.
Perhaps we feel that there could be no real difference between the one kind
of circumstance and the other, and hence no principled reason for stopping our
inductions at the walls of our laboratories. But there is a difference: some circumstances resemble the models we have; others do not. And it is just the point
of scientific activity to build models that get in, under the cover of the laws in
question, all and only those circumstances that the laws govern.8 Fundamentalists want more. They want laws; they want true laws; but most of all, they want
their favourite laws to be in force everywhere. I urge us to resist fundamentalism. Reality may well be just a patchwork of laws.
ACKNOWLEDGEMENTS
This chapter was originally published almost exactly as it appears here in Cartwright
1994.
7
8
Philosophers have tended to fall into two camps concerning scientific laws:
either we are realists or we are instrumentalists. Instrumentalists, as we know,
see scientific theories as tools, tools for the construction of precise and accurate predictions, or of explanations, or - to get down to a far more material
level - tools for constructing devices that behave in the ways we want them
to, like transistors, flash light batteries, or nuclear bombs. The laws of scientific theory have the surface structure of general claims. But they do not in
fact make claims about the world; they just give you clues about how to
manipulate it.
The scientific realist takes the opposite position. Laws not only appear to
make claims about the world; they do make claims, and the claims are, for
the most part, true. What they claim should happen is what does happen.
This leads realists to postulate a lot of new properties in the world. Look at
Maxwell's equations. These equations are supposed to describe the electromagnetic field: B is the magnetic intensity of the field and E, the electric
intensity. The equations seem to make claims about the behaviour of these
field quantities relative to the behaviour of other properties. We think that
the equations are true just in case the quantities all take on the right values
with respect to each other. There is thus a tendency, when a new theory is
proposed, to secure the truth of its equations by filling up the world with new
properties.
This tendency is nicely illustrated in some modern work on fibre bundles.1
Nowadays we want our theories to be gauge-invariant. That implies that the
Lagrangian should exhibit a local symmetry: elX% returns the same equations
for motion as itself. (You may think of the Lagrangian as an elaborate way
of representing the forces in a situation.) The X here is a phase factor which
used to be ignored; it was thought of, more or less, as an artefact of the
notation. But now its behaviour must be regulated if the theory is to have the
1
35
36
local symmetries we want. The values of X are angles: 45, 80, 170, and
the like. This suggests constructing a space for the angle to rotate in. We can
achieve the symmetries that gauge-invariance requires, but, as one author
puts it, 'at the expense of introducing a new structure in our theory'. The
structure is called a principal fibre bundle. It attaches an extra geometric
space to each point in physical space. The primary motivation seems to be
the practice of locating physical properties (e.g. field strengths) at one point
in space. Here the property X needs three dimensions to represent it. We end
up with a space-time structure that has these fibres, or geometric 'balloons',
attached to it at every point.
How shall we think about these structures? Most particle physicists prefer
an instrumentalist reading of the geometry. But a minority take it realistically.
The best example is probably Yural Ne'eman, who uses the structure to try
to resolve standard paradoxes of quantum mechanics. The paradoxes are set
in the famous paper by Einstein, Podolsky and Rosen. Ne'eman says:
What makes [the phenomena of the] EPR [experiment] appear unreal is the fact that
(with Einstein, Podolsky, and Rosen) we tend to visualize the experiment in the sectional setting of Galilean 3-space or even Riemannian 4-space, whereas the true arena
is one in which the latter are just the base manifolds. It is in the additional dimensionality of the fibres . . . it is in this geometry that we should consider the measurement.2
Ne'eman is a wonderful example of the tendency to keep expanding the
structures and properties of the universe as we expand our mathematical treatments of it.
It is this tendency that I want to resist. I want to defend the view that
although the laws may be true ('literally' true), they need not introduce new
properties into nature. The properties they mention are often already there;
the new concepts just give a more abstract name to them. We tend to think
of the concepts of physics, though theoretical, as very concrete, like is red;
or is travelling at 186,000 miles per second. But it is better to take them as
abstract descriptions, like is a success, or is work. In this case, I will argue,
we have no need to look for a single concrete way in which all the cases that
fall under the same predicate resemble each other. What we need to understand, in order to understand the way scientific laws fit the world, is the
relationship of the abstract to the concrete; and to understand that, it will help
to think about fables and their morals.
Fables transform the abstract into the concrete, and in so doing, I claim,
they function like models in physics. The thesis I want to defend is that the
37
relationship between the moral and the fable is like that between a scientific
law and a model. Consider some familiar morals:
It is dangerous to choose the wrong time for doing a thing.
(Aesop, Fable 190)
Familiarity breeds contempt.
(Aesop, Fable 206)
The weaker are always prey to the stronger.
(G. E. Lessing)
Are these three claims true? They require in each case an implicit ceteris
paribus clause since they describe only a single feature and its consequences;
and they do not tell you what to expect when different features conflict. For
example, when it comes to parents and children, the first claim is contradicted
by the biblical moral, 'What man is there of you whom if his son ask bread
will give him a stone?'3 But the same is true of the laws of physics. In the
case of picking up pins with a magnet, the law of gravity seems to be in
conflict with the law of magnetic attraction. Both need at least to be read
with an implicit ceteris paribus clause; or better, as a description of the nature
of gravity or magnetic attraction. Barring these considerations, though, I
would say that all three of these morals are most probably true. At least,
there is no special problem about their truth. Nothing about their very nature
prevents them from describing the world correctly.
So, too, with the laws of physics, I will argue. They can be true in just the
same way as these very homely adages. But there is a catch. For making the
parallel between laws and morals will allow us to limit the scope of scientific
laws if we wish. Laws can be true, but not universal. We need not assume
that they are at work everywhere, underlying and determining what is going
on. If they apply only in very special circumstances, then perhaps they are
true just where we see them operating so successfully - in the artificial environments of our laboratories, our high-tech firms, or our hospitals. I welcome
this possible reduction in their dominion; but the fundamentalist will not.
2
In fact, I do not wish to make claims about what the correct relationship
between the moral and the fable should be. Rather, I am interested in a
particular theory of the relationship: that is the theory defended by Gotthold
Ephraim Lessing, the great critic and dramatist of the German Enlightenment.
His is a theory that sees the fable as a way of providing graspable, intuitive
38
39
the list above. The first two are from fables of Aesop. The third is a moral
that goes in want of a fable. The fable is constructed by Lessing.7
A marten eats the grouse;
A fox throttles the marten; the tooth of the wolf, the fox.
Lessing makes up this story as a part of his argument to show that a fable is
no allegory.8 Allegories say not what their words seem to say, but rather
something similar. But where is the allegory in the fable of the grouse, the
marten, the fox and the wolf: 'What similarity here does the grouse have
with the weakest, the marten with the weak, and so forth? Similarity! Does
the fox merely resemble the strong and the wolf the strongest or is the former
the strong, the latter the strongest. He is it.'9 For Lessing, similarity is the
wrong idea to focus on. The relationship between the moral and the fable is
that of the general to the more specific, and it is 'a kind of misusage of the
words to say that the special has a similarity with the general, the individual
with its type, the type with its kind'.10 Each particular is a case of the general
under which it falls.
The point comes up again when Lessing protests against those who maintain that the moral is hidden in the fable, or at least disguised there. That is
impossible given his view of the relationship between the two. Lessing
argues: 'How one can disguise (verkleiden) the general in the particular, that
I do not see at all. If one insists on a similar word here, it must at least be
einkleiden rather than verkleiden.'11 Einkleiden is to fit out, as when you take
the children to the department store in the fall and buy them new sets of
school clothes. So the moral is to be 'fitted out' by the fable.
The account of abstraction that I borrow from Lessing to describe how
contemporary physics theories work provides us with two necessary conditions. First, a concept that is abstract relative to another more concrete set of
descriptions never applies unless one of the more concrete descriptions also
applies. These are the descriptions that can be used to 'fit out' the abstract
description on any given occasion. Second, satisfying the associated concrete
description that applies on a particular occasion is what satisfying the abstract
description consists in on that occasion. Writing this chapter is what my
working right now consists in; being located at a distance r from another
charge q2 is what it consists in for a particle of charge q, to be subject to the
Coulomb force qjq^A'nZor2 in the usual cases when that force function applies.
To say that working consists in a specific activity described with the relevant
7
8
9
10
II
40
set of more concrete concepts on any given occasion implies at least that no
further description using those concepts is required for it to be true that
'working' applies on that occasion, though surely the notion is richer than
this.
Although I have introduced Lessing's account of the abstract and the concrete through a discussion of the fable and its moral, it marks an entirely
commonplace feature of language. Most of what we say - and say truly uses abstract concepts that want 'fitting out' in more concrete ways. Of
course, that is compared to yet another level of discourse in terms of which
they may be more concretely fitted out in turn. What did I do this morning?
I worked. More specifically, I washed the dishes, then I wrote a grant proposal, and just before lunch I negotiated with the dean for a new position in
our department. A well-known philosophical joke makes clear what is at
stake: 'Yes, but when did you work?' It is true that I worked; but it is not
true that I did four things in the morning rather than three. Working is a more
abstract description of the same activities I have already described when I
say that I washed dishes, wrote a proposal, and bargained with the dean.
This is not to say that the more abstract description does not tell you
anything you did not already know from the more concrete list, or vice versa.
They are, after all, different concepts. Work has implications about leisure,
labour, preference, value, and the like, that are not already there in the
description of my activity as washing the dishes or negotiating with the dean.
(Though, I admit, I have chosen examples where the connection is fairly
transparent for most of us.) Thus I am not suggesting any kind of reductionism of abstract concepts to more specific ones. The meaning of an abstract
concept depends to a large extent on its relations to other equally abstract
concepts and cannot be given exclusively in terms of the more concrete concepts that fit it out from occasion to occasion.
For the converse reason, the abstract-concrete relation is not the same as
the traditional relation between genus and species. The species is defined in
terms of the genus plus differentia. But in our examples, the more concrete
cases, like washing dishes or travelling very fast, have senses of their own,
independent (or nearly independent) of the abstractions they fall under. Nor
should we think in terms of supervenience. Roughly, to say that one set of
concepts supervenes on another is to say that any two situations that have
the same description from the second set will also have the same description
using the first set: the basic concepts 'fix' the values of those that supervene
on them. This is not the case with the abstract-concrete distinction, as we
can see from the example of work. Although washing dishes is what working
amounted to for me in the early part of the morning, washing dishes only
counts as working because certain other propositions using more abstract
concepts like preferences, leisure and value are already presupposed for the
41
situation. Should these fail, the very same activity need no longer count as
work. Thus the notion of supervenience is in this sense stronger than the
abstract-concrete relation described by Lessing.12 The determinable-determinate relation is also stronger in just the same way.13 For example, the
determinable colour is fixed to hold as soon as any of the determinates that
fall under it are fixed.14
Philosophical niceties aside, the important point about labelling work as
an abstract concept in Lessing's sense is just that, in order to say truly that I
worked, we do not have to assume that there was some other activity I did
beyond those already mentioned. Consider the same point in the case of
Lessing's fable, illustrated in figure 2.1. The marten is wily and quick; the
grouse is slow and innocent. That is what it is for the grouse to be weaker
than the marten. The fox is weaker than the wolf. But this is not a new
relation between the fox and the wolf beyond the ones we already know so
well and can readily identify in the picture: the wolf is bigger, stronger and
has sharper teeth. That's what its being stronger than the fox consists in. It
is just this ease of identification that accounts for the widespread use of
animals in fables. Animals are used, Lessing maintains, because their characteristics are so well known and so permanent. About the use instead of particular persons, Lessing says, 'And how many persons are so generally well
12
13
14
I have noticed that there is a tendency among reductionists of various kinds to try to collapse
the distinction between abstraction and supervenience by arguing that in each case the entire
abstract vocabulary will supervene on some more concrete description if only we expand the
concrete descriptions to cover a broad enough piece of the surrounding circumstances ('global
supervenience'). This is of course a metaphysical doctrine of just the kind I am disputing in
this book.
The determinable-determinate relation is stronger in a second way as well, since it requires
that the designated determinate descriptions be mutually exclusive.
This notion of supervenience - as well as Lessing's concept of abstraction - is also stronger
than the notion of the abstract-concrete relation Jordi Cat has shown to be at work in Maxwell's discussions of concrete mechanical models vis a vis the more abstract descriptions in
the energy-based Lagrangian formalism and its associated general principles of energy and
work. The generality of the Lagrangian formalism, like that of a more 'abstract' phenomenological representation of electromagnetic phenomena in terms of electric and magnetic forces
and energy (for Green, Maxwell and Heaviside), or that of the more 'general' representation
of macroscopic media in continuum mechanics (for Stokes), lies in the elliptic assumption of
the existence of an unknown underlying molecular structure represented by a mechanical
model with hidden mechanisms - in which energy is manifested in motion (kinetic) or stored
in elasticity (potential) - together with the realisation that an infinite number of more concrete
mechanical descriptions can realise (or merely illustrate) the more abstract one. The more
abstract one, however, needs independently to satisfy the mechanical principles that regulate
and characterise the concepts of energy and force. See Cat 1995a, n. 23.
The supervenience relation is also, technically, weaker, for many definitions of supervenience do not formally require the first condition I take to be necessary for abstraction: to say
that identical descriptions at the base level imply identical descriptions at the second level
does not imply that no descriptions at the second level apply without some appropriate
description from the base concepts, although this is often assumed.
42
GfTOUS* \A b l l ^ ty
43
known in history that by merely naming them one can awake in everyone a
concept of their corresponding way of thinking?'15 For just this reason,
though, stereotypical persons can serve. In La Fontaine, for example, the
hero of the sour grapes tale is a Gascon - Gascons typically being pictured
as swaggering and boastful.
3
Turn now from the Gascon and the fox to the stereotypical characters of the
models which 'fit out' the laws of physics. Consider = ma. I claim this
is an abstract truth relative to claims about positions, motions, masses and
extensions, in the same way that Lessing's moral The weaker are always
prey to the stronger' is abstract relative to the more concrete descriptions
which fit it out. To be subject to a force of a certain size, say F, is an abstract
property, like being weaker than. Newton's law tells that whatever has this
property has another, namely having a mass and an acceleration which, when
multiplied together, give the already mentioned numerical value, F. That is
like claiming that whoever is weaker will also be prey to the stronger.
In the fable Lessing proposes, the grouse is the stereotypical character
exhibiting weakness; the wolf, exhibiting strength. According to Lessing we
use animals like the grouse and the wolf because their characters are so well
known. We only need to say their names to bring to mind what general
features they have - boastfulness, weakness, stubbornness, pride, or the like.
In physics it is more difficult. It is not generally well known what the stereotypical situations are in which various functional forms of the force are
exhibited. That is what the working physicist has to figure out, and what the
aspiring physicist has to learn.
This point can be illustrated by looking at the table of contents of a typical
mechanics text, for example, the one by Robert Lindsay.16 A major part of a
book like this is aimed at the job I just described - teaching you which
abstract force functions are exhibited in which stereotypical situations. That
is like teaching you what everyone already knows about the grouse, that it is
weak vis-a-vis the marten; or about the marten, that it is weak vis-a-vis the
fox. Lindsay's text begins with a chapter of introduction, 'The Elemental
Concepts of Mechanics'. Already the second chapter starts with simple
arrangements, teaching us their force functions. Chapter Three continues with
slightly more complicated models. Chapter Four introduces energy. Again
though, immediately it turns to an account of what energy functions should
be assigned to what situations: 'Energy Relations in a Central Force Field',
15
16
44
'Inverse Square Field', 'Electron Energies in the Bohr Atom'. The same pattern is followed in the discussion of equilibrium {e.g., 'Equilibrium of a
Particle. Simple Cases . . . A System of Particles . . . Equilibrium of a Flexible
String'), and similarly throughout the text.
Consider some simple examples of force functions. In the opening sections
of Lindsay's chapter 2 we learn, for example, that the net force on a block
pulled by a rope across a flat surface is given by F r = F,_Fy, where e is from
the pull of the hand, and Ff is due to friction. Under 'Motion in a Field
Proportional to the First Power of the Distance', the basic model is that of a
block on a spring. In this arrangement, F = - kx, where x is the displacement
from the equilibrium position, and k is the spring constant. 'Motion in a Field
Proportional to the Square of the Distance' is probably the most familiar
since that is the form for the gravitational attraction between two masses.
The simplest case is the two-body system. This is a situation in which a
smaller body, m, is located a distance r from a larger, M. We learn that in
this arrangement the small mass is subject to the force GmMIr2.
Once we know how to pick the characters, we can construct a fable to 'fit
out' Newton's law by leaving them alone to play out the behaviour dictated
by their characters: that is what I called at the beginning, a model for a law
of physics. For Lessing's moral, he picked the grouse and the marten. Now
we can look to see if the grouse is prey to the marten. Similarly, we have the
small mass m located a distance r from the larger mass M. Now we can look
to see if the small mass moves with an acceleration GM/r2. If it does, we
have a model for Newton's law. Lessing said about his examples, 'I do not
want to say that moral teaching is expressed (ausgedrilcki) through the
actions in the fable, but rather . . . that through the fable the general sentence
is led back (zuriickgefuhrt) to an individual case.'17 In the two-body system,
and similarly in each of the models listed in Lindsay's Table of Contents,
Newton's law is 'led back' to the individual case.
Consider now Lessing's other claim, which I said at the beginning would
be central to my argument:
The general exists only in the particular, and can only become graphic in the particular.18
On my account force is to be regarded as an abstract concept. It exists only
in the more specific forms to which it is led back via models of the kind
listed in Lindsay's Table of Contents. It is not a new, separate property,
different from any of the arrangements which exhibit it. In each case being
in this arrangement - for example, being located at a distance r from another
17
18
45
46
I conclude with some possible lessons about truth, objectivity and realism in
physics. Nowadays the social constructivists provide us with powerful arguments against taking the laws of physics as mirrors of nature. Scientists, after
all, operate in a social group like any other; and what they do and what they
say are affected by personal motives, professional rivalries, political pressures, and the like. They have no special lenses that allow them to see through
to the structure of nature. Nor have they a special connection or a special ear
that reveals to them directly the language in which the Book of Nature is
written. The concepts and structures that they use to describe the world must
be derived from the ideas and concepts that they find around them. We can
improve these concepts, refine the structure, turn them upside down, inside
out; we can even make a rather dramatic break. But always the source must
be the books of human authors and not the original Book of Nature. What
we end up with through this process is bound to be a thoroughly human and
social construction, not a replica of the very laws that God wrote.
What then of the startling successes of science in remaking the world
around us? Don't these successes argue that the laws on which the enterprise
is based must be true? Social constructivists are quick to point out that the
successes are rather severely restricted to just the domain I mentioned - the
world as we have made it, not the world as we have found it. With a few
notable exceptions, such as the planetary systems, our most beautiful and
exact applications of the laws of physics are all within the entirely artificial
and precisely constrained environment of the modern laboratory. That in a
sense is a commonplace. Consider one of the founders of econometrics,
Tyrgve Haavelmo, who won the Nobel prize in 1989 for his work in originating this field. Haavelmo remarks that physicists are very clever. They confine
their predictions to the outcomes of their experiments. They do not try to
predict the course of a rock in the mountains and trace the development of
the avalanche. It is only the crazy econometrician who tries to do that, he
says.19
Even when the physicists do come to grips with the larger world they do
not behave like the econometrician, argue the social constructivists. They do
not take laws they have established in the laboratory and try to apply them
outside. Rather, they take the whole laboratory outside, in miniature. They
19
47
The conclusion I am inclined to draw from this is that, for the most part, the
laws of physics are true only of what we make. The social constructivists
tend to be scornful of the 'true' part. There is almost always the suggestion
lurking in their writings that it is no surprise that the laws work for the very
situations they have been designed to work for. The scientists in turn tend to
shrug their shoulders in exasperation: 'You try for a while and you'll find
out. It is a major achievement to get anything to work, and it is just as hard
to get a good model to describe it when it does.'
The observation that laws of physics are general claims, like the morals of
fables, and that the concepts they employ are abstract and symbolic can provide a middle ground in the dispute. Newton's law, for instance, can be true
of exactly those systems that it treats successfully; for we have seen how we
can take it to be true of any situation that can be simulated by one of the
models where the force puts on a concrete dress. That does not mean that we
have to assume that Newton has discovered a fundamental structure that governs all of nature. That is part of the point of seeing force as an abstract
concept, like work, and not a more concrete one, like extension. We may ask
of any object, 'What is its extension?' and expect there to be an answer. But
not work. My children's teachers used to say to me 'Play is a child's work'.
Were they right? I am inclined to think this is one of those situations where
they were neither right or wrong. The normal activities of middle-class preschoolers are not an arena to which we can lead back the abstract concept of
work and its relations, like labour, value and leisure. The concepts do not
apply here.
20
48
Similarly with force. You may assume that for every object - this book, a
ship in the Atlantic Ocean, or the rocks sliding over each other in the avalanche - there is an answer to the question, 'What is the force on this object?'
But you need not. Whether you do will depend on how widely you think our
models apply. I have argued that the laws are true in the models, perhaps
literally and precisely true, just as morals are true in their corresponding
fables. But how much of the world are fables true of? I am inclined to think
that even where the scientific models fit, they do not fit very exactly. This
question bears on how true the theory is of the world. But it is a different
question from the one at stake here. Choose any level of fit. Can we be
assured that for every new situation, a model of our theory will fit at that
level, whether it be a model we already have, or a new one we are willing
to admit into our theory in a principled way? This is a question that bears,
not on the truth of the laws, but rather on their universality.
What about force? You may agree with the great British physicist Kelvin21
that the Newtonian models of finite numbers of point masses, rigid rods, and
springs, in general of inextendable, unbendable, stiff things can never simulate very much of the soft, continuous, elastic, and friction-full world around
us. But that does not stop you from admitting that a crowbar is rigid, and,
being rigid, is rightly described by Newton's laws; or that the solar system
is composed of a small number of compact masses, and, being so composed,
it too is subject to Newton's laws. It is a different question to ask, 'Do
Newton's laws govern all of matter?' from 'Are Newton's laws true?' Once
we recognise the concept of force as an abstract concept, we can take different views about how much of the world can be simulated by the models that
give a concrete context to the concept. Perhaps Newton's models do simulate
primarily what we make with a few fortuitous naturally occurring systems
like the planets to boot. Nevertheless, they may be as unproblematically true
as the unexceptionable and depressing claim that the weaker are prey to the
stronger.
ACKNOWLEDGEMENTS
This chapter was originally published in almost exactly the form it appears here in
Cartwright 1991. My thanks to Conrad Wiedeman for his help and to Stanford-inBerlin for the opportunity of continuing my studies of Lessing; also to J. B. Kennedy
for discussions of Ne'eman.
21
I have learned about Kelvin from Norton Wise. Cf. Smith and Wise 1989.
Where do laws of nature come from? This will seem a queer question to a
post-logical-positivist empiricist. Laws of nature are basic. Other things come
from, happen on account of, them. I follow Rom Harre1 in rejecting this
story. It is capacities that are basic, and laws of nature obtain - to the extent
that they do obtain - on account of the capacities; or more explicitly, on
account of the repeated operation of a system of components with stable
capacities in particularly fortunate circumstances. Sometimes the arrangement
of the components and the setting are appropriate for a law to occur naturally,
as in the planetary system; more often they are engineered by us, as in a
laboratory experiment. But in any case, it takes what I call a nomological
machine to get a law of nature.
Here, by law of nature I mean what has been generally meant by 'law' in
the liberalised Humean empiricism of most post-logical-positivist philosophy
of science: a law of nature is a necessary regular association between properties antecedently regarded as OK. The association may be either 100 per
cent - in which case the law is deterministic, or, as in quantum mechanics,
only probabilistic. Empiricists differ about what properties they take to be
OK; the usual favourites are sensible properties, measurable properties and
occurrent properties. My objections do not depend on which choice is made.
The starting point for my view is the observation that no matter how we
choose our OK properties, the kinds of associations required are hard to come
by, and the cases where we feel most secure about them tend to be just the
cases where we understand the arrangement of capacities that gives rise to
them. The point is that our knowledge about those capacities and how they
operate in given circumstances is not itself a catalogue of modalised regularity claims. It follows as a corollary from my doctrine about where laws of
nature come from that laws of nature (in this necessary regular association
50
sense of 'law') hold only ceteris paribus - they hold only relative to the
successful repeated operation of a nomological machine.
What is a nomological machine? It is a fixed (enough) arrangement of
components, or factors, with stable (enough) capacities that in the right sort
of stable (enough) environment will, with repeated operation, give rise to the
kind of regular behaviour that we represent in our scientific laws. The next
four chapters will argue for the role of nomological machines in generating
a variety of different kinds of laws: the laws we test in physics, causal laws,
results in economics and probabilistic laws. This chapter aims to provide a
sense of what a nomological machine is and of why the principles we use to
construct nomological machines or to explain their operation can not
adequately be rendered as laws in the necessary regular association sense of
'law'.
2
I want to thank Jordi Cat for discussions and for contributing significantly to this section of
the chapter.
' Feynman 1992.
51
radius of the circular path and towards its centre. In the case of the orbiting
planet, the constituents of the nomological machine are the sun, characterised
as a point-mass of magnitude M, and the planet, a point-mass of magnitude
m, orbiting at a distance r and connected to the former by a constant attractive
force directed towards it. Newton's achievement was to establish the magnitude of the force required to keep a planet in an elliptical orbit:
F = -GmM/r2
See I. Newton, Principia, Proposition 11. The proof solves the so-called 'direct Kepler's
problem'. See Newton 1729.
Ibid., Proposition 13. The proof provides a solution to the so-called 'inverse Kepler's problem'.
A sketch of the analytical proof goes as follows. In the relation between a force, the mass of
a body and the acceleration the body undergoes as a result of the force (alternatively, the
acceleration by virtue of which it is able to exert the force - Newton's second law of motion the force can be expressed as a function of the position of the body and of time: F(\,t) = m
d2x/dr\ A transformation into polar co-ordinates, the radius r and the angle <J), allows for an
expression of the force in terms of the radial co-ordinates, F(r,t) and the angular ones, F(,t).
By eliminating the time parameter, one can obtain an expression for the force in terms of r
and <j) only: F(r,<j)) = -f/mr^itfil/rydty2 + 1/r), where / is the angular momentum of the system.
This is the 'polar orbital equation'. Then by differentiating the orbital equation of the ellipse,
l/r = c(l + ecos) (e is the eccentricity and c is a constant), one can arrive at the inversesquare form of the required force, F = -Ar'O/r2), in the direction of the sun, where k' is empirically determined by the arrangement of the system (k'' -GmM). A discussion of Newton's
geometrical proofs and their correspondence with the modern analytical substitutes can be
found in Brackenridge 1995.
52
It is worth mentioning that Newton's nomological machine derives its unifying power from
its ability to account in addition for the regularities described in Kepler's second and third
law. But in the case of the third law (that the square of the period of a planet's motion is
proportional to the cube of the major axis of its orbit) Newton's description of the setting
includes the assumption that the planet's mass is negligible compared to the mass of the sun.
Kepler's law describes then only an approximation to the actual regularities displayed by the
larger planets, such as Jupiter and Saturn. Cf. Goldstein 1980, p. 101.
Feynman 1992, p. 23.
53
conditions that are necessary to ensure the stability of the original Newtonian
machine.9
54
55
p<0
56
especially pressing here because it does not even make sense to think of
either of the two capacities being exercised on its own.12
These examples bring out the wholistic nature of the project we undertake
in theory formation in exact science. We must develop on the one hand
concepts (like 'the force due to gravity*, 'the force due to charge'13 or 'capacitance', 'resistance', 'impedance' . . . ) and on the other, rules for combination', and what we assume about each constrains the other, for in the end the
two must work together in a regular way.14 When the concepts are instantiated
in the arrangements covered by the rules, the rules must tell us what happens,
where regularity is built into the demand for a rule: whenever the arrangement is thus-and-so, what happens is what the rule says should happen.
Developing concepts for which we can also get rules that will work
properly in tandem with them is extremely difficult, though we have succeeded in a number of subject areas. In both physics and economics we have
a variety of formal theories with special concepts and explicit rules that allow
us to predict what regular behaviours should occur whenever the concepts
are instantiated in the prescribed kinds of arrangements. And in physics,
where we have been able to build clear samples of these arrangements, a
number of our formal theories are well confirmed. Economics generally must
rely on a more indirect form of testing, and the verdicts there are far less
clear. At any rate, the success in various branches of physics in devising
special concepts and laws that work in cases where the concepts clearly apply
shows that there are at least some domains where the requirements we have
been discussing are not impossible to fulfil.
A common metaphysical assumption about the completeness (or
completability) of theory would go further and put an even more severe
demand on our scientific concepts. The assumption was well expressed by
John Stuart Mill:
The universe, so far as known to us, is so constructed that whatever is true in any
one case is true in all cases of a certain description: the only difficulty is tofindwhat
description.15
The sense of completeness I have in mind is this: a theory is complete with
respect to a set of cases when it supplies for those cases the descriptions that
Mill expects plus the principles that connect the descriptions.
12
13
14
15
This is in sharp contrast with the method of representation just discussed in which factors
with different capacities are combined in a single equation. Generally in these cases the value
of the other relevant causes can be set to 'zero' to represent situations in which they do not
operate.
In my vocabulary these would be called 'the Coulomb capacity', 'the capacity for gravitational
attraction' and so on.
This point, I take it, is similar to that of Donald Davidson in Davidson 1995.
Mill 1843, vol. 1, p. 337.
57
What about this additional demand? Should we accept it? I urge 'no'. The
constraints imposed on concept formation in exact science by the demands
to build at the same time a system of matching rules that will work together
with the concepts in the right way are so severely confining that we have
only satisfied them in a few formal theories in physics, and with great effort.
And even in physics, we never have had a success, nor a near success, at
completeness. It is only subject to the big ceteris paribus condition of the
operation of an appropriate nomological machine that we can ever expect,
'that whatever is true in any one case is true in all cases'. We might well
of course aim for completeness in any case where we have an empirically
well-grounded research programme that offers promising ideas for how to
achieve it. But in general we have no good empirical reason to think the
world at large lends itself to description by complete theories.
This is why the idea of a nomological machine is so important. It is, after
all, only a philosophical concept, like 'unconditional law' or 'complete
theory' or 'universal determinism', a way of categorising and understanding
what happens in the world. But it has the advantage over these that it adds
less than they do to what we are given in our observations of how successful
formal theories work; and it shows that we do not need to use these more
metaphysically extensive concepts in order to make sense of either the successes of our exact sciences nor of the pockets of precise order that these
sciences can describe. Where there is a nomological machine, there is lawlike behaviour. But we need parts described by special concepts before we
can build a nomological machine. The everyday concepts of irritability and
inaccuracy will not do, it seems, nor the concept of acceleration in terms of
rate of change of velocity with distance (dv/dx) rather than with time (dv/d/),
which the Medievals struggled to make a science of. We also need a special
arrangement: a bunch of resistors and capacitors collected together in a paper
bag will not conduct an electric current. When we understand it like this, we
are not inclined to think that exact science must be completable, at least in
principle, in order to be possible at all.
There is one further central aspect of nomological machines that I have so
far not discussed: shielding. Recall the irregularity in the orbit of Uranus
from the point of view of the original model of the planetary machine. This
reminds us that is not enough to insist that the machine have the right parts
in the right arrangement; in addition there had better be nothing else happening that inhibits the machine from operating as prescribed. As we saw in
chapter 1, even a very basic principle like = ma needs a shield before it
can describe a regularity. We can have all the forces in all the right arrangements that license assignment of a particular 'total' force F. But we cannot
expect an acceleration a = F/m to appear if the wind is blowing too hard. The
need for shielding is characteristic of the ordinary machines we build in
58
59
repeated operation are so transparent that they go unnoted. To the extent that
this claim is borne out, to that extent we have powerful empirical evidence
that you cannot get a regularity without a nomological machine. And if nomological machines are as rare as they seem to be, not much of what happens
in nature is regular and orderly, as Mill supposed it to be. The world is after
all deeply dappled.
I argue against laws that are unconditional and unrestricted in scope. Laws
need nomological machines to generate them, and hold only on condition
that the machines run properly. But there are, as we saw in the last section,
some very well understood machines, modelled within the various disciplinary boundaries of our exact sciences. I say our understanding of these
depends on knowledge of capacities, not knowledge of laws. Is there much,
after all, in the difference? I think so, because when we refuse to reconstruct
our knowledge as knowledge of capacities, we deny much of what we know
and we turn many of our best inventions into pure guesses. What is important
about capacities is their open-endedness: what we know about them suggests
strategies rather than underwriting conclusions, as a vending-machine view
of science would require. To see the open-endedness it is useful to understand
how capacities differ from dispositions.
Disposition terms, as they are usually understood, are tied one-to-one to
law-like regularities. But capacities, as I use the term, are not restricted to
any single kind of manifestation. Objects with a given capacity can behave
very differently in different circumstances. Consider Coulomb's law, F =
~ ^/^/^Tteor2, for two particles of charge q} and q2 separated by a distance
r. I will discuss this case in more detail in chapter 4. For here let us just
consider what Coulomb's law tells us about the motions of the particle pair.
It tells us absolutely nothing. Before any motion at all is fixed, the particles
must be placed in a special kind of environment; just the kind of environment
that I have described as a nomological machine. Without a specific environment, no motion at all is determined.
We may think that the natural behaviour for opposite charges is to move
towards each other and for similar charges, to separate from each other. But
it is important to keep in mind that this is not an effect in abstracto. That
motion, like any other, depends on how the environment is structured. There
is no one fact of the matter about what occurs when charges interact. With
the right kind of structure we can get virtually any motion at all. We can even
create environments in which the Coulomb repulsion between two negatively
charged particles causes them to move closer together. Figure 3.1 gives an
60
Two electrons ex and e2 are released from rest into a cylinder as in Figure
3.1b. The cylinder is open from one side only, and it is open to a unified
magnetic field directed towards the negative z-axis. The initial distance
between the two electrons is rx. According to the laws of electromagnetism,
the force between the two electrons is a repulsive force equal to
c
e e
\2
F=
=- = meaa.
r\
Whereas e2 will be locked inside the cylinder, ex will enter the magnetic field
4
B with a certain velocity vx. The magnetic field on ex will move it in a circular
motion (as in the figure) with a force equal to
F = evx <g> B.
This will take the electron ex into an insulated chamber attached to the
cylinder. The dimensions of the cylinder and the chamber can be set so
that the distance between the final position of ex and e2 is less than rx.
example, due to Towfic Shomar, from the LSE Modelling and Measurement
in Physics and Economics Project.
For a different kind of example, let us turn to economics, to a study by
Harold Hotelling16 of Edgeworth's taxation paradox.17 This is a case that I
have worked on with Julian Reiss, also from the LSE Modelling and Measurement in Physics and Economics Project.18 Taxes have the capacity to
affect prices. How do we characterise the effects of this capacity? Think
again about the capacity represented in Coulomb's law with respect to the
motion of oppositely charged particles. We tend to characterise this capacity
in the canonical terms I used above: opposite charges move towards each
other; similar charges, away from each other. Similarly, taxes increase prices.
The 'paradox' pointed out by Edgeworth is that this is not the only possibility.
In the right situations taxes can decrease prices, and they can do so by following just the same principles of operation that 'normally' lead to price increase.
Hotelling produced a toy model of a simple economy that illustrates Edgeworth's paradox. The economy consists of many firms which compete in the
production of the different commodities and many buyers whose sole source
of utility derives from these goods. A version of the Hotelling economy with
16
17
18
Hotelling 1932.
Cf. Edgeworth 1925, section II.
See also Hands and Mirowski 1997 and my comments in Cartwright 1997c.
61
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
xX
X X
X X
X X
X
X
X
X
X
X
X
X
e\ ...
\^
X
X
X
X
X
X
X
X
X
X
X X X X X
X X X X X
X X >C X X
X X >C X X
X X X X,
X
X X
X A : xX
Xi X X X
X X X X
/
X X X X
r X X X X
x
X X X X X
e\
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X
> * T T" * * <
X X X X X X X>
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X
- X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X y
0j*
*
x
X X X X X
X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
\
*/
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
*
X X X X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
XLX
\}
x
X
X
X
X
X
rV
i
fI
i Xx
rI X
y Xx X
*X X X
xxxxxxx
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Insulator
Figure 3. lb
A and B are functions, through the demand and supply equations, of partial
derivatives with respect to each of the two quantities of functions of both
quantities {hf^qiq^lbqw .) In the terminology I used in Nature s Capacities
and their Measurement,19 AID represents the strength of f s capacity to affect
dply and BID, the strength of fs capacity to affect dp2. For Hotelling's two
commodity economy, it can be shown that D is always positive. But it is
possible to construct supply and demand functions whose parameters make
19
Cartwright 1989.
62
(' = 1.2).
(1)
It is assumed that these equations are solvable, such that we have the inverse
demand functions:
2)
(/=1,2).
(2)
For the producers the respective supply functions are given with the following expressions:
qi = Gi(PuPi)
(' = 1,2)
(3)
Pi=gi(quqi)
( / = 1,2).
(4)
Now, let h\(q\,qi) be the excess of demand price over supply price. Thus we
obtain,
A differentiation with respect to a qx will be denoted with a subscript j , so
that,
The last definition will be the determinant of the marginal excess price
matrix:
Let an asterisk * denote the equilibrium values for which supply and demand
are equal. Then
^i(?i02) = O(6)
Now, a tax tx per unit sold is imposed on the /th commodity, payable by the
producers. Let px + dpx and ^ + dqx be the new prices and quantities. In
equilibrium, the demand price must exceed the supply price by exactly th
Hence,
A Taylor expansion of the first order and the subtraction of equation (6)
yield the following approximations for small tx:
hudqx
+ hl2dq2
= tx.
(8)
63
1 h
D
hi
hxxh22
h22
1 h\\
dq2
h2Xhx2'
t2hxx hxxh22 -
txh2X
h2Xhx2
h2X h
The interesting effect the tax has is on the prices. The price changes to buyers
resulting from the taxes are:
1
fn
fa
(10)
-fnf*
+/i2g2i)
dP2=t(f22g2lDf2lg22
(11)
.
<F> f2Xg22
-f22g2x
(12)
< 0.
One can easily see the dual (or triple) capacity of the tax to increase or
decrease (or leave unchanged) the prices, depending on the values of the
other parameters. This dual capacity stems from the fact that the two
goods interact both in consumption and production. For a single commodity equation (10) yields:
d =--
fn
D t hu
since t is a l w a y s p o s i t i v e a n d b o t h / n a n d h x x (=fxx
1
2
3
(13)
- gxx) are negative.
64
It is now time to defend explicitly my claim that we need claims about capacities to understand nomological machines and cannot make do with laws, in
the necessary regular association sense of 'law'. I shall look at two prominent
places where we can see why we need capacities instead of laws. The first is
in the principles for building nomological machines, the second for describing
20
21
R y l e 1949, p. 118.
Ibid., p . 119.
65
The one case we have looked at where the basic principles could legitimately be thought of
as describing what systems do using only occurrent-property language is in the simultaneous
equations models of econometrics. The equations are supposed to involve only measurable
quantities, and since each equation must be separately satisfied, the relations between measurable quantities that really occur are literally in accord with each of the principles.
66
when two masses interact, whatsoever these episodes look like. The Concise
Oxford English Dictionary,23 for instance, defines 'attract' when used 'of
a magnet, gravity, etc/ as 'exert a pull on'. 'Attract' and 'pull' are like
'groce' for the activities of a grocer and 'solicit' for the activities of a solicitor. They are not in the usual philosopher's list of occurrent property terms.
Rather, they mark the fact that the relevant capacity has been exercised. That
is what is in common among all the cases when masses interact as Newton
described.
Sometimes we conceal the widespread use in physics of terms like 'attract',
terms that mark the exercise of a capacity, by a kind of equivocation. We
switch back and forth between an occurrent sense of the term - a body has
attracted a second when the second moves towards it - in which Newton's
principle or Coulomb's is generally not borne out - and the sense marking
the exercise of a capacity in which the principles do seem to be true (if not
universally at least as widely as we have looked so far). 'Attract', like many
verbs in both ordinary and technical language, comes with a natural effect
attached, and with two senses. In the first sense the natural effect must occur
if the verb is to be satisfied; in the second sense, it is enough for the system
to exercise its capacity regardless of what results, i.e., for it to try to produce
the associated effect.
The trying is essential, and sometimes verbs like these have it built right
into their definition. To 'court', according to the Concise Oxford Dictionary,24
is to 'try to win the affection or favour of (a person)'. These kinds of words
are common in describing the facts of everyday life: to brake - to apply the
brakes, or to succeed in slowing the vehicle; to anchor - to lower the anchor,
or to succeed in securing the boat; to push, to pull, to resist, to retard, to
damn, to lure, to beckon, to shove, to harden (as in steel), to light (as the
fire), . . . ; and especially for philosophers: to 'explain' is not only, in its first
sense in the Concise Oxford Dictionary, to 'make . . . intelligible', but also,
in its second, to 'say by way of explanation'.25
The technical language of physics shares this feature with our more ordinary language; indeed it shares much of the same vocabulary. Attraction,
repulsion, resistance, pressure, stress, and so on: these are concepts that are
essential to physics in explaining and predicting the quantities and qualities
we can directly measure. Physics does not differ from ordinary language by
needing only some special set of occurrent property terms or directly measurable quantities stripped of all connections with powers and dispositions.
Rather, as I described in section 3, what is distinct about the exact sciences
8th edn, 1990.
Ibid.
Ibid., italics added.
67
is that they deal with capacities that can be exercised not only more or less push harder, or resist less - but with ones for which the strength of effort is
quantifiable, and for which, in certain very special circumstances, the exact
results of the effort may be predictable.
The second place where it is easy to see the need for capacity concepts is
when we set the nomological machine running. This is a point I make much
of at other places, so I will only summarise here. Consider the very simple
case of two charged bodies separated by a distance r. To calculate their
motions, we add vectorially the force written down in Coulomb's principle
and the force written down in Newton's law of gravity; then we substitute
the result into Newton's second law, F = ma. What then are we supposing?
First, that there is nothing that inhibits either object from exerting both its
Coulomb and its gravitational force on the other; second, no other forces are
exerted on either body; and third, everything that happens to the two bodies
that can affect their motions can be represented as a force. Notice that these
caveats all have to do with capacities and their exercise. Nothing must inhibit
either the charges or the gravitational masses from exercising their capacities.
No further capacities studied in classical dynamics should be successfully
exercised; and finally, the capacity of a force to move a body as recorded in
Newton's second law must be exercised successfully, unimpeded and without
interference.
Can we render these caveats without using the family of concepts involving
capacities? Throughout these chapters I argue that we cannot. (In particular,
I treat the first and second conditions in chapters 4 and 8; the third was
discussed at length in chapter 1.) The idea that we can do so is part of the
fundamentalist pretensions of physics: there is some vocabulary special to
physics within which we can describe everything that matters to the motions
of bodies. This view gains support, I take it, from a mistaken understanding
about how deductivity works in physics. In theories like mechanics, electromagnetism and special relativity we have had considerable success in finding
sets of occurrent property descriptions that have a kind of deductive closure:
certain kinds of effects describable in that vocabulary occur reliably in circumstances where all the causes of these kinds of effects (and their
arrangement) can be appropriately described within the designated vocabulary. But that does not cash out into regularity laws with the descriptions of
the causes and their arrangements in the antecedents and the descriptions of
the effects in the consequent. For we still need the shielding: nothing else
must occur that interferes with the capacities of those causes in that arrangement to produce those effects.
The need for this kind of addition is often obscured by the plasticity of the
language of physics. Sometimes terms in physics refer to genuinely measurable quantities that objects or systems might possess and sometimes the use
68
of the very same terms requires truths about the operation of capacities for
its satisfaction. This plasticity gives physics two different ways to finesse the
problems I have been discussing - either narrow the range of the antecedent
to include the ceteris paribus conditions right in it, or expand the range of
the consequent to cover whatever occurs when the capacities in question
operate.
We have seen lots of illustrations of the second device already, for instance
with the introduction of words like 'attract' and 'repel' into Coulomb's law
and the law of gravity. The first can be seen in the simple case of the law of
the lever. 'Lever' can be defined in terms of occurrent properties, making no
allusion to capacities and their exercise. So, we sometimes use iever' to
mean rigid rod, where a rod is rigid just in case the distances between all
the mass points that make it up remain constant through the motions of these
mass points. But sometimes we use 'lever' only for rigid rods so placed that
their capacity to exhibit the behaviour required in the law of the lever will
operate unimpeded. In this sense (if physics is right about the capacities of
rigid rods), then a lever is bound to satisfy the law of the lever.
So far I have argued that there are jobs we do - and indeed should do with our scientific principles that cannot be done if we render them as laws
instead of as descriptions of capacities. There is one answer to my plea for
capacities that sidesteps these defences of capacities. The answer employs a
kind of transcendental argument. It does not attempt to show how it is possible to do these jobs without capacities but rather tries to establish that it
must be possible to do so. I borrow the form from arguments made by Bas
van Fraassen and by Arthur Fine in debating more general questions of scientific realism.26 The argument presupposes that we have available a pure data
base, cleansed of capacities and their non-Humean relativities. The objection
goes like this: 'You, Cartwright, will defend the design of given machine by
talking about what impedes and what facilitates the expression of the capacities in question. I take it this is not idle faith but that in each case you will
have reason for that judgement. These reasons must ultimately be based not
in facts about capacities, which you cannot observe, but in facts about actual
behaviour, which you can. Once you have told me these reasons, I should be
able to avoid the digression through capacities and move directly to the same
conclusions you draw with capacities. Talk of capacities may provide a convenient way to encode information about behaviours, but so long as we insist
that scientific claims be grounded in what can be observed, this talk cannot
contribute any new information.'
But what about this decontaminated data base? Where is it in our experience? It is a philosophical construction, a piece of metaphysics, a way to
26
69
70
28
71
facts then are they that make our capacity claims true? Let me turn the question around and see what the traditional view has to say. What facts make
law claims true in the necessary regular association sense of law? There are,
I think, only two honest kinds of answer for an empiricist.
The first is that regularities make law claims true, real regularities, ones
that actually occur. These are undeniably facts in the world, not for instance
putative facts in some merely possible world. They are thus proper empiricist
candidates for truth makers. But we know that this lets in both too little and
too much. Start with too much. What about all the accidental regularities?
There is an honest empiricist answer:29 laws are those regularities that cover
the widest range of occurrences in the most efficient way.30 The objection
that there are too few regularities was taken up by Bertrand Russell:31 a
good many of the claims we are most interested in, especially in contexts of
forecasting and planning, are about situations that may never occur or only
rarely get repeated. Russell claimed that physics solves this problem by using
very abstract descriptions; at that level 'the same thing' does generally occur
repeatedly. (So, for example, the trajectories of the planets and of cannon
balls and of electrons in a cloud chamber are all supposed to instantiate F =
ma.)
My objection is the same in both cases. I summarise the lessons argued
for in various places throughout this book: there are no such regularities to
begin with. Unless we take capacities robustly, Coulomb's and Newton's
principle are ruled out immediately. Perhaps they are to get relegated to the
status of calculational tools for getting 'real' regularities, like F = ma. But
even this is not a true regularity without adding to the antecedent the caveat
that the force operates unimpeded. Russell's proposal fares better; but, as I
argued in section 5, only if we allow our abstract vocabulary to include terms
like 'attract' and 'repel', terms that have implications about capacities and
their operations built in. So regularity theorists cannot even get started unless
they too take facts involving how capacities operate to be part of the constitution of the world.32
Alternatively, there are proposals33 to take necessitation as one of the kinds
of facts that make up the world. Then we can still be empiricist in the sense
that we can stick to the demand that scientific claims be judged against facts
29
72
about the real world around us.34 The drawback to this proposal from my
point of view is not that it lets modal facts into the world but rather that it
lets in the wrong kind of modal fact. The inversion of a population of atoms
does not necessitate the emission of coherent radiation; it allows it. But it
allows it in some very special way. After all, anything can cause anything
else. In fact, it seems to me not implausible to think that, with the right kind
of nomological machine, almost anything can necessitate anything else. That
is, you give me a component with a special feature and a desired outcome,
and I will design you a machine where the first is followed by the second
with total reliability. Just consider, for example, Rube Goldberg machines or
Paolozzi sculptures.
So, if anything can cause practically anything else, what is special about
the claim that an inversion in a population of atoms allows (or can cause)
coherent radiation? We can use the expression we often introduce in
explaining our intuitions about laws of nature here: the inversion allows the
coherent radiation by virtue of the structure of the world, or by virtue of the
way the world is made. But what does that mean? To mark the distinction
between the kind of accidental possibility, where anything can result in anything else, and this other more nomological sense of possibility, Max Weber
labelled the latter "objective possibility'.35 Weber's ideas seem to me very
much worth pursuing in our contemporary attempts to understand scientific
knowledge. But so far I still think that the best worked out account that suits
our needs most closely is Aristotle's doctrines on natures, which I shall
defend in the next chapter. Capacity claims, about charge, say, are made true
by facts about what it is in the nature of an object to do by virtue of being
charged. To take this stance of course is to make a radical departure from
the usual empiricist view about what kinds of facts there are.
That returns me to the plea for the scientific attitude. Philosophical arguments for the usual empiricist view about what there is and what there is not
are not very compelling to begin with. They surely will need to be given up
if they land us with a world that makes meaningless much of what we do
and say when we use our sciences most successfully. Wliat makes capacity
claims true are facts about capacities, where probably nature's grammar for
capacities is much like our own - or at least as much like our own as any
Perhaps I should say that this allows us to satisfy the ontological demands of empiricism.
There are of course in addition in the empiricist canon also epistemological demands and
demands about how meanings can be fixed. In my view, as I argue in different places here
and in Cartwright 1989, all these kinds of demands are just as well met by claims about
capacities as by claims about occurrent properties.
Cf. Weber's The Logic of Historical Explanation in Runciman 1978, which is translated from
Weber's 1951. In that essay, Weber attributes the concept of objective possibility to the
German physiologist Johannes von Kries.
73
other claims about the structure of the world that we back-read from successful scientific formulations. What makes true, then, the claim, 'Inversion in a
population of atoms has the capacity to produce coherent radiation'? In
simple Tarski style, just that: the fact that inversion has the capacity to produce coherent radiation. And this fact, so far as our evidence warrants,36 has
as much openness about it with respect to determining occurrent properties,
as does our own claim about the capacity.
7
I have been defending the claim that facts about capacities and how they
operate are as much a part of the world as pictured by the exact sciences as
are facts about occurrent properties and measurable quantities. One may be
inclined to query what all the fuss is about. Once we have forsaken the
impressions-and-ideas theory of concept formation defended by Hume and
all forms of sense-data theories as well, how are we to draw a distinction
between facts about occurrent properties and ones about capacities in the first
place?
I have no qualms about giving up the distinction. But in doing so we must
not lose sight of one important feature of capacities that affects our doctrines
about the limits of science. There is no fact of the matter about what a system
can do just by virtue of having a given capacity. What it does depends on its
setting, and the kinds of settings necessary for it to produce systematic and
predictable results are very exceptional. I have argued here that it takes a
nomological machine to get a regularity. But nomological machines have
very special structures. They require the conditions to be just right for a
system to exercise its capacities in a repeatable way, and the empirical indications suggest that these kinds of conditions are rare. No matter how much
knowledge we might come to have about particular situations, predictability
in the world as it comes is not the norm but the exception. So we should
expect regularities to be few and far between. If we want situations to be
predictable, we had better engineer them carefully.
ACKNOWLEDGEMENTS
This chapter is dedicated to Rom Harre, who taught me that it is all right to believe
in powers. I wish to thank Jordi Cat, who has contributed substantially to section 2
Again I should add that I do not think there are any successful arguments that the evidence
here is less good than for any of the more usual empiricist claims about what kinds of facts
there are. Indeed, if my arguments are right, the evidence is much better since a reconstruction
of scientific claims using capacity language will go very much farther in capturing our empirical successes than will a reconstruction that uses only the language of occurrent properties.
74
and has provided valuable editorial help and useful conversations, and Towfic Shomar
and Julian Reiss who developed the examples in section 4. Work on this chapter was
supported by the LSE Modelling and Measurement in Physics and Economics Project.
About half of this chapter has been previously published. Sections 1 and 2, bits of
section 5 and all of section 7 appeared in much the same form in Cartwright 1997b.
Section 3 is entirely new as are the crucial examples of section 4 and all of section
6. Some of section 5 appeared in Cartwright 1992.
Part II
Beyond regularity
78
of what happens or what can happen. John Stuart Mill talked of knowledge
like this as knowledge of what things tend to do. But that kind of talk is rarely
heard now.1 Our contemporary account of scientific knowledge is dominated
instead by categories favoured by British empiricists from the eighteenth and
the twentieth centuries, most notably Locke, Berkeley, Hume, Ayer and Ryle.
These are the categories of the 'sensible' or 'observable' properties and of
properties that are 'occurrent' as opposed to powers and dispositions. But we
get a far better description of scientific knowledge if we adopt a category of
Aristotle's. The knowledge we have of the capacity of a feature is not knowledge of what things with that feature do but rather knowledge of the nature
of the feature. This chapter will explain why I reject the conventional categories of British empiricism and turn instead to more ancient ones. A concept
like Aristotle's notion of nature is far more suitable than the concepts of law,
regularity and occurrent property to describe the kind of knowledge we have
in modern science: knowledge that provides us the understanding and the
power to change the regularities around us and produce the laws we want.
2
Historical background
79
80
the world as a whole, made at the point of creation, but derives proximately from the
local conditions and characters in the Aristotelian pattern
if we look more closely at the seventeenth century we see an insistence, even more
adamant than Aquinas', upon the autonomy of physics from theology. Descartes
insists on it most stringently
The Drang nach Autonomie of physics, even as developed by such theological
thinkers as Descartes, Newton, and Leibniz, needed an intermediate link between
God's decree and nature. Aquinas had needed such a link to explain proximate causation, and found it in the Aristotelian substantial forms (individual natures). For the
seventeenth century another kind was needed, one that could impose a global constraint on the world process. In general terms, this link was provided by the idea that
nature has its inner necessities, which are not mere facts, but constrain all mere facts
into a unified whole. The theological analogy and dying metaphor of law provided
the language in which the idea could be couched.8
My thesis here is that this story is distorted, at least as it applies to modern
experimental science. We have not replaced natures by laws of natures. For
our basic knowledge knowledge of capacities - is typically about natures
and what they produce. Instead what we have done is to replace occult powers
by powers that are visible, though it may take a very fancy experiment to see
them. This is already apparent in Francis Bacon. Bacon still employs the
Aristotelian idea of natures or essences, but for him these are not hidden.
Bacon looks for the explanatory essences, but he looks for them among qualities that are observable. Consider his hunt for the essence of heat.9 He makes
large tables of situations in which heat occurs, in which it is absent, and in
which it varies by degrees. Instances agreeing in the Form of Heat include,
for instance, rays of the sun; damp, hot weather; flames; horse dung; strong
vinegar; and so forth. Then he looks to see what other quality is always
present when heat is present and always absent when heat is lacking. In this
way he finds the true, simple nature that constitutes heat: motion. The point
is that Bacon still hopes to find the nature of heat, but among visible, not
occult, qualities.
Modern explanation similarly relies on natures, I will argue, though
modern natures are like Bacon's and unlike those of the Scholastics in that
they are attributed to structures and qualities we can independently identify.
Generally they differ from Bacon's in that they do not lie on the surface and
are not to be observed with the naked eye. We often need very subtle and
elaborate experiments in order to see them. Modern science insists that we
found explanation on experimentally identifiable and verifiable structures and
8
9
81
qualities. But, I maintain, what we learn about these structures and qualities
is what it is in their natures to do.
What we have done in modern science, as I see it, is to break the connection between what the explanatory nature is - what it is, in and of itself and what it does. An atom in an excited state, when agitated, emits photons
and produces light. It is, I say, in the nature of an excited atom to produce
light. Here the explanatory feature - an atom's being in the excited state - is a
structural feature of the atom, which is defined and experimentally identified
independently of the particular nature that is attributed to it. It is in the nature
of the excited atom to emit light, but that is not what it is to be an atom in
an excited state. For modern science what something really is - how it is
defined and identified - and what it is in its nature to do are separate things.
So even a perfect and complete modern theory would never have the complete deductive structure that the Aristotelians envisaged. Still, I maintain,
the use of Aristotelian-style natures is central to the modern explanatory programme. We, like Aristotle, are looking for 'a cause and principle of change
and stasis in the thing in which it primarily subsists', and we, too, assume
that this principle will be 'in this thing of itself and not per accidens\l
Yet even at this very cursory level of description we differ from Aristotle
in three important ways. First, as in my example of an atom in an excited
state, we assign natures not to substances but rather to collections or configurations of properties, or to structures. Second, like the early empiricists and
the mechanical philosophers of the scientific revolution, modern physics supposes that the 'springs of motion' are hidden behind the phenomena and that
what appears on the surface is a result of the complex interaction of natures.
We no longer expect that the natures that are fundamental for physics will
exhibit themselves directly in the regular or typical behaviour of observable
phenomena. It takes the highly controlled environment of an experiment to
reveal them. Third, having made the empiricist turn, we no longer identify
natures with essences. As I have described in this section, in modern science
we separate our definition of a property from our characterisation of what
kind of change it naturally produces. Still, when we associate a particular
principle of change with a given structure or characteristic, we expect that
association to be permanent, to last so long as the structure is what it is.
Indeed, it is this permanence of association that I underline by claiming that
modern science still studies Aristotelian-style natures. Of course these are not
really Aristotelian natures. For one thing, we seem to share none of the concerns about substance and individuation in which Aristotle's concept was
embedded. There are a number of other differences as well. Nevertheless I
82
83
Buchdahl 1969.
84
esting. The ultimate aim is to find out how the charged bodies interact not
when their masses are zero, nor under any other specific set of circumstances,
but rather how they interact qua charged. That is the second stage of inquiry:
we infer the nature of the charge interaction from how charges behave in
these specially selected 'ideal' circumstances.
The key here is the concept ideal On the one hand we use this term to
mark the fact that the circumstances in question are not real or, at least, that
they seldom obtain naturally but require a great deal of contrivance even to
approximate. On the other, the 'ideal' circumstances are the 'right' ones right for inferring what the nature of the behaviour is, in itself. Focusing on
the first aspect alone downplays our problems. We tend to think that the chief
difficulties come from the small departures from the ideal that will always
be involved in any real experiment: however small we choose the masses in
tests of Coulomb's law, we never totally eliminate the gravitational interaction between them; in Galilean experiments on inertia, the plane is never
perfectly smooth nor the air resistance equal to zero; we may send our experiments deep into space, but the effect of the large massive bodies in the
universe can never be entirely eliminated; and we can perform them at
cryogenic temperatures, but the conditions will never, in fact, reach the
ideal.
The problem I am concerned with is not whether we can get the system
into ideal circumstances but rather, what makes certain circumstances ideal
and others not. What is it that dictates which other effects are to be
minimised, set equal to zero, or calculated away? This is the question, I
maintain, that cannot be answered given the conventional empiricist account
of scientific knowledge. If we consider any particular experiment, it may
seem that the equipment we move about, the circumstances we contrive,
and the properties we calculate away are ones that can be described without
mentioning natures. But in each case, what makes that arrangement of equipment in those particular circumstances 'ideal' is the fact that these are the
circumstances where the feature under study operates, as Galileo taught, without hindrance or impediment, so that its nature is revealed in its behaviour.
Until we are prepared to talk in this way about natures and their operations,
to fix some circumstances as felicitous for a nature to express itself and
others as impediments, we will have no way of determining which principle
is tested by which experiment. It is this argument that I develop in the next
section.
Before turning to this argument I should make a point about terminology.
My use of the terms capacity and nature are closely related. When we ascribe
to a feature (like charge) a generic capacity (like the Coulomb capacity) by
mentioning some canonical behaviour that systems with that capacity would
display in ideal circumstances, then I say that that behaviour is in the nature
85
of that feature. Most of my arguments about capacities could have been put
in terms of natures had I recognised soon enough how similar capacities, as
I see them, are to Aristotelian natures. On the other hand, the use of the term
'natures' would seem very odd in the contemporary philosophical literature
on causation, and would probably divert attention from the central points I
want to make there about capacities versus laws, so perhaps it is not such a
bad idea to keep both terms.
4
For anyone who believes that induction provides the primary building tool
for empirical knowledge, the methods of modern experimental physics must
seem unfathomable. Usually the inductive base for the principles under test
is slim indeed, and in the best experimental designs, where we have sufficient
control of the materials and our knowledge of the requisite background
assumptions is secure, one single instance can be enough. The inference of
course is never certain, nor irrevocable. Still we proceed with a high degree
of confidence, and indeed, a degree of confidence that is unmatched in largescale studies in the social sciences where we do set out from information
about a very great number of instances. Clearly in these physics experiments
we are prepared to assume that the situation before us is of a very special
kind: it is a situation in which the behaviour that occurs is repeatable. Whatever happens in this situation can be generalised.
This particular kind of repeatability that we assume for physics experiments requires a kind of permanence of behaviour across varying external
conditions that is comparable to that of an essence although not as strong.
For example, we measure, successfully we think, the charge or mass of an
electron in a given experiment. Now we think we know the charge or mass
of all electrons; we need not go on measuring hundreds of thousands. In so
doing we are making what looks to be a kind of essentialist assumption: the
charge or mass of a fundamental particle is not a variable quantity but is
characteristic of the particle so long as it continues to be the particle it is.
In most experiments we do not investigate just the basic properties of
systems, such as charge, but rather more complicated trains of behaviour.
Diagrammatically we may think of Galileo's attempts to study the motions
of balls rolling down inclined planes; or, entirely at the opposite end of the
historical spectrum, the attempts in Stanford's Gravity-Probe-B experiment
to trace the precession of four gyroscopes in space in order to see how they
are affected by the space-time curvature relativistically induced by the earth.
Here too some very strong assumptions must back our willingness to draw a
general conclusion from a very special case. On the surface it may seem that
the licence to generalise in these cases can be put in very local terms that
86
need no reference to natures. We require only the assumption that all systems
so situated as the one in hand will behave identically. But on closer inspection
we can see that this is not enough.
We may begin to see why by considering Hume himself. Hume maintained
the principle 'same cause, same effect'. For him every occurrence is an exemplar of a general principle. It is simply a general fact about the world, albeit
one we can have no sure warrant for, that identically situated systems behave
identically. Hence for Hume the licence to generalise was universal. But not
for us. We cannot so easily subscribe to the idea that the same cause will
always be succeeded by the same effect. Hume assumed the principle to be
true, though not provable. He worried that principles like this one could only
be circularly founded, because they could have no evidence that is not inductive. But nowadays we question not just whether our belief in them can be
well-founded but whether they are true.
Even if we were content with merely inductive warrant, in what direction
does our evidence point? The planetary motions seem regular, as do the successions of the seasons; but in general nature in the mundane world seems
obstinately unruly. Outside the supervision of a laboratory or the closed casement of a factory-made module, what happens in one instance is rarely a
guide to what will happen in others. Situations that lend themselves to generalisations are special, and it is these special kinds of situations that we aim
to create, both in our experiments and in our technology. My central thesis
in this chapter is that what makes these situations special is that they are
situations that permit a stable display of the nature of the process under study,
or the stable display of the interaction of several different natures.
The case is especially strong when we turn from fictional considerations
of ideal reasoning to considerations of actual methodology. Here questions
of true identity of circumstance drop away. We never treat complete descriptions; rather we deal with salient characteristics and relevant similarities.
This is a familiar point. You do not have to specify everything. If the right
combination of factors is fixed, you are in a position to generalise. Yet what
makes a specific combination a right one? What is the criterion that makes
one similarity relevant and another irrelevant? Experiments are designed with
intense care and precision. They take hard work and hard thought and enormous creative imagination. The Gravity-Probe experiment which I mentioned
is an exaggerated example. It will only be set running twenty-five years twenty-five years of fairly continuous effort - after it was initiated, and it
will have involved teams from thirty or forty different locations, each solving
some separate problem of design and implementation.
What can account for our effort to make the experimental apparatus just
so and no other way? Take the Gravity Probe as a case in point. Each effort
is directed to solving a specific problem. One of the very first in the Gravity
87
Probe involved choosing the material for the gyroscopes. In the end they are
to be made of fused quartz, since fused quartz can be manufactured to be
homogeneous to more than one part in 106. The homogeneity is crucial. Any
differences in density will introduce additional precessions, which can be
neither precisely controlled nor reliably calculated, and these would obscure
the nature of the general-relativistic precession that the experiment aims to
learn about.
In this case we can imagine that the physicists designing the experiment
worked from the dictum, which can be formulated without explicit reference
to natures, 'If you want to see the relativistic precession, you had better make
the gyroscope as homogeneous as possible.' They wanted to do that because
they wanted to eliminate other sources of precession. But more than that is
necessary. The total design of the experiment must take account not only of
what else might cause precession but also of what kinds of features would
interfere with the relativistic precession, what kinds of factors could inhibit
it, and what is necessary to ensure that it will, in the end, exhibit itself in
some systematic way. When all these factors are properly treated, we should
have an experiment that shows what the nature of relativistic precession is.
That is the form, I maintain, that the ultimate conclusion will take.
But that is not the immediate point I want to make. What I want to urge
is that, by designing the experiment to ensure that the nature of relativistic
precession can manifest itself in some clear sign, by blocking any interference
and by opening a clear route for the relativistic coupling to operate unimpeded - to operate according to its nature, by doing just this, the GravityProbe team will create an experiment from which it is possible to infer a
general law. At the moment the form of this law is not my chief concern.
Rather, what is at stake is the question, 'What must be true of the experiment
if a general law of any form is to be inferred from it?' I claim that the
experiment must succeed at revealing the nature of the process (or some
stable consequence of the interaction of natures) and that the design of the
experiment requires a robust sense of what will impede and what will facilitate this. The facts about an experiment that make that experiment generalisable are not facts that exist in a purely Humean world.
It is, of course, not really true that my thesis about the correct form of
natural laws is irrelevant to my argument. Put in the most simple-minded
terms, what I point out is the apparent fact that we can generalise from a
single observation in an experimental context just because that context is one
in which all the relevant sources of variation have been taken into account.
Then, after all, what I claim is that it is laws in the form I commend - that
is, laws about natures - that determine what is and what is not relevant. This
sets the obvious strategy for the Humean reply: laws, in the sense of universal
or probabilistic generalisations, determine the relevant factors an experiment
88
Collins 1985.
Still, the knowledge cannot be too implicit. Trivially, where the experiment is to serve as a
test, we must know enough to be assured that the behaviour we see is a manifestation of the
nature of the phenomenon we want to study and not a manifestation of some other side aspect
of the arrangement.
89
I do not mean to suggest that there can be no other basis for this generalisation. Sheer repetition will serve as well, and that is an important aspect of claims like those of Hacking 1983,
that the stable phenomena that are created in the experimental setting have a life of their own
and continue to persist across great shifts in their abstract interpretation.
" Our degree of confidence in the generalisation will be limited, of course, by how certain we
are that our assessment of the situation is correct.
90
91
An objection
92
among all factors that appear in all the universal generalisations supposedly
true in the world which factors are to be fixed, and how, in this particular
experiment?
Perhaps the answer comes one level up. Here I think is where we get the
idea that there might be a relatively small number of fixed, probably articulable, factors that are relevant. We may think in terms of forces, how few in
kind they are; or of long lists of causes and preventives. What is crucial is
that, at the higher level, context seems irrelevant. Either it is or it is not the
case that magnetic fields deflect charged particles or that, as quantum mechanics teaches, an inversion in a population of molecules can cause lasing.
Perhaps we can even find a sufficiently abstract law so that the problem
seems to evaporate. For example if we are thinking of an experiment where
the effect we look for involves particle motions, we turn to the law F = ma,
and that tells us that we must control all sources of forces. I, of course, as
we have seen in chapters 1 and 2, do not think this enough. We must control
all unwanted causes of motion, and not all of these can be represented
properly as forces. For the moment though, for the sake of argument, I am
not pressing this point but will rather accept the more usual fundamentalist
view in order to focus on the other problems for the Humean account. In the
gyroscope experiment the law of choice in this case would be:
Precession: d(nr^)ldt = Tr x njl(os
This formula gives the drift rate (d(nrs)/dt) of a gyrospin vector as a function
of the total torque (P) exerted on the gyro along with its moment of inertia
(/) and its spin angular velocity (cos). From this we learn: control all sources
of torque except that due to the relativistic coupling, as well as any sources
of deviation in the angular velocity and in the moment of inertia.
The difficulty with this advice is that it does not justify the replicability
we expect unless we join to it a commitment to natures, or something very
much like them. To see why, imagine a single successful run of the experiment, successful in the sense that, first, we have indeed managed to set the
total net torque, barring that due to relativistic coupling, equal to zero - or,
as the Gravity Probe hopes to do, at least to an order of magnitude lower
than that predicted for the relativistic effect; and second, it turns out that the
observed precession is just that predicted. We seem to have succeeded in
giving a purely Humean recipe for when to generalise, and this case fits.
Roughly, we can generalise the quantitative relation we see between a designated input (here the relativistic coupling) and the precession actually
observed in a given situation if that situation sets the remaining net torque
equal to zero (or, more realistically, calculates it away), where the rationale
for picking net torque = 0 as the relevant feature comes from the 'Humean
93
association' recorded in the functional law that describes the size of precessions.
The problem is that this does not get us the detailed generalisation we
expect at the lower level. The Gravity-Probe team has worked hard to set the
total net torque extremely low by a large number of specific hard-won
designs; and they are entitled to think that the results are replicable in that
experimental design. What the Humean prescription entitles them to is
weaker. It gives them the right to expect only that on any occasion when the
net nonrelativistic torque is zero, the precession will be the value predicted
from the general theory of relativity. But we expect the more concrete general
claim to hold as well.
Consider the table of design requirements for the gyroscope experiment
(figure 4.1). The table tells how controlled each foreseeable source of torque
must be in order for the total extraneous precession to be an order of magnitude smaller than that predicted from relativistic coupling. Each such source rotor homogeneity, rotor sphericity, housing sphericity, optimum preload, and
so on - presents a special design problem; and for each, the experiment has
a special solution. Using fused quartz to get maximum rotor homogeneity is,
for example, the starting point for the solution of the first problem. What all
this careful planning, honing, and calculation entitles us to is a far more
concrete generalisation than the one above about (near) zero external torque.
We are entitled to infer from a successful run that in any experiment of this
very specific design, the observed precession should be that predicted by the
general theory of relativity.17
The table of requirements highlights the analytic nature of this kind of
experiment. What happens if something goes wrong with the rotor housing
as it was originally planned and the fault cannot be repaired? With a lot of
effort the Probe team will make a new design and slot it into the old general
scheme, making appropriate changes. Because we are working in a domain
where we trust analytic methods, a peculiar kind of sideways induction is
warranted: from the successful run with the original design plus our confidence in the new rotor housing and its placement, we are entitled to infer a
second, highly specific * lower-level' generalisation to the effect that the precession in situations meeting the new design will be that predicted for relativistic coupling as well. Again, the new situation will indeed be one that falls
under the 'Humean' generalisation involving zero torques. What is missing
17
The inference is ceteris paribus, of course - 'so long as nothing goes wrong'. The 'zero
torque' generalisation apparently has the advantage that it needs no such ceteris paribus
clause. But that is a mixed blessing since the advantage is bought at the cost of making 'zero
torque' a concept that is not identifiable independently of its effect. As soon as we begin to
fill in what makes for zero torque, anything we say will inevitably have to contain a ceteris
paribus proviso, as I have argued in chapter 3.
12
< 20cm
r r /r s - 2 x io"13
t / d - 1 % t = 15//in
u; s ~ 170 Hz
Suspension torques
~ 2 x 10~5 Ar"~ 15 in
r
5x 10~7 Ar'~0.4/iin
Explanation
Figure 4.1 Design requirements for a relativity gyroscope with limiting accuracy of 0.5 x 10 16 rad/sec (0.3 milliarc-sec/year).
Source: Everitt 1980.
<0.1
<10"7g
<io- lo g
Preload symmetry
Centring accuracy
Optimum spin speed
Torque switching ratio
for spin up system
Distance from drag-free
proof mass
Optimum preload
Housing sphericity
Rotor homogeneity
Other
Suspension
systems
Mechanical
parts
Requirement
95
is the connection. The new situation is one of very small extraneous torque;
but the expectation that it should be cannot be read from the regularities of
nature.
The regularity theorist is thus faced with a dilemma. In low-level, highly
concrete generalisations, the factors are too intertwined to teach us what will
and what will not be relevant in a new design. That job is properly done in
physics using far more abstract characterisations. The trouble is that once we
have climbed up into this abstract level of law, we have no device within a
pure regularity account to climb back down again.
6
96
deriving it from experiments concluding positively and directly.' Or, 'If the
Experiments, which I urge, be defective, it cannot be difficult to show the
defects; but if valid then by proving the theory they must render all objections
invalid.' One last remark to illustrate the steadfastness of Newton's views on
the role of the experimentum crucis in proving this claim appears in Newton's
letter of 1676,21 four years after his initial report to the Royal Society. This
letter concerned the difficulties Anthony Lucas had reported in trying to
duplicate Newton's experiments and also some of Lucas' own results that
contradicted Newton's claims. Newton replies, 'Yet it will conduce to his
more speedy and full satisfaction if he a little change the method he has
propounded, and instead of a multitude of things try only the Experimentum
Crucis. For it is not number of experiments, but weight to be regarded; and
where one will do, what need many?'
Goethe's point of view is entirely opposite to Newton's: 'As worthwhile
as each individual experiment may be, it receives its real value only when
united or combined with other experiments . . . I would venture to say that
we cannot prove anything by one experiment or even several experiments
together.'22 For Goethe, all phenomena are connected together, and it is essential to follow through from each experiment to another that 'lies next to it or
derives directly from it'. According to Goethe, 'To follow every single
experiment through its variations is the real task of the scientific researcher.'
This is illustrated in his own work in optics where he produces long series
of 'contiguous' experiments, each of which is suggested by the one before
it. The point is not to find some single set of circumstances that are special
but rather to lay out all the variations in the phenomena as the circumstances
change in a systematic way. Then one must come to see all the interrelated
experiments together and understand them as a whole, 'a single piece of
experimental evidence explored in its manifold variations'.
Goethe is sharp in his criticisms of Newton. Two different kinds of criticism are most relevant here. The first is that Newton's theory fails to account
for all the phenomena it should and that is no surprise since Newton failed
to look at the phenomena under a sufficient range of variation of circumstance. Second, Newton's inferences from the experiments he did make were
not valid; the experimentum crucis is a case in point. The chief fault which
Goethe finds with Newton's inferences is one that could not arise in Goethe's
method. Newton selects a single revealing experiment to theorise from; since
he does not see how the phenomena change through Goethe's long sequences
of experiments, he does not recognise how variation in circumstance affects
the outcome: '[Newton's] chief error consisted in too quickly and hastily
21
22
Newton's second letter (1676) to the Royal Society, quoted from Newton 1959-76.
Goethe 1812 [1988].
Windowshut
Board 2
Figure 4.2 Newton's experimentum crucis. Source: Sepper 1988; recreated by G. Zouros.
Board 1
Screen
98
setting aside and denying those questions that chiefly relate to whether
external conditions cooperate in the appearance of colour, without looking
more exactly into the proximate circumstances/23
The crucial experiment involves refracting a beam of light through a prism,
which elongates the initial narrow beam and 'breaks' it into a coloured band,
violet at the top, red at the bottom. Then differently coloured portions of the
elongated beam are refracted through the second prism. Consider figure 4.2,
which is taken from Dennis L. Sepper's study, Goethe contra Newton. In all
cases the colour is preserved, but at one end of the elongated beam the second
refracted beam is elongated more than it is at the other. In each case there is
no difference in the way in which the light falls on the prism for the second
refraction. Newton immediately concludes, 'And so the true cause of the
length of the image was detected to be no other than that light consists of
rays differently refrangible.'24
We should think about this inference in the context of my earlier cursory
description of the modern version of the deductive method, called 'bootstrapping' by Clark Glymour,25 who has been its champion in recent debates. In
the bootstrapping account, we infer from an experimental outcome to a scientific law, as Newton does, but only against a backdrop of rather strong
assumptions. Some of these assumptions will be factual ones about the specific arrangements made - for example, that the angle of the prism was 63;
some will be more general claims about how the experimental apparatus
works - the theory of condensation in a cloud chamber, for instance; some
will be more general claims still - for example, all motions are produced by
forces; and some will be metaphysical, such as the 'same cause, same effect'
principle mentioned above. The same is true of Newton's inference. It may
be a perfectly valid inference, but there are repressed premises. It is the
repressed premises that Goethe does not like. On Goethe's view of nature,
they are not only badly supported by the evidence; they are false. Colours,
like all else in Goethe's world,26 are a consequence of the action of opposites,
in this case light and darkness:
We see on the one side light, the bright; on the other darkness, the dark; we bring
23
24
25
26
99
violet
violet
Figure 4.3.
what is turbid between the two [such as a prism or a semitransparent sheet of paper],
and out of these opposites, with the help of this mediation, there develop, likewise in
an opposition, colors.27
violet
white
Figure 4.4.
27
100
the prism functions in the same way in both cases: it just transports the
coloured light through, bending it in accord with its fixed degree of refrangibility.
Consider an analogous case. You observe a large, low building. Coloured
cars drive through. Cars of different colours have different fixed turning radii.
You observe for each colour that there is a fixed and colour-dependent angle
between the trajectory on which the car enters the building and the trajectory
on which it exits; moreover, this is just the angle to be expected if the cars
were driven through the building with steering wheels locked to the far left.
Besides cars, other vehicles enter the building, covered; and each time a
covered vehicle enters, a coloured car exits shortly afterward. It exists at just
that angle that would be appropriate had the original incoming vehicle been
a car of the same colour driven through with its steering wheel locked. Two
hypotheses are offered about what goes on inside the building. Both hypotheses treat the incoming coloured cars in the same way: on entering the
building their steering wheels get locked and then they are driven through.
The two hypotheses differ, however, about the covered vehicles. The first
hypothesis assumes that these, too, are coloured cars. Inside the building they
get unwrapped, and then they are treated just like all the other coloured cars.
The second hypothesis is more ambitious. It envisages that the low building
contains an entire car factory. The covered vehicles contain raw material, and
inside the building there are not only people who lock steering wheels, but a
whole crew of Fiat workers and machinery turning raw materials into cars.
Obviously, the first hypothesis is simpler, but it has more in its favour than
that. For so far, the second hypothesis has not explained why the manufactured cars exit at the angle they do, relative to their incoming raw materials;
and there seems to be no immediate natural account to give on the second
story. True, the cars are manufactured with fixed turning radii, but why
should they leave the factory at just the same angle relative to the cart that
carries in their raw materials as a drive-through does relative to its line of
entry? After all, the manufactured car has come to exist only somewhere
within the factory, and even if its steering wheel is locked, it seems a peculiar
coincidence should that result in just the right exit point to yield the required
angle vis-a-vis the raw materials. In this case, barring other information, the
first, Newtonian, hypothesis seems the superior. The caveat, 'barring other
information', is central, of course, to Goethe's attack. For, as I have already
remarked, Goethe was appalled at the small amount of information that
Newton collected, and he argued that Newton's claim was in no way adequate
to cover the totality of the phenomena. What looks to be the best hypothesis
in a single case can certainly look very different when a whole array of
different cases have to be considered.
The principal point to notice, for my purpose, is that the argument is not
101
102
window illuminates the pencil's shadow from the candle ('the shadow will
appear of the most beautiful blue'31). Even when described from the point of
view of Goethe's final account of colour formation, in the prism experiments
Goethe is not looking at light but rather at light (or darkness)-ininteraction-with-a-turbid-medium.
Newton focuses on his one special experiment and maintains that the
account of the phenomena in that experiment will pinpoint an explanation
that is generalisable. The feature that explains the phenomena in that situation
will explain phenomena in other situations; hence he looks to a feature that
is part of the inner constitution of light itself. To place it in the inner constitution is to cast it not as an observable property characteristic of light but rather
as a power that reveals itself, if at all, in appropriately structured circumstances. To describe it as part of light's constitution is to ascribe a kind of
permanence to the association: light retains this power across a wide variation
in circumstance - indeed, probably so long as it remains light. That is, I
maintain, to treat it as an Aristotelian-style nature. This is why Newton,
unlike Goethe, can downplay the experimental context. The context is there
to elicit the nature of light; it is not an essential ingredient in the ultimate
structure of the phenomenon.
7
103
through the experiment with the two prisms, the underlying nature expresses
itself in a clearly visible behaviour: the colours are there to be seen, and the
purely dispositional property, degree-of-refrangibility, is manifested in the
actual angle through which the light is bent. The experiment is brilliantly
constructed: the connection between the natures and the behaviour that is
supposed to reveal them is so tight that Newton takes it to be deductive.
Goethe derides Newton for surveying so little evidence, and his worries
are not merely questions of experimental design: perhaps Newton miscalculated, or mistakenly assumed that the second prism was identical in structure
with the first, or Newton takes as simple what is not . . . Goethe's disagreement with Newton is not a matter of mere epistemological uncertainty. It is
rather a reflection of deep ontological differences. For Goethe, all phenomena
are the consequence of interaction between polar opposites. There is nothing
in light to be isolated, no inner nature to be revealed. No experiment can
show with a single result what it is in the nature of light to do. The empiricists
of the scientific revolution wanted to oust Aristotle entirely from the new
learning. I have argued that they did no such thing. Goethe, by contrast, did
dispense with natures; there are none in his world picture. But there are, I
maintain, in ours.
ACKNOWLEDGEMENTS
This chapter is a slightly shortened version of Cartwright 1992. I owe special thanks
to Hasok Chang, as well as to the members of the Philosophy of Language and
Science Seminar and the Romanticism and Science Seminar at Stanford University,
fall and winter terms, 1989/90. This chapter is dedicated to Eckart Forster.
This book takes its title from a poem by Gerard Manley Hopkins. Hopkins
was a follower of Duns Scotus; so too am I. I stress the particular over the
universal and what is plotted and pieced over what lies in one gigantic plane.
This book explores the importance of what is different and separate about
the sciences as opposed to what is common among them. The immediate
topic of the present chapter is the Markov condition. This condition lies at
the heart of a set of very well grounded and powerful techniques for causal
inference developed during the last fifteen years by groups working at Carnegie Melon University with Clark Glymour and Peter Spirtes1 and at UCLA
with Judea Pearl2 using directed acyclic graphs (DAGs). I think of it as a
small case study in particularism. Contrary to what I take to be the hopes of
many who advocate DAG-techniques, I argue that the methods are not universally applicable, and even where they work best, they are never sufficient
by themselves for causal inferences that can ground policy recommendations.
For sound policy, we need to know not only what causal relations hold but
what will happen to them when we undertake changes. And that requires that
we know something about the design of the nomological machine that generates these causal relations.
Section 2 of this chapter points out that the Markov condition is in general
not satisfied by causes that act probabilistically, though it may be appropriate
for many kinds of deterministic models. I use this to illustrate that there is a
great variety of different kinds of causes and that even causes of the same
kind can operate in different ways. Consider a causal law of the form 'X
causes F : e.g., 'Sparking an inverted population causes coherent radiation.'
This could be true because sparking precipitates the coherent radiation, which
is already waiting in the wings; or because it removes a prior impediment;
or because it acts to create a coherent radiation that would never have been
1
2
104
105
possible otherwise. Does the sparking cause coherent radiation in a way that
also tends to decohere it? Does it do so deterministically? Or probabilistically? The term 'cause' is highly unspecific. It commits us to nothing about
the kind of causality involved nor about how the causes work. Recognising
this should make us more cautious about investing in the quest for universal
methods for causal inference. Prima facie, it is more reasonable to expect
different kinds of causes operating in different ways or embedded in differently structured environments to be tested by different kinds of statistical
tests.
Section 3 rehearses some problems generated for the Markov condition by
mixing, that is by combining together different populations in which either
different causal relations hold for the same set of factors or different probability measures. These problems are well known; the fact that there can be
different causal relations among the very same features in different circumstances as well as different probabilities is widely recognised. This suggests
that neither the causal relations studied by these techniques nor the probabilities are fundamental. Rather, as I have been arguing, both depend on the
stable operation of a nomological machine. The most straightforward way to
try to save the Markov condition in the face of the mixing problem is to treat
the design of the machine itself as just another cause, adding to the Book of
Nature a great number of new law claims in which it is supposed to figure,
both causal laws and probabilities, and always treating the new law claims
as just the same in type as the old ones. Reflecting on the kinds of peculiarities and difficulties that this strategy creates will help us to understand why
we need new concepts and new methods to understand where the laws of
nature come from.
2
2.1
Although the techniques I discuss here have been developed both by Pearl's
group and by the Glymour/Spirtes group, I will centre my discussion on the
latter, which I know better, and in particular on the 1993 book, Causation,
Prediction and Search by Peter Spirtes, Clark Glymour and Richard
Schemes. Spirtes, Glymour and Schemes assume that causality and probability are distinct but related features of the world. They suppose that the causal
laws relating a set of variables can be represented in a directed acyclic graph
like that in figure 5.1, and in addition that Nature provides a probability
measure over these variables. The first job is to say what relations can be
assumed to hold between the causal structure depicted in a directed graph
and the associated probability measure. Given these relations, Spirtes,
106
X
Y
B
ZQ
Zj
Z2
Z3
fumigants
yields
the population of birds and other predators
last year's eel worm population
eelworm population before treatment
eelworm population after treatment
eelworm population at the end of the season
107
Because of the central role that probabilities play, one might think that the
methods employed by Pearl's group and by Spirtes, Glymour and Scheines
are methods that can allow us to study probabilistic causes, causes that operate purely probabilistically, without determinism in the offing. This is not
entirely so. Why, after all, should we accept the Markov condition? That is
the question that starts off section 3.5 of Causation, Prediction and Search:
'When and why should it be thought that probability and causality together
satisfy these conditions, and when can we expect the conditions to be violated?'3 The next sentence but one gives one of their two principal answers.
'If we consider probability distributions for the vertices of causal graphs
of deterministic or pseudo-indeterministic systems in which the exogenous
variables are independently distributed, then the Markov condition must be
satisfied.'4 (The exogenous variables in a graph are the ones with no causal
inputs pictured in the graph.)
To understand what this means we must realise that what we can hope for
with one of these graphs is accuracy; we never expect completeness. So we
want a graph whose causal claims are a correct though incomplete representation of the causal structure that obtains in the world. Consider the relation of
a particular effect to its causal parents. On most graphs this relation will be
indeterministic: the values of the parent variables do not fix the value of the
3
4
108
effect variable. But the indeterminism may not be genuine. The structure will
be pseudo-indeterministic if it can be embedded in another more complete
graph that also makes only correct claims and in which the parents of the
given effect are sufficient to fix the value of the effect.
When we do have a graph in which the parents of a given effect fix its
value in this way, then the Markov condition is trivially true, since the probabilities are always either 0 or 1. But deterministic causes of this kind seem
to be few and far between in nature, and it is important to be clear that the
Markov condition will not in general hold for genuine probabilistic causality,
where a richer set of choices confronts Nature. When causality is probabilistic, a question arises for Nature that does not have to be addressed for purely
deterministic causes: what are the probabilistic relations among the operations of the cause to produce its various effects? In the case of determinism
the answer is trivial. Whenever the cause occurs, it operates to produce each
and every one of its effects. But when it can occur and nevertheless not
operate, some decision is required about how to marshal its various operations together.
Screening off offers one special solution - the one in which all operations
occur independently of each other. Sociology regularly provides us with cases
modelled like this. Consider the familiar example of Hubert Blalock's negative correlation between candy consumption and divorce, which disappears
if we condition on the common causal parent, age. Age operates totally independently in its action to decrease candy consumption from its operation to
increase the chance of divorce. This is a kind of 'split-brain' model of the
common cause. The single cause, age, sets about its operations as if it were
two separate causes that have nothing to do with each other. It is as if the
cause had two independent agents in one head. But the split-brain is by no
means the most typical case. We confront all different kinds of joint operations everyday, both in ordinary life and, more formally, throughout the social
sciences.
Consider a simple example. Two factories compete to produce a certain
chemical that is consumed immediately in a nearby sewage plant. The city is
doing a study to decide which to use. Some days chemicals are bought from
Clean/Green; others from Cheap-but-Dirty. Cheap-but-Dirty employs a genuinely probabilistic process to produce the chemical. The probability of getting
the desired chemical on any day the factory operates is eighty per cent. So
in about one-fifth of the cases where the chemical is bought from Cheap-butDirty, the sewage does not get treated. But the method is so cheap the city
is prepared to put up with that. Still they do not want to buy from Cheap-butDirty because they object to the pollutants that are emitted as a by-product
whenever the chemical is produced.
That is what is really going on. But Cheap-but-Dirty will not admit to it.
109
They suggest that it must be the use of the chemical in the sewage plant
itself that produces the pollution. Their argument relies on the screening-off
condition. If the factory were a common parent, C, producing both the chemical X and the pollutant 7, then (assuming all other causes of X and Y have
already been taken into account) conditioning on which factory was
employed should make the chemical probabilistically independent from the
pollutant, they say. Their claim is based on the screening-off condition.
Screening off requires that conditioning on the joint case C should make the
probabilities for its two effects factor. We should expect then:
Prob(XF/C) = Prob(X/C)Prob(F/C)
Cheap-but-Dirty is indeed a cause of the chemical X, but they cannot then be
a cause of the pollutant Y as well, they maintain, since
Prob(X*7C) = 0.8 =* (0.8) x (0.8) = Prob(X/C)Prob(>7C)
Indeed the conditional probabilities do not factor, but that does not establish
Cheap-but-Dirty's innocence. In this case it is very clear why the probabilities
do not factor. Cheap-but-Dirty's process is a probabilistic one. Knowing that
the cause occurred will not tell us whether the product resulted or not.
Information about the presence of the by-product will be relevant since this
information will tell us (in part) whether, on a given occasion, the cause
actually 'fired'.
We may think about the matter more abstractly. Consider a simple case of
a cause, C, with two separate yes-no effects, X and Y. Looking at the effects,
we have in this case an event space with four different outcomes:
110
realm.' Spirtes, Glymour and Schemes do not say this anywhere in the
book. Still, the fact that the Markov condition is true of pseudoindeterministic systems is one of their two chief reasons for thinking the
Markov condition holds for 'macroscopic natural and social systems for
which we wish causal explanations'.5 This argument supposes, I think
wrongly, that there are no macroscopic situations that can be modelled
well by quantum theory but not by classical theory. But that is a minor
point. Far more importantly, it confuses the world with our theories of it.
Classical mechanics and classical electromagnetic theory are deterministic.
So the models these theories provide, and derivately any situations that
can be appropriately represented by them, should have causal structures
that satisfy screening off. The same can not be expected of quantum
models and the situations they represent.
But what does this tell us about the macroscopic world in general, and in
particular about the kinds of medical, social, political or economic features
of it that DAG techniques are used to study? The macroscopic world is to all
appearances neither deterministic nor probabilistic. It is for the most part an
unruly mess. Our job is to find methods of modelling the phenomena of
immediate concern that will provide us with as much predictive power as
possible. Just as with social statistics, where we realise that if we want to
model the associations of quantities of interest we will have more success
with probabilistic models than with deterministic ones, so too with causality:
models of probabilistic causality have a far greater representational range
than those restricted to deterministic causality.
This is not an unrecognised lesson. Consider very simple linear models
with error terms that, following the early work of the Cowles Commission,
may be used in econometrics to represent causal relations:
x2 = a2lxx + u2
The convention for these representations is that the variables written on the
right-hand side in each equation are supposed to pick out a set of causal
parents for the feature designated by the variable on the left-hand side. The
WjS represent, in one heap, measurement error, omitted factors, and whatever
truly probabilistic element there may be in the determination of an effect by
its causes. This last is my concern here. Simultaneous linear equation models
look deterministic (fixing the values of the variables on the right-hand side
completely fixes the values of the variables on the left), but they can neverthe5
Ibid., p. 64.
111
less be used to represent genuine probabilistic causality via the 'error terms'
that typically occur in them. For other matters, life is made relatively simple
when we assume that the error terms are independent of each other6 (M, is
independent of u}, for i ^ j): the system is identifiable, simple estimators work
well, etc. But when the concern is how causes operate, these independence
conditions on the error terms are more or less identical to the independence
conditions laid down by the Markov condition for the special case of linear
models.
One major trend in econometrics has moved away from these kinds of
simple across-the-board assumptions. We are told instead that we have to
model the error terms in order to arrive at a representation of the processes
at work in generating the probabilistic relations observed.7 So these models
need not be committed to the screening-off constraint on the operation of
probabilistic causes. Violations of this constraint will show up as correlations
between error terms in different equations.
The use of error terms in simultaneous equations models is not a very
sensitive instrument for the representation of probabilistic causality, however.
Typically there is one error term for each dependent variable - i.e., for each
effect - whereas what ideally is needed is one term per causal factor per
effect, to represent the probabilities with which that cause 'fires' at each level
to contribute to the effect. As it is (even subtracting out the alternative roles
for the error terms), they still can represent only the overall net probabilistic
effect of all the causes at once on a given dependent variable. A finer-grained
representation of exactly how a given cause contributes to its different effects
is not provided. Thus the simultaneous equation models can admit violations
of the screening-off condition, but they are not a very sensitive device for
modelling them.
The DAG approach also has a method that allows for the representation
of purely probabilistic causality in a scheme that looks formally deterministic.
To construct this kind of representation, we pretend that Nature's graph is
indeed deterministic, but that some of the determining factors are not known
to us nor ever would be. The set of vertices is then expanded and divided
into two kinds: those which stand for unknown quantities, including ones that
it will never be possible to know about (represented by, say, empty circles
as in figure 5.1) and those that stand for quantities that we do know about
(represented by, say, solid circles). It is very easy then to construct examples
in which the independence condition is violated for the subgraph that includes
6
This in fact is a really powerful motive for aiming for models with independent error terms.
For attempts to find methods of inference that are robust even in the face of correlation, see
for instance Toda and Phillips 1993.
For an example of a case that has proved incredibly hard to model with independent errors,
see Ding et al. 1993.
112
only the vertices standing for known quantities, but not for the full graph
which we have introduced as a representational fiction.
The trouble with this strategy is that it tends to be methodologically
costly. There would in principle be no harm to the strategy if it were
kept clearly in mind that it, like the use of error terms in simultaneous
equations models, is a purely representational device. But that may be far
removed from what happens in practice. In a number of social science
applications I have observed, quite the opposite is the case. Rather than
first taking an informed case-by-case decision about whether, for the case
at hand, we are likely to be encountering causes that act purely probabilistically, we instead set out on a costly and often futile hunt to fill in the
empty circles. The motive may be virtuous: the DAG theorems justify a
number of our standard statistical methods for drawing causal inferences
if we can get our graphs complete enough. Attempting to fill in the graphs
is a clear improvement over much social science practice that uses the
familiar inference techniques without regard for whether they are justified.
But the far better strategy is to fit our methods to our best assessment of
the individual situations we confront. This means that we may generally
have to settle for methods - like the old-fashioned hypothetico-deductive
method - that provide less certainty in their conclusions than those offered
by DAG proponents, but are more certainly applicable.
We began this section with the observation that the Markov condition can
be assumed wherever the causal structures in our graphs are not genuinely
indeterministic but only pseudo-indeterministic.
Before closing, we should
look at other reasons Spirtes, Glymour and Schemes themselves give in support of the Markov condition. These appear in the concluding two paragraphs
of their section 3.5:
The Causal Markov Condition is used all the time in laboratory, medical and engineering settings, where an unwanted or unexpected statistical dependency is prima facie
something to be accounted for. If we give up the Condition everywhere, then a statistical dependency between treatment assignment and the value of an outcome variable
will never require a causal explanation and the central idea of experimental design
will vanish. No weaker principle seems generally plausible; if, for example, we were
to say only that the causal parents of Y make Y independent of more remote causes,
then we would introduce a very odd discontinuity: So long as X has the least influence
on Y, X and Y are independent conditional on the parents of X. But as soon as X has
no influence on Y whatsoever, X and Y may be statistically dependent conditional on
the parents of Y.
The basis for the Causal Markov Condition is, first, that it is necessarily true of
populations of structurally alike pseudo-indeterministic systems whose exogenous
variables are distributed independently, and second, it is supported by almost all of
our experience with systems that can be put through repetitive processes and whose
fundamental propensities can be tested. Any persuasive case against the Condition
113
would have to exhibit macroscopic systems for which it fails and give some powerful
reason why we should think the macroscopic natural and social systems for which we
wish causal explanations also fail to satisfy the condition. It seems to us that no such
case has been made.8
I shall look at each point in turn.
How do we eliminate factors as causes when they have 'an unwanted or
unexpected' correlation with the target effects of the kind I have described?
In particular what kind (if any) of probabilistic test will help us tell a byproduct from a cause? The kind of worry I raise is typical in medical cases,
where we are very wary of the danger of confusing a co-symptom with a
cause. For just one well-known example to fix the structure, recall R. A.
Fisher's hypothesis that, despite the correlation between the two, smoking
does not cause lung cancer. Instead, both are symptoms of a special gene.
One good strategy for investigating hypotheses like this where we are
unwilling to assume that the putative joint cause must satisfy the Markov
condition9 is to conduct a randomised treatment-control experiment with the
questionable factor as treatment. In an ideal experiment of this kind, we must
introduce the treatment by using a cause that has no way to produce the effect
other than via the treatment.10 That means that we must not use a common
cause of the effect and the treatment; so in particular we must not introduce
the treatment by using a cause which has both the treatment and the target
effect as by-products. If the probability of the effect in the treatment group
is higher than in the control group we have good reason to think the treatment
is a genuine cause and not a joint effect. This line of reasoning works whether
or not the possible joint cause11 (the gene in Fisher's example) screens off
the two possible effects (smoking and lung cancer) or not. So we do not need
to use the Markov condition in medical, engineering and laboratory settings
to deal with 'unwanted' correlations. But even if we had no alternatives, it
would not be a good idea to assume the principle just for the sake of having
something to do. If we do use it in a given case, the confidence of our results
must be clearly constrained by the strength of the evidence we have that the
causes in this case do satisfy the principle.
Next, Spirtes, Glymour and Scheines claim that if we give up the Markov
principle everywhere, probabilistic dependencies will not require explanation.
8
9
10
II
114
I do not see why this claim follows, though it clearly is true that without the
Markov condition, we would sometimes have different kinds of explanations.
But at any rate, I do not think anyone claims that it fails everywhere.
As to the weaker condition they discuss, this is the one that I said in section
2.1 would be more natural to call a 'Markov' condition. I will argue in the
next section that the ideas motivating this kind of condition may not always
be appropriate when we are discussing laws, as we are here, rather than
singular causal claims. But that is beside the point right now. The point here
is that this weaker principle excludes causal action across temporal gaps.
What Spirtes, Glymour and Scheines complain about is that it does not talk
about anything else. But it does not seem odd to me that a principle should
confine itself to a single topic. I admit that if all we know about how a given
set of causes acts is that it satisfies this weaker principle, we will not be in a
very strong position to make causal inferences. But again, that is not a reason
to supplement what we may have good reason to assume about the set, by
further principles, like screening off, that we have no evidence for one way
or another.
Next, we see that Spirtes, Glymour and Scheines claim as their second
major defence of screening off that it is supported by 4 almost all of our
experience with systems that can be put through repetitive processes and
whose fundamental propensities can be tested'.12 Here, it seems to me, they
must be failing to take account of all those areas of knowledge where probabilistic causality is the norm, and where products and by-products, sets of
multiple symptoms, and effects and side-effects are central topics of concern.
Drug studies are one very obvious - and mammoth - example. We have vast
numbers of clinical trials establishing with a high degree of certainty stable
probabilistic relations between a drug and a variety of effects, some beneficial
and some harmful; and independence among the effects conditional on the
drug being used is by no means the norm.
Last, do we have 'powerful reason' for thinking the macroscopic systems
we want to study fail the screening-off condition? I have given three kinds
of reasons:
(1) In a great variety of areas we have very well tested claims about causal
and probabilistic relations and those relations do not satisfy the screening-off
condition.13
(2) Determinism is sufficient for screening off. But our evidence is not
sufficient for universal determinism. To the contrary, for most cases of causality we know about, we do not know how to fit even a probabilistic model,
12
13
Ibid., p. 64.
Though of course they are consistent with the claim that there are in fact missing (unknown)
causal laws that, when added, will restore screening off.
115
116
away, where we have represented this on a DAG with C as the causal parent
of X and Y. We know there must be some continuous causal process that
connects C with X and one that connects C with Yy and the state of those
processes at any later time t2 must contain all the information from C that is
relevant at that time about X and Y. Call these states Px and PY. We are then
justified in drawing a more refined graph in which Px and PY appear as the parents of X and Y, and on this graph the independence condition will be satisfied
for X and for Y (although not for Px and PY). Generalising this line of argument
we conclude that any time a set of factors on an accurate graph does not satisfy
the independence condition, it is possible to embed that graph into another
accurate graph that does satisfy independence for that set of factors.
What is wrong with this argument is that it confuses singular causal claims
about individual events that occur at specific times and places (e.g., 'My
taking two aspirins at 11:00 this morning produced headache relief by noon')
with general claims about causal relations between kinds of events (e.g.,
'Taking aspirin relieves headaches an hour later') - claims of the kind we
identify with scientific laws. Simultaneous equation models, DAGs, and path
diagrams are all scientific methods designed for the representation of genericlevel claims; they are supposed to represent causal laws (which are relations
between kinds), not singular causal relations (which hold between individual
events). Arguments that convince us that there must be a continuous process
to connect every individual causal event with its distant effect event do not
automatically double as arguments to establish that we can always put more
vertices between causes and effects in a graph that represents causal laws.
Although it is necessary for the causal message to be transmitted somehow
or other from each individual occurrence of a cause-kind to the individual
occurrence of its effect-kind, the processes used to carry the message can be
highly varied and may have nothing essential in common. The transmission
of the causal message between the given instance of a cause-kind and the
associated instance of the effect-kind can piggy-back on almost anything that
follows the right spatio-temporal route and has an appropriate structure. For
some causal laws the cause itself may initiate the connecting process, and in
a regular law-like way. For these laws there will be intermediate vertices on
a more refined graph. But this is by no means necessary, nor I think, even
typical.
In principle all that is needed is to have around enough processes following
the space-time routes that can carry the right sort of 'message' when the
causes operate. The connection of these processes with either the general
cause-kind or the general effect-kind need have none of the regularity that is
necessary for a law-claim. Glenn Shafer makes a similar point in his defence
of the use of probability trees over approaches that use stochastic processes.
As Shafer says, 'Experience teaches us that regularity can dissolve into
117
irregularity when we insist on making our questions too precise, and this
lesson applies in particular when the desired precision concerns the timing
of cause and effect.'15
This point is reinforced when we realise that the kind of physical and
institutional structures that guarantee the capacity of a cause to bring about
its effect may be totally different from those that guarantee that the causal
message is transmitted. Here is an example I will discuss at greater length in
chapter 7. My signing a cheque at one time and place drawn on the Royal
Bank of Scotland in your favour can cause you to be given cash by your
bank at some different place and time, and events like the first do regularly
produce events like the second. There is a law-like regular association and
that association is a consequence of a causal capacity generated by an elaborate banking and legal code. Of course the association could not obtain if it
were not regularly possible to get the cheque from one place to the other.
But it is; and the ways to do so are indefinitely varied and the existence of
each of them depends on quite different and possibly unconnected institutional and physical systems: post offices, bus companies, trains, public streets
to walk along, legal systems that guarantee the right to enter the neighbourhood where the bank is situated, and so on and so on.
The causal laws arising from the banking system piggy-back on the vast
number of totally different causal processes enabled by other institutions, and
there is nothing about those other processes that could appropriately appear
as the effect of cheque-signing or the parent of drawing-cash-on-the-cheque.
We might of course put in a vertex labelled * something or other is happening
that will ensure that the signed cheque will arrive at the bank', but to do that
is to abandon one of the chief aims we started out with. What we wanted was
to establish causal relations by using information that is far more accessible,
information about readily identifiable event-types or easily measured quantities and their statistical associations. Here we have fallen back on a variable
that has no real empirical content and no measurement procedures associated
with it.
2.4
The arguments here clearly do not oppose the Markov condition tout court.
They show rather that it is not a universal condition that can be imposed
willy-nilly on all causal structures. The same is true for a second important
condition that DAG-techniques employ: the Faithfulness condition. This condition is roughly the converse of screening off. Screening off says that, once
the parents of a factor have been conditioned on, that factor will be independ15
Shafer 1997, p. 6.
118
I have argued that neither the Markov nor the Faithfulness condition can
serve as weapons in an arsenal of universal procedures for determining
whether one factor causes another. I should like to close with some brief
consideration of why this is so. The mistake, it seems to me, is to think that
there is any such thing as the causal relationship for which we could provide
a set of search procedures. Put that way, this may seem like a Humean point.
But I mean the emphasis to be on the the in the expression 'no such thing as
the causal relationship'. There are such things as causal relations, hundreds
of thousands of them going on around us all the time. We can, if we are
sufficiently lucky and sufficiently clever, find out about them, and statistics
119
can play a crucial role in our methods for doing so. But each causal relation
may be different from each other, and each test must be made to order.
Consider two causal hypotheses I have worked with. The first is the principle of the laser: sparking an inverted population stimulates emission of
highly coherent radiation. The second is an important principle we have
looked at in chapter 1, the principle that makes fibre bundles useful for communication: correct matching of frequency to materials in a fibre bundle narrows the Doppler broadening in packets travelling down the fibre. In both
cases we may put the claims more abstractly, with the causal commitment
made explicit: 'Sparking an inverted population causes coherent radiation'
and 'Correct frequency matching causes wave-packets to retain their shapes.'
Now we have two claims of the familiar form, 'X causes Y.' But no one
would take this to mean that the same relation holds in both cases or that the
same tests can be applied in both cases. Yet that is just the assumption that
most of our statistical methodology rests on, both methodologies that use
DAG-techniques and methodologies that do not.
I have argued against the assumption that we are likely to find universal
statistical methods that can decide for us whether one factor causes another.
But we must not follow on from that to the stronger view that opponents of
causal interference often take, that we cannot use statistics at all to test
specific causal hypotheses. We all admit the truism that real science is difficult; and we are not likely to find any universal template to carry around
from one specific hypothesis to another to make the job easy. In so far as DAG
methods purport to do that, it is no surprise that they might fail. Yet with
care and caution, specialists with detailed subject-specific knowledge can and
do devise reliable tests for specific causal hypotheses using not a universally
applicable statistical method but rather a variety of statistical methods - DAG
techniques prominent among them - different ones, to be used in different
ways in different circumstances, entirely variable from specific case to specific case. Science is difficult; but it has not so far proved to be impossible.
If, as I claim, there is no such thing as the causal relation, what are we to
make of claims of the form 'X causes Y' - e.g., 'Correct frequency matching
causes wave-packets to retain their shape'? Consider first, what do we mean
by 'cause' here? The Concise Oxford English Dictionary tell us that 'to
cause' is 'to effect', 'to bring about' or 'to produce'. Causes make their
effects happen. That is more than, and different from, mere association. But
it need not be one single different thing. One factor can contribute to the
production or prevention of another in a great variety of ways. There are
standing conditions, auxiliary conditions, precipitating conditions, agents,
interventions, contraventions, modifications, contributory factors, enhancements, inhibitions, factors that raise the number of effects, factors that only
raise the level, etc.
120
But it is not just this baffling array of causal roles that is responsible for
the difficulties with testing - it is even more importantly the fact that the
term 'cause' is abstract. It is abstract relative to our more humdrum action
verbs in just the sense introduced in chapter 2: whenever it is true that 'X
causes Y\ there will always be some further more concrete description that
the causing consists in. This makes claims with the term 'cause' in them
unspecific: being abstract, the term does not specify the form that the action
in question takes. The cat causes the milk to disappear; it laps it up. Bombarding the population of atoms of a ruby-rod with light from intense flash
lamps causes an inversion of the population; it pumps the population to an
inverted state. Competition causes innovation. It raises the number of patents;
it lowers production costs.
This matters for two reasons. The first is testing. Reliable tests for whether
one factor causes another must be finely tuned to how the cause is supposed
to function. With fibre bundles, for example, we proceed dramatically differently for one specification, say, 'Correct matching gates the packets', from
the different causal hypothesis, 'Correct matching narrows the packets.' The
two different hypotheses require us to look for stability of correlation across
different ranges of frequencies and materials, so the kinds of statistical tests
appropriate for the first are very different from the kinds appropriate for the
second. The second reason is policy. We may be told that X causes Y. But
what is to be done with that information is quite different if, for example, X
is a precipitating cause from what we do if it is a standing cause, or an active
cause versus the absence of an impediment.
Here is a homely analogue to illustrate the problems we have both in
testing claims that are unspecific and in using them to set policy. I overheard
my younger daughter Sophie urging her older sister Emily to turn off the
television: 'Term has just begun and you know mama is always very irritable
when she has to start back to teaching.' Sophie's claim, 'mama is irritable',
was true but unspecific. It provided both a general explanation for a lot of
my peculiar behaviour of late and also general advice: 'Watch your step.'
But Emily wanted to know more. 'Don't you think I can get away with just
another half hour on the telly?' Sophie's hypothesis could not answer. Nor,
if called upon, could she test it. Behaviourist attempts to operationalise irritability failed not just because mental states are not reducible to patterns of
behaviour but equally because descriptions like irritable are unspecific.
Claims of the form 'X causes Y9 are difficult to test for the same reason.
Sophie's claim is also, like the general claim 'X causes Y\ weak in the
advice-giving department. It gives us a general description of what to do,16
16
The general advice that follows from lX causes T seems to be of this form: if you want Y,
increase the frequency (or level) of X without in any way thereby increasing the frequency
121
but it does not tell us if we can risk another half hour on the telly. Usually
in looking for causal information we are in Emily's position - we are concerned with a very concrete plan of action. So then we had better not be
looking to claims of the form 'X causes F' but instead for something far more
specific.
3
3.1
Following the terminology that is now standard, I spoke in the last section
about causal structures. What is a causal structure; and where does it come
from? According to Spirtes, Glymour and Schemes, a causal structure is
an ordered pair <V,E> where V is a set of variables, and E is a set of ordered pairs
of V, where <X,Y> is in E if and only if X is a direct cause of Y relative to V.17
Alternatively, V can be a set of events. But we should not be misled into
thinking we are talking about specific events occurring at particular times
and places. The causal relations are supposed to be nomic relations between
event-types, not singular relations between tokens.
Where does the complex of causal laws represented in a causal structure
come from and what assures its survival? These laws, I maintain, like all
laws, whether causal or associational, probabilistic or deterministic, are transitory and epiphenomenal. They arise from - and exist only relative to - a
nomological machine. In figure 5.2 we have a typical graph of the kind
Spirtes, Glymour and Schemes employ to represent sets of causal laws. What
we see represented in the graph are fixed connections between event-types.
One event-type either causes another or it does not, and it does so with a
definite and fixed strength of influence if Spirtes, Glymour and Schemes' best
theorems are to be applicable. We typically apply these methods to eventtypes like unemployment and inflation, income and parents' education level,
divorce rates and mobility, or a reduction in tax on 'green' petrol and the
incidence of lung cancer. What could possibly guarantee the required kind of
determinate relations between event-types of these kinds? You could think
of reducing everything ultimately to physics in the hope that in fundamental
physics unconditional laws of the usual kind expressed in generalisations will
17
(or level) of features that prevent Y. Even that advice does not seem correct, however, if X is
an enhancer and the basic process to be enhanced does not occur, or if X needs some imperfection to precipitate it and we have bought ourselves pure samples of X in the expectation that
we will thus maximise its efficacy.
Spirtes et al. 1993, p. 45.
122
,U8
w4
w7
Figure 5.3 The nomological machine giving rise to the structure of Figure 5.2.
Source: designed by Towfic Shomar; recreation, George Zouros.
It is interesting to note that this way of thinking about the two levels is
similar to the way matters were originally viewed when the kinds of probability methods we are now studying were originally introduced into social
enquiry. The founders of econometrics, Trygve Haavelmo and Ragnar Frisch,
are good examples. Both explicitly believed in the socio-economic machine.
Frisch for instance proposed a massive reorganisation of the economy relying
on the economic knowledge that he was acquiring with his new statistical
techniques.18 Or consider Haavelmo's remarks about the relation between
pressure on the throttle and the acceleration of the car.19 This is a perfectly
useful piece of information if you want to drive the car you have, but it is
not what you need to know if you are expecting change. For that you need
to understand how the fundamental mechanisms operate.
Cf. Andvig 1988.
Haavelmo 1944.
124
My own hero Otto Neurath expressed the view clearly before the First
World War in criticising conventional economics. Standard economics, he
insisted, makes too much of the inductive method, taking the generalisations
that hold in a free market economy to be generalisations that hold simpliciter.
4
Those who stay exclusively with the present will very soon only be able to
understand the past.'20 Just as the science of mechanics provides the builder
of machines with information about machines that have never been constructed, so too the social sciences can supply the social engineer with
information about economic orders that have never yet been realised. The
idea is that we must learn about the basic capacities of the components; then
we can arrange them to elicit the regularities we want to see. The causal laws
we live under are a consequence - conscious or not - of the socio-economic
machine that we have constructed.
In the remainder of this chapter I want to make a small advance in favour
of the two-tiered ontology I propose. I will do this by defending a distinction
between questions about what reliably causes what in a given context and
questions about the persistence of this reliability. The first set of questions
concerns a causal structure; the second is about the stability of the structure.
Recall the machine in figure 5.3 that supports the graph in figure 5.2. There
is a sense in which this is a 'badly shielded' machine. Normally we want to
construct our machine so that it takes inputs at very few points. But for this
machine every single component is driven not only by the other components
of the machine but also by influences from outside the system. We had to
build it like that to replicate the graphs used by Spirtes, Glymour and
Schemes. In another sense, though, the machine is very well shielded, for
nothing disturbs the arrangement of its parts.
My defence here of the distinction between the causal structure and the
nomological machine that gives rise to it will consist of two separate parts:
the first is a summary of a talk I gave with Kevin Hoover21 at the British
Society for the Philosophy of Science (BSPS) in September 1991, and the
second is a local exercise to show the many ways in which this distinction is
maintained in the Spirtes, Glymour and Schemes work. In particular I try to
show why it would be a mistake to let this distinction collapse by trying to
turn the nomological machine into just another kind of cause.
3.2
Stability22
125
substitute for the experimental method. I defended their belief. As I see it, the
early econometrics work provided a set of rules for causal inference (RCI):
RCI: Probabilities and old causal laws - New causal laws
Why should we think that these rules are good rules? The econometricians
themselves - especially Herbert Simon - give a variety of reasons; I gave a
somewhat untidy account in Nature's Capacities and their Measurement. At
the BSPS I reported on a much simpler and neater argument that is a development of Hoover's work (following Simon) and my earlier proof woven
together. This was the thesis:
Claim: RCI will lead us to infer a causal law from a population probability only if
that law would be established in an ideal controlled experiment in that population.
That is, when the conditions for inferring a causal law using RCI are satisfied,
we can read what results would obtain in an experiment directly from the
population probabilities. We do not need to perform the experiment - Nature
is running it for us. That is stage 1. The next stage is an objection that Hoover
and I both raise to our arguments. The objection can be very simply explained
without going into what the RCI really say. It is strictly analogous to a transparent objection to a very stripped-down version of the argument. Consider
a model with only two possible event types, Ct and Et+At. Does C at t cause
E at t + At? The conventional answer is that it does so if and only if Ct raises
the probability of t+At , i.e. (suppressing the time indices),
RCI2_eVent modei: C causes E iff Prob(/C) > Prob(/-C)
The argument in defence of this begins with the trivial expansion:
Prob() = Prob(/C)Prob(C) + Prob(/-C)Prob(-C)
Let us now adopt, as a reasonable test, the assumption that if C causes E then
(ceteris paribus) raising the number of Cs should raise the number of Es.
In the two-variable model there is nothing else to go wrong, so we can drop
the ceteris paribus clause. Looking at the expansion, we see that Prob() will
go up as Prob(C) does if and only if Prob(/C) > Prob(/-C). So our minirule of inference seems to be vindicated.
But is it really? That depends on how we envisage proceeding. We could
go to a population and collect instances of Cs. If Prob(/C) > Prob(/-C),
this will be more effective in getting Es than picking individuals at random.
But this is not always what we have in mind. Perhaps we want to know
whether a change in the level of C will be reflected in a change in the level
of E. What happens then? Suppose we propose to effect a change in the
level of C by manipulating a 'control' variable W (which for simplicity's
sake is a 'switch' with values +W or -W). Now we have three variables and
it seems we must consider a probability over an enlarged event space:
126
Prob( C W). Once W is added in this way, the joint probability of C and
E becomes a conditional probability, conditional on the value of W. It looks
as if the Prob() that we have been looking at then is really Prob(/-H0. But
what we want to learn about is Prob(/W), and the expansion
Prob(/-W) = Prob(/C-W)Prob(C/-W)
+ Prob(/-C-W>Prob(-C/-W)
will not help, since from the probabilistic point of view, there is no reason
to assume that
Prob(E/C+W) = Prob(E/C-W)
But probabilities are not all there is. In my view probabilities are a shadow of
the underlying machine. The question of which are fundamental, nomological
machines or probabilities, matters because reasoning about nomological
machines can give us a purchase on invariance that probabilities cannot provide. To see this let us look once again at the Markov condition.
Think again of our model (C,E, W) in terms of just its causal structure. The
hypothesis we are concerned with is whether C at t causes E at t + At in a
toy model with very limited possibilities: (i) W at some earlier time t0 is the
complete and sole cause of C at t\ and (ii) C is the only candidate cause for
E at t. In this case the invariance we need is a simple consequence of the
Markov assumption: a full set of intermediate causes screens off earlier
causes from later effects. (This is the part of the condition that seems to me
appropriately called 'Markov'.) In our case this amounts to
MC: Prob(E/CW) = Prob(/C)
Spirtes, Glymour and Scheines have a theorem to the same effect, except that
their theorem is geared to their graph representation of complex sets of causal
laws. Kevin Hoover and I use the less general but more finely descriptive
linear-modelling representations. Spirtes, Glymour and Scheines call theirs
the 'manipulation theorem'. They say of it,
The importance of this theorem is that if the causal structure and the direct effects of
the manipulation . . . are known, the joint distribution [in the manipulated population]
can be estimated from the unmanipulated population.23
Let us look just at the opening phrases of the Spirtes, Glymour and Scheines
theorem:
Spirtes et al. 1993, p. 79.
127
W)
So the Markov condition plays a central role in the Spirtes, Glymour and
Scheines argument as well.
What exactly are we presupposing when we adopt condition MC? Two
different things I think. The first is a metaphysical assumption that in the
kinds of cases under consideration causes do not operate across time gaps;
there is always something in between that 'carries' the influence. Conditioning on these intermediate events in the causal process will render information about the earlier events irrelevant, so the probability of E given C should
be the same regardless of the earlier state of W.
I objected to this assumption about conditioning in section 2.3 on the
grounds that the intermediate factors that ensure temporal continuity for the
individual causes may not fall under any general kinds of the right type and
hence may not appear in the causal structure, since it represents causal laws
not singular causal facts. But more is involved. Clearly the assumption that
Prob(/ C) is the same given either + W or - W assumes that changes in W
are not associated with changes in the underlying nomological machine. This
is required not just with respect to the qualitative causal relation - whether
or not C causes E - but also with respect to the exact quantitative strengths
of the influence, which are measured by the conditional probabilities (or functions of them). This assumption is built into the Spirtes, Glymour and
Scheines theorem in two ways: (1) The antecedent of the theorem restricts
consideration to distributions Prob(V W) that satisfy the Markov condition, a restriction which, as we will see in the next section, has no general
justification in cases where causal structures vary; (2) they assume in the
proof that GUnman - GMan are subgraphs of GComb. That is, they assume that
changes in the distribution of the externally manipulated * switch variable' W
are not associated with any changes in the underlying causal structure. That
is the same assumption that Hoover and I pointed to as well.
Where do we stand if we build this assumption in, as Spirtes, Glymour
and Scheines do? It looks as if the original thesis is correct: //changes in the
variables that serve as control variables in Nature's experiment are not associated with changes in the remainder of the causal structure then RCI (or the
Spirtes, Glymour and Scheines programme) will allow us to infer causal laws
from laws of probabilistic association. But it is the nomological machine that
makes the structure what it is. The way we switch W on and off must not
interfere with that if we are to expect the laws it gives rise to to be the same
once we have turned W on. So the RCI of our miniature 2-variable model
24
128
are good rules of inference - but only if the underlying machine remains
intact as W and C change. That seems to be the situation with respect to the
back and forth about causal inference between Hoover and me at the BSPS
meeting. The reason for covering this again is to point out that central to the
argument is the distinction between (1) the nomological machine that gives
rise to the laws and (2) the laws themselves - both the causal laws and the
associated laws of probabilistic association.
3.3
There is an obvious strategy for trying to undo the distinction between (1)
and (2). That is to try to take the nomological machine 'up' from its foundational location and to put it into each of the emerging causal laws themselves,
on the model of an auxiliary. Following John Stuart Mill25 or John L.
Mackie26 (as Spirtes, Glymour and Schemes do too) suppose we take the law
connecting causal parents with their effects to have the form:
CXAX v C2A2 v . . . v CnAn causes ,
where the Cs are the 'salient' factors and the As the necessary helping conditions that make up a complete cause. The idea is to include a description that
picks out the nomological machine that gives rise to the causal relation
between the C, and E in each of the disjuncts. That is, to assume for each At
in the above formula that
Mill 1843.
Mackie 1965.
129
sent all possible nomological machines that could give rise to some causal
structure or other over the designated set of vertices. Does this even make
sense? And what kinds of values should these be (integers, real numbers,
complex numbers, vectors . . .?) and what will make for an ordering on them?
For an ordinary random variable of the kind that usually appears in the vertex
set of a causal structure, we expect that there should be a fairly uniform set
of fairly specific procedures for measuring the variable, regardless of its
value. That does not seem possible here. We also expect that the order relation on the values reflects some ordering of magnitudes represented by those
values. Distances too are generally significant: if the distance between one
pair of values is greater than between a second pair, we expect two items
represented by the first pair to differ from each other in some respect in
question more than two items represented by the second pair.
These are just the simplest requirements. In general when we construct a
new random variable, we first have in mind some features that the qualitative
relations that items to be assigned values have. For temperature, for example,
the items can be completely ordered according to ' . . . is hotter than
Then we construct the right kind of mathematical object to represent those
features. And when we do it formally, we finally try to prove a representation
theorem, to show that the two match in the intended way. None of this seems
to make sense for the random variable labelled WAT. It seems we are trying
to press nomological machines into the same mould as the measurable quantities that appear as causes and effects, but they do not fit.
This leads immediately to a third objection: nomological machines are not
easy to represent in the same way that we represent causes because they
aren't causes. Look again at our paradigm, Mackie's and Mill's account. The
relationship between the Cs and As on the one hand and the Es on the other
is a causal relation. That is not true of the relation between the nomological
machine and the set of laws (causal and probabilistic) that it gives rise to.
Consider the machine of figure 5.3 and its concomitant causal graph. The
machine causes neither the causal laws in the graph nor the effects singled
out in the causal laws pictured there. The importance of this is not just metaphysical; there are real methodological implications. The two types of relation
are very different. So too will be the methods for investigating them. In
particular we must not think that questions about the stability of causal laws
can be established in the same way that the laws themselves are established,
for instance by including a variable that describes the causal machine in the
vertex set and looking at the probabilistic relations over the bigger vertex set.
One last objection concerns how the nomological machine variable would
actually have to be deployed. If it is to be guaranteed to do its job of ensuring
that the Markov condition holds for a causal structure, it will have to appear
as a cause of every effect in the structure. Besides being cumbersome, this
130
can not be right. For the machine that is responsible for the causal laws is
not part of the cause of the laws' effects: it is the causes cited in those laws
that (quoting the Concise Oxford Dictionary again) 'bring about' or 'produce'
the effects. Again, the difference will come out methodologically. For many
causal laws in nice situations we will be able to devise statistical tests. But
the same kind of tests do not make sense when the nomological machine is
considered as a cause. Consider the most simple tests in a 2-variable model.
Are increased levels of the putative cause associated with increased levels of
the effect? For this we need a cause that has at least the first requirements of
a quantity - it comes in more and less.
What about the test that looks for an increased probability of the effect
when the putative cause is present compared with when it is absent? This
test can not apply either, for the reasons I urge in chapter 7. There is no
probability measure to be assigned across different nomological machines. In
general there is no answer to the question, 'What is the probability that
Towfic Shomar builds this machine rather than that machine or some other?'
just as there need be no probability for one socio-economic set-up rather then
another to obtain in a developing country.27 We see thus that there are a
number of defects to the proposal to escalate the nomological machine into
the causal structure itself. What I want to do in the next section is to look very locally - at how these defects are bad for the Spirtes, Glymour and
Scheines techniques themselves.
34
Mixing
Almost all of the Spirtes, Glymour and Scheines theorems presuppose that
Nature's graph - that is, a graph of the true causal relations among a causally
sufficient set that includes all the variables under study - satisfy their Markov
condition. But that assumption will not in general be true if the vertex set
includes variables, like the nomological machine variable, whose different
values affect the causal relations between other variables. The problem is one
that Spirtes, Glymour and Scheines themselves discuss, although their aim is
to establish causal relations not to establish stability, so they do not put it in
exactly this context. Here is the difficulty: consider two different values (say
NM- 1, NM = 2) for the nomological machine variable. There are cases in
which the true causal relations among a set of variables V (not including NM)
will satisfy the Markov condition both when NM = 1 and when NM = 2; but
when they are put together, the combined graph does not. This is an instance
Except of course as a question about the probability of drawing one set-up rather than another
from a given sample, which is not the kind of probability that we need to match a causal
structure.
131
M/=2
Probi(X, Y,Z)
Prob2(X,Y,Z)
NM
Prob(M/, X, Y, Z)
Figure 5.5 A causal structure with WAT as additional variable.
132
NM=2
-JT
-Y 50
+X
50
50
50
+Y 50
50
+Y 50
50
+Z
-Z
+Z
The numerical example of figure 5.6 shows that both Pro^ and Prob2 can
satisfy the Markov condition relative to their graphs, but Prob does not do
so relative to its graph.
Consider Prob(+F+Z/+X) from figure 5.6; the others follow the same pattern.
Prob^+Y+ZI+X)
= 1 = 1 1 = Prob1(+r/+X)Prob,(+Z/+X)
rPrronoh2 v(+Y4-7IT\
+ J -TIJI-TA)
I - I2Q . 200
iOQ
2-50.
0 0 ~ 4 ~ 200
= Prob2(+y/+X)Prob2(+Z/+X)
But,
= Prob(+r/+X)Prob(+Z/+X)
Simpson's paradox is the name given to the following fact (or its generalisation to cases with more variables and more compartments in the partition):
There are probability distributions such that a conditional dependence relation
(positive, negative or zero) that holds between two variables (here Y, Z) may be
changed to any other in both compartments of a partition along a third variable (here
NM).
133
dependent on each of the other two (Y, Z). That is the case for each of the
distributions Prob+X, Prob_x:
?vob+x(Y/NM = 1) = 1 # 0.5 = T>rob+X(Y/NM = 2),
Prob_x(17WM = 1) = 0 0.5 = Prob_x(TOVM = 2),
and similarly for Probx(Z/NM). The dependence between Y (or Z) and NM
arises in Prob+X and Prob_x because, given either +X or -X, information about
whether we are in NM = 1 or NM = 2 (i.e., whether we are considering a
population in which X causes Y (or Z) or not) will certainly affect the probability of finding Y (or Z).
The case I have considered here is an all-or-nothing affair: the causal law
is either there and deterministic, or it is altogether missing. Figures 5.7 and
5.8 show another case where the causal graph gets entirely reversed, and not
even because of changes in the structure of the machine but only because of
changes in the relative values of the independent variables. But we do not
need such dramatic changes. Any change in the numerical strength of the
influence of X on Y and Z can equally generate a Simpson's paradox, and
hence a failure of the Markov condition for the combined graph.29
29
As an aside, let me make two remarks about the Spirtes, Glymour and Schemes scheme. First,
it is important to note that mixing is a general problem for them. According to Spirtes,
Glymour and Scheines (p. 44), a causal arrow gets filled in on a graph between two random
134
Returning to the main argument, I wanted this foray into Simpson's paradox to provide a concrete example of the harmful methodological consequences of a mistaken elevation of the description of the supporting nomological machine into the set of causes and effects under study. The work on
causal structures by Spirtes, Glymour and Schemes is very beautiful: they
provide powerful methods for causal inference, and their methods are well
suited to a number of real situations in which the methods work very well.
But if we model the situations in the wrong way - with the causal structure
in the vertex set - the theorems will in general no longer apply. In such
circumstances no reason can be provided why these methods should be
adopted. The right way to think about them, I urge, is with nomological
machines. The causal structure arises from a nomological machine and holds
only conditional on the proper running of the machine; and the methods for
variables if any value of the first causes any value of the second. So anytime that causal
influences are of different strengths for different values, we have one graph representing two
different situations. There are even cases where for one pair of values, one variable causes
another, and for a different pair, the second causes the first. In this case, under their prescription the graph will not even be acyclic.
Second, they say they are interested in qualitative relations - whether or not X causes Y rather than in the quantitative strength of the causal relation, and that they do not like results
that depend on particular values for those strengths. Nevertheless the plausibility of their
Markov condition presupposes that they are studying sets of causal relations that are stable
not only qualitatively but also across changes in the quantitative strengths of influence as
well.
135
imc Switch
Figure 5.8 The nomological machine giving rise to the structures of figure 5.7.
Source: Towfic Shomar.
studying nomological machines are different from those we use to study the
structures they give rise to. Unfortunately these methods do not yet have the
kind of careful articulation and defence that Spirtes, Glymour and Scheines
and the Pearl group have developed for treating causal structures.
ACKNOWLEDGEMENTS
This chapter is dedicated to Elizabeth Anscombe, from whom I learned, The word
''cause" itself is highly general . . . I mean: the word "cause" can be added to a
language in which are already represented many causal concepts' (Sosa and Tooley
(eds.) 1993, p. 93; reprinted from Anscombe 1971), and also that for many causes,
'shields can be put up against them' (ibid., p. 100).
Sections 1 to 2.5 of this chapter are drawn from material to be published in Cartwright forthcoming (b). The bulk of section 3 is taken from Cartwright 1997a, though
new arguments have been added, as well as the discussion of the Faithfulness condition.
136
Thanks to Jordi Cat for very helpful discussions as well as to the participants of
the British Society for the Philosophy of Science Conferences in 1991 and 1993 and
the participants of the Notre Dame Causality in Crisis Conference. Thanks also to
Towfic Shomar for comments, graphs and help in the production. Research for this
chapter was supported by the LSE Research Initiative Fund and the LSE Modelling
and Measurement in Physics and Economics Project.
Economics, we are told, studies laws that hold only ceteris paribus.l Does
this point to a deficiency in the level of accomplishment of economics; does
it mean that the claims of economics cannot be real laws? The conventional
regularity account of laws answers 'yes*. On this account a theoretical law
is a statement of some kind of regular association,2 either probabilistic or
deterministic, that is usually supposed to hold 'by necessity'. The idea of
necessity is notoriously problematic. Within the kind of empiricist philosophy
that motivates the regularity account it is difficult to explain what constitutes
the difference between law-like regularities and those that hold only by accident, 'nonsense' correlations that cannot be relied on. I shall not be primarily
concerned with necessity here; I want to focus on the associations themselves.
As I have rehearsed in chapter 3, empiricism puts severe restrictions on
the kinds of properties that appear in Nature's laws, or at least on the kinds
of properties that can be referred to in the law-statements we write down in
our theories. These must be observable or measurable or occurrent. Economists are primarily concerned with what is measurable, so that is what I shall
concentrate on here. It also restricts the kinds of facts we can learn. The only
claims about these quantities that are admissible into the domain of science
are facts about patterns of their co-occurrence. Hence the specification of
either an equation (in the case of determinism) or of a probability distribution
over a set of measurable quantities becomes the paradigm for a law of nature.
These two assumptions work together to ban ceteris paribus laws from the
nature that we study. Together they tell us that all the quantities we study are
qualitatively alike. There are no differences between them not fixed either by
the patterns of their associations or by what is observable or measurable
about them. As a consequence it becomes impossible to find any appropriate
1
2
Cf. Hausman 1992, ch. 8, Blaug 1992, pp. 59-62, Hutchison 1938, pp. 40-46.
Hempel and Oppenheim 1960.
137
138
category that would single out conditioning factors for laws from the factors
that fall within them.
What then could it mean to include as a law in one's theory that (to use
an example I will discuss later) 'In conditions C, all profitable projects will
be carried out'? All that is law-like on the Humean picture are associations
between measurable quantities. That's it. The only way a condition could
restrict the range of an association in a principled or nomological way would
be via a more complex law involving a general association among all the
quantities in question. The effect of this is to move the conditioning factor C
inside the scope of the law: 'All projects that are both profitable and satisfy
conditions C will be carried out.' That is what the laws of nature that we are
aiming for are like. If theories in economics regularly leave out some factors
like C (whatever C stands for) from the antecedents of their laws, and hence
write down law claims that turn out to be, read literally, false, then economics
is regularly getting it wrong. These theories need to keep working on their
law-statements till they get ones that express true regularities.
As we have seen, I defend a very different understanding of the concept
of natural law in modern science from the iaws = universal regularities'
account I have been describing. We aim in science to discover the natures of
things; we try to find out what capacities they have and in what circumstances
and in what ways these capacities can be harnessed to produce predictable
behaviours. The same is true in theoretical economics.3 Regularities are
secondary. Fixed patterns of association among measurable quantities are a
consequence of the repeated operation of factors that have stable capacities
arranged in the 'right' way in the 'right kind' of stable environment: regularities are a consequence of the repeated successful running of a socio-economic
machine. The alternative account of laws as regularities goes naturally with
a covering law theory of prediction and explanation. One set of regularities the more concrete or pheomenological - is explained by deducing them from
another set of regularities - the more general and fundamental. The distinction is like that in econometrics between structural equations (like the demand
and supply equations of chapter 3) and reduced-form equations (like the
simple equations that tell us what the single allowed value of the price is).
As I urged in chapter 4, the alternative theory of explanation in terms of
natures rejects the covering law account. You can not have regularities 'all
the way down'.
Ceteris paribus conditions play a special role when explanation depends
3
I use 'theoretical' in this case to contrast the kind of model I look at in this chapter with
models that we arrive at primarily by induction, for instance, large macro-economics econometric models.
139
on natures and not on covering laws. On the natures picture there is a general
and principled distinction between the descriptions that belong inside a law
statement and those that should remain outside as a condition for the regularity described in the law to obtain. The regularities to be explained hold only
ceteris paribus; they hold relative to the implementation and operation of a
machine of an appropriate kind to give rise to them. The hypothesis I defend
in this chapter is that the covering-law account is inappropriate for much of
the work in modern economic theory.
The detailed example I will present is one in which we cross levels of
scale, or aggregation, in moving from explanandum to explanans. The model
I will describe is one in which a (highly idealised) macroeconomic regularity
is shown to arise from the rational behaviour of individual agents. This crossing of levels is not important to the point. We could attempt to explain macroeconomic regularities by reference to capacities and relations that can only
be attributed to institutions or to the economy as a whole, with no promise
of reduction to features of individuals.4 On the other hand we could undertake - as psychologists often do - to explain behavioural or psychological
regularities by reference to behavioural or psychological characteristics of the
individuals involved. The point is that in all these cases it is rarely laws, in
the sense of regular associations among observable or measurable quantities,
that are fundamental. Few, if any, economic regularities, no matter how basic
or important or central or universal we take them to be, can stand on their
own. If there are laws in the regularity sense to record in economic theories,
that is because they have been created.
Economists usually talk in terms of models rather than theories. In part
they do so to suggest a more tentative attitude to the explanations in question
than would be warranted in a fully-fledged well-confirmed theory; and in part
they do so to mark a difference in the degree of articulation: theory is a
large-scale (and not necessarily formalised) outline whereas a model gives a
more specified (and formalised) depiction. But I think the terminology also
marks a difference of importance to the hypothesis of this chapter. Theory'
suggests a set of covering laws like Maxwell's equations, laws of the kind
physics explanations are supposed to begin from. Models in economics do
not usually begin from a set of fundamental regularities from which some
further regularity to be explained can be deduced as a special case. Rather
they are more appropriately represented as a design for a socio-economic
machine which, if implemented, should give rise to the behaviour to be
explained. I illustrate this idea in Section 3 with a game theory example.
4
For an account of how to do non-reductionistic social science, see Ruben 1985 or Phillips
1987.
140
The most popular regularity account has it that laws of nature are necessary
regular associations between GOOD (- sensory or measurable or occurrent)
properties. What is wrong with this account? The answer is - everything.
There are four ingredients in the characterisation, and each fails. My major
concern in this chapter is with regularities in economics, so I shall discuss
them at length. Before that, though, for the sake of completeness, let me
review what goes wrong with each of the others.
Necessity. The usual objection is to the introduction of modalities. I argue
by contrast that modal notions cannot be eliminated. The problem with the
usual account is that it uses the wrong modality. We need instead a notion
akin to that of objective possibility. I have discussed this briefly in section 6,
chapter 3, and will say no more about it here because this issue, though
important, is tangential to my central theses about the dappled world.
Association. Causal laws are central to our stock of scientific knowledge,
and contrary to the hopes of present-day Humeans, causal laws can not be
reduced to claims about association. In fact if we want to know what causal
laws say, the most trouble-free reconstruction takes them to be claims about
what kinds of singular causings can be relied on to happen in a given situation. For instance, 'Aspirins relieve headaches in population/?' becomes 'In
p, it can be relied on that aspirins do cause headache relief some of the time.'
I do not of course wish to argue that facts about associations do not matter
in science; nor to deny that we can construct machines like those in chapter
5 to produce causal laws. Rather I wish to point out that laws must not be
understood in terms of association since we have both lawful associations
and causal laws, and neither is more fundamental than the other. Again since
this issue is not central to the concerns of this book, I shall not go into any
more detail here.5
Sensiblelmeasurable!occurrent properties. Economists like to stick with
properties that are easy to observe and to which we can readily assign numbers. I have no quarrel with measurable properties - when they are indeed
properties. But we should remember that we can, after all, produce a lot of
nonsense measurements. Consider, for example, David Lewis' nice case in
which we give the graduate record examination in philosophy to the students
in his class, but we mark the results with the scoring grid for the French
examination. Our procedure satisfies all the requirements of the operationalist - it is public, well articulated and gives rise to results that command
full intersubjective agreement. But the numbers do not represent any real
property of the students. Real properties bring specific capacities with them,
5
141
like the capacity Lewis' students should have to tell a modus ponens argument from a modus tollens or the capacity to look 'red' to 'normal' observers
or to enrage a bull. As I have argued in chapter 3, once we have given up
Hume's associationist account of concept formation, I do not think we can
find anything wrong with the very idea of a capacity, nor, as I argued in
Cartwright 1989, is there anything epistemically second-rate about capacities.
Nevertheless, in certain specific scientific contexts we may wish to restrict
ourselves to making claims about readily measurable quantities or to using
only 'occurrent property' descriptions for a variety of reasons. We may, for
example, wish to ensure a uniform standard across different studies or different contexts. But we will not thereby succeed in describing a nature stripped
of capacities, powers and causes. Nor will we succeed in producing true law
claims purely in a vocabulary like this, since the concealed ceteris paribus
conditions will always have modal force. This point is defended at length in
a number of chapters in this book, so I need not go into it further here.
Instead I turn to regularity which I have so far not treated with sufficient
attention.
Regularity. The most immediate problem with regularities is that, as John
Stuart Mill observed,6 they are few and far between. That is why economics
cannot be an inductive science. What happens in the economy is a consequence of a mix of factors with different tendencies operating in a particular
environment. The mix is continually changing; so too is the background
environment. Little is in place long enough for a regular pattern of associations to emerge that we could use as a basis for induction. Even if the
situation were stable enough for a regularity to emerge, finding it out and
recording it would be of limited use. This is the point of Trygve Haavelmo's
example of the relation between the height of the throttle and the speed of
the automobile.7 This is a useful relation to know if we want to make the car
go faster. But if we want to build a better car, we should aim instead at
understanding the capacities of the engine's components and how they will
behave in various different arrangements.
The second problem with regularities is that, as in physics, most of the
ones there are do not reflect the kind of fundamental knowledge we want,
and indeed sometimes have. We want, as Mill and Haavelmo point out, to
understand the functioning of certain basic rearrangeable components.8 Most
of what happens in the economy is a consequence of the interaction of large
numbers of factors. Even if the arrangement should be appropriate and last
Mill 1836, see also Mill 1872.
Haavelmo 1944.
As I use the term here, features are 'fundamental' to the extent that their capacities stay
relatively fixed over a wide (or wide enough) range of circumstances.
142
long enough for a regular association to arise (that is, they make up a socioeconomic machine) that association would not immediately teach us what we
want to know about how the parts function separately. Hence the laments
about the near impossibility in economics of doing controlled experiments
designed especially to do this job. What we need to know is about the capacities of the distinct parts. We may ask, though, 'Are not the laws for describing
the capacities themselves just further regularity statements?' But we know
from chapters 3 and 4 that the answer to this is 'no'.
Let us turn next to the idea of a mechanism operating on its own, say the
supply mechanism or the demand mechanism. We may conceive of the
demand mechanism in terms of individual preferences, goals and constraints
or alternatively we may conceive it as irreducibly institutional or structural.
In either case on the regularity account of laws the law of demand records
the regular behaviour that results when the demand mechanism is set running
alone. This paradigmatic case (which Haavelmo himself uses in talking about
laws for fundamental mechanisms) shows up the absurdity of trying to
describe the capacities of mechanisms in terms of regularities. No behaviour
results from either the supply or the demand mechanism operating on its
own, and that is nothing special about this case. In general it will not make
sense to talk about a mechanism operating on its own. That is because in this
respect economic mechanisms really are like machine parts - they need to
be assembled and set running before any behaviour at all results. This is true
in even the most stripped-down cases. Recall the lever. A rigid rod must not
only be affixed to a fulcrum before it becomes the simple machine called a
lever; it must also be set into a stable environment in which it is not jiggled
about.
The same is true in economics. Consider an analogous case, a kind of
economic lever that multiplies money just as the lever multiplies force. In
this example banking behaviour will play the role of the rigid rod with 'highpowered money' as the force on one end. The reserve ratio will correspond
to where the fulcrum is located. Here is how the money-multiplying mechanism works. The 'central bank' has a monopoly on making money. Following
convention, let us call this 'high-powered money'. The commercial banking
system blows up, or multiplies, the high-powered money into the total money
supply. The banks do this by lending a fraction of the money deposited with
them. The larger the proportion they can lend (i.e., the smaller the reserve
ratio) the more they can expand money.
Two factors are at work: the proportion of high-powered money held as
currency, which is one minus the proportion deposited; and the proportion
lent, which is one minus the reserve ratio. Suppose, for example, that highpowered money = 100. All of it could be deposited in a bank. That bank
could lend 80. All 80 could be deposited in another bank. That bank could
143
lend 64, and so on. The total of all deposits, 100 + 80 + 64 + . . . = 500.
Assume that banks lend all they can, which as profit makers they are disposed
to do. Then we can derive
M
1 +cu
(1)
re + cu
where H = high powered money; re = the reserve ratio; cu = the rate of currency to deposits; M = the money stock. Equation (1) is like the law of the
lever. Regularity theorists would like to read it as a statement of regular
association simpliciter. But that is too quick. We do not have a description
of some law-like association that regularly occurs. Rather we have a socioeconomic machine that would give rise to a regular association if it were set
running repeatedly and nothing else relevant to the size of the money stock
happens.9
This simple example illustrates both of the central points I want to argue
in this chapter. First, regularity theorists have the story just upside down.
They mistake a sometime-consequence for the source. In certain very felicitous circumstances a regularity may occur, but the occurrence - or even the
possibility of the occurrence - of these regularities is not what is nomologically basic. We want to learn how to construct the felicitous circumstances
where output is predictable. Our theories must tell us what the fundamental
mechanisms available to us are, how they function, and how to construct a
machine with them that can predictably give rise to a regular association.
And the information that does this for us is not itself a report of any regular
association - neither a real regular association (one that occurs) nor a counterfactual association (one that might occur). Second, the example shows the
special place of ceteris paribus conditions. Equation (1) holds ceteris paribus,
that is under very special circumstances we may designate C. C does not
mark yet another variable like //, M, cu and re that has mistakenly been
omitted from equation (1), either by ignorance or sloth. The ceteris paribus
clause marks the whole description of the socio-economic machine that if
properly shielded and run repeatedly will produce a regular rise and fall of
money with deposits of the kind characterised in equation (1).
I use the word 'mechanism' in describing the money multiplier. Regularity
theorists also talk about mechanisms. But for them a mechanism can only be
another regularity, albeit a more 'fundamental' one. 'More fundamental' here
usually means either more encompassing, as in the relation of Newton's to
Kepler's laws, or having to do with smaller parts as in the relation between
individuals with their separate behaviours and expectations on the one hand
9
144
and macroeconomic regularities on the other. Economics does not have any
fundamental laws in the first sense. But the second sense will not help the
regularity theorist either since explanations citing regularities about individuals have all the same problems as any other. When we say 'mechanism' I
think we mean 'mechanism' in the literal sense - 'the structure or adaptation
of parts of a machine'.10 The little banking model is a good case. In our
banking example the fixed relationship between high-powered money that
the central bank creates and the ultimate size of the money stock depends on
the currency-to-deposit ratio not changing and on every commercial bank
being fully lent up, that is, being tight on, but not beyond, the legal reserve
ratio. It is not immediately obvious that a group of commercial banks can
'multiply' the amount of currency issued by the central bank when no one
bank can lend more than is deposited with it. The banking system can lend
more even though no one bank can. How much more? That is what the 'law'
tells us. But it is not derived by referring to a regularity, but rather by deducing the consequences of a mechanism. I illustrate the relations between economic models and regularities with another example in section 3.
When confronted with the fact that there seem to be a lot more laws than
there are regularities, regularity theorists are apt to defend their view by
resorting to counterfactual regularities. Laws are identified not with regularities that obtain but with regularities that would obtain if the circumstances
were different. It is fairly standard by now to go on to analyse counterfactuals
by introducing an array of other 'possible' worlds. Laws then turn out to be
regularities that occur 'elsewhere' even though they do not occur here - that
is, even though they do not occur at all. As a view about what constitutes a
law of nature this 'modalised' version of the regularity account seems even
more implausible than the original. A law of nature of this world - that is, a
law of nature full stop, since this is the only world there is - consists in a
regularity that obtains nowhere at all.
There are two strategies, both I think due to David Lewis, that offer the
beginnings of an account of how possible regularities can both be taken seriously as genuine regularities and also be seen to bear on what happens in
the actual world. The first takes possible worlds not as fictional constructions
but as real.11 The second takes them as bookkeeping devices for encoding
very complicated patterns of occurrences in the real world. In this case the
truth of a counterfactual will be entirely fixed by the complicated patterns
of events that actually occur. Rather than employing this very complicated
semantics directly we instead devise a set of rules for constructing a kind of
chart from which, with the aid of a second set of rules, we can read off
10
11
145
whether a counterfactual is true or false. The chart has the form of a description of a set of possible worlds with a similarity relation defined on them.
But the description need not be of anything like a world, the 'worlds' need
not be 'possible' in any familiar sense, and the special relation on them can
be anything at all so long as a recipe is provided for how to go from facts
about our world to the ordering on the possible worlds. The trick in both
cases, whether the possible worlds are real or only function as account books,
is to ensure that we have good reasons for making the inferences we need;
that is, that we are able to infer from the truth of a counterfactual as thus
determined to the conclusion that if we really were repeatedly to instantiate
the antecedent of the counterfactual, the consequent would regularly follow.
This of course is immediately guaranteed by the interpretation that reads
claims about counterfactual regularities just as claims about what regularities
would occur if the requisite antecedents were to obtain. But then we are back
to the original question. What sense does it make to claim that laws consist
in these nowhere existent regularities?
The point of this question is to challenge the regularity theorist to explain
what advantage these non-existent regularities have over a more natural ontology of natures which talks about causes, preventations, contributing factors,
triggering factors, retardants and the like. We may grant an empiricist point
of view in so far as we require that the claims of our theories be testable. But
that will not help the regularity theorist since causal claims and ascriptions of
capacities or natures are no harder to test than claims about counterfactual
regularities and indeed, I would argue, in many cases you can not do one
without the other.12
3
The game-theoretic model proposed by Oliver Hart and John Moore in their
'Theory of Debt Based on the Inalienability of Human Capital' provides a
good example of a blueprint for a nomological machine.131 pick this example
not because it is especially representative of recent work in economic theory
but rather because the analogy with machine design is transparent in this case
and the contrast with a covering law account is easy to see. The central idea
behind the model is that crucial aspects of debt contracts are determined by
the fact that entrepreneurs cannot be locked into contracts but may withdraw
with only small (in their model no) penalties other than loss of the project's
assets. That means that some debt contracts may be unenforceable and hence
12
13
I have defended the second of these claims in Cartwright 1989 and in chapter 4. The first is
argued for in chapter 3.
Hart and Moore 1991; republished with minor changes as Hart and Moore 1994.
146
inefficiency may result, i.e., some profitable projects may not be undertaken.
Hart and Moore derive a number of results. I shall discuss only the very first
(corollary 1) as an illustration of how socio-economic machines give rise to
regularities. As Hart and Moore describe, 'Corollary 1 tells us that inefficiency arises only if either (a) there is an initial sunk cost of investment . ..
and/or (b) the project's initial returns are smaller than the returns from the
assets' alternative use . . .V 4
The model presents a toy machine that if set running repeatedly generates
an economically interesting regularity described in corollary 1. The inputs
can vary across the identity of individual players, sunk costs, income streams,
liquidation-value streams, and the initial wealth of the debtor. The output we
are considering is the regularity described in corollary 1:
R: All profitable projects which have no initial sunk costs and whose initial returns
are at least as large as the returns from the alternative use of the assets will be
undertaken.
The model lays out a number of features necessary for the design of a
machine: It tells us (i) the parts that make up the machine, their properties
and their separate capacities, (ii) how the parts are to be assembled and (iii)
the rules for calculating what should result from their joint operation once
assembled. The components are two game players ('debtor' or 'entrepreneur',
and 'creditor') who (a) have the same discount rates, (b) are motivated only
by greed, (c) operate under perfect certainty and (d) are perfect and costless
calculators. The arrangements of the players is claustrophobic: two players
set against each other with no possible interaction outside. Later extensions
consider what happens when either the debtor or the creditor have profitable
reinvestment opportunities elsewhere, but these are fixed opportunities not
involving negotiations with new players. Another extension tries in a very
'indirect and rudimentary fashion'15 to mimic with just the two players what
would happen if the debtor could negotiate with a new creditor. Other
assumptions about the arrangement are described as well; for instance, as
part of the proof of R it is assumed 'for simplicity' that the debtor 'has all
the bargaining power' in the original negotiation.16 The central features of
the arrangement are given by the rules laid out for the renegotiation game,
plus a number of further 'simplifying' assumptions about the relations among
capital costs, project returns and liquidation returns that contribute to the
proof of R. As we saw in chapter 3, rules for calculating how parts function
14
15
16
Hart and Moore 1991, p. 19. The result need mention only initial returns because of the
simplifying assumption Hart and Moore make that if the return at any period is greater then
the liquidation income at that period, this will continue to be the case in all subsequent periods.
Ibid., p. 40.
Ibid., p. 17.
147
17
This is part of the reason it is difficult to formulate a general informative philosophical account
of mechanisms and their operation.
148
as a matter of law, there must be a machine like the one modelled by Hart
and Moore (or some other, with an appropriate structure) to give rise to it.
There are no law-like regularities without a machine to generate them. Thus
ceteris paribus conditions have a very special role to play in economic laws
like R. They describe the structure of the machine that makes the laws true.
The relation of laws to models I describe is familiar in economics, where
a central part of the theoretical enterprise consists in devising models in
which socio-economic regularities can be derived. But it is important to realise how different this is from the regularity theory. Look back to the regularity theory. R is not a universal association that can be relied on outside
various special arrangements. On the regularity theory law-likeness consists
in true universality. So there must be some universal association in the offing
or else R cannot be relied on at all, even in these special circumstances. If
an association like R appears to hold in some data set, that cannot be a matter
of law but must be viewed as merely a chance accident of a small sample
unless there is some kind of true universal association to back it up.
The difference between an account of natural law in terms of nature's
capacities and machines and the regularity account is no mere matter of metaphysics. It matters to method as well, both the methods used in the construction of theory and those used in its testing. R tells us that ceteris paribus all
profitable ventures will be taken up - except in conditions (a) and (b). The
regularity theory invites us to eliminate the ceteris paribus clause by
extending this list to include further factors - (c), (d), (e), . . . , (x) - until
finally a true universal is achieved: 'All profitable ventures that satisfy (a),
(b), . .., (x) will be taken up.' This way of looking at it points the investigation in an entirely wrong direction. It focuses the study on more and more
factors like (a) and (b) themselves rather than on the structural features and
arrangements like those modelled by Hart and Moore that we need to put in
place if we want to ensure efficiency.
The regularity theory also carries with it an entourage of methods for testing that have no place here. When is a model like that of Hart and Moore a
good one? There are a large number of different kinds of problems involved.
Some are due to the fact that theirs is a game-theoretic model; these are to
some extent independent of the issue raised by the differences between a
regularity and a capacity view of laws. The advantage to game theory is that
it makes the relationship between the assumptions of the explanatory model
and laws like R that are to be explained in the model very tight. The results
that are 'derived' in the model are literally deduced. The cost is that the rules
of the games that allow these strict deductions may seem to be very unrealistic as representations of real life situations in which the derived regularities
occur. As Hart and Moore say about their own model, 'The game may seem
ad hoc, but we believe that almost any extensive form bargaining game is
149
18
Ibid., p. 12.
150
151
I began with the conventional claim that the laws of economics hold only
ceteris paribus. This is supposed to contrast with the laws of physics. On the
regularity account of laws this can only reflect an epistemological difference
between the two. Economists simply do not know enough to fill in their law
claims sufficiently. I have proposed that, if there were a difference, it would
have to be a metaphysical difference as well. Laws in the conventional regularity sense are secondary in economics. They must be constructed, and the
knowledge that aids in this construction is not itself again a report of some
actual or possible regularities. It is rather knowledge about the capacities of
institutions and individuals and what these capacities can do if assembled
and regulated in appropriate ways.
Does this really constitute a difference between economics on one hand
and physics on the other? As we have seen in earlier chapters, I think not. It
is sometimes argued that the general theory of relativity functions like a
true covering-law theory. It begins with regularities that are both genuinely
universal and true (or, 'true enough'); the phenomena to be explained are just
special cases of these very general regularities. Perhaps. But most of physics
works differently. Like economics, physics uses the analytic method. We
come to understand the operation of the parts - for example, Coulomb's
force, the force of gravity, weak and strong nuclear interactions, or the behaviour of resistors, inductors and capacitors - and we piece them together to
predict the behaviour of the whole. Even physics, I argue, needs 'machines'
to generate regularities - machines in the sense of stable configurations of
components with determinate capacities properly shielded and repeatedly set
running. If this is correct then differences in the metaphysics of natural laws
that I have been describing are not differences between economics and physics but rather between domains where the covering-law model obtains and
those where the analytic method prevails. Economics and physics equally
employ ceteris paribus laws, and that is a matter of the systems they study,
not a deficiency in what they have to say about them.
ACKNOWLEDGEMENTS
The bulk of this chapter is taken from Cartwright 1995 a. The review of problems
with the regularity account of laws in section 2 is new. Research for this paper was
sponsored by the Modelling and Measurement in Physics and Economics Project at
the Centre for Philosophy of Natural and Social Science, London School of Economics. I would like to thank the members of that research group for help with this paper,
as well as especially Mary Morgan and Max Steuer. The paper was read at a symposium on realism in the Tinbergen Institute; I would also like to thank the students and
faculty who participated in that symposium for their helpful comments.
Hacking 1965.
152
Probability machines
153
ever it takes to set the machine running. Remember the case of the lever
from chapter 3. A lever is not just a rigid rod appropriately affixed to a
fulcrum. For we do not call something a * lever' unless the rod is appropriately affixed and shielded so that it will obey the law of the lever. From the
point of view of empiricism, physics regularly cheats. Economists tend to be
truer to empiricism. In formulating laws they do try to stick to properties that
are measurable in some reasonable sense. This is one of the reasons why I
want to illustrate my views about machines with socio-economic examples.
The example I will consider in this section involves a debate about Sri
Lanka and the case for direct government action to maintain the food entitlement of the population. I focus on a discussion fairly late in the debate, by
Sudhir Anand and Ravi Kanbur.2 Sri Lanka has been considered as a test
case in development strategy because of its 'exceptionally high achievements
in the areas of health and education'.3 For a long while Sri Lanka has had a
very high life expectancy and high educational levels in comparison with
other developing countries in spite of being one of the low-income group.
Anand and Kanbur state the point at issue thus: 'The remarkable record in
achievement is attributed by some to a systematic and sustained policy of
government intervention in the areas of health, education, and food over a
long period. The counter to this position comes in several forms, which can
perhaps best be summarised in the statement that the intervention was, or has
become, "excessive" relative to the achievement.'4 Anand and Kanbur
criticise a 'cross-sectional' model standard in the literature, for which they
want to substitute a time series model. Their criticisms are detailed and
internal.5 What I want to object to is the whole idea of using a cross-sectional
model of this kind to think about the success of Sri Lanka's welfare programme.
Here is their model, which is intended 'to explain some measure of living
standard, // it , for country i at time t9:6
Hlt = cc, + PFit + 6lt + K + uit
where Yit is per capita income; Eit is social welfare expenditure; o^ is a timespecific but country-invariant effect assumed to reflect technological
advances; ^ is a country-specific and time-invariant 'fixed effect'; 5 is the
marginal impact of social expenditure on living standards; and uit is a random
error term.
The trick is to estimate 5: the marginal impact of social expenditure on
Anand and Kanbur 1995.
Ibid., p. 298.
Ibid.
And not, to my mind, correct. Cf. Sen 1988.
Anand and Kanbur 1995, p. 321.
154
Sen 1981.
Probability machines
155
a: technological advances
Y: per capita income
E: social welfare expenditure
A: country-specific 'fixed effect'
u: 'error'
H: living standard
Ibid., p. 298.
Ibid., p. 299.
Ibid., p. 298.
156
Figure 7.2a
b.c.
p: poverty removal
b.c: guarantee of basic entitlements
c: expanded employment
g: expansion of economy using labour-absorbing production
JC: export expansion
s: variety of state activities
Figure 7.2b
t
b.c.
\
/t\
Figure 7.2: Sen's causal structure for (a) South Korea and Taiwan and (b)
Sri Lanka.
picture I defend two further assumptions to get their conclusion. (1) The
underlying structure must be one of individual rational decisions. (2) The
way these decisions are made will always after the short term readjust to
undo whatever a government intervention might try to achieve. I note these
two additional assumptions since there is no reason in the arguments I give
to think that we need have a micro base in order to account for macro regularity. The planetary system is a nomological machine: the capacities of the
Probability machines
157
Hendry 1995, p. 7.
This understanding of Hendry is supported by his own example of the dice rolling experiment:
Hendry 1995, p. 8.
158
enough) description to bring it into the theory? On the other hand, to insist
that every data-generating process, regardless of its features, is correctly
described by some probability measure or other is just to assert the point at
issue, not to defend it.
We can of course construct an analogous situation for the developing countries, where the conditional probability for the standard of living given a
certain level of welfare expenditure does make sense. Imagine you represent
a medical insurance company in a position to set up business in one of the
developing countries. Which of these countries you will be allowed to invest
in will be assigned in a fair lottery. You will not be able to find out the
standard of living in the country you draw but information on its level of
welfare expenditure will be available. In this case I have no quarrel with the
existence of a conditional probability for a country you might draw to have
a given standard of living for a given level of expenditure - that is just like
the probability that a card drawn from my pack will be a diamond given it
is red; and indeed it will be extremely useful to you in representing the
insurance company to know what this conditional probability is.
I have been talking throughout about laws of nature in the realist way that
is typical nowadays, and I have been assuming the usual connection between
laws and warranted prediction. Laws I claim, including probabilistic laws,
obtain in only very special kinds of circumstances. Concomitantly, rational
prediction is equally limited. For great swathes of things we might like to
know about, there are no good grounds for forecasting. Glenn Shafer in his
recent book, The Art of Causal Conjecture,13 develops his own special
account of probability, but he arrives at the same conclusion. Shafer's standpoint is that of a staunch empiricist. His probabilities have directly to do with
rational prediction, without any mediation by laws. Shafer finds the concepts
of a law governing the evolution of events and of the determination of one
state of affairs by another to be problematic. I take it that his views are
similar to those of our Vienna Circle forebears. Perhaps we know how to
begin to think about matters concerning our making something happen. But
what guides do we have when it comes to one natural event forging another
into a prescribed form, much less that of laws of nature writ nowhere by no
one constraining the behaviour of objects in their dominion?
Shafer's alternative is to take probability as the rational degree of belief
of a Laplacean demon armed with complete information about everything
that has happened so far in history. The demon thus has just the same kind
of information that we do, only lots more of it. And the demon is engaged
in just the kind of activity that we engage in daily in science and thus have
pa firm grasp on. It is very important in Shafer's thinking that the Laplacean
13
Shafer, 1996.
Probability machines
159
demon - whom he calls 'Nature' - has the same kind of information that we
do. 14 This means for him, first, that Nature knows no laws but only facts;
second, that the facts, even when they are all known up to a certain time,
may exhibit no coherent pattern. They may continue to look just as disorderly
as they do to real practising scientists, who know only a small subset of them.
This I take it is why Shafer argues that for many cases, probability - i.e.,
Nature's rational degrees of belief - may not exist. Consider the diagram in
figure 7.3a. Shafer says:
Consider the determination of a woman's level of education in the society represented
[on the left hand side of figure 7.3a]. Perhaps Nature can say something about this.
Perhaps when Nature observes certain events in a woman's childhood she changes
her predictions about how much schooling the woman will receive. [The right hand
side of figure 7.3a] gives an example, in which we suppose that the experience of
being a girl scout encourages further schooling. But it is not guaranteed that Nature
will observe such regularities. Perhaps the proportion of girls becoming scouts and
the schooling received by scouts and non-scouts varies so unpredictably that Nature
cannot make probabilistic predictions, as indicated in [figure 7.3b]. Perhaps there are
no signals that can help Nature predict in advance the amount of schooling a girl will
get.15
Figure 7.3b is especially interesting because probability gaps appear in the
middle of a tree. I too have been arguing for probability gaps like these.
Recall the example of chapter 5. I sign a cheque for 300, as I did this past
weekend to pay our nanny. I suppose that the banking system in England for
ordinary transactions is well regulated, the rules are clearly laid out, and the
bank employees are strongly disposed and competent to follow the rules.
They do have a policy reviewing the signatures on cheques, however, which
leads to a small number of false rejections. Coupling that with other failures
(like an inattentive clerk) we assume that not all properly signed cheques for
300 cause the person to whom they are written - in this case, our nanny to be 300 richer if they reach that person's bank. Still there is a very good
chance they will: we may say 'a high qualitative probability'. But that qualitative probability is not free standing. Rather it obtains on account of the
institutional set-ups that I described - crudely, the banking laws and people's
dispositions to obey them.
Notice, however, that I said the payee should be made richer by the
cheque signing //the cheque arrives. That is in general not a problem. There
are a thousand and one ways, or more, to get a cheque from here to there,
and all guaranteed by some different institutional arrangements than those
14
15
ibid.
Shafer 1997, pp. 7-8.
160
that make possible the causal law connecting cheque signing with the payee's
getting money. In fact it makes sense to set up the banking machine in the
way we do just because there are a variety of ways readily available to carry
the causal message from the cause to the effect. They make it possible for
there to be a high probability for the payee to become richer, given that a
cheque is signed; but there is no necessity that they themselves have any
analogous conditional probability. A signed cheque must travel by bus or
mail or train or foot or messenger or some other way in order to be successful
at providing money to the payee. But that goes no way to showing that there
is some conditional probability or other that a cheque goes by, say, Royal
Mail, given it is signed.
Think about my situation for example. The children were staying with me
in London and would see the nanny on the way back to boarding school. I
was flying to California the next day. Would I send the cheque by hand with
them, or post it with the Royal Mail, or, too busy to buy English stamps,
take it all the way to California and use the United States mail? Nothing, I
think, fixes even a single case propensity, let alone a population probability.
Certainly if there is one, it is not guaranteed by the same nomological
machine that fixes the possibility for a conditional probability for a payee to
receive money given that a cheque is signed, nor by the many institutions
that guarantee the existence of the various routes the cheque may follow: for
example, the charter for the Royal Mail that runs the post; the Department
of Transport that keeps up the roads; the statutes and attitudes that prohibit
apartheid so that my children will be able to enter Oxford; and so forth. In
order to ensure that there is a conditional probability that a cheque is, say, in
the mail given it is signed, one would need another nomological machine
beyond all these, one that structured the dispositions of the banking population in a reliable way. I do not think there is one. But that of course is an
empirical question. The point is that these new conditional probabilities are
not required by the other probabilistic and causal laws we have assumed we
have;16 and if they are to obtain we will need some different socio-economic
machines beyond the ones we have so far admitted.
The mention of conditional probabilities should remind us of another range
of cases where we have long recognised that there may be probability gaps:
cases where the conventional characterisation of conditional probability Prob(#/A) = Prob(A<)/Prob04) - seems to fail. For instance situations
where the conditional probability seems easy to assess, but the quantities in
Shafer 1996 uses the theory of martingales to establish a number of results about when 'new'
probabilities, either qualitative or quantitative, will be required by probabilistic assumptions
already made.
.5
12yrs
Does not
become a girl
scout
Some detail about how the educational level of a woman is determined. Notice that the refinement agrees with all the causal
assertions in the original tree. When a woman leaves school after eight years, nature gives her a 5% chance of earning $12.
How she decided to leave school does not matter.
Woman
Woman
162
NO PROBABILITIES^ ^
1
Woman
Does not
become a girl
scout
In this version of the story, Nature does not observe any stable pattern in the
proportion of girls who become girl scouts or in the proportion of girl scouts and
non-girl scouts who finish 12 years of schooling.
Figure 7.3b Source: Shafer 1997, p. 8.
Probability machines
163
be displayed; nor, given the set-up, does it seem possible for the particles to
communicate with each other at the last moment about what the outcome
will be. How then do the results come to be correlated? One hypothesis is
that the initial quantum state - the singlet state - acts as a joint cause producing the two outcomes in tandem. How should we test that hypothesis? One
natural idea is to make use of the standard probabilistic analysis of causality.
According to this analysis, (ceteris paribus) C causes E if Prob(/C) >
Prob(7 - C). So for each of the particles we consider the question: is the
probability of a spin-up outcome higher given the singlet state than it is
without the singlet state?
Now we are back at Hajek's problem with the coin tossing example. If we
have a well constructed experimental arrangement with particles prepared
and maintained in the singlet state, quantum mechanics can tell us the probability of a given outcome in a spin measurement. But if we try to read that as
a conditional probability, as the probabilistic analysis of causation requires,
we are in trouble. Corresponding to Prob(C&E) in this case is the probability
that we create the right kind of experimental arrangement involving a singlet
state and then get a spin-up outcome. Quantum theory has nothing to say
about what this probability could be; and it does not look as if any other
theory will do the job either. The point is even more striking when we consider Prob(/-C), for now we have to calculate what the probability of a
spin-up outcome is if we measure a particle not prepared in the singlet state.
What state, if any, should the particle be in then, and what should be the
probability of this state? We do not have an answer. Nor, I think, does anyone
expect one. We intuitively recognise that here there is no probability to be
had.
4
Salmon 1971.
164
proton
proton
a-particle
magnetic
field
J alpha source
alpha source
igure a
igure
proton
a-particle i
alpha source
igure c
Figure 7.4. Salmon machine. Source: designed by Towfic Shomar
First we need an arena - say, a closed container. Then we need a mechanism, a mechanism that can ensure two things: that there is a fixed joint probability (or range of fixed probabilities with a fixed probability for mixing)
for the presence and absence of the two kinds of causes in the container, and
that under this probability there is sufficient anticorrelation, given the levels
of effectiveness, to guarantee the decrease in probability. Next we must
ensure that there is no other source of the effect in the container or introduced
with either of the elements. We must also guarantee that there is nothing
present in correlation with either of the causes that annihilates the effect as
it is produced, etc., etc. Figure 7.4 is a model of the kind of arrangement that
Probability machines
165
166
experiments in mind is that the tosses are independent and undertaken with a randomly varying force, and we construe the probability of a half as the parameter and
the outcomes as the random variable. Now imagine the tossing being done by a
perfectly calibrated machine in a vacuum, where the coin is always loaded with the
same face up in the machine. If the outcome on the first toss is a head, the probability
of a head becomes unity in successive tosses. Here, changing the design of experiments has radically altered the parameter and the realizations. Consequently, all
aspects of the design of experiments must be selected jointly.19
In Hendry and Morgan's example we have one coin with a fixed physical
structure, but two probability machines. We could of course have a vast
variety of other machines generating a vast variety of other probabilistic
behaviours. We may if we like say of the symmetric coin that it has a fifty
per cent probability to land heads and fifty per cent to land tails. But that is
just a shorthand way to refer to a very generic capacity the coin has by virtue
of being symmetric. In this case we are picking out the capacity by implicitly
pointing to the behaviour it gives rise to in what we all recognise to be a
canonical chance set-up. But as Hendry and Morgan make clear, with different surrounding structures the coin will give rise to different probabilities.
Imagine by contrast that we flip the coin a number of times and record the
outcomes, but that the situation of each flip is arbitrary. In this case we cannot
expect any probability at all to emerge.20 Just notice how specific Hendry and
Morgan are in their description of the two machines. In the first, 'the tosses
are independent and undertaken with a randomly varying force'; in the
second, we 'imagine the tossing being done by a perfectly calibrated machine
in a vacuum, where the coin is always loaded with the same face up in the
machine'; and clearly in both cases they mention only a few of the necessary
features they are assuming the set-ups to have. The matter is different of
course if by 'arbitrary' tosses we mean 'random' tosses, either independent
or with some special kind of fixed dependencies, shielded so that nothing
further affects the outcome than what has already been captured by previous
descriptions. In this case we have a probability. But we also have a well
understood chance set-up. With no fixed surrounding structure of the right
kind for the flips, there will be no probabilities at all to describe the outcomes.
5
Turning from these specific examples, we can ask, 'In general, how do probabilities attach to the world?' The answer is via models, just like the abstract
descriptions of physics that I discussed in chapter 2. Just like the classical
19
20
Probability machines
167
McAllister 1975.
168
An economics example
Ibid., p . 1 1 1 .
Ibid., p. 120.
Mulholland and Jones 1968, p. 167.
Probability machines
169
distributions that they give rise to. I would like now to look at an example
from an empirical science, in particular economics. Most economic models
are geared to produce totally regular behaviour, represented, on the standard
account, by deterministic laws. My example here is of a model designed to
guarantee that a probabilistic law obtains.
The paper I will discuss is titled 'Loss of Skill during Unemployment and
the Persistence of Unemployment Shocks' by Christopher Pissarides.25 I
choose it because, out of a series of employment search models in which the
number of jobs available depends on workers' skills and search intensities,
Pissarides' is the first to derive results of the kind I shall describe about the
probabilities of unemployment in a simple way. The idea investigated in the
paper is that loss of skill during unemployment leads to less job creation by
employers which leads to continuing high levels of unemployment. The
method is to produce a model in which
fx = the probability of a worker getting a job at period t
(i) depends on the probability of getting a job at the previous period iftA) if
there is skill loss during unemployment - i.e., shows persistence', and (ii)
does not depend onf^ if not. The model supposes that there is such a probability and puts a number of constraints on it in order to derive a further constraint on its dynamics:
(i)
(ii)
The point for us is to notice how finely tuned the details of the model plus
the initial constraints on the probability must be in order to fix even a welldefined constraint on the dynamics of/t, let alone/, itself.
The model is for two overlapping generations each in the job market for
two periods only: workers come in generations, and jobs are available for
one period only so that at the end of each period every worker is, at least for
the moment, unemployed. 'Short-term unemployed' refers to 'young'
workers just entering the job market at a given time with skills acquired
through training plus those employed, and thus practising their skills, in the
previous period; 'long-term unemployed' refers to those from the older generation who were not employed in the previous period. The probability / t of
a worker getting a job in the between-period search depends critically on %,
the number of times a job and worker meet and are matched so that a hire
would take place if the job and the worker were both available. By assumption, % at r is a determinate function of the number of jobs available at t (7t)
and the number of workers available at / (2L). Wages in the model are deter25
Pissarides 1992.
170
mined by a static Nash bargain, which in the situation dictates that the worker
and employer share the output equally and guarantees that all matches of
available workers and jobs lead to hiring. The central features of the first
model are listed in figure 7.6. Variations on the basic model that relax the
assumptions that all workers search in the same way and thus have the same
probability for a job match are developed in later sections of the paper.
The details of the argument that matter to us can be summarised in three
steps. (I follow Pissarides' numbering of formulae, but use primes on a
number to indicate formulae not in the text but that follow in logical sequence
the numbered formula.)
A firm's expected profit, 7Ct, from opening a job at t is
(4)
where y= 1 represents no skill loss, y< 1 the opposite. It is crucial that/ t l
appears in this formula. It enters because the expected profit depends on the
probability of a job meeting a short- and a long-term unemployed worker,
which in turn depends on the number of long-term unemployed workers
available and hence on the probability of employment at M .
B. The number of jobs will adjust so that no firm can make a profit by
opening one more, which, given that the cost of opening a job is l/, leads to
nt = l/k
(50
(7)
since the number of hires cannot be greater than the number of jobs or
workers available nor the number of meetings that take place. Coupling these
with the assumption that the function % is homogenous of degree 1 gives
/ t -min{xW2L, 1), 1}
(8)
(80
i] = x(<*>, i)
(90
Probability machines
171
1. Discrete time.
2. Two overlapping generations.
a. Each of fixed size, L.
b. Each generation is in the job market exactly two periods.
3. Each job lasts one period only and must be refilled at the beginning of
every period.
4. The number of jobs, 7t, available at beginning of period t is
endogenous.
5. Workers in each of their two life periods are either employed or
unemployed.
6. a. Output for young workers and old, previously employed
workers = 2.
b. Output for old, previously unemployed workers = 2y, 0 < v < 1
(v < 1 represents skill loss during unemployment.)
7. Unemployed workers have 0 output, no utility, no income.
(This is relevant to calculating wages and profits.)
8. In each period all workers and some jobs are available for matching.
9. Each job must be matched at the beginning of a period to befilledin
that period.
10. In each period workers and jobs meet at most one partner.
11. The number of matches between a job and a worker is designated by \,
where
a. \ is at least twice differentiate.
b. First derivatives in \ a r e positive; second, negative.
c. x is homogeneous of degree 1.
d. x (0,2L) = x(4,0) = 0.
e. x{J\.i 2L) = max(7t, 2L).
12. There is a probability that a worker meets a job at the beginning of /,
designated by/ t .
a. fx does not depend on what a worker does nor on whether the worker
is employed or unemployed.
b. / t is a function only of Jt and L.
13. There is a probability that a job meets a worker at the beginning of
period /.
a. This probability is independent of what jobs do.
b. This probability is a function only of Jt and L.
14. The cost of opening a job and securing the output as described in 6, is
equal to \/k
(whether the job isfilledor not).
15. Wages are determined by a Nash bargain.
16. Workers and employers maximise expected utility.
Figure 7.6 Assumptions of Pissarides' 1992 model 1.
172
ning with the second: (ii) The case where there is no skill loss during unemployment is represented by y = 1. (Short- and long-term workers are equally
productive. See assumption 6, figure 7.6.) Then
from which we see t h a t / does not depend onftl. Hence with no skill loss
there is no unemployment persistence in this model.
(i) When there is skill loss, y < 1. Differentiating (9') with respect t o / . i in
this case gives
[1 - {d%/d<D}[k/2]{1 + y + (1 -y)ftA}] [dfjdf^] = (k/2)(\ -^(dx/dfc) (11)
Then by the homogeneity of
Ibid., p. 1377.
Probability machines
173
I repeat the lesson I wish to draw from looking at Pissarides' search model.
Turn again to figure 7.6. It takes a lot of assumptions to define this model
and, as we have seen, the exact arrangement matters if consequences are to
be fixed about whether there is persistence in the dynamics of unemployment
probability or not. Those arrangements are clearly not enough to fix the exact
nature of the persistence, let alone the full probability itself. In model 1,
where job openings are endogenous, the dependence of jobs on workers'
histories must be engineered just so, so that Jt will be a function of the
product//.,. In model 2, where the product could not possibly enter through
Jt, the facts about how workers search must be aligned just right to get the
product into St. And so forth.
My claim is that it takes hyperfine-tuning like this to get a probability.
Once we review how probabilities are associated with very special kinds of
models before they are linked to the world, both in probability theory itself
and in empirical theories like physics and economics, we will no longer be
tempted to suppose that just any situation can be described by some probability distribution or other. It takes a very special kind of situation with the
arrangements set just right - and not interfered with - before a probabilistic
law can arise.
As I noted at the beginning, what is special about these situations can be
pointed to by labelling them nomological machines: they are situations with
a fixed arrangement of parts where the abstract notions of operation, repetition, and interference have concrete realisations appropriate to a particular
law and where, should they operate repeatedly without interference, the outcome produced would accord with that law.
8
Hajek 1997a.
174
Ibid., p . 1.
Ibid., p. 2.
Ibid., p . 3 .
Probability machines
175
of the results from toss to toss. So what seemed at first a refutation turns into
a confirmation of the points I am defending here. Hajek can get a probability
from almost anything. But to do so he must construct a chance set-up. The
probabilities that come out can be finely tuned by adjusting the exact construction of the chance set-up. Depending on just what the chance set-up is,
the probabilities can be anything, to any degree of accuracy, or they can be
nothing at all if we do not treat the key in any special way.31
9
Chance set-ups
I began with Ian Hacking's views from The Logic of Statistical Inference.
There Hacking urges that propensities in chance set-ups and frequencies are
obverse sides of the same phenomenon. Propensities in chance set-ups give
rise to frequencies; frequencies are the expression of these propensities. This
is a point of view that I have tried to defend here. A chance set-up may
occur naturally or it may be artificially constructed, either deliberately or by
accident. In any case probabilities are generated by chance set-ups, and their
characterisation necessarily refers back to the chance set-up that gives rise to
them.32 We can make sense of the probability of drawing two red balls in a
row from an urn of a certain composition with replacement; but we cannot
make sense of the probability of six per cent inflation in the United Kingdom
next year without an implicit reference to a specific social and institutional
structure that will serve as the chance set-up that generates this probability.
The originators of social statistics followed this pattern, and I think rightly
so. When they talked about the 'iron law of probability' that dictated a fixed
number of suicides in Paris every year or a certain rising rate of crime, this
was not conceived as an association laid down by natural law as they (though
not I) conceived the association between force and acceleration, but rather as
an association generated by particular social and economic structures and
susceptible to change by change in these structures.
The same, I claim, is true of all our laws, whether we take them to be
iron - the typical attitude towards the laws of physics - or of a more flexible
material, as in biology, economics or psychology. J. J. C. Smart33 has urged
that biology, economics, psychology and the like are not real sciences. That
is because they do not have real laws. Their laws are ceteris paribus laws,
I think that Hajek could agree with my insistence that the chance set-up is necessary, for it
in no way counters his own interpretation of probabilities as single case propensities.
Hugh Mellor also thinks that probabilities, or at least objective chances, are not defined for
every arbitrary situation. For instance, an effect does not confer a chance on its causes, though
the causes do fix the chance of the effect (see Mellor 1995). His arguments will help to
support some of my positions as well.
Smart 1963.
176
and a ceteris paribus law is no law at all. The only real laws are, presumably,
down there in fundamental physics. I put an entirely different interpretation
on the phenomena Smart describes. As we have seen, if the topic is laws in
the traditional empiricist sense of claims about necessary patterns of regular
association, we have ceteris paribus laws all the way down: laws hold only
relative to the chance set-ups that generate them.
What basic science aims for, whether in physics or economics, is not primarily to discover laws but to find out what stable capacities are associated
with features and structures in their domains, and how these capacities behave
in complex settings. What is fundamental about fundamental physics is that
it studies the capacities of fundamental particles. These have the advantage,
supposedly, of being in no way dependent on the capacities of parts or
materials that make them up. That is not true of the capacities of a DNA
chain or a person. But the laws expressing the probabilistic regularities that
arise when these particles are structured together into a nucleus or an atom
are every bit as dependent on the stability of the structure and its environment
as are the regularities of economics or psychology. A nucleus in an atom
may be a naturally occurring chance set-up, but it is a set-up all the same.
Otherwise the probabilistic laws of nuclear physics make no sense. I repeat
the lesson about the dual nature of frequencies and propensities: probabilities
make sense only relative to the chance set-up that generates them, and that
is equally true whether the chance set-up is a radio-active nucleus or a socioeconomic machine.
ACKNOWLEDGEMENTS
Part III
In the 1960s when studies of theory change were in their heyday, models
were no part of theory. Nor did they figure in how we represent what happens
in the world. Theory represented the world. Models were there to tell us how
to change theory. Their role was heuristic, whether informally, as in Mary
Hesse's neutral and negative analogies, or as part of the paraphernalia of a
more formally laid out research programme, as with Imre Lakatos. The 1960s
were also the heyday of what Fred Suppe dubbed 'the received view' of
theory,1 the axiomatic view. Theory itself was supposed to be a formal system
of internal principles on the one hand - axioms and theorems - and of bridge
principles on the other, principles meant to interpret the concepts of the
theory, which are only partially defined by the axioms. With the realisation
that axiomatic systems expressed in some one or another formal language are
too limited in their expressive power and too bound to the language in which
they are formulated, models came to be central to theory; they came to constitute theory. On the semantic view of theories, theories are sets of models.2
The sets must be precisely delimited in some way or another, but we do not
need to confine ourselves to any formal language in specifying exactly what
the models are that constitute the theory.
Although doctrines about the relation of models to theory changed from
the 1960s to the 1990s, the dominant view of what theories do has not
changed: theories represent what happens in the world. For the semantic view
that means that models represent what happens. One of the working hypotheses of the LSE/Amsterdam Modelling Project has been that this view is
mistaken. There are not theories, on the one hand, that represent and phenomena, on the other, that get represented (though perhaps only more or less
accurately). Rather, as Margaret Morrison3 put it in formulating the background to our project, models mediate between theory and the world. The
1
2
3
Suppe 1977.
Van Fraassen 1980, or Giere 1988.
Morrison 1997.
179
180
theories I will discuss here are the highly abstract theories of contemporary
physics. I want to defend Morrison's view of models not as constituting these
theories but as mediating between them and the world.
Of course there are lots of different kinds of models serving lots of different purposes, from Hesse's and Lakatos' heuristics for theory change to Morrison's own models as contextual tools for explanation and prediction. In this
discussion I shall focus on two of these. The first are models that we construct
with the aid of theory to represent real arrangements and affairs that take
place in the world - or could do so under the right circumstances. I call these
representative models. This is a departure from the terminology I have used
before. In How the Laws of Physics Lie,4 I called these models phenomenological to stress the distance between fundamental theory and theory-motivated
models that are accurate to the phenomena. But How the Laws of Physics Lie
supposed, as does the semantic view, that the theory itself in its abstract
formulation supplies us with models to represent the world. They just do not
represent it all that accurately. Here I want to argue for a different kind of
separation: theories in physics do not generally represent what happens in the
world; only models represent in this way, and the models that do so are not
already part of any theory. It is because I want to stress this conclusion that
I have changed the label for these models.
Following the arguments about capacities initiated in chapter 10 of
Nature's Capacities and their Measurement5 and further developed here, I
want to argue that the fundamental principles of theories in physics do not
represent what happens; rather, the theory gives purely abstract relations
between abstract concepts. For the most part, it tells us the capacities or
natures of systems that fall under these concepts. As we saw in chapter 3, no
specific behaviour is fixed until those systems are located in very specific
kinds of situations. When we want to represent what happens in these situations we will need to go beyond theory and build a model, a representative
model. And, as I described in chapter 3, if what happens in the situation
modelled is regular and repeatable, these representative models will look very
much like blueprints for nomological machines.
For a large number of our contemporary theories, such as quantum mechanics, quantum electrodynamics, classical mechanics and classical electromagnetic theory, when we wish to build a representative model in a systematic or principled way, we shall need to use a second kind of model. For all
of these theories use abstract concepts, 4abstract' in the sense developed in
chapter 2: concepts that need fitting out in more concrete form. The models
that do this are laid out within the theory itself in its bridge principles. First
Cart wright 1983.
Cart wright 1989.
181
Boumans 1998.
It should be noted that this is not merely a matter of the distinction between the logic of
discovery and the logic of justification, for my claim is not just about where many of our
most useful representative models come from but also about their finished form: these models
are not models of any of the theories that contribute to their construction.
182
Towfic Shomar in his PhD Dissertation,8 gives a nice example of both the
importance of co-operation and of the role of the Ansatz. As Shomar stresses,
this model, built upwards from the phenomena themselves, is still for a great
many purposes both more useful for prediction and wider in scope than what
for years stood as the fundamental and correct model, by Bardeen, Cooper
and Schrieffer (BCS).9 The situation is reflected in the description in a standard text by Orlando and Delin of the development followed up to the point
at which the Ginzburg-Landau model is introduced.10 As Orlando and Delin
report, their text started with electrodynamics as the * guiding principle' for
the study of superconductivity; this led to the first and second London equations.11 The guiding discipline at the second stage was quantum mechanics,
resulting in a 'macroscopic model' in which the superconducting state is
described by a quantum wave function. This led to an equation for the
supercurrent uniting quantum mechanical concepts with the electrodynamic
ones underlying the London equations. The supercurrent equation described
flux quantization and properties of type-II superconductors and led to a
description of the Josephson effect. The third stage introduced thermodynamics to get equilibrium properties. Finally, with the introduction of the
Ginzburg-Landau model, Orlando and Delin were able to add considerations
depending on 'the bi-directional coupling between thermodynamics and electrodynamics in a superconducting system'.12
This kind of creative and co-operative treatment is not unusual in physics,
and the possibility of producing models that go beyond the principles of any
of the theories involved in their construction is part of the reason that modern
physics is so powerful. So, under the influence of examples like the GinzburgLandau model, I would no longer make my earlier points by urging that the
laws of physics lie, as they inevitably will do when they must speak on their
own. Rather, I would put the issue more positively by pointing out how
powerful their voice can be when put to work in chorus with others.
The first point I want to urge in this chapter then is one about how far the
knowledge contained in the theories of physics can go towards producing
accurate predictive models when these theories are set to work co-operatively
with what else we know or are willing to guess for the occasion. But I shall
not go into this in great detail since it is aptly developed and defended in the
volume coming out of the research on our Modelling Project.13 My principal
thesis is less optimistic. For I shall also argue that the way our theories get
8
9
10
11
12
n
Shomar 1998.
Bardeen, Cooper and Schrieffer 1957.
Orlando and Delin 1990.
See Suarez 1998 for a discussion of these.
Orlando and Delin 1990, p. 508.
Morrison and Morgan 1998.
183
applied - even when they co-operate - puts serious limits on what we can
expect them to do. My chief example will be of the BCS theory of superconductivity, which has been one of the central examples in the LSE Modelling
Project. Readers interested in a short exposition of the core of the argument
about the limits of theory in physics can move directly to Section 6.
2
'Good theory already contains all the resources necessary for the representation of the happenings in its prescribed domain.' I take this to be a doctrine
of the 'received' syntactic view of theories, which takes a theory to be a set
of axioms plus their deductive consequences. It is also a doctrine of many
standard versions of the semantic view, which takes a theory to be a collection of models.
Consider first the syntactic view. C. G. Hempel and others of his generation
taught that the axioms of the theory consist of internal principles, which
show the relations among the theoretical concepts, and bridge principles. But
Hempel assigned a different role to bridge principles than I do. For Hempel,
bridge principles do not provide a way to make abstract terms concrete but
rather a way to interpret the terms of theory, whose meanings are constrained
but not fixed by the internal principles. Bridge principles, according to
Hempel, interpret our theoretical concepts in terms of concepts of which we
have an antecedent grasp. On the received view, if we want to see how
specific kinds of systems in specific circumstances will behave, we should
look to the theorems of the theory, theorems of the form, 'If the situation
(e.g., boundary or initial conditions) is X, Y happens'.
Imagine for example that we are interested in a simple well-known case the motion of a small moving body subject to the gravitational attraction of
a larger one. The theorems of classical mechanics will provide us with a
description of how this body moves. We may not be able to tell which theorem we want, though, for the properties described in the theory do not match
the vocabulary with which our system is presented. That is what the bridge
principles are for. 'If the force on a moving body of mass m is GmM/r2, then
the body will move in an elliptical orbit 1/r = 1 4- cecos <>| (where e is the
eccentricity; c a constant)'. To establish the relevance of this theorem to our
initial problem we need a bridge principle that tells us that the gravitational
force between a large mass M and a small mass m is of size
GmMIr2. Otherwise the theory cannot predict an elliptical orbit for a planet.
The bridge principles are crucial; without them the theory cannot be put
to use. We may know for example from Schrodinger's equation that a
quantum system with some particular initial state , and Hamiltonian 3 =
- (h2l2m)l2 + V(r) + (iehlmc)A(ryt)-l will evolve into the state *f. But this is
184
of no practical consequence till we know that *Ff- is one of the excited stationary states for the electrons of an atom, 3f is the Hamiltonian representing the
interaction with the electromagnetic field and from *F/ we can predict an
exponentially decaying probability for the atom to remain in its excited state.
The usefulness of theory is not the issue here, however. The point is that on
the 'received view' the theorems of the theory are supposed to describe what
happens in all those situations where the theory matters, and this is true
whether or not we have bridge principles to make the predictions about what
happens intelligible to us. On this view the only problem we face in applying
the theory to a case we are concerned with is to figure out which theoretical
description suits the starting conditions of the case.
Essentially the same is true for the conventional version of the semantic
view as well. The theory is a set of models. To apply the theory to a given
case we have to look through the models to find one where the initial conditions of the model match the initial conditions of the case. Again it helps to
have the analogue of bridge principles. When we find a model with an atom
in state H*, subject to Hamiltonian 'K we may be at a loss to determine if this
model fits our excited atom. But if the atoms in the models have additional
properties - e.g., they are in states labelled 'ground state', 'first excited state',
'second excited state', and so on - and if the models of the theory are constrained so that no atom has the property labelled 'first excited state' unless
it also has a quantum state *Ff, then the task of finding a model that matches
our atom will be far easier. I stress this matter of bridge principles because I
want to make clear that when I urge, as I do here and in chapter 3, that the
good theory need not contain the resources necessary to represent all the
causes of the effects in its prescribed domain, I am not just pointing out that
the representations may not be in a form that is of real use to us unless
further information is supplied. Rather I want to deny that the kinds of highly
successful theories that we most admire represent what happens, in however
usable or unusable a form.
I subscribe neither to the 'received' syntactic view of theories nor to this
version of the semantic account. For both are cases of the 'vending machine'
view. The theory is a vending machine: you feed it input in certain prescribed
forms for the desired output; it gurgitates for a while; then it drops out the
sought-for representation, plonk, on the tray, fully formed, as Athena from
the brain of Zeus. This image of the relation of theory to the models we use
to represent the world is hard to fit with what we know of how science
works. Producing a model of a new phenomenon like superconductivity is
an incredibly difficult and creative activity. It is how Nobel prizes are won.
On the vending machine view you can of course always create a new theory,
but there are only two places for any kind of human activity in deploying
existing theory to produce representations of what happens, let alone finding
185
a place for genuine creativity. The first: eyeballing the phenomenon, measuring it up, trying to see what can be abstracted from it that has the right form
and combination that the vending machine can take as input; second - since
we cannot actually build the machine that just outputs what the theory
should - we either do tedious deduction or clever approximation to get a
facsimile of the output the vending machine would produce.
This is not, I think, an unfair caricature of the traditional syntactic/semantic
view of theory. For the whole point of the tradition that generates these two
views is the elimination of creativity - or whim - in the use of theory to treat
the world. That was part of the concept of objectivity and warrant that this
tradition embraced.14 On this view of objectivity you get some very good
evidence for your theory - a red shift or a Baimer series or a shift in the
trend line for the business cycle - and then that evidence can go a very long
way for you: it can carry all the way over to some new phenomenon that the
theory is supposed to 'predict'.
In The Scientific Image15 Bas van Fraassen asks: why are we justified in
going beyond belief in the empirical content of theory to belief in the theory
itself? It is interesting to note that van Fraassen does not restrict belief to the
empirical claims we have established by observation or experiment but rather
allows belief in the total empirical content. I take it the reason is that he
wants to have all the benefits of scientific realism without whatever the cost
is supposed to be of a realist commitment. And for the realist there is a
function for belief in theory beyond belief in evidence. For it is the acceptability of the theory that warrants belief in the new phenomena that theory
predicts. The question of transfer of warrant from the evidence to the predictions is a short one since it collapses to the question of transfer of warrant
from the evidence to the theory. The collapse is justified because theory is a
vending machine: for a given input the predictions are set when the machine
is built.
I think that on any reasonable philosophical account of theories of anything
like the kind we have reason~to believe work in the world, there can be no
such simple transfer of warrant. We are in need of a much more textured,
and I am afraid much more laborious, account of when and to what degree
we might bet on those claims that on the vending machine view are counted
as 'the empirical content' or the deductive consequences of theory. The vending machine view is not true to the kind of effort that we know it takes in
physics to get from theories to models that predict what reliably happens;
and the hopes that it backs up for a shortcut to warranting a hypothesised
model for a given case -just confirm theory and the models will be warranted
14
15
186
The first step beyond the vending machine view are various accounts that
take the deductive consequences of a single theory as the ideal for building
representative models but allow for some improvements,17 usually improvements that customise the general model produced by the theory to the special
needs of the case at hand. These accounts recognise that a theory may be as
good as we have got and yet still need, almost always, to be corrected if it is to
provide accurate representations of behaviour in its domain. They nevertheless
presuppose that good scientific theories already contain representations of the
regular behaviours of systems in their domain even though the predicted
behaviours will not for the most part be the behaviours that occur.
To bring together clearly the main reasons why I am not optimistic about
the universality of mechanics - or any other theory we have in physics, or
almost have, or are some way along the road to having, or could expect to
have on the basis of our experiences so far - I shall go step-by-step through
what I think is wrong with the customisation story. The problem set to us is
to predict or account for some aspect of the behaviour of a real physical
system, say the pendulum in the Museum of Science and Industry that illustrates the rotation of the earth by knocking over one-by-one a circle of pegs
centred on the pendulum's axis. On any of a number of customisation
accounts,18 we begin with an idealised model in which the pendulum obeys
Galileo's law. Supposing that this model does not give an accurate enough
account of the motion of the Museum's pendulum for our purposes, we
undertake to customise it. If the corrections required are ad hoc or are at
odds with the theory - as I have observed to be the usual case in naturally
occurring situations like this - a successful treatment, no matter how accurate
16
17
18
In order to treat warrant more adequately in the face of these kinds of observations, Joel
Smith suggests that conclusions carry their warrant with them so that we can survey it at the
point of application to make the best informed judgements possible about the chances that
the conclusion will obtain in the new circumstances where we envisage applying it. See
Mitchell 1997 for a discussion.
It is important to keep in mind that what is suggested are changes to the original models that
often are inconsistent with the principles of the theory.
For example, Ronald Giere's. See Giere 1988.
187
and precise its predictions are, will not speak for the universality of the
theory. So we need not consider these kinds of corrections here.
Imagine then that we are in the nice situation where all the steps we take
as we correct the model are motivated by the theory; and eventually we
succeed in producing a model with the kind of accuracy we require. What
will we have ended up with? On the assumption that Newton's theory is
correct, we will have managed to produce a blueprint for a nomological
machine, a machine that will, when repeatedly set running, generate trajectories satisfying to a high degree of approximation not, I suppose, Galileo's law
for a very idealised pendulum, but some more complex law; and since, as we
are assuming for the sake of argument, all the corrections are dictated by
Newtonian theory given the circumstances surrounding the Museum's pendulum, we will ipso facto have a blueprint for a machine that generates trajectories satisfying the general Newtonian law, F = ma. (Of course, the original
ideal model was also a blueprint for a machine generating the F = ma law.)
Once we have conceived the idealised and the deidealised models as
nomological machines, we can see immediately what is missing from the
customisation account. In a nomological machine we need a number of components with fixed capacities arranged appropriately to give rise to regular
behaviour. The interpretative models of the theory give the components and
their arrangement: the mass-point bob, a constraint that keeps it swinging
through a small angle along a single axis, the massive earth to exert a gravitational pull plus whatever additional factors must be added (or subtracted) to
customise the model. But that is not enough. Crucially, nothing must significantly affect the outcome we are trying to derive except for factors whose
overall contribution can be modelled by the theory. This means both that the
factors can be represented by the theory and that they are factors for which
the theory provides rules for what the net effect will be when they function
in the way they do in the system conjointly with the factors already modelled.
This is why I say, in talking of the application of a model to a real situation,
that resemblance is a two-way street. The situation must resemble the model
in that the factors that appear in the model must represent features in the real
situation (allowing whatever is our favoured view about what it is to 'represent appropriately'). But it must also be true that nothing too relevant occurs
in the situation that cannot be put into the model. What is missing from the
account so far, then, is something that we know matters enormously to the
functioning of real machines that are very finely designed and tuned to yield
very precise outputs - the shielding that I have stressed throughout this book.
This has to do with the second aspect of resemblance: the situation must not
have extraneous factors that we have not got into the model. Generally, for
naturally occurring systems, when a high degree of precision is to be hoped
188
for, this second kind of resemblance is seldom achieved. For the theories we
know, their descriptive capacities give out.
Let us lay aside for now any worries about whether corrections need to be
made that are unmotivated by the theory or are inconsistent with it, in order
to focus on the question of how far the theory can stretch. In exact science
we aim for theories where the consequences for a system's behaviour can be
deduced, given the way we model it. But so far the kinds of concepts we
have devised that allow this kind of deducibility are not ones that easily cover
the kinds of causes we find naturally at work bringing about the behaviours
we are interested in managing. That is, as I have been arguing, why the laws
of our exact sciences must all be understood with implicit ceteris paribus
clauses in front. As I shall argue in the rest of this chapter, our best and most
powerful deductive sciences seem to support only a very limited kind of
closure: so long as the only relevant factors at work are ones that can be
appropriately modelled by the theory, the theory can produce exact and precise predictions. This is in itself an amazing and powerful achievement, for
it allows us to engineer results that we can depend on. But it is a long distance
from hope that all situations lend themselves to exact and precise prediction.
4
I have made a point of mentioning bridge principles, which get little press
nowadays, because they are of central importance both practically and philosophically. Practically, bridge principles are a first step in what I emphasise
as a sine qua non of good theory - the use of theory to effect changes in the
world. They also indicate the limitations we face in using any particular
theory, for the bridge principles provide natural boundaries on the domain
the theory can command. So they matter crucially to philosophical arguments
about the relations between the disciplines and the universal applicability of
our favoured theories. These are arguments I turn to in later sections.
I take the general lack of philosophic investigation nowadays of what
bridge principles are and how they function in physics to be a reflection of
two related attitudes that are common among philosophers of physics. The
first is fascination with theory per se, with the details of the formulation and
the exact structure of a heavily reconstructed abstract, primarily mathematical, object: theory. I say 'heavily reconstructed' because 'theory' in this
sense is far removed from the techniques, assumptions, and various understandings that allow what is at most a shared core of equations, concepts, and
stories to be employed by different physicists and different engineers in different ways to produce models that are of use in some way or another in
manipulating the world. The second attitude is one about the world itself,
one I remarked on in the Introduction, an attitude that we could call Platonist
189
or Pauline: Tor now we see through a glass darkly, but then face to face.
Now I know in part; but then shall I know even as also I am known.'19 It
would be wrong to say, as a first easy description might have it, that these
philosophers are not interested in what the world is like. Rather they are
interested in a world that is not our world, not the world of appearances but
rather a purer, more orderly world, a world which is thought to be represented
'directly' by the theory's equations. But that is not the world that contemporary physics gives us reasons to believe in when physics is put to work to
manage what happens in the world.
As I have argued in the earlier chapters in this book, physics needs bridge
principles because a large number of its most important descriptive terms do
not apply to the world directly; rather, they function as abstract terms. The
quantum Hamiltonian, the classical force function and the electromagnetic
field vectors are all abstract. Whenever they apply there is always some more
concrete description that also applies and that constitutes what the abstract
concept amounts to in the given case. Mass, charge, acceleration, distance,
and the quantum state are not abstract. When a particle accelerates at 32 ft/
sec2, there is nothing further that constitutes the acceleration. Similarly,
although it may be complicated to figure out what the quantum state of a
given system is, there is nothing more about the system that is what it is for
that system to have that state. In chapter 2 we saw some simple examples.
For a more complex illustration consider the Hall effect. We start with
what in classical electromagnetic theory serves as a very concrete description:20 a conductor carries a uniform current density J of electric charge nq
with velocity v parallel to, say, the y axis in a material of conductivity a. So
Jy = nqvy. Ohm's law tells us how to put this in the more abstract vocabulary
of electric fields:
J = aE
Now we know that there is an electric field parallel to the current. What
happens when the conductor is placed in an external magnetic field B parallel
to the z axis? An electric field appears across the conductor in the direction
of the x axis and the magnetic field exerts a force F = qVyBz on the moving
charge carriers in the current.
From our general knowledge of what forces do, we can then predict that
the force will tend to displace the moving carriers in the x direction. This
gives rise to a non-uniform charge distribution, which licenses, in turn, a new
190
23
191
e.g., whether some kinds of descriptions are abstract relative to other kinds
of descriptions in this theory, the answers must be constrained by considerations about what makes for the empirical success of the theory. Once we call
this reconstructed object 'quantum theory' or 'the BCS theory of superconductivity', it will be reasonably assumed that we can attribute to this object
all the empirical successes usually acknowledged for these theories. What
this requirement amounts to in different cases will get argued out in detail
on a case-by-case basis. The point here is about bridge principles. In the
successful uses of classical mechanics, force functions are not applied to
situations that satisfy arbitrary descriptions but only to those situations that
can be fitted to one of the standard interpretative models by bridge principles
of the theory; so too for all of physics' abstract terms that I have seen in
producing the predictions that give us confidence in the theory.
Recall the analogue of this issue for the semantic view, again with the
most simple-minded version of classical mechanics to illustrate. Does the
set of models that constitutes the theory look much as Ronald Giere and
I both picture it: pendulums, springs, 'planetary' systems, and the like,
situated in specific circumstances, each of which also has a force acting
on it appropriate to the circumstances; that is, do the objects of every
model of the theory have properties marked out in the 'interpretative'
models and a value for the applied force as well? Or, are there models
where objects have simply masses, forces and accelerations ascribed to
them with no other properties in addition? I think there is a tendency to
assume that to be a scientific realist about force demands the second. But
that is a mistake, at least in so far as scientific realism asserts that the
claims of the theory are true and that its terms refer, in whatever sense
we take 'true' and 'refer' for other terms and claims.
The term 'geodesic' is abstract, as, I claim, are many central terms of
theoretical physics: it never applies unless some more concrete description
applies in some particular geometry, e.g. 'straight line' on a Euclidean plane
or 'great circle' on a sphere. But this does not mean that we cannot be realists
about geodesies. The same holds for questions of explanatory or predictive
or causal power. The set of models that I focus on, where forces always
piggy-back on one or another of a particular kind of more concrete description, will predict accelerations in accordance with the principle F = ma. Still
there is nothing across the models that all objects with identical accelerations
and masses have in common except that they are subject to the same force.
Putting the bridge principles into the theory when we reconstruct it does not
conflict with realism. And it does produce for us theories that are warranted
by their empirical successes.
192
Hughes 1998.
193
representation and does not match up with a separation of the causes into
two distinct mechanisms.
Third, without a broader notion of representation than one based on some
simple idea of picturing we should end up faulting some of our most powerful
models for being unrealistic. Particularly striking here is the case of second
quantization,25 from which quantum field theory originates. In this case we
model the field as a collection of harmonic oscillators in order to get Hamiltonians that give the correct structure to the allowed energies. But we are not
thus committed to the existence of a set of objects behaving just like springs though this is not ruled out either, as we can see with the case of the phonon
field associated with the crystal lattice described below or the case of the
electric dipole oscillator that I describe in chapter 9.
Last, we make it easy to overlook the fact that when we want to use
physics to effect changes in the world we not only need ways to link the
abstract descriptions from high theory to the more concrete descriptions of
models; we also need ways to link the models to the world. This is a task
that begins to fall outside the interests of theorists, to other areas of physics
and engineering. Concomitantly it gets little attention by philosophers of science. We tend to try to make do with a loose notion of resemblance. I shall
do this too. Models, I say, resemble the situations they represent. This at
least underlines the fact that in order for a description to count as a correct
representation of the causes, it is not enough that it predicts the right effects;
independent ways of identifying the representation as correct are required. I
realise that this is just to point to the problem, or to label it, rather than to
say anything in solution to it. But I shall leave it at that in order to focus on
the separate problem of how we use the interpretative models of our theories
to justify the abstract descriptions we apply when we try to represent the
world. I choose the quantum Hamiltonian as an example. In the next section
we will look in detail at one specific model - the BCS model for superconductivity - to see how Hamiltonians are introduced there.
6
194
gical in two senses: first, they are not derived by constructing a model to
which a Hamiltonian is assigned, but rather are justified by an ad hoc combination of considerations from thermodynamics, electromagnet!sm and
quantum mechanics itself. Second, the model does not give us any representation of the causal mechanisms that might be responsible for superconductivity. The first of these senses is my chief concern here.
The Ginzburg-Landau equations describe facts about the behaviour of the
quantum state that, according to proper quantum theory, must be derived
from a quantum Hamiltonian. Hence they impose constraints on the class of
Hamiltonians that can be used to represent superconducting materials. But
this is not the procedure I have described as the correct, principled way for
arriving at Hamiltonians in quantum theory, and indeed the equations were
widely faulted for being phenomenological, where it seems both senses of
'phenomenologicar were intended at once. The description of the GinzburgLandau model in the recent text by Poole, Farachad and Creswick is typical:
' The approach begins by adopting certain simple assumptions that are later
justified by their successful predictions of many properties of superconducting materials.'27 Indeed it is often claimed that the Ginzburg-Landau
model was not treated seriously until after we could see, thanks to the work
by G'orkov, how it followed from the more principled treatment of the BCS
theory.28
Before turning to the construction of the BCS Hamiltonian I begin with a
review of my overall argument. We are invited to believe in the truth of our
favourite explanatory theories because of their precision and their empirical
successes. The BCS account of superconductivity must be a paradigmatic
case. We build real operating finely-tuned superconducting devices using the
Ginzburg-Landau equations. And, since the work of G'orkov, we know that
the Ginzburg-Landau equations can be derived from quantum mechanics or
quantum field theory using the BCS model. So every time a SQUID detects
a magnetic fluctuation we have reason to believe in quantum theory.
But what is quantum theory? Theory, after all, is a reconstruction. In the
usual case it includes 'principles' but not techniques, mathematical relations
but little about the real materials from which we must build the superconducting devices that speak so strongly in its favour. Theory, as we generally
reconstruct it, leaves out most of what we need to produce a genuine empirical prediction. Here I am concerned with the place of bridge principles in
our reconstructed theories. The quantum Hamiltonian is abstract in the sense
of 'abstract' I have been describing: we apply it to a situation only when that
27
28
195
Messiah 1961.
196
are few in number. Just as with internal principles, so too with bridge principles: there are just a handful of them, and that is in keeping with the point
of abstract theory as it is described by empiricists and rationalists alike.30 We
aim to cover as wide a range as we can with as few principles as possible.
How much then can our theories cover? More specifically, exactly what
kinds of situations fall within the domain of quantum theory? The bridge
principles will tell us. In so far as we are concerned with theories that are
warranted by their empirical successes, the bridge principles of the theory
will provide us with an explicit characterisation of its scope. The theory
applies exactly as far as its interpretative models can stretch. Only those
situations that are appropriately represented by the interpretative models fall
within the scope of the theory. Sticking to Messiah's catalogue of interpretative models as an example, that means that quantum theory extends to all
and only those situations that can be represented as composed of central
potentials, scattering events, Coulomb interactions and harmonic oscillators.
So far I have mentioned four basic bridge principles from Messiah. We
may expect more to be added as we move through the theory net from fundamental quantum theory to more specific theories for specific topics. Any good
formalisation of the theory as it is practised at some specific time will settle
the matter for itself. In the next section I want to back up my claim that this
is how quantum theory works by looking at a case in detail. I shall use the
Bardeen-Cooper-Schrieffer account of superconductivity as an example. This
account stood for thirty-five years as the basic theory of superconductivity
and, despite the fact that the phenomena of type-II superconductors and of
high temperature superconductivity have now shown up problems with it, it
has not yet been replaced by any other single account.
I chose this example because it was one I knew something about from my
study of SQUIDs at Stanford and from our research project on Modelling at
LSE. It turns out to be a startling confirmation of my point. The important
derivations in the BCS paper are based on a 'reduced' Hamiltonian with just
three terms: two for the energies of electrons moving in a distorted periodic
potential and one for a very simple scattering interaction. This Hamiltonian
is 'reduced* from a longer one that BCS introduce a page earlier. When we
look carefully at this longer Hamiltonian, we discover that it too uses only
the basic models I have already described plus just one that is new: the kinetic
energy of moving particles, the harmonic oscillator, the Coulomb interaction,
and scattering between electrons with states of well defined momentum, and
then, in addition, the 'Bloch' Hamiltonian for particles in a periodic potential
(itself closely related to the central potential, which is already among the
30
For example, David Lewis, John Earman and Michael Friedman on the empiricist side, and
on the rationalist, Abner Shimony.
197
198
with well defined momenta which repeatedly interact via the lattice vibrations, changing their individual momenta as they do so but always maintaining a total null momentum. The Pauli exclusion principle dictates that no
two electrons can occupy the same state. So normally at a temperature of
absolute zero the lowest energy for the sea of conduction electrons is
achieved when the energy levels are filled from the bottom up till all the
electrons are exhausted. This top level is called the * Fermi level'. So all the
electron' energy will normally be below the Fermi level. The interaction of
two electrons under an attractive potential decreases the total potential
energy. Raising them above the Fermi level increases their energy of course.
What Cooper showed was that for electrons of opposite momenta interacting
through an attractive potential - Cooper pairs - the decrease in potential
energy will be greater than the increase in kinetic energy for energies in a
small band above the Fermi level. This suggested that there is a state of lower
energy at absolute zero than the one in which all the levels in the Fermi sea
are filled. This is essentially the superconducting state.
The first job of the BCS paper was to produce a Hamiltonian for which
such a state will be the solution of lowest energy and to calculate the state.
This is why the paper is such a good example for a study of how Hamiltonians get assigned. The construction of the Hamiltonian takes place in the
opening sections of the paper. The bulk of the paper which follows is devoted
to showing how the allowed solutions to the BCS Hamiltonian can account
for the typical features of superconductivity. We will need to look only at
the first two sections.
8
199
k 2 +q
200
H
7-
pi
1
~~ 2
>/2u;K|MK|2c
x-
-K+K)2-(^K)2
(k
"
with equal and opposite momenta. As in the earlier work of Frohlich and
Bardeen, electrons with kinetic energies in the range just above the Fermi
level can have a lower total energy if the interaction between them is attractive. But there is a limit set on the total number of pairs that will appear in
this range because not all pairs can be raised to these states since the Pauli
exclusion principle prohibits more than one pair of electrons in a state with
specific, oppositely directed values of the momentum. So here we see one of
the features that quantum theory assigns to electrons that are retained by the
electron-like particles in the model.
What is interesting for our topic is that these physical ideas are not built
in as explicit features of the model that are then used in a principled way to
put further restrictions on the Hamiltonian beyond those already imposed
from the model. Rather the assumptions about what states will interact
significantly are imposed as an Ansatz, motivated but not justified, to be tried
out and ultimately judged by the success of the theory at accounting for the
peculiar features associated with superconductivity.32 Thus in the end the
BCS Hamiltonian is a rich illustration: it is a Hamiltonian at once both theoretically principled and phenomenological or ad hoc. Let us look at each of
the terms of the BCS Hamiltonian (figure 8.2) in turn.
Terms (I) and (2): Block electrons. The 'electrons' in our model are not
'free' electrons moving independently of each other, electrons moving unencumbered in space. They are, rather, 'Bloch' electrons and their Hamiltonian
is the 'Bloch Hamiltonian'. It is composed of the sum of the energies for a
32
Because of these ad hoc features it makes sense to talk both of the BCS theory and separately
of the BCS model since the assumptions made in the theory go beyond what can be justified
using acceptable quantum principles from the model that BCS offer to represent superconducting phenomena.
201
202
34
203
35
with screening in his 1956 survey. For brevity I shall describe here only the
Fermi-Thomas treatment.36
A usual first step for a number of approaches is to substitute a new model,
the 'independent electron model', for the interacting Bloch electrons, a model
in which the electrons are not really interacting but instead each electron
moves independently in a modified form of the Bloch potential. What potential should this be? There is no bridge principle in basic theory to give us a
Hamiltonian for this model. Rather the Hamiltonian for the independent electron model is chosen ad hoc. The task is to pick a potential that will give, to
a sufficient degree of approximation in the problems of interest, the same
results from the independent electron model as from the original model. We
can proceed in a more principled way, though, if we are willing to study
models that are more restrictive. That is the strategy of the Fermi-Thomas
approach.
The Fermi-Thomas approach refines the independent electron model several steps further. First, its electrons are fairly (but not entirely) localized.
These will be represented by a wave-packet whose width is of the order of
l/kF, where kF is the wave vector giving the momentum for electrons at the
Fermi energy level. One consequence of this is that the Fermi-Thomas
approach is consistent only with a choice of total potential that varies slowly
in space. The Fermi-Thomas treatment also assumes that the total potential
(when expressed in momentum space) is linearly related to the potential of
the fixed ions. To these three constraints on the model a fourth can be consistently added, that the energy is modified from that for a free electron model
by subtracting the total local potential. As a result of these four assumptions
it is possible to back-calculate the form of the total potential for the model
using standard techniques and principles. The result is that the usual Coulomb
potential is attenuated by a factor l/ekF'lTi~rJl. The Fermi-Thomas approach
is a nice example of how we derive a new bridge principle for quantum
theory. The trick here was to find a model with enough constraints in it that
the correct Hamiltonian for the model could be derived from principles and
techniques already in the theory. Thus we are able to admit a new bridge
principle linking a model with its appropriate Hamiltonian. By the time of
the BCS paper the Fermi-Thomas model had long been among the models
15
16
Bardeen 1956.
In fact I do this not just for brevity but because the Bardeen and Pines approach introduces
a different kind of model in which the electrons and ions are both treated as clouds of charge
rather than treating the ions as an ordered solid. Tracing out the back and forth between this
model and the other models employed which are literally inconsistent with it is a difficult
task and the complications would not add much for the kinds of points I want to make here,
although they do have a lot of lessons for thinking about modelling practices in physics.
204
205
Springs
Atoms
Atom at rest
Atom at rest
Atom at rest
Displacement
The term corresponding to the linear expansion, i.e., the first derivative, is null, as it expresses
the equilibrium condition for the distance at rest.
206
207
that the transverse waves are treated by a different model, a model in which
the ions vibrate in a fixed negative sea of electrons. The Frohlich model also
assumes that the spatial vibration of the electron waves across the distance
between ions in the lattice is approximately the same for all wave vectors.
Using perturbation methods in the first instance and later a calculation that
does not require so many constraints as perturbation analysis, Frohlich was
able to justify a Hamiltonian of the form of term (4), with an interaction
coefficient in it reflecting the screening of the Coulomb force between ions
by the motion of the electrons. The model assumes that in the presence of
electron interactions the motion of the ions is still harmonic, with the frequencies shifted to take into account the screening of the ion interactions by the
electrons. Similarly, in Frohlich's version of term (4) the negative charge
density in the model is no longer the density of the original Bloch electrons
(which, recall, move in a fixed external potential) but rather that of electrons
carrying information about the lattice deformation due to their interaction
with it.
Now we can turn to the treatment by Bardeen and Pines. They have a
model with plasma oscillations of the electron sea. They deal thus with a
longer Hamiltonian involving energies of the individual Bloch electrons, the
harmonic oscillations of the lattice, the harmonic oscillations of the plasma,
an electron-lattice interaction, a plasma-phonon interaction, a plasma-electron
interaction and a term for those wave vectors for which plasma oscillations
were not introduced, including the shielded Coulomb electron-electron interaction. They rely on previous arguments of Pines' to show that this model
can approximate in the right way a suitably constrained version of the 'full
underlying model'. They are then able to show that the electron-lattice interaction can be replaced by a phonon-mediated electron-electron interaction
described by a Hamiltonian of the form of term (4). In Bardeen and Pines'
version of this term, as with Frohlich's, the interaction coefficient is adjusted
to provide approximately enough for the effects produced by shielding in the
underlying model.
Other contemporary treatments. We have looked at the BCS theory in
detail, but there is nothing peculiar about its use of quantum Hamiltonians.
A number of accounts of superconductivity at the time treat the quantum
mechanical superconducting state directly without attempting to write down
a Hamiltonian for which this state will be a solution. This is the case, for
instance, with Born and Cheng,40 who argue from a discussion of the shape
of the Fermi surface that in superconductivity spontaneous currents arise from
a group of states of the electron gas for which the free energy is less with
208
currents than without. It is also the case with Wentzel,41 who treats superconductivity as a magnetic exchange effect in an electron fluid model, as well
as Niessen,42 who develops Heisenberg's account.
Those contemporary accounts that do provide Hamiltonians work in just
the way we have seen with BCS: only Hamiltonians of stock forms are introduced and further details that need to be filled in to turn the stock form into
a real Hamiltonian are connected in principled ways with features of the
model offered to represent superconducting materials. By far the most
common Hamiltonians are a combination of kinetic energy in a Bloch potential plus a Coulomb term. This should be no surprise since theorists were
looking at the time for an account of the mechanism of superconductivity
and Coulomb interactions are the most obvious omission from Bloch's
theory, which gives a good representation of ordinary conduction and resistance phenomena. We can see this in Heisenberg's theory,43 which postulates
that Coulomb interactions cause electrons near the Fermi energy to form low
density lattices where, as in Born and Cheng, there will be a lower energy
with currents than without, as well as in Schachenmeier,44 who develops
Heisenberg's account and in Bohm,45 who shows that neither the accounts of
Heisenberg nor of Born and Cheng will work. Macke46 also uses the kinetic
energy plus Coulomb potential Hamiltonians, in a different treatment from
that of Heisenberg.
Of course after the work of Bardeen and of Frohlich studies were also
based on the electron scattering Hamiltonian we have seen in term (4) of the
BCS treatment, for instance in the work of Singwi47 and Salam.48 A very
different kind of model studied both by Wentzel49 and by Tomonaga50 supposes that electron-ion interactions induce vibrations in the electron gas,
where the vibrations are assigned the traditional harmonic oscillator Hamiltonian. Tomonaga, though, does not just start with what is essentially a spring
model which he then describes with the harmonic oscillator Hamiltonian.
Rather, he back-calculates constraints on the Hamiltonian from assumptions
made about the density fields of the electron and ion gases in interaction. In
these as in all other cases I have looked at in detail the general point I want
to make is borne out. We do not keep inventing new Hamiltonians for each
41
42
43
44
45
46
47
48
49
50
Wentzel 1951.
Niessen 1950.
Heisenberg 1947.
Schachenmeier 1951.
B o h m 1949.
Macke 1950.
Singwi 1952.
Salam 1953.
Wentzel 1951.
Tomonaga 1950.
209
new phenomenon, as we might produce new quantum states. Rather the Hamiltonians function as abstract concepts introduced only in conjunction with
an appropriate choice of a concrete description from among our set stock of
interpretative models.
9
It is well known that relativistic quantum field theory makes central the imposition of symmetry constraints, such as local gauge invariance, for determining the form of Lagrangians
and Hamiltonians for situations involving interactions where the force fields are represented
by gauge fields. But it cannot be claimed on these grounds alone that the Hamiltonians, say,
are built, not from the bottom up, as I have discussed, but rather from the top down. Even in
the case of interaction between electrons involving electromagnetic fields the local gauge
symmetry requirement that establishes the form of the Lagrangian and Hamiltonian is only
formal. It is satisfied by the introduction of a quantity that the symmetry requirement itself
leaves uninterpreted. As Jordi Cat has argued, we still need in addition some bridge principles
that tell us that the gauge field corresponds to electromagnetic fields and that indeed the
Lagrangian or Hamiltonian describe a situation with some interaction with it (see Cat 1993
and 1995b). It is in this sense that it is often argued, mistakenly, that symmetry principles
alone can dictate the existence of forces.
210
says nothing one way or another about how much of the world our stock
models can represent.
ACKNOWLEDGEMENTS
This chapter is a shortened version, with some changes, of Cartwright 1998b. Thanks
to both Jordi Cat and Sang Wook Yi for comments, for detailed discussion of both
the physics and the philosophy and for help in the production. Research for the chapter
was supported by the LSE Modelling in Physics and Economics Project, and a number
of the ideas were developed during our group meetings.
212
Robert Clifton gives a formal defence of the bi-orthogonal rule in Clifton 1995. For a discussion see Cartwright and Zouros in preparation.
213
Fano 1957.
For a fuller discussion, see Cartwright 1974a and Del Seta and Cattaneo 1998.
214
I want to propose here a different kind of strategy for having your cake and
eating it too, one with neither a universal rule for double state assigment nor
a general demonstration of correctness and consistency. This is a strategy
that instead undertakes to produce the requisite demonstrations in exactly
those cases where detailed physical analysis results in two distinct representations for the same system.
The strategy employs a number of ideas that are taken from Willis Lamb.
The ideas are developed in a series of papers of his on measurement.4 I am
going to locate these ideas of Lamb's in the frame I have just been sketching
and then elaborate on some of the philosophical underpinning that supports
the use I put them to. I will borrow two particular features of Lamb's account.
The first is that the second set of states to be assigned beyond the conventional quantum superposition are not further quantum states but rather
classical states. The second is that the assignment of these classical states is
not a consequence of a general rule but comes instead from case-by-case
analyses of specific physical situations. The view is, in a way I shall describe,
peculiarly realistic about the quantum theory.
I begin with a standard assumption of the 'have-your-cake-and-eat-it-too'
strategy: once a quantum state is assigned to a system in a particular kind
of situation that can be represented with a quantum Hamiltonian, all future
quantum states involving that system in that situation are determined by
the Schrodinger equation in the ordinary way given its interactions. This
is true no matter what other descriptions we assign to the system. This
means that, so long as the system stays in situations where all the causes
of change can be represented by a quantum Hamiltonian, once the system
has been in an interaction it will never again have a state of its own but
rather be represented as part of an ever increasing composite. (Of course
there are familiar shortcuts for disregarding the rest of the composite in
calculations where it is irrelevant.) The restriction to specific kinds of
situations should be noted because Lamb does not think that quantum
descriptions apply to systems in arbitrary situations. My agreement with
Lamb on this is central to my view that quantum theory is severely
limited in its scope of application. But that is not the central point in this
4
215
chapter, so for most of the discussion here the restriction to very particular
kinds of situations will be repressed.
The core of the idea I want to develop here is that in addition to the
quantum state, some systems will be correctly described by classical states
as well. Both descriptions may be true of the system and true of it in exactly
the same sense. Classical quantities (like the length of Einstein's bed or the
position of particles on a photographic plate) depend on the classical state;
and the quantum state fixes quantum mechanical features (like the dipole
moment of the atoms in the semi-classical theory of the laser described
below). There is thus no difficulty in knowing which states to use for which
predictions. So, in this picture the book-keeping problem is trivial - at least
at the formal level where these issues are generally discussed in the foundations of physics literature. I shall return to this issue, however, to explain
why I add this caveat.
The idea that the states from the second set are classical is a simple one
with obvious historical roots yet I do not think it has before been seriously
considered in this setting. But classical states seems prima facie at least on a
par with the more standard choice of quantum states with respect to both the
book-keeping problem and to the question of when the second set of descriptions are supposed to apply. On other grounds classical states have a decided
advantage over the quantum-state alternative:5
(a) Classical states have well-defined values for all classical observables.
One of the chief reasons for wanting a second set of states is to deal
with the problems of macroscopic objects that have interacted with
quantum systems and might thereby become part of a composite superposition, objects like Schrodinger's cat. The second state is supposed to ensure
that the macroscopic object nevertheless has well-defined values for the
quantities we are used to seeing well-defined. But quantum states do not
serve this purpose since systems with well-defined values for some observables will still be in superpositions across a variety of values of other
observables that do not commute with the first, (b) Classical states do not
spread as quantum states do. For example a system with a well-defined
value of position will always retain a well-defined position under classical
laws of evolution. So, Einstein's bed will not fill his room after all. (c)
No rules for state selection are necessary. Problems like those I mentioned
in discussing the bi-orthogonal rule do not arise. If we consider assigning
some member of a second set of states to a system, the set of states is
always the same - the set of classical states.
These are also advantages that classical states have over the decoherence approach, which
employs a form of the reduction strategy.
216
Turn now to the second idea that contributes to the version of 'have-yourcake-and-eat-it-too' that I propose. The idea is that the assignment of classical
states follows no formal universal rule. Rather it is part of the elaborate and
ever expanding corpus of knowledge we acquire in quantum mechanics about
how to solve problems and produce models for great varieties of different
situations. This is a proposal that I have been long thinking about in my own
way, so my remarks about it may not accurately reflect Lamb's view.
I begin with the familiar observation that the Schrodinger equation guarantees a deterministic evolution for the quantum state function. Nevertheless
quantum mechanics is a probabilistic theory. What are the probabilities
probabilities of? One answer is that they are probabilities that various
'observables' on the system possess certain allowed values. This is the
answer associated with the ensemble interpretation of quantum mechanics. It
is widely rejected on the grounds that results like those of the two-slit experiment show that observable quantities do not have 'possessed values' in
quantum systems. The more favoured alternative is that the probabilities are
probabilities for certain allowed values to be found when an observable is
measured on a quantum system.
I reject both of these answers. I propose instead that the probabilities when
they exist at all are probabilities for classical quantities to possess particular
values. What is usually formalised as quantum theory is a theory about how
systems with quantum states will evolve in the special circumstances that can
be represented by quantum Hamiltonians. Of course our theoretical knowledge goes well beyond that, and we are often in a position (as I illustrate
below) to use our quantum descriptions to help predict facts about the
classical state of some system. As with Max Born's original treatment of
scattering, these predictions may well be probabilistic, and it seems there is
no way to eliminate the probabilistic element. But there is no guarantee that
a quantum analysis will yield such predictions and there is no universal
principle by which we infer classical claims from quantum descriptions. This
is rather a piece-meal matter differing from case to case.
The fundamental picture I want to urge, following my interpretation of
Lamb's ideas, is that in the right kind of situations some systems have
quantum states, some have classical states, and some have both. The presupposition is that macro-objects in the usual settings that are well represented
by classical models have classical states and that micro-objects in the kinds
of settings that resemble quantum models have quantum states. But perhaps
some macro-objects have quantum states as well. Macro-interference effects
may give us one reason to assign a quantum state to macro-objects in certain
kinds of situations. Sketches of micro-macro interactions like Schrodinger's
217
cat or the von Neumann theory of measurement have also been thought to
provide macro-quantum states. A third commonly accepted reason for
assigning quantum states to macro-objects relies on standard procedures for
modelling quantum interactions. It is supposed that (i) given any two systems
with quantum states, the composite of the two must itself have a quantum
state represented in the tensor product of the spaces representing the two
separately; and also that (ii) all interactions between systems with quantum
states are quantum interactions, representable by an interaction Hamiltonian
on the tensor product space. Hence step by step the quantum states of a
macro-object are built up out of the quantum states of the systems that make
it up.
I myself do not find any of these reasons compelling, especially not the
second and the third, which make claims about what kind of treatments
quantum mechanics can in principle provide without actually producing the
treatments. But that is not the most pressing matter. For the point is that
there is no automatic incompatibility between quantum and classical states.
Although contradictions may be unearthed in one case or another, they are
not automatic.
What are classical states then? They are the states we represent in classical
physics. Classically, the characteristics of physical systems are represented
by analytic functions onto real numbers. In classical statistical mechanics, for
instance, these quantities are functions of the basic dynamical variables, position and momentum, and, hence, they are defined on the n-dimensional phase
space that the position and momentum values define. On this phase space
physical states are represented by probability densities.
Position and momentum are also basic dynamical variables in quantum
mechanics, although in this case they are represented mathematically by noncommuting Hermitian operators on a Hilbert space. It is on this complex
vector space that quantum physical states are defined and can be associated
with a statistical operator, the density matrix, that represents probability distributions. Despite some formal analogies, however, quantum and classical
quantities fall under different mathematical characterisations, and there is no
general rule that enables us to represent any classical quantity with a quantum
one and vice versa. We do make attempts to formulate a general rule that
enables us to represent classical quantities with quantum operators and vice
versa, but they have all proven problematic.6 In my opinion this is entirely
6
Kochen and Specker argued in 1967 that the interpretive - or metaphysical - consistency of
such a general mapping would require a hidden variable quantum theory, in which a phase
space of hidden states has the formal structure of a classical phase space. However, they
showed that no mapping exists that (1) allows the calculation of hidden variable theory
expectation values by the classical method and (2) maps quantum operators onto real-valued
functions on the space of hidden states in such a way that (2a) for any operator and
218
7
8
But Wigner's distribution is not non-negative as a probability should be. In addition L. Cohen
and M. Margenau have shown that in general no permissible distribution generates a rule for
associating quantum operators with classical quantities that preserves algebraic relations. See
Margenau and Cohen 1967 and Cartwright 1974b.
Or, 'after some spontaneous reduction along the eigenstates of O'
Cf. James Park's measurement and measurement2 in Park 1969.
quantum
E(r, /)
statistical
<Pi>
mechanics
219
Maxwell's
P(r, /)
summation
E ' ( r , /)
equations
Figure 9.1 The semi-classical theory of the laser. Source: Sargent III,
Scully and Lamb 1974.
For a number of examples and a further discussion, see Chang 1997a, 1997b and also 1995.
Sargent III, Scully and Lamb 1974, p. 96.
220
(a) P = 0
(b) P * 0
Figure 9.2 The quantum electric dipole oscillator. Source: Sargent III,
Scully and Lamb 1974.
physics, that quantum electrons in atoms 'behave like charges subject to the
forced, damped oscillation of an applied electromagnetic field'.11 This charge
distribution oscillates like a charge on a spring, as in Figure 9.2:
The discussions of the oscillator waver between the suggestive representation of <er{> as a genuinely oscillating charge density and the more 'pure'
interpretation in terms of time-evolving probability amplitudes. From the
standard point of view this should be troubling; but without the generalised
Born interpretation looming in the background I see nothing in this to worry
about. The oscillator analogy provides a heuristic for identifying a quantum
and a classical quantity in the laser model. The identification is supported by
the success of the model in treating a variety of multimode laser phenomena the time and tuning dependency of the intensity, frequency pulling, spiking,
mode locking, saturation, and so forth.
Another example where a quantum characteristic is typically associated
with a classical quantity is the standard semi-classical treatment of the
Aharanov-Bohm effect. Consider Figure 9.3. In a two-slit experiment two
diffracted electron beams produce an interference pattern on a screen. For
the Aharanov-Bohm effect a solenoid is added at a required location between
the slits and the screen. In that region a current in the solenoid produces an
electromagnetic potential but no magnetic field. The electromagnetic vector
potential is then causally associated with the occurrence of a phase shift in
the interference fringes on the screen.
Let *F0 be the wavefunction representing the electrons before the appearance of the vector potential. The wavefunctions corresponding to the paths 1
and 2 between slits 1 and 2, respectively, and an arbitrary point P on the
screen are modified by the presence of the vector potential A in the following
way:
11
221
SOURCE
where Sj and S2 are the action integrals, S = / Adx, evaluated along the paths
1 and 2 respectively.
The phase shift at point P is
A5 = e(S} - S2)/h
222
Consider now how problems are usually generated for views that assign a
new state to the measuring device after measurement. There are two ways.
First using von Neumann's simple theory of measurement it is supposed that
the measurement interaction takes the composite Xc,^, into the state Xc/O,^,
for some set of 'pointer position* eigenstates {^F,}. When the interaction
ceases each pointer must have a possessed value for position and thus be in
one of the eigenstates *F,-. Since under the generalised Born interpretation
pointer position i will occur with probability lc,l2, by observing the distribution of the pointer positions we can infer the values of the lc,l2. The problem
arises when we consider the implications of the assumption that each pointer
is in state 4*, for some i. In that case, it is argued, the state of the mated system
in each case must be O, since under the generalised Born interpretation the
probability is zero for the pointer to state, and the system to have some
different state, (j i) for the measured observable O. So for each individual
composite the joint state must be O/^F,; the ensemble of these composites will
be then in the mixed state I,\ci\2\<&i><Q>i\\xi><xi'i\. This contradicts the original assumption that the ensemble was in the pure, or non mixed, state
This argument motivates advocates of the 4have-your-cake-and-eat-too'
strategy to fix it so that the second set of states assigned (here { O ^ } ) are
true of systems in some different way than the original superposition so that
the assignment of both the mixed and pure states will not be in contradiction.
When the second set of states assigned is classical, no such assumption is yet
necessary, for quantum and classical states do not contradict each other in
this simple way. I say 'yet' because I take it that in each new analysis the
question remains open. There are ways of inferring quantum states from
information about classical states and the reverse, as we saw in Section 4.
223
Nothing I have said, or will say, guarantees that no situations will arise in
which the classical information we can get in that situation and the inferences
we take to be legitimate will produce a new quantum state for the apparatus
and system that is inconsistent with the supposition that the Schrodingerevolved superposition continues to be correct. But finding these situations,
undertaking the analysis, conducting the experiments and drawing this surprising conclusion is a job that remains to be done. This is the job of hunting
the measurement problem.
The second way to generate a measurement problem is to look at the
time-evolved behaviour of the composite, system plus apparatus, to see if the
predictions generated from the superposition contradict those from the second
state assignment. As I remarked at the beginning, most versions of the 'haveyour-cake-and-eat-too' strategy make some starting attempt to show that this
can not happen. What happens in the version I propose? We imagine that
after the measurement interaction the apparatus carries on with further interactions. Let us assume that the system it next interacts with has a quantum
description. A problem could arise if this interaction also had a classical
description so that the classical values assigned to the apparatus evolved
as well. A contradiction is possible: we may find some interactions where
the quantum-mechanically evolved state implies classical claims which
contradict the information contained in the classically-evolved state. But such
analyses are very uncommon and very difficult. We virtually never give a
serious quantum-mechanical analysis of the continuing interactions of the
measuring apparatus. To produce a contradiction we not only must do this
but we must do this in a situation where we can draw classical inferences
from the final quantum state as well. Again the job in front of us is not to
solve the measurement problem but to hunt it.
6
How do we relate quantum and classical properties in modelling real systems? I maintain that in practice we follow no general formulae. The association is piecemeal and proceeds in thousands of different ways in thousands
of different problems. Figuring out these connections is a good deal of what
physics is about, though we may fail to notice this fact in our fascination
with the abstract mathematical structures of quantum theory and of quantum
field theory. Consider superconductivity. This is a quantum phenomenon. As
we saw in the last chapter, we really do use quantum mechanics to understand
it. Yet superconducting devices are firmly embedded in classical circuits
studied by classical electronics. This is one of the things that most puzzled
me in the laboratory studying SQUIDs. You would wire the device up, put
it in the fridge to get it down below critical temperature, and then turn on
224
Coaxial cable
Strip line
=27ia
= inner radius
R(n)
Normal case
Rn
UrfTLw
Ln"imrv
Superconducting case
Figure 9.4 A Superconducting configuration and its corresponding
classical circuit diagram. Source: Beasly 1990; recreation: George Zouros.
the switch. Very often you simply would not get the characteristic I-V curve
which is the first test that the device is operating as a SQUID. What had gone
wrong? To figure it out the experimenter would begin to draw classical circuit
diagrams.
Without going all the way to SQUIDs we can see in figure 9.4 an example
of the simplest kind of superconducting configuration that can be found in
any standard treatment of superconducting electronics. 'What allows you to
draw classical circuit diagrams for these quantum devices?' I would ask. The
reply: 'There are well-known theorems that show that any complicated circuit
is equivalent to certain simple circuits'. But that missed my point. What
allows us to associate classical circuits with quantum devices?
No matter what theory you use - London, Ginzburg-Landau (G'orkov),
BCS - all have as a central assumption the association of a familiar quantum
quantity
225
Normal case
Superconducting case
[L
- /
jd{d + 2\)
ld(d + 26)
CJ
/, = (e*h/2m*i)
- (e*2/m*)m2A2
with a classical current that flows around the circuit. I say it is familiar
because this is just what, in the Born interpretation, would be described as a
probability current, taking IW as a probability and using the conventional
replacement for situations where magnetic fields play a role
V -> V (ie*/h)A
Yet we have all learned that we must not interpret elW as a charge density
as Schrodinger wished to do. One of the reasons is that *F cannot usually
be expressed in the co-ordinates of physical space but needs some higher
dimensional co-ordinate space. But in this case it can be. And we have
learned from the success of the theory that this way of calculating the electrical current is a good one.
But aren't we here seeing just the Born interpretation? No. On the Born
interpretation what we have here is a probability and a probability in need of
an elaborate story. The story must do one of two things. Either it both
provides a mechanism that keeps reducing the paired electrons of the
superconductor so that they are in an eigenstate of the current operator and
also shows that this is consistent with the predictions that may well require
a non-reduced state; or else in some way or another it ensures that the mean
value (or something like it that evolves properly enough in time) is the one
that almost always occurs. We have no such stories. More importantly, we
226
where Nr(0) and N,(0) are the densities of states at the Fermi level at the right
227
The model used for the computer simulations of the SQUID with transmission line.
The RSJ has stray inductance Ls. The dimensionless capacitance of the transmission
line is Bctl = 27tIcRsCs/(t>o.
Figure 9.5 Typical equivalent current for a SQUID, with accompanying I
V curve.
and left metal respectively and <IT2I> is the mean transition probability for
an electron to move from the left to the right.
Where does this identification come from? I follow here the treatment of
Antonio Barone and Gianfranco Paterno.13 They first derive from fundamental microscopic theory an expression for the quasi-particle tunnelling current in a Josephson junction. (Quasi-particles are something like single electrons as opposed to the Cooper pairs which are responsible for
superconductivity.)
lv = - hleRNj dcowAcoKCco - eVJh) [f(co) - f(co - eVJh)}
Here /((*)) and r(co) are the quasi-particle densities as a function of frequency
on the left and on the right, f(co) = l/(evkBr+ 1) where kB is the Boltzmann
constant, and Vo is the voltage across the junction. We now apply the formula
Barone and Paterno 1982.
228
I-V CURVE of
SINGLE
JUNCTION
V
(5xlO~4 V/div.)
IV characteristic of a single Josephson junction from the first generation. The zeros
for I and V are at the centre of the figure.
Figure 9.5 cont. Source: Beasley 1990.
for /qp to the case when both junction electrodes are in the normal state. Then
n(co)= 1, and since
/ dco [f(co) - f(co - eVo/h)] = eVJh
we get for the tunnelling current in the normal state (INN)
This shows that the quantity RN defined above is the resistance of the junction
when both metals are in the normal state.
There is one last point to make with this example. The identification of an
expression we can calculate in quantum mechanics with the quantity RN,
which we can measure using Ohm's law with totally conventional volt-meters
and ammeters, is not a once and for all identification. The expression for RN
depends very much on the specifics of the situation studied: here one simple
quantum junction driven by a constant DC circuit with no spatial phase variation. In other situations we will need a very different quantum expression
for the normal resistance. Finding it is just the kind of job that theoreticians
working in superconductivity spend their time doing.
Examples like this show first that quantum mechanics and classical mechanics are both necessary and neither is sufficient; and secondly, that the two
relate to each other differently from case to case. Both the conclusions speak
229
in favour of using classical states if we propose to adopt a have-your-cakeand-eat-it too strategy for dealing with the superposition problem. On the
other hand, they do discount the advantage I claimed for this approach with
respect to the issue of book-keeping. For matters are not after all so simple
as I first suggested. In one sense they are. Once we have a model which
assigns states to the systems under study, book-keeping is straight forward
on the classical-state view. Quantum features of the systems in the model are
calculated from their quantum states and classical features from their classical
states. But when we want to develop the model, in particular to think about
how the systems might evolve, the underlying problems come to the fore
once more.
To see how, consider the treatment of measurement in the modal interpretation. According to the modal interpretation, the observed outcome from
a measurement interaction is to be accounted for by the 'second' state
assigned to the measured system (the state 'true in the actual world' but not
'necessarily' true). The point of keeping the big superposition for the
apparatus plus measured system is to account for the statistical features of
any future measurement results on the composite in which interference would
be revealed (i.e. to account for the wave aspect of the composite in its future
behaviour). To account for the individual outcomes in the new measurement,
though, a new set of second states comes into play. In early versions of the
modal interpretation the exact nature of these second states and the probability distributions over them were to be fixed by the big evolved state for the
composite in conjunction with information about which observables exactly
were being measured in the second interaction. This simple version of the
modal interpretation had little to say about the evolution of the second-level
states: how does the state true of the measured system in the actual world at
one time turn into the state true at another?
This is a question that both advocates and critics of the modal interpretation have put a good deal of effort into. It demands an answer just as
much on the view I am defending, where the 'second' set of states are
classical rather than quantum states. We might try the simple answer:
classical states evolve by classical Langrangians into further classical
states; quantum states evolve into quantum states under quantum Hamiltonians. But as I have indicated, this answer will not do from my own point
of view. That is because of the possibility of mixed modelling that I have
just been describing.
We can and do build models in which classical features affect the
quantum description, and the reverse. Often the mixing is concealed at
the very beginning or at the very end of the model. Quantum 'state
preparation' is a clear general case of the first. You send your particles
down a long tube in order to prepare a quantum state with a narrow
230
Philosophical remarks
Lamb 1998.
231
232
We have not learned in the course of our work that quantum mechanics is
true and classical mechanics false. At most we have learned that some kinds
of systems in some kinds of situations (e.g. electrons circulating in an atom)
have states representable by quantum state functions, that these quantum
states evolve and interact with each other in an orderly way (as depicted in
the Schrodinger equation), and that they have an assortment of causal relations with other non-quantum states in the same or other systems. One might
maintain even the stronger position that all systems in all situations have
quantum states. As we have seen throughout this book I think this is a claim
far beyond what even an ardent quantum devotee should feel comfortable in
believing. But that is not the central matter here. Whether it is true or not, it
has no bearing one way or another on whether some or even all systems have
classical states as well, states whose behaviour is entirely adequately
described by classical physics.
One reason internal to the theory that may lead us to think that quantum
mechanics has replaced classical physics is reflected in the generalised Born
interpretation. We tend to think that quantum mechanics itself is in some
ways about classical quantities, in which case it seems reasonable for the
realist to maintain that it rather than classical theory must provide the correct
way of treating these quantities. But that is not to take quantum realism
seriously. In developing quantum theory we have discovered new features of
reality that realists should take to be accurately represented by the quantum
state function. Quantum mechanics is firstly about how these features interact
and evolve and secondly about what effects they may have on other properties
not represented by quantum state functions.
In my opinion there is still room in this very realistic way of looking at
the quantum theory for seeing a connection between quantum operators and
classical quantities and that is through the correspondence rule that we use
as heuristic in the construction of quantum mechanical Hamiltonians. But
that is another long chapter. All that I want to stress here is my earlier remark
that quantum realists should take the quantum state seriously as a genuine
feature of reality and not take it is as an instrumentalist would, as a convenient way of summarising information about other kinds of properties. Nor
should they insist that other descriptions cannot be assigned besides quantum
descriptions. For that is to suppose not only that the theory is true but that it
provides a complete description of everything of interest in reality. And that
is not realism; it is imperialism.
But is there no problem in assigning two different kinds of descriptions to
the same system and counting both true? Obviously there is not from anything like a Kantian perspective on the relations between reality and our
descriptions of it. Nor is it a problem in principle even from the perspective of
the naive realist who supposes that the different descriptions are in one-to-one
233
correspondence with distinct properties. Problems are not there just because
we assign more than one distinct property to the same system. If problems
arise, they are generated by the assumptions we make about the relations
among those properties: do these relations dictate behaviours that are somehow contradictory? The easiest way to ensure that no contradictions arise is
to become a quantum imperialist and assume there are no properties of interest besides those studied by quantum mechanics. In that case classical
descriptions, if they are to be true at all, must be reducible to (or supervene
on) those of quantum mechanics. But this kind of wholesale imperialism and
reductionism is far beyond anything the evidence warrants. We must face up
to the hard job of practical science and continue to figure out what predictions
quantum mechanics can make about classical behaviour.
Is there then a problem of superposition or not? In the account of quantum
mechanics that I have been sketching the relations between quantum and
classical properties is complicated and the answer is not clear. I do not know
of any appropriately detailed analysis in which contradiction cannot be
avoided. Lamb's view is more decided. So far we have failed to find a single
contradiction, despite purported paradoxes like two-slit diffraction, Einstein
Poldolsky-Rosen or Schrodinger's cat:
There is endless talk about these paradoxes,
Talk is cheap, but you never get more than you pay for.
If such problems are treated properly by quantum mechanics, there are no
paradoxes.15
Still many will find their suspicions aroused by these well-known attempts
to formulate a paradox. In that case I urge that we look harder, and find the
quantum measurement problem.
ACKNOWLEDGEMENTS
This chapter combines material previously published in Cartwright 1995b, 1995c and
1998a. Thanks to Jordi Cat for his help with the Bohm-Aharanov discussion and to
Sang Wook Yi for comments.
15
Lamb 1993.
Bibliography
Allport, P. 1991, 'Still Searching For the Holy Grail', New Scientist 5, pp. 55-56.
1993, 'Are the Laws of Physics "Economical with the Truth"?', Synthese 94, pp.
245-290.
Anand, S. and Kanbur, R. 1995, 'Public Policy and Basic Needs Provision: Intervention and Achievement in Sri Lanka', in Dreze, Sen and Hussain, 1995.
Anderson, P. W. 1972, 'More Is Different', Science 111, pp. 393-396.
Andvig, J. 1988, 'From Macrodynamics to Macroeconomic Planning: A Basic Shift
in Ragnar Frisch's Thinking', European Economic Review 32, pp. 495-502.
Anscombe, E. 1971, Causality and Determination, Cambridge: Cambridge University
Press. Reprinted in Sosa and Tooley 1993, pp. 88-104.
Aristotle [1970], Aristotle's Physics: Book I and II, trans. W. Charlton, Oxford: Clarendon Press.
Backhouse, R., Maki, U., Salanti, A. and Hausman, D. (eds.) 1997, Economics and
Methodology, International Economic Association Series, London: Macmillan
and St Martin's Press.
Bacon, F. 1620 [1994], Novum Organum, Chicago: Open Court.
Bardeen, J. 1956, 'Theory of Superconductivity', in Encyclopedia of Physics, Berlin:
Springer-Verlag, pp. 274-309.
Bardeen, J. and Pines, D. 1955, 'Electron-Phonon Interactions in Metals', Physical
Review 99, pp. 1140-1150.
Bardeen, J., Cooper, L. N. and Schrieffer, J. R. 1957, 'Theory of Superconductivity',
Physical Review 108, pp. 1175-1204.
Barone, A. and Paterno, G. 1982, Physics and Applications of the Josephson Effect,
New York: Wiley.
Beasley, M. 1990, Superconducting Electronics, class notes (EE334), Stanford University.
Blaug, M. 1992, The Methodology of Economics, Cambridge: Cambridge University
Press.
Bohm, D. 1949, 'Note on a Theorem of Bloch Concerning Possible Causes of Superconductivity', Physical Review 75, pp. 502-504.
Born, M. and Cheng, K. C. 1948, 'Theory of Superconductivity', Nature 161, pp.
968-969.
Boumans, M. 1999, 'Built-in Justification', in Morrison and Morgan 1999.
Brackenridge, J. B. 1995, The Key to Newton s Dynamics, Berkeley: University of
California Press.
Brush, C. B. (ed.) 1972, The Selected Works of Pierre Gassendi, New York: Johnson
Reprint Corporation.
234
Bibliography
235
236
Bibliography
Bibliography
237
238
Bibliography
Hands, D. W. and Mirowski, P. 1997, 'Harold Hotelling and the Neoclassical Dream',
in Backhouse et al. 1997.
Harre, R. 1993, Laws of Nature, London: Duckworth.
Hart, O. and Moore, J. 1991, 'Theory of Debt Based on the Inalienability of Human
Capital', LSE Financial Markets Group Discussion Paper Series, No. 129,
London: LSE.
1994, 'Theory of Debt Based on the Inalienability of Human Capital', Quarterly
Journal of Economics 109, pp. 841-879.
Hausman, D. 1992, The Inexact and Separate Science of Economics, Cambridge:
Cambridge University Press.
1997, 'Why Does Evidence Matter So Little To Economic Theory?', in Dalla
Chiara, Doets, Mundich and van Beuttem, 1997, pp. 395-407.
Heisenberg, W. 1947, 'Zur Theorie der Supraleitung', Zeitschrift fur Naturforschung
2a, pp. 185-201.
Hempel, C. G. and Oppenheim, P. 1960, 'Problems in the Concept of General Law',
in Danto and Morgenbesser 1960.
Hendry, D. F. 1995, Dynamic Econometrics, Oxford: Oxford University Press.
Hendry, D. F. and Morgan M. S. (eds.) 1995, The Foundations of Econometric
Analysis, Cambridge: Cambridge University Press.
Hoover, K. forthcoming, Causality in Macroeconomics, Cambridge: Cambridge University Press.
Hotelling, H. 1932, 'Edgeworth's Taxation Paradox and the Nature of Demand and
Supply Functions', Journal of Political Economy 40, pp. 577-616.
Hughes, R. I. G. 1999, 'The Ising Model, Computer Simulation, and Universal
Physics', in Morrison and Morgan 1999.
Hume, D. 1779 [1980], Dialogues Concerning Natural Religion, Indianapolis:
Hackett.
Hutchison, T. W. 1938, The Significance and Basic Postulates of Economic Theory,
New York: Kelley.
Jackson, J. D. 1975, Classical Electromagnetism, 2nd Edition, New York: Wiley.
Kitcher, P. and Salmon, W. (eds.) 1988, Scientific Explanation, Minneapolis: Univ.
of Minnesota Press.
Kittel, C. 1963, Quantum Theory of Solids, New York: Wiley.
Kochen, S. and Specker, E. P. 1967, 'The Problem of Hidden Variables in Quantum
Mechanics', Journal of Mathematics and Mechanics 17, pp. 5987.
Kriiger, L. and Falkenburg, B. (eds.) 1995, Physik, Philosophic und die Einheit der
Wissenschaften, Heidelberg: Spektrum Akademischer Verlag.
Kuhn, T. S. 1958, 'Newton's Optical Papers', in Cohen 1958.
Kyburg, H. 1969, Probability Theory, Englewood Cliffs NJ: Prentice Hall.
Lamb, W. 1969, 'An Operational Interpretation of Nonrelativistic Quantum Mechanics', Physics Today 22, pp. 23-28.
1986, 'Quantum Theory of Measurement', Annals of the New York Academy of
Sciences 48, pp. 407-416.
1987, 'Theory of Quantum Mechanical Measurements', in Namiki et al. 1987.
1989, 'Classical Measurements on a Quantum Mechanical System', Nuclear Physics B. (Proc. Suppl.) 6, pp. 197-201.
1993, 'Quantum Theory of Measurement: Three Lectures', lectures, LSE. July
1993.
Bibliography
239
240
Bibliography
Bibliography
241
Sargent III, M , Scully, M. O. and Lamb, W. Jr 1974, Laser Physics, Reading MA:
Addison-Wesley.
Schachenmeier, R. 1951, 'Zur Quantentheorie der Supraleitung', Zeitschrift fiir Physik
129, pp. 1-26.
Sen, A. 1981, 'Public Action and the Quality of Life in Developing Countries', Oxford
Bulletin of Economics and Statistics 43, pp. 287-319.
1988, 'Sri Lanka's Achievements: How and When', in Srnivasan and Bardham
1988.
Sepper, D. L. 1988, Goethe contra Newton, Cambridge: Cambridge University Press.
Shafer, G. 1996, The Art of Causal Conjecture. Cambridge MA: MIT Press.
1997, 'How to Think about Causality', MS, Faculty of Management, Rutgers University.
Shoemaker, S. 1984, Identity, Cause and Mind: Philosophical Essays, Cambridge:
Cambridge University Press.
Shomar, T. 1998, Phenomenological Realism, Superconductivity and Quantum Mechanics, PhD Dissertation, University of London.
Singwi, K. S. 1952, 'Electron-Lattice Interaction and Superconductivity', Physical
Review 87, pp. 1044-1047.
Smart, J. J. C. 1963, Philosophy and Scientific Realism, London: Routledge.
Smith, C. and Wise, M. N. 1989, Energy and Empire: A Biographical Study of Lord
Kelvin, Cambridge: Cambridge University Press.
Sosa, E. and Tooley, M. (eds.) 1993, Causation. Oxford: Oxford University Press.
Spirtes, P., Glymour C , Scheines, R. 1993, Causation, Prediction and Search, New
York: Springer-Verlag.
Srnivasan, T. N. and Bardham, P. K. (eds.) 1988, Rural Poverty in South East Asia,
New York: Columbia University Press.
Stegmiiller, W. 1979, The Structuralist View of Theories, Berlin: Springer-Verlag.
Suarez, M. 1999, 'The Role of Models in the Application of Scientific Theories:
Epistemological Implications', in Morrison and Morgan 1999.
Suppe, F. 1977, The Structure of Scientific Theories, Urbana: University of Illinois
Press.
Suppes, P. 1984, Probabilistic Metaphysics, Oxford: Blackwell.
Toda, H. and Philips, P. 1993, 'Vector Autoregressions and Causality', Econometrica
61, pp. 1367-93.
Tomonaga, S. 1950, 'Remarks on Bloch's Method of Sound Waves Applied to ManyFermion Problems', Progress in Theoretical Physics 5, pp. 544-69.
Van Fraassen, B. 1980, The Scientific Image, Oxford: Clarendon Press.
1989, Laws and Symmetry, Oxford: Clarendon Press.
1991, Quantum Mechanics: An Empiricist View, Oxford: Clarendon Press.
Vidali, G. 1993, Superconductivity: The Next Revolution?, Cambridge: Cambridge
University Press.
Weber, M. 1951, Gesammelte Aufsdtze zur Wissenschaftslehre, Tubingen: J. C. B.
Mohr (Paul Siebeck).
Weinert, F. (ed.) 1995, Laws of Nature: Essays on the Philosophical, Scientific and
Historical Dimensions, New York: Walter de Gruyer.
Wentzel, G. 1951, 'The Interaction of Lattice Vibrations with Electrons in a Metal',
Physical Review Letters 84, pp. 168-169.
Wise, N. (ed.) forthcoming, Growing Explanations: Historical Reflections on the Sciences of Complexity.
Index
243
Index
cause, common 107-108, 128
ceteris paribus 4, 10, 25, 28, 33, 37,
50, 57, 68, 93n, 125, 137-139, 141,
143, 147-148, 151-152, 163, 168,
175-176, 188
chance set-ups 152, 154, 157, 162,
165-166, 168, 174-176
Chang, Hasok ix, 103, 219n
circuit diagrams 223-226, 229, 231
classical states 214-217, 222-223. See
also Quantum states
Clifton, Robert 212n
Collins, Harry 88
completability, of the sciences 56
concrete 36, 39^41, 4In, 43, 45-46, 91,
183, 189-193, 189n, 195. See also
abstract
conditional probability 109, 126-127,
157-158, 161-163
Cooper pairs 197-198, 200, 228
correspondence rule 195
Coulomb force 39, 60, 67, 82, 151,
202
Coulomb interaction 192-193, 195-198,
202, 206, 208
Coulomb's law 53-54, 59-60, 64-65,
67-68, 71, 82-84
counterfactual(s) 4, 83, 143-145
covering law account 138139, 145,
151
Cowles Commission 110
crucial experiment (experimentum
crucis) 95-96, 98, 103
customisation account(s) 186-187
DAGs (Directed Acyclic Graphs) 104,
110-112, 116-119
dappled world 1, 9-10, 12, 140
Davidson, Donald 56
De Broglie's double solution 212
decoherence approach 215n
deductive closure 67, 188
deductive method 95, 101, 112
degree-of-refrangibility See
refrangibility
Del Sata, Marco ix, 213n
demand curve 55
Descartes, Rene 82
determinable-determinate relationship
41,41n, 64
determinism 6, 57, 107-108, 114, 137
deterministic causation 107-108, 110
111
Ding, Zhuanxin 111
Dipole expectation 219
244
Index
Index
Ito, Makiko ix
I-V curve 224, 226-228
Jones, C.R. 167-168, 168n
Josephson junction(s) 226-228, 230
Kanbur, Ravi 153-154, 153n, 154n
Kant, Immanuel 23
Kelvin, Lord 48
Kennedy, J.B. 35n, 48
Kepler, Johannes 50-53, 51n, 52n, 152,
157
Keynes, John Maynard 7, 16, 150
Kitcher, Philip 7In
knowledge 4-7, 9-10, 17-18, 23-24,
49, 54, 64, 72-73, 119, 123, 140141, 151
knowledge, local 23
Kochen and Specker 217n, 218n
Kuhn, Thomas 95
Kyburg, Henry 167
Lakatos, Imre 179-180, 198
Lamb, Willis 214, 216, 219, 222, 230231, 233
Laplacean demon 158-159
lattice vibrations 197-199, 202, 204,
206
law(s), causal 50, 104-105, 116-117,
121-122, 124-130, 133, 140, 161
162
law(s), ceteris paribus 4, 25, 28f., 137,
151, 175-176
law(s), higher-level 89-90
law(s) of nature 12, 24, 49, 72, 78-80,
82, 105, 140, 144, 158
law(s), phenomenological 10, 31-32
law(s), probabilistic 20, 152, 158, 168,
173
law(s), theoretical 137
Lessing, Gotthold Ephraim 37-44,
48
lever 58, 68, 142-143
Lewis, David 144, 144n, 196n
light (or darkness)-in-interaction-with-aturbid-medium 102
light, the inner constitution of 102
limits of scientific laws 37, 47-48
limits of quantum physics 193-197, 214
Lindsay, Robert 4344
Locke, John 70, 78, 82
London equations 53, 182, 225
Lucas, Anthony 96
Ma, Cynthia ix
Mackie, John L. 128-129, 128n
245
macro-micro interactions 216
macro-objects 215-217
Maki, Uskali 3n
manipulation theorem 126
Markov condition 104-114,117-118,
126-127, 129-130, 132-134
Marxism 7
Maudlin, Tim 71
Maxwell, James Clerk 218n
Maxwell's equations 35, 147, 219
McAllister, Harry E. 166-168, 174
measurement problem, of quantum
mechanics 162,211-212,222-223,
231
mechanical philosophy 1, 2, 33
Menger, Carl 3
Messiah, Albert 195-196
metaphysics ix, 12, 69-70, 148-149,
151
methods of modern experimental
physics 80, 85, 102
methodology and ontology, their
connection 102-103
micro-objects 216
Mill, John Stuart 56, 59, 78, 128,
151
Millman's theorem 55
Mirowski, Philip 60
Mitchell, Sandra 186n.
mixing 105, 130
modal interpretation 212, 229
model(s) 36, 43-44, 48, 179-181, 184,
191-193, 197, 199, 216, 230
Moore, John 145-148, 145n, 146n
moral of a fable 37, 39, 44, 47
Morgan, Mary ix, 124, 151, 165-167,
176
Morrison, Margaret ix, 16n, 179-180
Mulholland, H. 167-168, 168n
Nagel, Ernst 195
Nash bargain 169-170
natures, Aristotelian 4, 10, 28-29, 34,
37, 72, 78-92, 98-99, 101-103,
138, 145, 149
necessitation 71-72
necessity 4, 137, 140
Ne'eman, Yural 36, 48
Nemeth, E. 124
Neurath, Otto 5-6, 27, 124
Newton, Isaac 28, 47, 50-51, 95-96,
98-103
Newton's laws 25, 27, 32, 43-^4, 4748, 52, 65, 67, 187
neuromagnetism 47
nomological machine 4, 25, 49-53,
246
Index
247
Index
science, exact 10, 53-54, 56-59, 64, 67,
73, 188
scientific attitude 6-7, 9, 11-12, 72
scientific method 18
scientific revolution 78-79, 81
screened Coulomb potential 192, 202,
207
screening off 107-110, 114
screening-off condition 109, 111, 115
second quantization 193
semantic view, of theory 179-180,
183-185, 191
semi-classical theory of the laser 215,
219-220
Sen, Amartya 153n, 154-156, 154n,
176
Sepper, Dennis L. 98-99
Shafer, Glenn 116, 117n, 158-162,
158n, 159n, 160n
Shielding 29, 50-52, 57-58, 67, 147,
152, 187
Shimony, Abner 196n
Shoemaker, Sydney 70, 70n
Shomar, Towfic ix, 60, 73, 122-123,
130, 133-136, 182, 193
Simon, Herbert 125
Simpson's paradox 131-132, 13In, 134
simultaneous equations models 55, 65n,
111-112, 116
Smart, J.J.C 175, 175n
Smith, Joel 186n
spring model of ions 204205
social constructivist(s) 46-47
socio-economic machine 123-124, 137139, 142-143, 145-146, 149-150,
152, 154, 155-156, 162, 176
Sosa, Ernest 135
special theory of relativity 67
Spirtes, Peter 104-107, 110, 112, 113
114, 113n, 121, 124, 126-128, 130131, 133n, 134
SQUIDs 5, 24, 29, 47, 194, 196, 219,
224, 230
standard model, of particle physics 12
state preparation, quantum 230
Steuer, Max 143n, 151
stock models 198-199, 208-210
structuralist(s), German 190n, 201
Suarez, Mauricio ix
superconductivity 193-194, 196-198,
200, 204, 206-208, 223, 226, 228
superstring theory 2n, 16
supervenience 32-33, 40, 4In, 233